Join a purpose-driven organization that is dedicated to creating a positive impact in the world. We embrace diversity and are committed to cultivating a culture that values unique perspectives, fostering a collaborative environment where everyone can excel.
Position Overview
We are on the lookout for a highly skilled Senior Data Engineer to become a vital part of our Health Engineering Systems team. In this pivotal role, you will work closely with clients to articulate a vision for success and implement effective solutions to realize that vision. Your knowledge and experience will be instrumental in constructing and sustaining robust data pipelines and systems that yield significant outcomes.
Key Responsibilities
- Design, develop, and sustain scalable data pipelines utilizing Spark, Hive, and Airflow.
- Create and implement data processing workflows on the Databricks platform.
- Develop API services to enable seamless data access and integration.
- Construct interactive data visualizations and reports employing AWS QuickSight.
- Set up the necessary infrastructure for optimal extraction, transformation, and loading of data from diverse sources using AWS and SQL technologies.
- Monitor and improve the performance of data infrastructure and processes.
- Build data quality and validation jobs.
- Compile extensive and intricate datasets that align with both functional and non-functional business requirements.
- Document unit and integration tests for all data processing code.
- Collaborate with DevOps engineers on CI/CD and Infrastructure as Code (IaC).
- Transform specifications into code and design documentation.
- Conduct code reviews and formulate processes to enhance code quality.
- Improve data availability and timeliness by executing more frequent refreshes, tiered data storage, and optimizations of existing datasets.
- Safeguard the security and privacy of data both at rest and in transit.
- Perform additional duties as necessary.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or a closely related field.
- A minimum of 7 years of practical experience in software or data development (5 years with a Master's degree).
- At least 4 years of experience in developing data pipelines using Python, Java, and cloud technologies.
- Must be able to obtain and maintain a Public Trust clearance.
- Candidates must be residing in the US and hold authorization to work in the US.
Required Skills
- Proficiency in Spark and Hive for big data processing.
- Experience in building workflows with the Databricks platform.
- Strong familiarity with AWS products such as S3, Redshift, RDS, EMR, AWS Glue, and QuickSight.
- Knowledge of processes supporting data transformation, workload management, and data governance.
- Expertise in optimizing data systems and constructing data pipelines from the ground up.
- Understanding of both relational and NoSQL databases, including Cassandra and Postgres.
- Familiarity with workflow management tools like Airflow, and stream-processing systems such as Spark Streaming.
- Awareness of DevOps practices, including CI/CD pipelines (GitHub Actions) and Infrastructure as Code (Terraform).
- Experience with Agile methodologies and test-driven development.
Career Growth Opportunities
This role provides a unique opportunity for professional growth and development within a dynamic and forward-thinking organization. You will have the chance to work on impactful projects, gain new skills, and contribute to the overall mission of the company.
Company Culture And Values
Our organization prides itself on its mission-driven approach and unwavering commitment to diversity. We foster a supportive and collaborative environment where all employees are encouraged to share their unique perspectives and work collectively towards shared goals.
Employment Type: Full-Time