Job Overview
We are excited to invite applications for a Senior Data Engineer position within our innovative Health Engineering Systems team. This remote role offers the chance to leverage your technical expertise in a mission-driven organization dedicated to making a meaningful difference in the world. By joining us, you will be part of a diverse team that values collaboration and fosters an inclusive workplace environment.
Key Responsibilities
- Design, implement, and sustain scalable data pipelines using technologies such as Spark, Hive, and Airflow.
- Develop and manage data processing workflows utilizing the Databricks platform.
- Create API services to streamline data access and integration.
- Develop interactive data visualizations and reports through AWS QuickSight.
- Construct necessary infrastructure for efficient data extraction, transformation, and loading from various sources using AWS and SQL.
- Monitor and optimize the performance of data infrastructures and processes.
- Establish data quality and validation processes.
- Assemble extensive and intricate datasets catering to both functional and non-functional business requirements.
- Conduct unit and integration testing for all data processing code.
- Collaborate with DevOps engineers on CI/CD and Infrastructure as Code (IaC) initiatives.
- Transform specifications into actionable code and design documents.
- Conduct code reviews and implement procedures to enhance code quality.
- Improve data availability and timeliness through more frequent data refreshes and optimizations of existing datasets.
- Ensure the integrity and security of data during storage and transit.
- Perform additional duties as needed.
Required Skills
- Extensive experience in data pipeline development using Python, Java, and cloud technologies.
- Proficient in big data processing with Spark and Hive.
- Expertise in AWS products, including S3, Redshift, RDS, EMR, AWS Glue, and QuickSight.
- Familiarity with data transformation processes, workload management, and data governance.
- Understanding of relational and NoSQL databases like Cassandra and Postgres.
- Knowledge of workflow management tools such as Airflow and stream-processing systems like Spark Streaming.
- Familiarity with DevOps principles, including CI/CD pipelines and Infrastructure as Code methodologies.
- Experience working with Agile methodologies and test-driven development practices.
Qualifications
- Bachelor’s degree in Computer Science, Engineering, or a related discipline.
- A minimum of 7 years of relevant experience in software or data development (5 years with a Master's degree).
- At least 4 years of experience in developing data pipelines.
- Must meet the requirements to obtain and maintain a Public Trust clearance.
- Candidates must be based in the United States and authorized to work within the country.
Career Growth Opportunities
This position presents a unique opportunity to elevate your skills in data engineering while engaging with cutting-edge technologies. Additionally, you will have the chance to contribute to impactful projects that not only enhance your professional development but also benefit the community.
Company Culture And Values
We are committed to creating a diverse and inclusive workforce, and we encourage applications from individuals of all backgrounds and experiences. Our culture emphasizes collaboration and support, creating a welcoming environment for all employees.
Networking And Professional Opportunities
Joining our team offers extensive opportunities for networking and collaboration within the industry, providing a platform for professional growth and development.
Compensation And Benefits
The compensation for this role will be determined based on several factors, including relevant experience, skill set, geographic location, and educational background.
Employment Type: Full-Time