Must be US Citizen
Responsibilities:
- Collaborate & contribute to the architecture, design, development, and maintenance of large-scale data & analytics platforms, system integrations, data pipelines, data models & API integrations.
- Create transformation path for data to migrate from on-prem pipelines and sources to AWS.
- Provide input and insights, in conjunction with Data Architects, to the client.
- Coordinate with data engineers to provide feedback and organization to team.
- Ensure that data are optimally standardized and analysis-ready.
- Prototype emerging business use cases to validate technology approaches and propose potential solutions.
- Collaborate and ensure data integrity after large scale migrations.
- Deliver high quality data assets to be used by the business to transform business processes and to enable leaders to complete data-driven analyses.
- Continuously improve data solutions to increase quality, speed of delivery and trust of data engineering team’s deliverables to enable business outcomes.
- Reduce total cost of ownership of solutions by developing shared components and implementing best practices and coding standards.
- Collaborate with team to re-platform and reengineer data pipelines from on-prem to AWS cloud.
- Work together with team members to ensure data quality and integrity during migrations.
- Lead by example and pitch in to enable successful and seamless client delivery.
Location: Preferred in DC area, but role is remote.
Requirements
- 8+ years experience related to Data Engineering.
- AWS Cloud Certificate
- Minimum of 3 years of experience in the following:
- Leading engineering teams, including task management and personnel management.
- Working in or managing data centric teams in the government or other highly regulated environments.
- Strong understanding of data lake, data lakehouse, and data warehousing architectures in a cloud-based environment.
- Proficiency in Python for data manipulation, scripting, and automation.
- In-depth knowledge of AWS services relevant to data engineering (e.g., S3, EC2, DMS,
- DataSync, SageMaker, Glue, RDS, Lambda, Elasticsearch).
- Understanding of data integration patterns and technologies.
- Proficiency designing and building flexible and scalable ETL processes and datapipelines using Python and/or PySpark and SQL.
- Proficiency in data pipeline automation and workflow management tools like Apache Airflow or AWS Step Functions.
- Knowledge of data quality management and data governance principles.
- Strong problem-solving and troubleshooting skills related to data management challenges.
- Experience managing code in GitHub or other similar tools.
- Minimum of 2 years of experience in the following:
- Hands-on experience with Databricks including data ingestion, transformation, analysis and optimization.
- Experience designing, deploying, securing, sustaining and maintaining applications and services in a cloud environment (e.g., AWS, Azure) using infrastructure as code (e.g., Terraform, CloudFormation, Boto3).
- Experience with database administration, optimization, and data extraction.
- Experience using containerization technology such as Kubernetes or Mesos.
- Minimum of 1 year of experience in the following:
- Hands-on experience migrating from an on-premise data platform(s) to a modern cloud environment (e.g. AWS, Azure, GCP).
- Linux/RHEL server & bash/shell scripting experience in on-prem or cloud environment.
Preferred Experience:
- Bachelor's Degree in related field.
- Previous experience with large-scale data migrations and cloud-based data platform implementations.
- Prior experience with Databricks Unity Metastore/Catalog.
- Familiarity with advanced SQL techniques for performance optimization and data analysis.
- Knowledge of data streaming and real-time data processing frameworks such as Spark Structured Streaming.
- Experience with data lakes and big data technologies (e.g., Apache Spark, Citus).
- Familiarity with serverless computing and event-driven architectures in AWS.
- Certifications in AWS, Databricks, or related technologies.
- Experience working in Agile or DevSecOps environments and using related tools for collaboration and version control.
- Extensive knowledge of software and data engineering best practices.
- Strong communication and collaboration skills with internal and external stakeholders.
- Experience establishing, implementing and documenting best practices, standard operating procedures, etc.
Clearance requirements:
- Must be able to obtain and maintain a Public Trust clearance.
- Must be US Citizen.