Job title: Big Data Engineer
Location: Malvern, PA
Duration: 6 months
Contract : W2 and C2C
Responsibilities
Job Description:
- Leverages data pipeline designs and supports the development of data pipelines to support model development.
- Proficient with software tools that develop data pipelines in a distributed computing environment (PySpark, GlueETL).
- Supports integration of model pipelines in a production environment.
- Develops understanding of SDLC for model production.
- Reviews pipeline designs, makes data model design changes as needed.
- Documents and reviews design changes with data science teams.
- Supports data discovery and automated ingestion for model development.
- Performs detailed analysis of raw data sources for data quality, applies business context, and model development needs.
- Engages with internal stakeholders to understand and probe business processes in order to develop hypotheses.
- Brings structure to requests and translates requirements into an analytic approach.
- Participates in and influences ongoing business planning and departmental prioritization activities.
- Runs model monitoring scripts, follows process for alerts to management as needed.
- Addresses issues found in data pipelines from model monitoring alerts.
Qualifications
Pyspark, Python, Advanced SQL, AWS, Sagemaker pipelines, ETL, Data Pipelines, understanding of AI/ML, Model Development and Deployment Life Cycle