Job Title: Data Engineer
Location: 100% remote/WFH– EST (Onboarding will require finger printing).
Duration: 6 months contract
Skills
Hands-on experience in Azure Databricks, ADF, Python, and Pyspark scripting to extract, load, and transform large data sets
Description
- This Data Engineer will possess strong experience with Azure Data Lake, ETL, ELT, ADF and Databricks, Python and Pyspark.
- Work with architects, other data engineers and analysts to identify, engage, and integrate data sources for discovery and profiling and, where necessary and define data services that empower business processes
- Design, build, and scale data pipelines across a variety of source systems and streams (internal, third-party, as well as cloud-based), distributed / elastic environments, and downstream applications and/or self-service solutions
- Develop data pipelines using Azure Databricks, python and Pyspark to parse semi structured files, (ex: JSON, XML files) to create automated derived data entities by reading the metadata and handling multiple parsing scenarios
- Collaborate in establishing and evolving development, testing, and documentation standards, as well as related code reviews
- Partner with business analysts, application engineers, data scientists, leveraging the appropriate tools, solutions, and/or processes as part of their data mining, profiling, blending, and analytical activities
- Collaborate in establishing and evolving development, testing, and documentation standards, as well as related code reviews
Qualifications
- Progressive data pipeline development experience & progressive Cloud data application development experience.
- Experience in Source/Target: ADLSGen2, JSON, XML, semi structured files, Azure SQL, Azure Databricks, Azure SQL
- Orchestration Using ADF and Databricks Notebooks mandatory
- Experience working in large scale/distributed SQL, NoSQL, and/or Hadoop environments
- Experience modeling and implementing ETL / ELT on columnar MPP database technologies
- Experience with streaming architectures e.g. Kafka, Stream, PubSub
- Data Modeling experience
- Hands on experience in Azure Databricks, ADF, Python and Pyspark scripting to extract, load and transform large data sets
- Hands on Experience in working with semi structured file parsing to extract the data and data model ex: JSON, XML
- Hands on Experience working with large data sets on databases and data warehouse. Good analytical skills and excellent coding skills
- Experience in performance tuning of data pipelines and data processing and transformation scripts
Best Regards,
David Roy | Talent Acquisition Manager – US Staffing | Charter Global Inc. | https://www.charterglobal.com
LinkedIn
One Glenlake Parkway | Suite 525 | Atlanta, GA 30328