Data Engineer - Remote/WFH

Charter Global • Contract • Remote (United States, US) • 1m ago

Job Title: Data Engineer

Location: 100% remote/WFH– EST (Onboarding will require finger printing).

Duration: 6 months contract

Skills

Hands-on experience in Azure Databricks, ADF, Python, and Pyspark scripting to extract, load, and transform large data sets

Description

This Data Engineer will possess strong experience with Azure Data Lake, ETL, ELT, ADF and Databricks, Python and Pyspark.
Work with architects, other data engineers and analysts to identify, engage, and integrate data sources for discovery and profiling and, where necessary and define data services that empower business processes
Design, build, and scale data pipelines across a variety of source systems and streams (internal, third-party, as well as cloud-based), distributed / elastic environments, and downstream applications and/or self-service solutions
Develop data pipelines using Azure Databricks, python and Pyspark to parse semi structured files, (ex: JSON, XML files) to create automated derived data entities by reading the metadata and handling multiple parsing scenarios
Collaborate in establishing and evolving development, testing, and documentation standards, as well as related code reviews
Partner with business analysts, application engineers, data scientists, leveraging the appropriate tools, solutions, and/or processes as part of their data mining, profiling, blending, and analytical activities
Collaborate in establishing and evolving development, testing, and documentation standards, as well as related code reviews

Qualifications

Progressive data pipeline development experience & progressive Cloud data application development experience.
Experience in Source/Target: ADLSGen2, JSON, XML, semi structured files, Azure SQL, Azure Databricks, Azure SQL
Orchestration Using ADF and Databricks Notebooks mandatory
Experience working in large scale/distributed SQL, NoSQL, and/or Hadoop environments
Experience modeling and implementing ETL / ELT on columnar MPP database technologies
Experience with streaming architectures e.g. Kafka, Stream, PubSub
Data Modeling experience
Hands on experience in Azure Databricks, ADF, Python and Pyspark scripting to extract, load and transform large data sets
Hands on Experience in working with semi structured file parsing to extract the data and data model ex: JSON, XML
Hands on Experience working with large data sets on databases and data warehouse. Good analytical skills and excellent coding skills
Experience in performance tuning of data pipelines and data processing and transformation scripts