Title : Data Engineer
Location : VA (Day One on-Site)
Duration : Long Term
Must : Python, Spark, Scala, AWS & SQL
Job Description:
Data team is looking for a Data Engineer to join our Data Engineering (DE) team which is responsible for building our data lake, maintaining our data pipelines / services and facilitating the movement of large data each day. We work directly with business teams and platform and engineering teams to ensure growth strategies. You are an out-of-the-box, structured thinker who is passionate about building services that scale. You will play a key role in providing the end-to-end data engineering solutions to support key business initiatives
Responsibilities
As a data engineer in the DE team, you will apply your strong technical experience building highly reliable services on managing and orchestrating multi-terabyte scale data lakes and implement a Data Mesh architecture, working closely with the Data Architecture/Modeling team. You enjoy working in an agile environment and are able to take vague requirements and transform them into solid solutions. You are motivated by solving challenging problems, where innovation, problem-solving, and creativity is as important as your ability to write code and test cases.
Minimum Qualifications and Expectations:
At least 3 years (5 or 10 based on level) of professional experience as a software engineer or data engineer
A BS in Computer Science or equivalent experience
Strong programming skills (some combination of Python, Java, and Scala)
Experience writing SQL, structuring data, and data storage practices
Experience NoSQL databases like Mongodb and Cassandra
Experience with data modeling
Knowledge of data warehousing concepts
Experienced building data pipelines and micro services
Experience with Spark, Kafka, Flink, Hive, Airflow and other streaming and data pipeline technologies to process large volumes of streaming data
Experience working on Amazon Web Services (in particular using EMR, Kinesis, RedShift, S3, SQS and the like)
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
An open mind to try solutions that may seem impossible at first
It's preferred, but not technically required, that you have:
Experience building self-service tooling and platforms
Built and designed Data Mesh architecture platforms
A passion for building and running continuous integration pipelines.
Built pipelines using Databricks and well versed with their API's
Contributed to open source projects (Ex: Operators in Airflow)
EDUCATION
Bachelor s Degree in Computer Science, Information Systems, Engineering or related field or equivalent work experience.