POSITION SUMMARY:
The Senior Data Engineer will work with senior level roles in the company to manage and direct
data, data storage, and data pipeline development activities of the company’s products and
services. This position is responsible for building and maintaining optimized and highly
available data pipelines that facilitate deeper analysis and reporting by the Data and Analytics
team. The Senior Data Engineer builds data processing frameworks that handle the business
growing data needs.
RESPONSIBILITIES:
- Create and maintain optimal data pipeline architecture
- Responsible for seeing a project through entire lifecycle (full SDLC): from project
- definition, system requirements, design, development, configuration management and
deployment.
- Assemble large, complex data sets that meet functional / non-functional business
requirements.
- Ensure all requirements for data-related processes are met in a timely manner and
follow our standard operating procedures.
- Keep our data separated and secure across national boundaries through multiple data
centers and AWS regions
- Work on data tools for analytics and data scientist team members that assist them in
building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data
systems.
- Responsible to ensure all processes and procedures used to develop and validate the
core product follow internal SOP(s) meeting regulatory guidelines such as ICH, GCP and
21 CFR Part 11.
REQUIREMENTS:
- Education: BS in Computer Science or related field (MBA or advanced degree preferred).
- Experience: 5+ years in senior data engineering and configuration management, with full SDLC knowledge.
- Technical Skills: Proven experience building reliable and scalable ETL systems on big data platforms.
- Expertise in databases (e.g., MongoDB, SQL, Hadoop, Spark, Redshift, MySQL).
- Strong understanding of data warehousing, dimensional modeling, scripting languages (Python, JavaScript), and machine learning.
- Familiarity with big data tools (MapReduce, Hive, HDFS, YARN, HBase, Oozie) and infrastructure (Python, SQS, Redshift, Scala, Spark, AWS, EMR).
- Proficiency in Linux and OS-level troubleshooting.