Summary
We are seeking a highly skilled and experienced Data Engineer to join our dynamic team. The ideal candidate must have experience and skill in the design, development, and maintenance of data architecture, ensuring the efficient and secure flow of information. As a Data Engineer, you will work closely with cross-functional teams to understand data requirements, implement robust data pipelines, and contribute to the overall data strategy. This role requires a deep understanding of data engineering principles, proficiency in relevant technologies, and to act as a strong team player.
Responsibilities:
- Data Architecture Design:
- Design and implementation of scalable and efficient data architectures that support business needs and analytics initiatives.
- Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and provide technical solutions.
- Data Pipeline Development:
- Develop, deploy, and maintain robust ETL (Extract, Transform, Load) processes and data pipelines to ensure the timely and accurate flow of data between systems.
- Implement data integration solutions that consolidate and centralize disparate data sources.
- Data Modeling:
- Design and implement data models that optimize the storage and retrieval of data, ensuring scalability and performance.
- Data Quality and Governance:
- Implement data quality checks and validation processes to ensure the accuracy and reliability of data.
- Adhere to data governance principles and contribute to the development and enforcement of data policies.
- Performance Optimization:
- Monitor and optimize the performance of databases and data processing systems to meet or exceed performance metrics.
- Cloud Technologies:
- Work with cloud platforms (e.g., AWS, Azure, GCP) to design, deploy, and manage data solutions in a cloud environment. Specifically, work with Azure Data Platform stack: Azure Data Lake, Data Factory ,Storage containers etc
- Collaboration:
- Collaborate with other engineering teams, data scientists, and business stakeholders to understand requirements and deliver integrated solutions.
- Documentation:
- Create and maintain comprehensive documentation for data engineering processes, systems, and workflows.
- Continuous Learning:
- Stay abreast of industry trends, emerging technologies, and best practices in data engineering. Implement continuous improvement initiatives to enhance data engineering processes.
Required:
Python Development:
- Leverage your 4+ years of Python experience to analyze and manipulate data effectively.
SQL Expertise:
- Apply strong SQL skills for querying with extensive join conditions, data extraction, and updates based on specified conditions.
DevOps Proficiency:
- Work within a DevOps environment, demonstrating expertise in Linux, GitHub, and Bash scripting.
CI/CD Lifecycle:
- Contribute to and enhance the continuous integration and continuous delivery (CI/CD) lifecycles, ensuring smooth data workflows.
Testing:
- Design and implement Unit, integration, and regression tests to ensure the accuracy and reliability of data pipelines.
Code Quality Standards:
- Implement and adhere to coding standards, utilizing static code analysis tools and linters to maintain code quality.
Cloud Platforms:
- Apply your experience with Microsoft Azure and/or Google Cloud Platform or Terraform to optimize data analytics processes within cloud environments.
Agile SCRUM Projects:
- Collaborate effectively within an Agile SCRUM framework, participating in sprint planning, reviews, and retrospectives.
Data Management:
- Utilize expertise in Databricks (delta/delta live pipeline implementation, asset bundles, Unity Catalog), PySpark, Spark Structured Streaming, Delta Live Tables, and Delta Sharing for advanced data management.
Highly Desired:
Expert SQL Knowledge:
- Demonstrate expert-level knowledge of SQL for complex data querying and manipulation.
Streaming Technologies:
- Apply experience with streaming technologies such as Apache Kafka, Azure EventHubs, IBM MQ, and Avro to enhance data processing capabilities.
Advanced DevOps:
- Utilize advanced DevOps tools and practices, including GitHub Actions, Terraform, and Artifactory to streamline data workflows.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- Proven experience as a Data Engineer, with a focus on designing and implementing data architectures and pipelines.
- Must be proficient in Databricks
- Must have experience in Terraform
- Strong proficiency in programming languages such as Python, Java, or Scala.
- Experience with big data technologies (Hadoop, Spark), relational databases, and data warehousing solutions.
- Expertise in ETL tools and methodologies.
- Familiarity with cloud-based data technologies (AWS, Azure, GCP).
- Strong understanding of data security and privacy principles.
- Excellent problem-solving and analytical skills.
- Strong communication and leadership skills.
- Previous experience in a senior or lead role is advantageous.
- Relevant certifications (e.g., AWS Certified Big Data - Specialty) are a plus.
Join our team and take a leadership role in shaping the data landscape of our organization!!