Data Engineer

Gandiva Insights LLC • Full-time • Texas, United States, US • 2d ago

Job Title: Data Engineer Remote Position Job Summary:

We are seeking a skilled Data Engineer to design, implement, and manage robust data pipelines and architecture. The ideal candidate will work closely with data scientists, analysts, and business stakeholders to ensure data is available, clean, and well-structured for analytics and decision-making. You will manage both structured and unstructured data and integrate various data sources into a cohesive, reliable system.

Key Responsibilities

Design, develop, and maintain scalable data pipelines and ETL processes to support data integration and analytics needs.
Collaborate with data scientists, analysts, and stakeholders to understand data requirements and ensure data accuracy and availability.
Manage and optimize databases (SQL, NoSQL) for data storage, retrieval, and analysis.
Build and maintain batch and real-time data processing systems.
Ensure data quality and governance through monitoring and validation tools.
Develop and maintain data warehousing solutions (e.g., AWS Redshift, Snowflake).
Implement data security and privacy policies in compliance with industry regulations.
Optimize data workflows for scalability, performance, and cost efficiency.
Continuously evaluate and improve the data architecture to meet the evolving needs of the business.

Required Skills and Qualifications:

Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
2+ years of experience in a Data Engineering role or equivalent.
Strong proficiency in SQL and experience with relational databases (e.g., PostgreSQL, MySQL).
Experience with big data technologies such as Hadoop, Spark, Kafka, and Hive.
Proficiency in programming languages such as Python, Java, or Scala.
Experience with cloud-based data platforms (e.g., AWS, GCP, Azure).
Hands-on experience with ETL tools (e.g., Apache Airflow, Talend, Informatica).
Familiarity with data warehousing concepts and tools (e.g., Snowflake, Redshift).
Strong understanding of data modeling, data structures, and database design.
Knowledge of data governance, security, and privacy best practices.

Preferred Qualifications:

Experience with machine learning pipelines and tools.
Knowledge of DevOps and CI/CD practices in data engineering.
Familiarity with stream processing frameworks like Apache Flink or Apache Storm.
Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).