Job Title: Senior Data Architect.
Location: Dallas Texas.
Roles and Responsibilities:
● Develop, maintain, and optimize data infrastructure using Delta Lake, MLflow, and Databricks SQL to enhance data management, processing, and analytics.
● Utilize Snowflake’s features such as data sharing, zero-copy cloning, and automatic scaling to optimize data storage, accessibility, and performance. Ensure effective management of both semi-structured and structured data within Snowflake’s architecture.
● Implement and manage data storage solutions using Amazon S3, perform data warehousing with Amazon Redshift and with AWS Glue.
● Design and implement data integration workflows using Azure Data Factory to orchestrate and automate data movement and transformation.
● Design and implement scalable data pipelines using tools like Apache Kafka or Apache Airflow to facilitate real-time data processing and batch data workflows.
● Apply advanced analytics techniques, including predictive modeling and data mining, to uncover insights and drive data-driven decision-making.
Must-Have Skills:
● Proficiency in features such as Delta Lake, MLflow, and Databricks SQL. Experience in managing Spark clusters and implementing machine learning workflows.
● Knowledge of features like Snowflake’s data sharing, zero-copy cloning, and automatic scaling. Experience in working with Snowflake’s architecture for semi-structured and structured data.
● Experience with services like Amazon S3, Amazon Redshift, and AWS Glue.
● Familiarity with Azure Synapse Analytics, Azure Data Lake Storage, and Azure Data Factory.
● Proficiency in tools such as Apache NiFi, Talend, Informatica, or Microsoft SQL Server Integration Services (SSIS).
● Experience in designing and implementing data pipelines using tools like Apache Kafka or Apache Airflow.
● Ability to perform data profiling, data quality assessments, and performance tuning.
● Experience in comparing and evaluating different data technologies based on criteria like performance, scalability, and cost.
● Skills in applying advanced analytics techniques, including predictive modeling and data mining. Good to Have Skills:
● Experience with data governance tools such as Collibra or Alation.
● Knowledge of data quality frameworks and standards, such as Data Quality Dimensions (completeness, consistency, etc.).
● Familiarity with tools like Apache Beam or Luigi for managing complex data workflows.
● Awareness of emerging data technologies such as data mesh, data fabric, and real-time data processing frameworks.