Data Architect - DBX + ML
Location: Remote (US)
Exp: 10+ Years
Travel Expectation: Up to 10%
Required Past Experience:
● Databricks Certified Machine Learning Professional
● Preferred to have current credentials or ability to obtain within first 60 days of employment
● 5+ years of experience with a proven experience in ETL testing Handson experience in SQL with advanced level skills
● Experience in scripting language (shell/python)
● Knowledge and understanding in Data Warehouse and data modeling
● Experience in healthcare insurance domain would be nice to have
● Experience using Data Warehousing and Business Intelligence tools such as Datastage / Informatica, GCP Bigquery, Tableau and SSRS, etc.
● Willingness to continuously learn & share learnings with others and capability to collaborate with stakeholders and project leaders to understand requirements, deliverables
● Experience working in an agile and collaborative team environment
● Excellent written and verbal communication, presentation and professional speaking skills
● Proven problem-solving skills and attention to detail with a commitment to excellence and high Standards
Required Skills and Abilities:
● 5+ years or more consulting experience working with external clients across a variety of industry markets
● 7+ years experience in Data Architecture, data engineering & analytics in areas such as performance tuning, pipeline integration & infrastructure configuration
● In-depth knowledge of AWS data services and related technologies, including but not limited to: Redshift, Glue, S3, AuroraDB, MWAA, Lambda, and SageMaker
● Deep knowledge and expertise in Databricks and its components, such as Unity Catalog, Delta Lake, Delta Live Tables, Apache Spark, and related technologies such as dbt.
● Production-level experience with data ingestion, streaming technologies (i.e., Kafka), performance tuning, troubleshooting, and debugging
● Deep understanding of machine learning algorithms, techniques, and methodologies, with hands-on experience in applying supervised, unsupervised, and deep learning techniques to real-world problems.
● Proficiency in programming languages such as Python, R, Spark, or Scala, including expertise in data manipulation and analysis libraries (e.g., NumPy, pandas).
Possess deep knowledge and hands-on experience with data streaming technologies, such as Apache Kafka, Apache Flink, or similar platforms.
● Experience implementing real-time data processing pipelines for streaming analytics.
● Experience with Terraform, Git, CI/CD tools, as well as Automation and Integration testing
● Experience with cloud computing platforms (e.g., AWS, Azure, GCP) and their machine learning services - preferred AWS
● Possess strong problem-solving and analytical skills, with the ability to identify and resolve complex data engineering challenges.
● Has excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams and stakeholders.
● Has the ability to adapt to changing project requirements and manage multiple tasks simultaneously.
Education: Bachelor's Degree or equivalent experience required.