Who are we?
We are a health technology startup building the search and data infrastructure for digital health systems of tomorrow. Our mission is simple, but the environment which we specialize in is complex.
We power enterprise-scale health organizations and healthtech startups to improve and optimize access to health care & support across North America and Europe. At our heart, we are a product & technology driven organization, and we look for people who share our vision leveraging technology to solve and scale some of the most impactful operational challenges in healthcare.
As our team grows, we are looking for an experienced data quality developer to help us improve our data quality to enable the development of our health-care specific ML models.
What are we looking for?
A FIT.
We would like the person who joins our team to be someone:
- who gets up in the morning wanting to be better than the day before, and for whom a 7/10 is not "ok";
- who wants to be part of a team and who has passion and collaboration as a must to reach a goal.
Your role
We are looking for a person with a robust background in building or managing data pipelines to provide datasets for training machine learning models. We desire someone who is deeply passionate about ensuring data quality and is dedicated to continuously enhancing established processes to support the training of machine learning models. This role is critical for maintaining the standards of data quality required for the development of products suited for dealing with the rigors required by the healthcare environment and ensuring the reliability and efficiency of our data pipelines.
As a Data Quality Developer, you will be in charge of:
- Collaborate closely with web scraping developers, and internal stakeholders to understand data requirements.
- Design secure and efficient processes for ingesting healthcare data from private sources and crawled websites.
- Develop and maintain robust data cleaning and transformation procedures for ensuring data quality and consistency.
- Utilize Spark and Ray for scalable, high-performance distributed data processing optimized for large healthcare datasets.
- Implement and manage Apache Airflow workflows for scheduling and automating routine healthcare data processing tasks.
- Work collaboratively with the GRC Lead to ensure data processing aligns with regulations and certifications (e.g., GDPR, HIPAA).
- Implement security measures such as data encryption, access controls, and anonymization techniques to safeguard sensitive data.
- Maintain comprehensive documentation for data processing pipelines, including design decisions, configurations, and workflow dependencies.
- Facilitate knowledge-sharing sessions with the data engineering team to disseminate best practices, new techniques, and updates.
In terms of skills, you should have:
- Solid Python programming skills, as well as proficiency with SQL.
- Knowledge of techniques and methodologies around data cleaning and data quality.
- Knowledge of different data processing paradigms (ETL, ELT, etc.)
- Experience working with parallel and distributed data computing (Ray, Spark, Dask, Hadoop, etc.)
- Experience working with versioned data lakes (Apache Iceberg)
- Experience working with containers and cloud computing
- Experience working with sensitive and clinical data (nice to have)
If you have other skills that you think would be a plus for the team, we are of course very curious to hear from you.
What we have to offer you
- 4 weeks of vacation;
- Summer schedules;
- Group insurance from day 1;
- Direct access to a 24/7 online doctor for you and your family through our partner (and client) Dialogue from day 1;
- Employee and Family Assistance Program (EFAP);
- Flexible hours: free to work the hours you are most productive;
- Flexible office: free to work from wherever you want;
- Autonomy, because Hey, you're the specialist;
- Independence of action in a highly collaborative environment;
- High-performance equipment (MacBook);
- Camellia Sinensis tea and Montreal roasted coffee for your office time;
- Pet therapy with Clinia's dogs @pico_the_teckle, Cacau, Alaska and Opale;
- Team buildings, 5@7, and team activities.
But also, this:
Moving is important : Clinia fundamentally believes in a balanced, active lifestyle. That's why we decided to offer a bonus ($) for every hour of physical activity you do: hiking, biking, running, climbing - whatever your sport, whatever day of the week, we encourage you to keep going
We also offer the opportunity to :
- Play an essential role in the development of a scaling company;
- Contribute to the development of a product used by millions of patients in Canada;
- Work with a team of persevering and ambitious people with a true team spirit.
Our approach is simple:
We are a young and dynamic team that advocates the involvement and equality of everyone in decision-making - we don't say that to be cool, we really believe in it. So we're looking for someone who can use their expertise to help us build a solid future for tomorrow.Do you have the motivation, focus and entrepreneurial spirit to meet this challenge? We're looking for someone like you!
Proudly B Corp Certified, join our team and be part of a company dedicated to making a positive impact on the world. Come grow with us!
Apply now !
*By submitting your application, you consent to share your personal information with Clinia, which will use it to process your application for this job position. Clinia will not use this information for any other purposes than stated above. See our Privacy Policy for more information.