Data Scientist - Synthetic Data/Generative AI
Our client is a multinational marketing research consultancy exploring data driven advances through the development of synthetic data. We seek a data scientist to join the team, bring specific and targeted skills in Generative AI and developing machine learning models. You will collaborate with a talented team of data scientists and engineers to explore and validate methodologies, and develop innovative solutions that harness the potential of synthetic data to enhance our clients data collection offerings.
.
What you will be doing:
- Design, implement, and evaluate synthetic data generation algorithms and models, including Generative Adversarial Networks (GANs)
- Test and validate the effectiveness of synthetic data in replicating the statistical properties and patterns of real-world data
- Develop and test synthetic data use cases
- Set up metrics and frameworks to assess the quality, diversity, and utility of synthetic data generated
- Stay abreast of latest advancements and research in generative AI & synthetic data techniques.
- Conduct research and experiments to push the boundaries of synthetic data capabilities.
- Work closely with data engineers, analysts, and other stakeholders to integrate synthetic data solutions into projects, teams and Service Lines
You will require:
- You have a PhD or a Master’s in a quantitative field such as Statistics, Mathematics, Computer Science, or a related discipline
- You have a solid foundation in statistical analysis, data mining, and machine learning, with experience applying these skills to real-world datasets.
- You are proficient in Python and have experience with data science libraries like pandas, scikit-learn, and TensorFlow or PyTorch
- You have a deep understanding of generative models, such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs) and other techniques such as SMOTE
- You are knowledgeable about the ethical considerations and best practices in generating synthetic data
- You are familiar with data visualisation tools (e.g., Matplotlib, Seaborn, or Tableau) and can effectively present data insights and findings
- You have experience with big data technologies and cloud platforms (e.g., AWS, Google Cloud, Azure) and can work with large-scale datasets
- You have a good understanding of data privacy regulations and best practices, ensuring that synthetic data projects comply with legal and ethical standards