Company Overview
Tomato.ai softens accents on calls. The company raised over $12M and is led by 2 ex-Googlers who worked in the speech space for years. The founders previously sold 4 tech startups. The company is remote-first, based in the US, and hiring for this role world-wide.
Pay range
Highly competitive compensation and benefits. Exact compensation may vary based on skills, experience, and location.
Location
Fully remote.
Responsibilities
- Develop and operate pipelines for large scale speech data processing using Apache Beam and Google Cloud Dataflow
- Develop algorithms for training data selection and augmentation for speech ML models
- Closely collaborate with the researchers for achieving the model performance goals.
Required Qualifications
- Minimum 5 years of experience in data engineering.
- Experienced in web scraping and data processing.
- Extensive hands-on experience on large scale data processing
- Proficient in at least one of Apache Beam, Spark and Flink
- Passionate about and skillful at data analysis; able to produce practical insights
- Good understanding of the state-of-the-art deep learning techniques.
- Proficient in Python and PyTorch
- Hands-on experience on ML model training
- Attention to details
- Effective communication skills.
- Ability to work independently in a remote-first environment.
Preferred Qualifications
- Experienced with audio data processing.
- Familiarity with GCP.
- Experience with speech or audio ML models is optional but a big plus