Company Description
Welcome to Maverick InnoGarage Inc (MIG), an innovative IT recruitment and consulting firm located in Toronto, ON. At MIG, we are dedicated to reshaping the landscape of IT solutions through our comprehensive services. Our focus areas include IT staffing, IT training, IT team auditing, system design, and staff augmentation. With a team of seasoned professionals, we stay ahead of industry trends and deliver client-centric solutions.
Role Description
This is a contract role for a Lead Data Scientist at for one of our clients in Toronto.
As a data scientist on our team, you will work on new product development in a small team environment writing production code in both run-time and build-time environments. You will help propose and build data-driven solutions for high-value customer problems by discovering, extracting, and modeling knowledge from large-scale natural language datasets including matter and contract repository, invoice/legal spend data and work management. You will prototype new ideas, collaborating with other data scientists as well as product designers, data engineers, front-end developers, and a team of experienced legal data annotators. You will get the experience of working in a start-up atmosphere with the large datasets and many other resources of an established company.
Roles & Responsibilities
Develop and implement LLM-based applications tailored for in-house legal
Fine-tune and deploy large language models to improve their performance on legal text processing tasks.
• Evaluate and help maintain our data assets and training/evaluation data sets
Design and build pipelines for pre-processing, annotating, and handling legal document datasets
Collaborate with legal authorities to understand requirements and ensure models meet domain-specific needs
Conduct experiments and evaluate mode performance to drive continuous improvements
Collaborate with other technical personnel or team members to finalize requirements.
• Work closely with other development team members to understand moderately complex product requirements and translate them into software designs.
• Efficiently implement development processes, coding standard methodologies, and code reviews for production environments.
Requirements
Formal training in machine learning: dimensionality reduction, clustering, embeddings, and sequence classification algorithms
Experience with deep learning frameworks such as PyTorch, Tensorflow and Hugging Face Transformers.
Practical experience in Natural Language Processing methods and libraries such as spaCy, word2vec, TensorFlow, Keras, PyTorch, Flair, BERT
- • Practical experience with large language models, prompt engineering, fine-tuning and benchmarking, using frameworks such as LangChain