Data Curation Scientist
- Dialpad, Inc.
- Kitchener, ON
- Extremely competitive with market rate.
- Job Function
- Job Type
- Full time
- Salary Estimate
- Company Size
- Scaling (20-499)
At Dialpad, we're a team of do-ers. A team that thinks outside the box and when that doesn't work, we reinvent it. We don't settle for the status quo and neither do the things we build. Led by the same minds behind Google Voice, we build products that get businesses talking—whether it's across the hall, street, or country.
With $70 million in funding from Google Ventures, Andreessen Horowitz, and other top VC’s along with engineers from companies like Microsoft and Google, every member of our team plays an essential role in creating a voice product that doesn’t just combine design and mobility but but works with you wherever productivity may strike.
Design and own strategies and pipelines for acquiring high quality training data. Optimize the quality, latency and cost of data acquired by crowdsourcing data labelling or internal labellers.
Manage large quantities of text and audio data. Typical tasks include extracting samples from databases, writing scripts to trim and clean data, and making datasets available on cloud services.
Developing standards for text data. Typical tasks include creating processes to infer pronunciations for words, that spellings and capitalizations are consistent across data, and standardizing incoming data from human transcribers.
Managing human labellers. Typical tasks include writing instructions for labellers, directing data to the interface that labellers will use, and creating tests to ensure quality.
Interact with world-class speech recognition and NLP specialists to help them meet their model’s needs for labelled data.
Masters or Ph.D. degree in technical or linguistic field required
5+ years' experience in data management
5+ years' experience in text processing
5+ years using labelled data, in a machine learning context for example
3+ years experience with labelling data using crowdsourcing
Excellent attention to detail
Creative, resourceful problem solver
Excellent data management skills with various platforms and languages
Comfortable using Python for data cleaning and management
Shell scripting skills
Proven ability to handle big data
Fluency in English and excellent understanding of the English language from a phonetic, grammatical, and linguistic perspective
Some experience with machine learning
Bonus: Multiple spoken languages (particularly Spanish and Japanese)
Bonus: Advanced programming skills in other programming languages
Bonus: Data presentation and analysis skills
Joining our team means collaborating with people that aren’t just passionate about their work but about Argentine tango, musicals, sushi burritos, comic books - you name it. Because if you’re going to redefine the status quo, you need a group of people hungry to do more, to see more, and be more than where they started.
There is no idea too crazy and no task too small — we work together to make things we’re proud of.
Compensation & Equity
Teamwork makes the dream work. We recognize that our dedicated team members are what make our success. That’s why we offer competitive salaries in addition to stock options.
An apple a day keeps the doctor away - and it doesn’t hurt that we offer 100% paid Medical, Dental and Vision Plan employee coverage.
We offer a monthly stipend to help cover your cell phone, home internet, and even gym membership costs.
Location, Location, Location
San Francisco <> Raleigh <> Vancouver <> Tokyo <> San Antonio <> San Jose. From coast to coast, our offices are nestled in active and growing downtown areas.