Junior Data Scientist

Job Description:

Social media generates a massive volume of data about online interactions on a daily basis. Such interactions provide useful information about users’ behaviours and psychological states.  SafeToNet has an AI-based engine that constantly analyzes the patterns behind human written languages and produces psychological insights about user’s emotional and mental health. Our emerging technologies are used to promote the safety and welfare of children.

We are seeking aa Data Scientist Co-op to help us create the next generation of SafeToNet technology.

The ideal candidate should have a passion for quality, attention to detail, and continuous learning. Join an interdisciplinary team of PhD-level researchers and software developers at SafeToNet!

Roles and Responsibilities:

Work with Data Scientists on the following:

  • Build production-worthy training sets from multiple sources and assess their quality; i.e. data fusion.
  • Build tools and scripts to help assess and manage data from multiple sources.
  • Test and optimize algorithms that are designed to work on large corpora.
  • Manage the configuration, execution, and evaluation of machine learning tests on large datasets.
  • Develop performance benchmarks and perform hyperparameter optimization to report the optimal model parameters.
  • Analyze, interpret, and communicate results to the data scientist, engineering, product, and QA leads as required.

Requirements:

  • Working towards a Bachelor’s degree in Computer Science, Software Engineering, or equivalent. You will have a keen interest in AI supported by continued education and learning.

Technical Requirements:

  • Proficiency in machine learning and deep learning methods (demonstrated project work in deep learning on GitHub is an asset).
  • Proficiency in Python or Java with ML libraries like Keras or scikit.
  • Experience with big data libraries, such as Pandas, and data visualization tools, such as SciPy, and matplotlib.
  • Experience with statistical and quantitative analysis; i.e. regression, properties of distributions, and statistical tests.

We consider it an asset if you also have any of the following:

  • Direct experience with Tensorflow, Torch, or Theano libraries would be an asset
  • Proficiency in source control tools such as GIT. Bitbucket and/or Github experience is an asset.
  • Experience with NLP and text processing libraries such as NLTK and CoreNLP.