Service Reliability Engineer
- Job Function
- Operations, Software Development
- Job Type
- Full time
- Company Size
- Scaling (20-499)
Do you love your job? You should!
Join our team and help launch the future of online education!
Who are we?
Never heard of DigitalEd? That’s because we’re brand new – well, sort-of! Maplesoft, a long-time tech company based in Waterloo is spinning-out its online education products into a separate corporation. DigitalEd will focus specifically on digital technology for online education, but with Maplesoft’s powerful math engine embedded at the core. That’s the best of two worlds! By combining the strength and longevity of Maplesoft with the excitement and focus of a start-up, DigitalEd is poised to make major contributions to the future of education worldwide.
Who are you?
We are looking for a full-time Service Reliability Engineer in Waterloo or working remotely.
In this role, you will design, deploy, build and manage DigitalEd’s internal Private Cloud infrastructure as well as our customer facing Google Public Cloud SaaS application infrastructure and create all the tooling and automation needed to enable our teams and give our customers a compelling experience.
The most important aspect of this team is that we’re never satisfied with the status quo. We continually learn, adapt and extend. DigitalEd is an agile shop, so we expect you to get things done by experimenting, failing, learning and most importantly, pushing forward.
This job might be for you if you:
- Can write code – in any language; shell scripting, ruby, python, Java, whatever (ideally in multiple languages) – and you have implemented some of your stuff into a production environment
- Have at least a few years’ experience in an Operational role, whether that be DevOps, SRE or traditional network/server management
- Are familiar with at least one configuration management system; Puppet, Chef, Ansible or similar
- Believe that a happy user is one that doesn’t have to call you with problems or to report errors
- Are comfortable communicating with people at all levels of the company who are spread all over the world
- Understand that a job is not finished until it’s tested, documented and highly automated
- Know that occasionally, you’ll probably be fixing something from home in your PJ’s
- Have worked on 24x7x365 Production systems in multiple sites and geographies powering business critical services, including deployment, maintenance, troubleshooting, performance tuning, and security.
- Know how to ensure there is proper monitoring, alerting, capacity planning and reporting in the production environment.
- Can contribute to the evolving design and architecture of reliable and scalable infrastructure.
- Have collaborated with product engineering teams to ensure Operations standards are observed, determine resource impacts for upcoming product deployments, and ensure successful product rollouts.
- Have developed processes, tools, and documentation in support of production operations.
You’re practically a shoe in if you:
- Have experience with container orchestration systems like Docker Swarm or Kubernetes
- Are familiar with more than one cloud provider – mainly GCP, but also AWS and Alibaba Cloud
- Care passionately about efficiency – in your code, in system performance, in procedure
- Linux is your platform of choice, but you are open to working wherever you need to
- Have worked on GCP before and are familiar with instance management, networking, IAM, and generally understand how it all works
Working at DigitalEd
DigitalEd is committed to providing every employee with professional growth opportunities, a supportive work environment, excellent compensation, and benefits. Our Waterloo, Ontario office provides a corporate concierge service and access to a fully equipped fitness facility.
If this sounds like your dream job, apply today! We’re waiting for you.