Site Reliability Engineer
- Job Function
- Software Development
- Job Type
- Full time
- Company Size
- Scaling (20-499)
Digital-who? If you’ve never heard of us, it’s because we’re new!
Well, new-ish. DigitalEd is a spin off company from Maplesoft, a Waterloo-based mathematics company with 30 years of experience. Our focus is to create digital technology that enhances online STEM education. With the energy and excitement that feels like a startup and our 15-year presence in the education market, we’re poised to make a serious contribution to improving education worldwide.
So if you want to make a difference and help improve the future of online education come join our team.
Who are you?
We are looking for full-time Site Reliability Engineers in Waterloo or working remotely.
In this role, you will design, deploy, build and manage DigitalEd’s internal Private Cloud infrastructure as well as our customer facing Google Public Cloud SaaS application infrastructure and create all the tooling and automation needed to enable our teams and give our customers a compelling experience.
The most important aspect of this team is that we’re never satisfied with the status quo. We continually learn, adapt and extend. DigitalEd is an agile shop, so we expect you to get things done by experimenting, failing, learning and most importantly, pushing forward.
This job might be for you if you:
- Can write code – in any language; shell scripting, ruby, python, Java, whatever (ideally in multiple languages) – and you have implemented some of your stuff into a production environment
- Have at least a few years’ experience in an Operational role, whether that be DevOps, SRE or traditional network/server management
- Are familiar with at least one configuration management system; Puppet, Chef, Ansible or similar
- Believe that a happy user is one that doesn’t have to call you with problems or to report errors
- Are comfortable communicating with people at all levels of the company who are spread all over the world
- Understand that a job is not finished until it’s tested, documented and highly automated
- Know that occasionally, you’ll probably be fixing something from home in your PJ’s
- Have worked on 24x7x365 Production systems in multiple sites and geographies powering business critical services, including deployment, maintenance, troubleshooting, performance tuning, and security.
- Know how to ensure there is proper monitoring, alerting, capacity planning and reporting in the production environment.
- Can contribute to the evolving design and architecture of reliable and scalable infrastructure.
- Have collaborated with product engineering teams to ensure Operations standards are observed, determine resource impacts for upcoming product deployments, and ensure successful product rollouts.
- Have developed processes, tools, and documentation in support of production operations.
You’re practically a shoe in if you:
- Have experience with container orchestration systems like Docker Swarm or Kubernetes
- Are familiar with more than one cloud provider – mainly GCP, but also AWS and Alibaba Cloud
- Care passionately about efficiency – in your code, in system performance, in procedure
- Linux is your platform of choice, but you are open to working wherever you need to
- Have worked on GCP before and are familiar with instance management, networking, IAM, and generally understand how it all works
Working at DigitalEd
DigitalEd is committed to providing every employee with professional growth opportunities, a supportive work environment, excellent compensation, and benefits. Our Waterloo, Ontario office provides a corporate concierge service and access to a fully equipped fitness facility.
If this sounds like your dream job, apply today! We’re waiting for you.