Site Reliability Engineering Manager
Software Engineering - The Netherlands, Amsterdam
Site Reliability Engineering Manager
Software Engineering - The Netherlands, Amsterdam
Apply to this Job
apply now

At TomTom…

You’ll move the world forward. Every day, we create the most innovative mapping and location technologies to shape tomorrow’s mobility for the better.

We are proud to be one team of more than 5,000 unique, curious, passionate problem-solvers spread across the world. We bring out the best in each other. And together, we help the automotive industry, businesses, developers, drivers, citizens and cities move towards a safe, autonomous world that is free of congestion and emissions.

TomTom is hiring for a Site Reliability Engineering Manager to perform a critical role that will contribute to TomTom’s success in the online services world. As a Site Reliability Engineering Manager, you will manage your team and partner with internal & external engineering teams to build reliable infrastructure, with automation and tooling to support it. You will have the chance to work through challenging scaling issues, dig in deep to debug the hardest technical problems, work across the stack solving problems and drive unmatched reliability of our systems. You will be the conduit between your SRE team and the business.


What you’ll do

  • Work with partners to shape the architecture, design, and implementation of new and existing systems to enhance their reliability, efficiency, and scalability;
  • Manage a team of SRE’s that provide the Incident Commander role in P1 incidents, getting hands-on with the platforms whilst improving mean time to resolve (MTTR) where required;
  • Apply software engineering disciplines to avoid incident reoccurrence;
  • Ensure all key services are measured, monitored, and properly alerting;
  • Work in a collaborative environment, across many teams to ensure our systems are performing as well as possible;
  • Design, write and maintain software to improve the availability, scalability, latency, and efficiency of the services;
  • Engage in service capacity planning and demand forecasting, anticipating performance bottlenecks;
  • Conduct RFP’s to standardize tooling across TomTom’s engineering teams;
  • Define the SRE strategy and roadmap;
  • Shift-left the nature of proactive operations (observability/assurance).


What you’ll need

  • Minimum of 8 years of industry experience in software engineering & SRE, minimum 2-3 years as a manager of an SRE team;
  • Experience supporting a service providing platform, realizing the importance of availability and predictability;
  • Have good knowledge of Kubernetes, Prometheus, Terraform;
  • Strong scripting skills (Bash, Python, Ansible);
  • Demonstrated experience of coding in production environments
  • Experience working with AWS, Azure or a similar cloud environment;
  • Experience coding in higher-level languages (e.g. Python, Java, C++);
  • Experience in the Linux environment and a good understanding of its fundamentals and internals: filesystems and modern memory management, threads and processes, the user/kernel-space divide, etc.;
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems;
  • Unending curiosity and a tenacious desire to get to the bottom of technical problems;
  • Excellent written and oral communication skills with technical and non-technical stakeholders;
  • Ability to take initiative and collaborate with the rest of the team(s)
  • Bachelor’s in computer science/engineering or equivalent.


What’s nice to have

  • Experience with network protocols and theory (TCP/IP, UDP, ICMP, MAC addresses, IP packets, DNS, and load balancing, etc.).


Meet your team

Our team is in the core TomTom live services. We connect with all DevOps teams and make sure that there is a good as possible customer experience when there is an incident and minimize the MTTR (Mean time to resolve). We also focus on reducing the number of incidents as we participate in improvement actions with a focus on automation and reliability setup.

Our Site Reliability Engineers (SRE) are a hybrid of software and systems engineers. We code our way out of operational problems. We are responsible for reliability, scalability, and automation while keeping an eye on latency, performance and capacity as well as other KPI’s.
 

Achieve more

We are self-starters who play well with others. Every day, we solve new problems with creativity, meet new people and learn rapidly at our offices around the world. We will invest in your growth and are committed to supporting you. In everything we do, we’re guided by six values: We care, putting our heart into what we do; we build trust (you can count on us); we create – it’s how we make a difference; we are confident, but don’t boast; we keep it simple because life is complex enough; and we have fun because life’s too short to be boring. 


Ready to move the world forward?
 

After you apply

Our recruitment team will work hard to give you a meaningful experience throughout the process, no matter the outcome. Your application will be screened closely, and you can rest assured that all follow-up actions will be thorough, from assessments and interviews through onboarding.


TomTom is an equal opportunity employer

TomTom is an equal opportunity employer. We celebrate diversity, thrive on each other’s differences and are committed to creating an inclusive environment at our offices around the world. Naturally, we do not discriminate against any employee or job applicant because of race, religion, color, sexual orientation, gender, gender identity or expression, marital status, disability, national origin, genetics, or age.

Apply to this Job
apply now
Application form
Upload your resume
Upload either DOC, DOCX, HTML, PDF, or TXT file types (5MB max)
Drag and drop a file here

or

browse
How did you hear about us?
Terms and conditions
TomTom is all about getting you to where you want to be. To help you achieve more in your career, we'll need to ask some things about you. At the same time, we fully understand that you value your privacy.
Read more

Your application for the Site Reliability Engineering Manager position was submitted successfully.

What is next?

Thanks for applying, we’ve received your application and are carefully reading through it. If you are a successful candidate we’ll contact you.

ok
03
02
01
03
More challenge, more growth
Join our hackathons, developer days, leadership programs and more.
Unlock your creativity
We have an agile work culture, entrepreneurial spirit and involved founders.
Together keeping the world moving
4,500+ people in

41 offices

29 countries.

Similar jobs

title
category
location
Product Owner
Product Engineering
Pune, India
Senior Software Engineer, Android
Product Engineering
Amsterdam, The Netherlands
Software Engineer Java
Product Engineering
Lodz, Poland
C++ Software Engineer (Navigation)
Product Engineering
Berlin, Germany
Security Engineer
Product Engineering
Berlin, Germany