Expert Site Reliability Engineer
Software Engineering - Amsterdam, The Netherlands
Expert Site Reliability Engineer
Software Engineering - Amsterdam, The Netherlands
Apply to this Job
Apply with LinkedIn
apply now

At TomTom…

At TomTom, we’re on a mission to create the most innovative technologies to shape tomorrow’s mobility. For this purpose, we’re hiring a Site Reliability Engineer to perform a critical role that will contribute to our success in the world of online services.

The SRE team at TomTom brings software and system engineering skills together. We code our way out of operational problems, working with internal and external teams to build resilient, scalable and reliable systems in order to deliver services of the highest quality to our customers.
This is a unique chance to solve challenging problems, dig deep and troubleshoot complex systems, learn from incidents, work across the stack to drive the reliability of our services, have a real impact on global mobility every day.
This position is based in our offices in Amsterdam, Netherlands or Berlin, Germany .

Responsibilities
  • Work with peers and partners to shape the architecture, design, and implementation of new and existing systems to enhance their reliability, efficiency and scalability.

  • Provide the Incident Commander role in high priority incidents, getting hands-on when required to improve the TTR.

  • Apply resiliency engineering discipline to avoid reoccurrence of incidents drive incident response, analysis and remediation.

  • Ensure that critical services have a properly configured monitoring and alerting setup and that operational hygiene is applied to guarantee their continuity.

  • Provide observability expertise and enablement to the engineering teamsDesign, write and maintain software to improve the performance of services and the connected operational profile.

  • Be part of a collaborative environment, working together across many teams to ensure that systems are performing as well as possible.

  • Shift left operations and support the growing autonomy of our DevOps teams.

  • Support the definition of the SRE strategy and roadmap.

Requirements
  • You haveproven experience developing,operating, and troubleshootinglarge scale production systems running on public Cloud Infrastructure (eg. AWS, Azure) andKubernetes

  • You are comfortable with concepts of infrastructure-as-a-code, configuration management, and related technologies

  • You have deep understanding of observability concepts and experience in technologies, tools and services related to its pillars

  • You have proven experience with Incident Management process, debriefing of incidents, and deriving corrective and preventive actions

  • You have experience as backend developer

  • You have good understanding of Unix/Linux systems internals

  • You have good understanding of the SLIs, SLOs, and SLAs concepts

  • You are passionate about automation and eliminating toil
     

We consider favourably
  • Experience mentoring, sharing knowledge and acting as a development advocate 

  • Experience with tools asScalyr, GrafanaCloud, Prometheus, OpenTelemetry

  • Experience with Java, Spring Boot and JVM

  • Expert level certs for AWS, Azure

Achieve more
We are self-starters who play well with others. Every day, we solve new problems with creativity, meet new people and learn rapidly at our offices around the world. We will invest in your growth and are committed to supporting you. In everything we do, we’re guided by six values: We care, putting our heart into what we do; we build trust (you can count on us); we create – driven to make a difference; we are confident, but don’t boast; we keep it simple, since life is complex enough; and we have fun because life’s too short to be boring. 

After you apply
Our recruitment team will work hard to give you a meaningful experience throughout the process, no matter the outcome. Your application will be screened closely and you can rest assured that all follow-up actions will be thorough, from assessments and interviews through your onboarding.

TomTom is an equal opportunity employer
We celebrate diversity, thrive on each other’s differences and are committed to creating an inclusive environment at our offices around the world. Naturally, we do not discriminate against any employee or job applicant because of race, religion, color, sexual orientation, gender, gender identity or expression, marital status, disability, national origin, genetics, or age.

Ready to move the world forward?

#LI-JH1
Apply to this Job
Apply with LinkedIn
apply now
Application form
Title
Upload your resume
Upload either DOC, DOCX, HTML, PDF, or TXT file types (5MB max)
Drag and drop a file here
or
browse
How did you hear about us?
Terms and conditions
TomTom is all about getting you to where you want to be. To help you achieve more in your career, we'll need to ask some things about you. At the same time, we fully understand that you value your privacy.
Read more

Your application for the Expert Site Reliability Engineer position was submitted successfully.

What is next?

Thanks for applying, we’ve received your application and are carefully reading through it. If you are a successful candidate we’ll contact you.

ok
03
02
01
03
More challenge, more growth
Join our hackathons, developer days, leadership programs and more.
Unlock your creativity
We have an agile work culture, entrepreneurial spirit and involved founders.
Together keeping the world moving
4,500+people in

41offices

29countries.
Similar jobs