Site Reliability Engineer
    Apply to this Job
    Apply with LinkedIn
    apply now
    Apply to this Job
    Apply with LinkedIn
    apply now
    What you'll do

    • Work with partners to shape the architecture, design, and implementation of new and existing systems to enhance their reliability, efficiency and scalability.

    • Assume the role of Incident Commander in high priority incidents, getting hands-on when required to improve the TTR.

    • Apply resiliency engineering disciplines to avoid reoccurrence of incidents: drive incident response, analysis and remediation.

    • Ensure that critical services have a properly configured monitoring and alerting setup and that operational hygiene is applied to guarantee their continuity.

    • Design, write and maintain software to improve the performance of services and the connected operational profile.

    • Actively develop and maintain the Observability Platform that teams at TomTom rely on to monitor their services and aid them during incident response

    • Be part of a collaborative environment, working together across many teams to ensure that systems are performing as well as possible.

    • Shift left operations and support the growing autonomy of our DevOps teams.

    • Support the definition of the SRE strategy and roadmap.

    What you'll need

    • 5+ years of working experience in a production environment, covering software and system engineering.

    • 3+ years of production experience operating Linux systems on cloud or bare metal, covering infrastructure as code, configuration management and monitoring.

    • Extensive experience designing, developing, troubleshooting and evolving large scale distributed systems.

    • Proficient in Java, C++, Python or any other modern programming language

    • Good understanding of Unix/Linux systems internals (e.g. memory management, file systems, threads and processes, system calls).

    • Good understanding of networking protocols and theory (e.g. TCP/IP, UDP, DNS, HTTP/HTTPS).

    • Experience working with AWS, Azure or a similar cloud environment at scale. 

    • Excellent written and oral communication skills, ability to collaborate successfully with technical and non-technical stakeholders.

    • Ability to establish successful mentorship relationships with colleagues, expressing technical leadership without pulling rank and role modeling the SRE principles.

    • Business acumen, ability to prioritize high ROI work, strong sense of ownership.

    What's nice to have

    • Experience working with Kubernetes and Prometheus in production.

    • Experience with operating mission-critical SaaS workloads in large scale cloud infrastructure. 

    Application form
    Upload your resume
    Upload either DOC, DOCX, HTML, PDF, or TXT file types (5MB max)
    Drag and drop a file here
    Upload your motivation letter
    Upload either DOC, DOCX, HTML, PDF, or TXT file types (5MB max)
    Drag and drop a file here
    How did you hear about us?
    Terms and conditions
    TomTom is all about getting you to where you want to be. To help you achieve more in your career, we'll need to ask some things about you. At the same time, we fully understand that you value your privacy.
    Read more
    Your application for the Site Reliability Engineer position was submitted successfully.

    Thanks for applying, we’ve received your application and are carefully reading through it. If you are a successful candidate we’ll contact you.

    After you apply

    1. First call
    If your application matches the role, then it’s time to put a voice to the name! We’ll call you to set up an interview.

    2. First interview
    In this interview, we want to know more about you – what excites you about location technology and how can you help us solve global challenges.

    3. Online assessment
    We’ll set you an assignment - use your expertise to show us what you’ve got.

    4. Second interview
    We'll dive into your potential role, showing you how you’ll fit into your team and contribute to our vision.

    5. The final decision
    Cue the fireworks, because we’ll start the onboarding!
    More challenge, more growth
    Join our hackathons, developer days, leadership programs and more.
    Unlock your creativity
    We have an agile work culture, entrepreneurial spirit and involved founders.
    Together keeping the world moving
    4,500+ people in

    30+ offices

    20+ countries.
    Similar jobs