Closes in 1 days

Senior Site Reliability Engineer

Paris, FranceCompetitiveRemote0 applicants

About this role

At Algolia, we’re proud to be a pioneer and market leader in AI Search, empowering 17,000+ businesses to deliver blazing-fast, predictive search and browse experiences at internet scale. Every week, we power over 30 billion search requests — four times more than Microsoft Bing, Yahoo, Baidu, Yandex, and DuckDuckGo combined.

In 2021, we raised $150 million in Series D funding, quadrupling our valuation to $2.25 billion. This strong foundation enables us to keep investing in our market-leading platform and serving incredible customers like Under Armour, PetSmart, Stripe, Gymshark, and Walgreens.

Algolia is set to enable every company to create world-class Search and Discovery experiences with an API-first approach. Performance and Scalability is at the heart of our mission: we power 1.5 trillion searches a year, for 10K+ customers all over the world.

If you're a problem solver, able to think outside the box and eager to nurture others and learn from them, then this is your challenge!

The Team

The Fleet team is a Site Reliability Engineering team focusing on one goal: the Search products should always be available. To make this possible, the Fleet team creates pragmatic solutions to optimize the Search products availability and costs at scale, taking into account the needs of customers, the product teams, and the many engineering teams involved in delivering a unique Search Experience to our customers.

The Opportunity

The team is looking for an individual who has a first experience of building and operating scalable architectures. You will contribute to the delivery of solutions that support other engineering teams and will have a direct impact on the success of Algolia's Search products.

In this role, you'll help design and implement systems focused on reliability, scalability, and cost efficiency, while also having opportunities to grow your skills and collaborate with team members.

Your role will include

Operating the Search products, building self-healing and automated incident response mechanisms

Building components that improve reliability and performance

Monitoring and computing the SLO and the error budget of the product you operate

Reducing the toil and the technical debt by automating tasks and increasing the quality of existing components

Managing Incidents and Customer Requests

You might be a good fit if you have

5 years experience in a scalable environment

Knowledge of at least one programming language (Python, Golang, Ruby) and you are familiar with software craftsmanship

Experience working with APIs

A focus on designing reliable, operable, and highly available applications

Familiarity with at least Public Cloud Providers like GCP, AWS, or Microsoft Azure, and their Kubernetes service

A good understanding of Linux system administration, networking, and troubleshooting

Strong communication and organizational skills

Team’s current stack:

Programming languages: Golang, Python, Ruby

CI/CD:Github Actions, CircleCI

IaC & configuration management:Terraform, Chef

Platform: Linux, Kubernetes

Hosting: Bare Metal Servers & Cloud on AWS & Azure

Monitoring: Datadog & custom monitoring stack for our Search infrastructure

Responsibilities

Operating the Search products, building self-healing and automated incident response mechanisms
Building components that improve reliability and performance
Monitoring and computing the SLO and the error budget of the product you operate
Reducing the toil and the technical debt by automating tasks and increasing the quality of existing components
Managing Incidents and Customer Requests
5 years experience in a scalable environment
Knowledge of at least one programming language (Python, Golang, Ruby) and you are familiar with software craftsmanship
Experience working with APIs
A focus on designing reliable, operable, and highly available applications
Familiarity with at least Public Cloud Providers like GCP, AWS, or Microsoft Azure, and their Kubernetes service
A good understanding of Linux system administration, networking, and troubleshooting
Strong communication and organizational skills
Programming languages: Golang, Python, Ruby

Requirements

CI/CD:Github Actions, CircleCI
IaC & configuration management:Terraform, Chef
Platform: Linux, Kubernetes
Hosting: Bare Metal Servers & Cloud on AWS & Azure
Monitoring: Datadog & custom monitoring stack for our Search infrastructure
GRIT - Problem-solving and perseverance capability in an ever-changing and growing environment.
TRUST - Willingness to trust our co-workers and to take ownership.
CANDOR - Ability to receive and give constructive feedback.
CARE - Genuine care about other team members, our clients and the decisions we make in the company.
HUMILITY - Aptitude for learning from others, putting ego aside.
Our open positions may appear on third-party job boards, but the best way to apply safely is directly through our careers page.
All genuine communication from Algolia will come from an @algolia.com email address. If you receive an email from someone claiming to work at Algolia who does not have an @algolia.com email address, please do not respond or share any personal information.
We’ll never ask for payments, purchases, or financial details during the hiring process.

Senior Site Reliability Engineer

About this role

Responsibilities

Requirements

EU Requirements

Job Details

Contact

Similar Jobs