Senior Systems Engineer – Performance & Reliability (Analysis)

Gdańsk, Pomeranian Voivodeship, PolandCompetitive0 applicants

About this role

Salary Range: PLN 350,700 - 474,400 + Benefits + Equity

Subject to alignment to the responsibilities and duties of the role.

Requirements

  • Strong software engineering experience, typically gained across multiple projects or systems over several years
  • Experience working in Linux-based environments, ideally with distributed or high-performance systems
  • Proficiency in Python
  • Experience with automation and CI/CD systems (e.g. GitLab CI, Jenkins, GitHub Actions)
  • Ability to design, implement, and run experiments or tests that produce meaningful results
  • Ability to interpret results and communicate findings clearly, with an emphasis on accuracy and usefulness to decision-making
  • Comfortable working in areas where requirements are not fully defined and judgement is required

Nice to have

  • Experience working with large-scale or distributed systems (e.g. clusters, cloud platforms, HPC environments)
  • Experience with performance, reliability, or systems-level testing/measurement
  • Familiarity with pytest or similar frameworks for structured test/measurement execution
  • Experience analysing system behaviour under load(compute, network, or ML workloads)
  • Experience working with containerisation, orchestration, or provisioning systems (e.g. Docker, Kubernetes, OpenStack)
  • Proficiency in other applications programming languages (e.g. C++)
  • Exposure to data analysis, statistics, or interpreting variability in results

About Graphcore

At Graphcore, we’re building the future of AI compute.We’re a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale.As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem.To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world.We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence. Job Summary We turn measurements from large-scale systems into engineering decisions. Our team runs workloads on Linux clusters (from rack scale upwards) and collects detailed performance and reliability data. The key challenge is interpreting these results correctly and deciding whether a system is ready for production. You will work on: Analysing results from measurements of distributed systems Understanding performance variability and repeatability Defining what “normal” and “acceptable” system behaviour looks like Typical work includes: Working with measurement data from compute, network, and ML workloads Analysing results produced by automated test frameworks (e.g. pytest-based systems) Comparing results across runs, configurations, and system scales Helping define thresholds for pass/fail decisions You may also: Influence how measurements are designed to produce better data Improve how results are stored, queried, and interpreted This is not a traditional data analysis or BI role. The focus is on understanding system behaviour and supporting engineering decisions. We are looking for engineers who: Are comfortable working with real-world, imperfect data Can reason about distributed systems performance Focus on evidence and correctness rather than presentation Selection criteria: Our engineers typically bring significant practical experience and sound engineering judgement. Depth in one area is valued, but the ability to work across boundaries is equally important.

EU Requirements

Job Details

Posted7 May 2026
Closes6 June 2026

Contact

Similar Jobs

Finding similar jobs...

Senior Systems Engineer – Performance & Reliability (Analysis) at Graphcore | EuroTalent AI