About Us
Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.
Graphcore is one of the world’s leading innovators in Artificial Intelligence compute. It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry. As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone. Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation. Job Summary We are looking for a Software Engineering Intern to join a team pioneering the development of high-performance machine learning (ML) kernels for a new generation of AI hardware. In this role, you will contribute to building optimised compute kernels that support a wide range of ML operators—powering applications from convolutional neural networks (CNNs) to large language models (LLMs). You'll leverage low-level programming and hardware-aware optimisation techniques to extract maximum performance and efficiency from modern accelerators. This is a unique opportunity to work at the intersection of ML, numerical computing, and scalable systems. The Team This is an exciting opportunity to join an expanding team at Graphcore. The Kernel Engineering team is responsible for delivering high performance compute library to help customers gain the maximum performance from AI hardware. Responsibilities and Duties Supporting the design and implementation of kernels for linear algebra and tensor ops (GEMM, batched GEMM, convolutions, reductions, elementwise and fused operations) in C++ Profile and optimise for the next generation of AI hardware - threading, cache locality, memory layout, and kernel launch efficiency. Support performance and correctness - add microbenchmarks, regression tests, numerics validation Debug issues, resolve bugs and generally improve the quality and functionality of the product You are open-minded and collaborative with interests in performance optimisation and memory-efficient designs, and you are looking to join a team of experts. You are comfortable to discuss technical tradeoffs, receive feedback and iterate on solutions and you are drawn to technically challenging problems and use analyticals reasoning to navigate unfamiliar domains.