Job Title: Machine Learning System Engineer
Salary: $250K - $300K + Equity
Location: Bay Area, CA
An AI startup is seeking a Senior Systems Engineer to optimize deep learning performance at scale. In this role, you’ll work at the intersection of systems, infrastructure, and machine learning, driving improvements across model training, inference, and distributed compute environments. You’ll focus on kernel-level optimization, GPU/accelerator efficiency, and deep framework tuning to push the boundaries of performance for next-generation AI workloads.
As part of a high-impact team, you’ll wear multiple hats in a startup environment and contribute across large-scale data processing, model parallelism, and runtime efficiency. The role requires expertise in CUDA/Triton, PyTorch internals, and distributed training systems, with the ability to diagnose and optimize performance bottlenecks across kernels, frameworks, and clusters. If you thrive on accelerating training and inference performance at scale, this is a chance to make a major impact.
3+ years of systems-level engineering experience in deep learning environments
Strong Python development and debugging skills
Kernel optimization (parallelization, performance tuning)
GPU / AI accelerator compute model familiarity
Large-scale distributed training (diagnosing bottlenecks in clusters)
PyTorch framework optimization and runtime improvements
Deep understanding of CUDA, Triton, and related internals
Read and apply for this role in the way that works for you by using our Recite Me assistive technology tool. Click the circle at the bottom right side of the screen and select your preferences.
We make an active choice to be inclusive towards everyone every day. Please let us know if you require any accessibility adjustments through the application or interview process.
Our mission is to empower every person, regardless of their background or circumstances, with an equitable chance to achieve the careers they deserve. Building a diverse future, one placement at a time.