
Staff Backend Engineer – Generative AI Platform (Stealth Infra AI Scale-Up)
Location: Onsite (Bay Area preferred) | Full-Time
Compensation: $240-275K base + high start-up equity + full benefits package
About the Role
Our client is building a next-gen platform at the intersection of generative AI, applied research, and real-world infrastructure. As a Staff Backend Engineer, you’ll join a high-caliber team of engineers, researchers, and systems experts working on foundational systems that support cutting-edge generative applications. You’ll focus on designing and scaling the backend infrastructure to support high-throughput, secure, and reliable AI systems across production environments.
Key Responsibilities
Architect and build backend infrastructure that supports large-scale generative AI workloads and distributed applications
Design and optimize APIs and microservices to serve real-time inference, model orchestration, and data flow at scale
Own CI/CD pipelines, service monitoring, and incident response to maintain high system uptime and reliability
Collaborate cross-functionally with applied ML, research, and product teams to translate technical needs into production-grade systems
Implement data integrity, security, and access controls across infrastructure components
Build and scale data pipelines capable of handling high-volume input and retrieval tasks across structured and unstructured sources
Mentor other engineers and contribute to architectural decisions across the platform
Core Requirements
7+ years of backend engineering experience, ideally in high-performance, distributed systems environments
Strong coding proficiency in Python and either Java or Go
Deep experience with cloud-native architecture (e.g. Kubernetes, container orchestration, CI/CD, service meshes)
Hands-on with one or more cloud providers (AWS, GCP, or Azure) — preference for GCP or Azure
Proven track record building scalable infrastructure from the ground up (not just integrating into an existing stack)
Familiarity with data modeling, security, and access control for sensitive or regulated data domains
Nice to Have
Experience with Langchain or similar frameworks for generative AI applications
Exposure to vector DBs, pub/sub systems (Kafka, Redis Streams), or RAG infrastructure
Background in building infra for domains with compliance or high-reliability requirements