
Product Lead – Inference
Location: San Francisco (Onsite)
Employment Type: Full Time
A well-funded and fast-growing AI company is building a next-generation platform for safe, performant, and cost-efficient AI agent deployment across enterprise environments. With a team of top researchers, engineers, and product leaders, they’ve developed proprietary multi-model architectures designed to reduce hallucinations and improve reliability at scale.
They’ve recently closed a major funding round from leading institutional investors, bringing total funding to over $400M and valuing the business north of $3B. They’re now expanding their platform team to continue scaling their custom LLMs, inference infrastructure, and cloud-native agent tooling.
As Product Lead for the Inference Platform, you’ll own the roadmap and execution for the infrastructure powering model deployment, orchestration, and usage across multiple cloud environments. This is a highly cross-functional IC role with visibility across engineering, research, and go-to-market.
You’ll be responsible for defining scalable, high-performance systems that support rapid model experimentation, SaaS application launches, and cloud cost optimization. Ideal for someone who thrives in a technically complex environment and wants to shape the underlying foundation of production-grade AI products.
Own product strategy for the multi-cloud inference platform and agent hosting systems
Collaborate with research and infra eng to forecast and scale model and application capacity
Monitor and optimize usage, latency, and cost across LLM and voice inference workloads
Drive decisions around GPU allocation, cloud cost efficiency, and workload orchestration
Define internal tools to support evaluation, logging, and performance observability
Work closely with GTM and operations to align platform performance with business goals
Partner with finance and leadership on pricing and margin strategy across the agent stack
Must Have:
7+ years of product management experience
2+ years building AI/ML platform, LLMOps, or infra products
Deep understanding of inference, training, and cloud compute (AWS/GCP/Azure)
Experience aligning Eng, Research, and GTM around complex technical products
Familiarity with cloud cost modeling, GPU orchestration, and workload optimization
Analytical mindset, strong execution, and bias toward measurable outcomes
Nice to Have:
Background in distributed systems, model evaluation, or GPU infra
Experience launching dev tooling or internal platforms for AI teams
Prior work with LLMs, voice agents, or AI-native applications
Strong technical intuition with hands-on engineering exposure a plus