Forward Deploy AI Engineer

San Francisco, CAIn-personFull-Time3-7+ years

About

About Judgment Labs

Judgment Labs builds infrastructure for Agent Behavior Monitoring (ABM). While traditional observability focuses on logging exceptions and latency, ABM surfaces behavioral anomalies such as instruction drifts and context retrieval loss in scaled production environments. Hundreds of teams building autonomous agents rely on Judgment to understand how their systems are behaving post-deployment. The team has raised $30M+ across two rounds in the past five months. Investors include Lightspeed, SV Angel, Valor Equity Partners, Nova Global, Chris Manning, Michael Ovitz, Michael Abbott, Cory Levy, and Kevin Hartz. The team is under 20 people and ships at 50+ company velocity, with Olympiad medalists, debate champions, and competitive athletes who bring that same intensity to company building. Everyone is either an ex-founder or a founder-to-be.

About the role

Judgment Labs is hiring a Forward-Deployed AI Engineer to embed the ABM platform directly into customers' production systems. You will work inside customer codebases to integrate monitoring and evaluation into real agent workflows, diagnose failures in live environments, and drive deployments to reliable production use. This is a deep technical execution and customer ownership role. Because the work is so technically demanding, we are prioritizing strong software engineers who can flex into a forward-deployed role over candidates with a purely solutions / forward-deployed background. You will work directly with customer teams to reason about agent behavior, translate high-level goals into concrete ABM deployments, and own outcomes end-to-end across real production environments. The scope, judgment, and autonomy required mirror a training ground for what it takes to found or lead a technical company. The seat is heavily customer-facing. Strong communication (explaining complex technical concepts clearly, building trust with both technical and non-technical stakeholders, and driving long-term adoption) is as critical as the engineering depth.

What you'll own

Deploy and embed Judgment Labs' ABM platform and AI components directly into customer codebases and production AI systems
Work inside customer systems to integrate monitoring, evaluation, and agent-facing components into real workflows
Guide customers through technical decisions around agent monitoring, evaluation strategy, and integrating these capabilities into existing production systems
Own multiple customer engagements end-to-end, ensuring successful integration and sustained adoption of monitoring and evaluation systems within production agent workflows
Translate customer feedback into roadmap input that shapes the next set of features

Requirements

Must-have

3-7 years of software engineering experience, with strong product-engineering ability that ships features end to end inside customer codebases
Some form of applied AI experience. (A hard agent requirement and preferred evals requirement are no longer mandatory; a strong engineer with AI-adjacent exposure is preferred over a pure solutions hire.)
Strong customer-facing communication skills: explains complex technical concepts clearly, builds trust with technical and non-technical stakeholders
Comfort deploying AI or LLM-based systems into real production environments
Ability to translate ambiguous customer goals into concrete technical solutions
The Forward-Deployed Engineer role is their genuine first choice, not a fallback for another role
Based in SF or willing to relocate. 5 days in person.
Wants to be a technical founder in the future

Nice-to-have

Strong engineering pedigree from a solid product company (e.g. Roblox, Snapchat, Tesla). Top-tier infra bar (e.g. Databricks) is a bonus, not a requirement
Prior agent or applied-AI work
Prior evals, observability, or behavior-monitoring product experience (heavily preferred)
Prior forward-deployed or solutions engineering experience
Prior 0-1 startup experience as an early hire
Founder background or strong founder-to-be signal
4-5+ years of experience tends to perform best (stronger comms and interview pass rate); exceptionally strong candidates 1-2 years out of school still considered case-by-case
Comfort across the full stack with backend/infra weight
Junior level (3 years experience): $200K
Mid to senior level (3-7 years experience): up to $300K
Exception: exceptional AI-savvy FDE leads can reach $400K on a case-by-case basis

Benefits & perks

Full benefits package
Equinox membership
Private chef
Competitive compensation
Direct customer interface and influence on product roadmap

Interview process

1Application Review
2recruiter screen
3evals fde round
4booked evals round
5coding iq round
6booked iq round
7fde onsite
8Offer

Drop your CV for this role.

One PDF and your email. We read it, score your fit for this role at Judgment Labs, and route the introduction through us.