Research Engineer
Chinatown, CAIn-personFull-Time3 - 7 years
About
About Judgment Labs
Judgment Labs builds infrastructure to monitor and evaluate how AI agents behave in production, helping teams catch failures like hallucinations, drift, or bad decisions in real time. It turns real usage data into scoring and feedback loops so teams can continuously improve agent reliability, performance, and decision-making at scale.
Requirements
Must-have
- Build high-taste agent products that pair powerful behavior with consumer-grade UX polish
- Design and ship agent capabilities from 0 to 1 inside a fast-scaling product surface
- Contribute across the full stack as needed, with the majority of work focused on agent infrastructure and product
- Translate customer feedback on agent behavior into concrete product iterations
- Help raise the bar on product taste and craft as the team grows past 19 people
- 3-7 years of engineering experience
- Prior experience building and designing agents (Applied AI or agent engineering background), production work, not hackathon projects
- Solid product engineering capabilities, ships features end-to-end
- Strong customer-facing communication skills , comfortable explaining complex technical concepts, and building trust with technical and non-technical stakeholders
- Comfort with 0-1 product work and ambiguity
- Basic full-stack engineering ability
- Based in SF or willing to relocate, 5 days in office in FiDi
- Prior evals, observability, or behavior-monitoring product experience (heavily preferred)
- TypeScript fluency
- Prior Series A or seed startup experience
- Strong design taste at consumer-grade product polish levels
- Front-end or design engineering range on top of agent work
Benefits & perks
- Full benefits package
- Equinox membership
- Private chef
- Work with cutting-edge AI infrastructure
- Direct customer interaction and feedback
- Fast track to founding experience
Interview process
- 1Application Review
- 2Founder vibe check + optional 15 minutes deeper dive into technical projects.
- 3Technical Interview, problem-solving + role-specific interview
- 4Work Trial
- 5Offer
- 6Hired
Drop your CV for this role.
One PDF and your email. We read it, score your fit for this role at Judgment Labs, and route the introduction through us.