PolymathB2B

AI Research Resident

ML / AISF · Mid · Seed

About Polymath
Polymath is an applied research lab focused on advancing long-horizon agent capabilities through reinforcement learning. We design and scale simulation environments where agents learn to operate safely and autonomously. We work with the world’s leading model labs to push the frontier of agent capabilities. Polymath is backed by Base10, Founders Future, Y Combinator, and other incredible investors & angels. We've raised an $8M seed, and are growing out the team.

About the role

We’re looking for talented researchers currently enrolled in MS / PhD programs to collaborate on a research project focused around frontier benchmarks and environments for long-horizon AI agents. This will require 1) identifying failure modes in frontier models, 2) developing rigorous benchmarks that evaluate how well frontier agents perform on complex, realistic tasks requiring long-horizon reasoning and tool use in dynamic environments, and 3) training autonomous agents that can reason, plan, and act over extended time horizons.

We can accommodate full-time or part-time engagements. The goal of the residency is to culminate in a publication, and if there is a mutual fit, transition into a full-time role. If you’re interested in joining Polymath but are not currently a student, please apply to the Member of Technical Staff role.

You’ll be a good fit if you:

Are currently pursuing an MS or PhD program in Computer Science or a related field
Have experience with reinforcement learning, benchmarking frontier models, or model post-training
Have experience with systems engineering and can write production-quality code
Have a strong track record of publications
Have high agency, move quickly, and enjoy working on open-ended research problems

Culture

Polymath is a team of researchers, engineers, and operators focused on advancing the frontier of safe, superintelligent AI agents.

We have a flat organizational structure. We believe that people do their best work when they’re self-motivated and driven by a desire to learn, contribute to the team’s goals, and advance scientific progress.
We’re looking for folks who ship fast, set high standards for themselves, and are great team players.

Check your CV against this role

Drop your CV. You get a 0-100 fit score against the actual job description, plus the read a senior engineering lead would write. Private to you.

Score this once, or every future role

Start the candidate journey and every new role on the board gets scored against you.

Five minutes. Tell us what you’re after, drop your CV once, pick how we should reach out. You get a candid read back and you only hear from us when a role actually fits.

Start the journey How it works

More at Polymath

Software EngineerSF · Mid→Member of Technical Staff · EngineeringSF · Staff→Member of Technical Staff · ResearchSF · Staff→