Member of Technical Staff (Software Engineer, Data Flywheel)
Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and specialized data sources. The Answer Quality team ensures that our prompts, tools, search, and specialized datasets, combined with both frontier and in-house models, create the best possible experience for our users. As our product evolves, our evaluations must remain fast, accurate, and actionable. In this role, you will build the data flywheel that serves teams across Perplexity.
Responsibilities
Build the systems and pipelines that enable Search, Product, and other teams to independently access and utilize reliable eval verdicts without bottlenecks
Take ownership of the "evals-to-product" loop, autonomously determining the best way to turn raw signals into durable datasets that power decision-making across the company
Build a robust simulator pipeline capable of replaying user interactions with the product in formats legible to LLMs and VLMs, reflecting product changes as they are shipped
Maintain data trust by implementing monitoring, lineage, and quality checks, ensuring downstream consumers can rely on the results implicitly
Operate in a small, high-impact team where your work directly shapes how Perplexity measures and improves Answer Quality
Qualifications
3+ years of software engineering experience shipping production systems
Strong proficiency in Python and SQL with the ability to write production-grade, maintainable code
Experience with big data systems including distributed compute and large-scale storage
Solid fundamentals in data modeling, system design, and debugging distributed systems
Experience with AWS and lakehouse ecosystems like Databricks or Spark
Comfortable with agentic coding workflows and using AI-assisted development tools to iterate faster
Preferred Qualifications
Data engineering background including pipelines, orchestration, and warehousing patterns
Familiarity with LLM/VLM interfaces, tokenization, structured formats, and multimodal payloads
Experience with evaluation platforms, experimentation systems, or machine learning infrastructure
Prior work supporting customer-facing products at scale
Check your CV against this role
Drop your CV. You get a 0-100 fit score against the actual job description, plus the read a senior engineering lead would write. Private to you.
Score this once, or every future role
Start the candidate journey and every new role on the board gets scored against you.
Five minutes. Tell us what you’re after, drop your CV once, pick how we should reach out. You get a candid read back and you only hear from us when a role actually fits.