Software Engineer - AI Data & Evaluation

1d1 day ago

Mercor

San Francisco, US · Full-time · $130,000 – $500,000

About this role

Mercor partners with leading AI labs to deliver the human intelligence that trains frontier models. As a Software Engineer on the AI Data & Evaluation team, you build the data infrastructure and evaluation systems that power next-generation AI development. The work centers on creating high-quality data types that advance model capabilities across the industry.

You design evaluation methodologies and flywheels that drive continuous improvements in data quality and model performance. Synthetic data pipelines and simulation environments are engineered to produce high-signal training data at scale. Operational automation systems are architected to maintain precision and efficiency across the full data pipeline.

Engineers operate as builders and innovators at the intersection of data engineering, systems design, and applied AI research. Cross-functional collaboration with Operations, Research, and Product teams translates evolving model needs into scalable solutions. A product-oriented mindset and bias toward shipping define the fast-paced environment.

Direct impact comes from shaping the data that powers leading AI labs' frontier models. Early exposure to cutting-edge capabilities arrives months before market release. Growth occurs through ownership at the intersection of data engineering and AI research with clear paths to leadership.

Requirements

Strong software engineering skills with a proven track record shipping production systems end-to-end.
Deep interest in and experience with AI/ML data pipelines, evaluation frameworks, or training data systems.
Systems thinking: ability to design for scalability, quality, and operational reliability simultaneously.
Comfort operating with ownership and pragmatism in fast-moving, ambiguous environments.
Effective communication and collaboration with engineering, research, and operations teams.
Experience with synthetic data generation, reinforcement learning environments, or large-scale data quality systems is highly valued.

Responsibilities

Innovate and develop evaluation methodologies and flywheels that continuously improve data quality and model performance at scale.
Design and build synthetic data generation systems and simulation environments that produce high-signal, high-diversity training data.
Architect and ship operational automation systems that maximize throughput, efficiency, and quality across the end-to-end data pipeline.
Collaborate cross-functionally with Operations, Research, and Product to translate model needs into robust engineering solutions.
Own end-to-end delivery of critical systems from prototyping novel ideas to scaling production infrastructure.

Benefits

Impact: Your work directly shapes the quality of data powering the world's leading AI labs' frontier models.
Learning: Get early, first-hand exposure to cutting-edge model capabilities months before they reach the market.
Growth: Work at the intersection of data engineering and AI research with fast paths to ownership and leadership.