PhD Rater
1. Role Overview
Mercor is seeking experienced researchers and technical experts to contribute to a project supporting a frontier-model evaluation effort focused on agentic workflows. You’ll design and validate challenging benchmark tasks in data science, machine learning, finance, and coding to help surface and diagnose reasoning and problem-solving gaps in a target STEM model. The work centers on building robust, real-world tasks with executable tests and then analyzing model/agent behavior.
2. Key Responsibilities
- Design challenging, real-world STEM problems
- Implement each task inside an agentic development environment using Python
3. Core Qualifications
- Deep expertise in data science, machine learning, finance, and/or Python-based coding
- Active or recently graduated PhD (Top 20 U.S.-based school)
- Strong research background in frontier STEM topics
- Ability to engage reliably for 30+ hours/week, primarily on weekdays
- Demonstrated technical output such as high-quality open-source contributions (especially in agentic / LLM tooling ecosystems)
- Comfort reading and reasoning about agent behavior traces to diagnose failure modes beyond surface-level errors
4. More About the Opportunity
- Initial focus area: agentic workflows for STEM tasks
- Familiarity with agentic frameworks and OSS ecosystems is helpful (examples include LangChain, MetaGPT, AutoGen, AutoGPT, CrewAI, LlamaIndex, BabyAGI, SuperAGI, CAMEL, AgentGPT, Dify, etc.)
- Deliverables are expected to be reproducible and testable (clear specs, deterministic tests where possible, documented environments)
5. About Mercor
- Mercor is a talent marketplace that connects top experts with leading AI labs and research organizations.
- Our investors include Benchmark, General Catalyst, Adam D’Angelo, Larry Summers, and Jack Dorsey.
- Thousands of professionals across domains like law, creatives, engineering, and research have joined Mercor to work on frontier projects shaping the next era of AI.
Compensation
- Pay: $50 – $100/hour
- Type: Part-time contract
- Location: Remote
In-depth analysis: how it works, pay rates, pros & cons, and tips to get hired.
