Data/Evaluations Engineer
Nous Research
Global
fulltime
Posted 1 week ago
Job Description
Summary
We’re looking for a data/evaluations engineer to own our post-training evaluation pipeline. You’ll build and scale evals depth and breadth that measure model capabilities across diverse tasks, identify failure modes, and drive model improvements.
Responsibilities:
We’re looking for a data/evaluations engineer to own our post-training evaluation pipeline. You’ll build and scale evals depth and breadth that measure model capabilities across diverse tasks, identify failure modes, and drive model improvements.
Responsibilities:
- Identifying tasks for evaluation coverage
- Creating, curating, or generating test cases and ways to measure these tasks
- Implementing evaluation through objective output verification, LLM judge/reward modeling, human evaluation, or any tricks of the trade you may bring to the table
- Adding coverage and diving deep into analyzing what’s really gone wrong in failure cases
- Identifying ways to remedy failure cases
- Developing ways to present and make the evals scalable and accessible internally (e.g. light GUIs, scalable Slurm scripts, etc for running the evals)
Qualifications:
- Strong experience with evaluation frameworks
- Experience with both automated and human evaluation methodologies
- Ability to build evaluation infrastructure from scratch and scale existing systems
Preferred:
- History of OSS contributions
About the Company

Nous Research
IT / Telecommunication Services
11-50 Founded 2023
Nous Research is an AI research lab creating world-class models out in the open. We are best known for the Hermes series of open-source models, which are general purpose, human-aligned, lightweight models downloaded more than 50 million times on HuggingFace. In the process of developing these models, Nous is building a fully open AI stack, allowing anyone to meaningfully participate in the development of frontier intelligence, beginning with our fully distributed pre-training network, Psyche.
Apply for this job
Applications close soon
Company
Nous Research
IT / Telecommunication Services
Check Job Fit
Use AI to analyze how well your resume matches this job