Sr. Software Engineer - Conversational AI Evaluator

Remote
Mercor $45 - $80
Part Time Senior Level

Posted 1 month ago Expired

This job has expired

Looking for a job like Sr. Software Engineer - Conversational AI Evaluator? Upload your resume and we'll notify you when similar positions become available.

Upload Your Resume

About This Role

Evaluate LLM-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness, contributing to the advancement of AI research at Mercor.

Responsibilities

  • Evaluate LLM-generated responses to coding and software engineering queries for accuracy, reasoning, clarity, and completeness
  • Conduct fact-checking using trusted public sources and authoritative references
  • Execute code and validate outputs using appropriate tools to ensure accuracy
  • Annotate model responses by identifying strengths, areas of improvement, and factual or conceptual inaccuracies
  • Assess code quality, readability, algorithmic soundness, and explanation quality
  • Apply consistent evaluation standards by following clear taxonomies, benchmarks, and detailed evaluation guidelines

Requirements

  • BS, MS, or PhD in Computer Science or a closely related field
  • Significant real-world experience in software engineering or related technical roles
  • Expertise in at least one relevant programming language (e.g., Python, Java, C++, JavaScript, Go, Rust)
  • Ability to solve HackerRank or LeetCode Medium and Hard–level problems independently
  • Experience contributing to well-known open-source projects, including merged pull requests
  • Significant experience using LLMs while coding and understanding their strengths and failure modes
  • Strong attention to detail and comfort with evaluating complex technical reasoning, identifying subtle bugs or logical flaws

Qualifications

  • BS, MS, or PhD in Computer Science or a closely related field
  • Significant real-world experience in software engineering or related technical roles

Nice to Have

  • Prior experience with RLHF, model evaluation, or data annotation work
  • Track record in competitive programming
  • Experience reviewing code in production environments
  • Familiarity with multiple programming paradigms or ecosystems
  • Experience explaining complex technical concepts to non-expert audiences

Skills

Python * Java * C++ * JavaScript * Go * Rust *

* Required skills

About Mercor

Mercor connects elite creative and technical talent with leading AI research labs.

Technology
View all jobs at Mercor →