Language Model Evaluator - Bilingual
Remote
Mercor
$36 - $36
Part Time
Mid Level
Posted 4 weeks ago Expired
This job has expired
Looking for a job like Language Model Evaluator - Bilingual? Upload your resume and we'll notify you when similar positions become available.
Upload Your ResumeAbout This Role
Evaluate large language model-generated responses, conduct fact-checking, and generate high-quality human evaluation data. This role requires native or primary fluency in Chinese (Mandarin) and significant experience with LLMs.
Responsibilities
- Evaluate LLM-generated responses for effectiveness in answering user queries
- Conduct fact-checking using trusted public sources and external tools
- Generate high-quality human evaluation data by annotating response strengths, areas for improvement, and factual inaccuracies
- Assess reasoning quality, clarity, tone, and completeness of responses
- Ensure model responses align with expected conversational behavior and system guidelines
- Apply consistent annotations by following clear taxonomies, benchmarks, and detailed evaluation guidelines
Requirements
- Bachelor’s degree
- Native speaker or ILR 5/primary fluency (C2 on the CEFR scale) in Chinese (Mandarin)
- Significant experience using large language models (LLMs)
- Excellent writing skills
- Strong attention to detail
- Adaptable across topics, domains, and customer requirements
- Background in structured analytical thinking
- Excellent college-level mathematics skills
Qualifications
- Bachelor’s degree
Nice to Have
- Experience with RLHF, model evaluation, or data annotation work
- Experience writing or editing high-quality written content
- Experience comparing multiple outputs and making fine-grained qualitative judgments
- Familiarity with evaluation rubrics, benchmarks, or quality scoring systems
Skills
Large Language Models (LLMs)
*
* Required skills
About Mercor
Mercor connects elite creative and technical talent with leading AI research labs.
Technology
View all jobs at Mercor →
Related Searches
Similar Jobs
Data Science Scientist
Active Remote
Mercor
·
$56 - $77
Python
SQL
Snowflake
BigQuery
+6 more
1 week ago
Engineering Program Coordinator
Active Remote
Mercor
·
$65 - $80
1 week ago
Software Engineer – Codepath Expert
Active Remote
Mercor
·
$90 - $105
Python
C++
MySQL
1 week ago
Content Analyst
Active Remote
Mercor
·
$30 - $35
1 week ago
Frontend Engineer
Active Remote
Mercor
·
$70 - $80
JavaScript
REACT
TypeScript
Vanilla JS
1 week ago