Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift

Part Time Mid Level 5+ years

Posted 3 months ago Expired

This job has expired

Looking for a job like Evaluation Scenario Writer - AI Agent Testing Specialist? Upload your resume and we'll notify you when similar positions become available.

Upload Your Resume

About This Role

Create challenging and realistic coding test cases to evaluate and improve AI coding systems' capabilities. Focus on designing comprehensive functional tests that validate end-to-end behavior and analyze AI failures to understand model strengths and weaknesses.

Responsibilities

Review and refine realistic coding tasks based on provided production codebases
Write comprehensive functional tests that validate actual end-to-end behavior and edge-cases
Craft "fair but hard" challenges for AI systems, requiring complex reasoning and information retrieval across scattered data
Analyze AI failures to understand model struggles versus mastery
Iterate based on feedback from expert QA reviewers

Requirements

Degree in Computer Science, Software Engineering or related fields
5+ years in software development, primarily Python (pytest, async/await, subprocess, file operations)
Background in Full-Stack development (React-based interfaces and robust Back-end systems)
Experience writing tests (functional, integration)
Docker containers (running evaluations locally in containers)
CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results)
English proficiency - B2

Qualifications

Degree in Computer Science, Software Engineering or related fields
5+ years in software development, primarily Python

Skills

Python * Docker * REACT * CI/CD * GitHub Actions * Pytest *

* Required skills

About Mindrift

Mindrift connects specialists with AI projects from major tech innovators, unlocking the potential of Generative AI by tapping into real-world expertise from across the globe.

Technology

View all jobs at Mindrift →

Similar Jobs

Evaluation Scenario Writer - AI Agent Testing Specialist

This job has expired

About This Role

Responsibilities

Requirements

Qualifications

Skills

About Mindrift

Related Searches

Similar Jobs

Freelance Agent Evaluation Engineer

Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

Freelance Mechanical Engineering & Python Expert - AI Trainer

Freelance Mechanical Engineering & Python Expert - AI Trainer