
Transforming global content for diverse audiences
Welocalize, founded in 1997 and headquartered in Midtown South, New York, NY, is the 4th largest language service provider in North America and the 9th globally. With over 250 language options, Welocalize transforms and localizes content for 2,000+ clients, including major brands like Disney, Uber, ...
Welocalize offers competitive salaries, equity options, generous PTO policies, and a remote-friendly work environment to support work-life balance. Em...
Welocalize fosters a culture of inclusivity and global awareness, emphasizing the importance of language diversity. The company is committed to levera...

Welocalize • Cairo, Egypt
Welocalize is seeking an Arabic (Egyptian) AI Evaluation Specialist to support the testing and evaluation of an Arabic language model. You'll design prompts and evaluate AI responses to enhance language model performance. This role requires native fluency in Egyptian Arabic and relevant experience in AI evaluation.
You are a native speaker of Egyptian Arabic with a strong understanding of linguistic nuances and cultural context. You hold a Bachelor's degree or equivalent experience in Linguistics, Computational Linguistics, Communications, Technical Writing, or a related analytical field. Your background includes familiarity with AI evaluation, prompt engineering, and linguistic quality assurance, which will be essential in assessing AI outputs effectively. You are detail-oriented and possess the ability to create high-quality source documents that serve as the foundation for testing AI systems. Your analytical skills enable you to develop evaluation rubrics that assess AI responses across various criteria, including factuality, tone, and helpfulness.
In this role, you will be instrumental in refining and evaluating large language models (LLMs) by designing scenario-based and edge-case prompts to test AI behavior. You will perform side-by-side evaluations of AI outputs, scoring them on a defined scale based on established criteria. Your responsibilities will also include generating accurate and well-structured Golden Responses that adhere to instructions and effectively handle ambiguity. You will collaborate with a team to ensure the AI systems are functional, accurate, and safe for users. Additionally, you will attend and complete all required webinars and continuous learning sessions to stay updated on best practices in AI evaluation.
This position offers a competitive pay rate of $10 USD per hour, with a commitment of 40 hours per week over a 3-month project duration. You will have the opportunity to work remotely from Egypt, allowing for flexibility in your schedule. As part of Welocalize, you will contribute to cutting-edge AI technology, helping to build smarter and more reliable systems. We encourage you to apply even if your experience doesn't match every requirement, as we value diverse perspectives and backgrounds.
Apply now or save it for later. Get alerts for similar jobs at Welocalize.