Autonomous Agents Hackathon team: Auto-Eval