
Research papers often claim reproducibility while leaving critical evidence scattered across PDFs, arXiv records, DOI metadata, GitHub repositories, datasets, and informal implementation notes. The Agentic Reproducibility Engine turns that messy audit process into a visible multi-agent workflow. Users submit a paper or paper reference through a static web interface. The backend runs a sequence of specialized agents that plan the audit, extract claims and artifacts, retrieve external evidence, score reproducibility risk, design a replication plan, generate code/data follow-up commands, and critique unsupported assumptions before writing the final report. The UI streams the work live so judges can inspect agent artifacts, tool calls, resolver outcomes, verifier objections, and final provenance. The target deployment uses a Hugging Face Static Space as the public frontend and an AMD Developer Cloud instance as the backend. The AMD host serves Qwen/Qwen3.5-27B through vLLM on ROCm and runs the FastAPI agent API. The system is designed to fail closed: if arXiv, DOI, GitHub, or dataset evidence cannot be verified, the report records that gap and degrades the decision rather than inventing placeholder evidence. The project includes a public GitHub repo, self-serve AMD/Hugging Face deployment kit, deterministic eval harness, static frontend, API backend, live evidence resolver tools, and an audit trace format for reproducibility evidence.
10 May 2026