
Every researcher who joins a new project hits the same wall. You clone the repo, open the folder, and see hundreds of files across dozens of modules with minimal documentation. You spend days sometimes weeks figuring out what's what. I lived this. When I joined a research group, I spent days trying to understand a complex codebase with multiple abstraction layers and almost no documentation. No onboarding guide existed. No one warned me about placeholder classes masquerading as real implementations. That experience wasted time, hidden gotchas, tribal knowledge locked in senior developers' heads is universal across research labs, open-source ML projects, and academic codebases. bob-onboard is a custom IBM Bob mode that transforms Bob into a research code onboarding specialist. When activated on any Python research codebase, it systematically discovers the project structure, maps the architecture with Mermaid diagrams, surfaces domain knowledge like paper citations and algorithm references, documents setup procedures and dependencies, and hunts for hidden gotchas, stub classes, NotImplementedError sites, TODO/FIXME comments, and unimplemented abstract methods. The output is a single ONBOARDING.md file that takes a new contributor from "just cloned this" to "ready to contribute" in one sitting. To prove it works on real-world complexity, we ran it on IBM's Adversarial Robustness Toolbox, a 5.9k-star ML security library with 17 core modules, 90+ notebooks, and abstract base class hierarchies across 8 ML frameworks. The generated guide identified 30+ NotImplementedError sites with exact file paths, extracted paper citations verbatim from source code, and produced three complete workflow examples with working code.
17 May 2026