
The internet is flooded with decontextualized and often dangerous "Ayurvedic" advice. Meanwhile, genuine classical texts like the Charaka-Samhita remain largely inaccessible due to the language barrier (Sanskrit) and dense formatting. Vaidya bridges this gap by acting as a highly disciplined, citation-grounded Retrieval-Augmented Generation (RAG) system. When a user asks a medical or philosophical question in English, Hindi, or Sanskrit, Vaidya does not rely on the LLM's internal weights. Instead, it retrieves the exact relevant verses from the digitized Charaka-Samhita. Under the hood, we built a custom pipeline to parse Devanagari OCR, extract semantic chunks, and map exact structural metadata (Sthana, Adhyaya, Verse number). These were embedded using multilingual-e5-large into a Qdrant vector database. For generation, we deployed Qwen2.5-72B-Instruct using vLLM on an AMD MI300X GPU via DigitalOcean. Vaidya’s defining feature is its strict integrity. Every factual claim in the response is followed by a precise citation. Furthermore, it employs confidence-gating: if the vector search yields low-relevance results, the system is programmed to admit it lacks a source rather than fabricate an answer. Vaidya represents a trustworthy AI gateway to classical Indian medicine for researchers, students, and practitioners.
10 May 2026