
AI Trainer is a fast, one-page MVP that demonstrates the full “bring your own data → train → ask” loop entirely in the browser—no backend, no data leaving the device. It’s ideal for quick investor demos, internal POCs, and privacy-sensitive showcases. What it does Upload your data: CSV, TXT/Markdown, JSON/JSONL, and Excel (first worksheet). If the browser limits unzip/clipboard, a guided Clipboard/CSV fallback keeps things moving. Train locally: builds a lightweight TF-IDF-style lexical retriever in memory. No servers, keys, or background services required. Ask (RAG-style): type a question and get top-k relevant passages from your corpus for instant insight. Advanced mode Preprocessing controls (chunk size, overlap) and selection of text/label columns for CSV/Excel. Classifier using label-aware lexical overlap for simple categorization. Local API for demos without a server (via safe fetch interception): GET /api/status → { ok, docs, trained } POST /api/train → trains current docs POST /api/index → add documents POST /api/ask → query and get top matches Privacy & compatibility Everything runs in memory inside the browser—no uploads or telemetry. Plain HTML/CSS/Vanilla JS with conservative syntax for broad browser support. Lightweight XLSX reader (first sheet) plus robust clipboard/paste handling. Use cases Investor/product demos, internal knowledge bases, support FAQs/policies, and quick data exploration without infrastructure. Current limitations (MVP) Lexical retrieval only (no embeddings/neural ranking yet). No persistence across reloads. No hosted fine-tune or external model_id. Roadmap API key + cloud LLMs for summarization and/or embeddings. Persistence (vector DB or export/import). Hosted fine-tune with returned model_id. Service Worker to expose real network /api/* endpoints.
24 Aug 2025