Deep Chaos Scheduler with kernel optimization
A lottery scheduler for transformer fine-tuning — randomly selects 30–70% of layers, heads, and MLP channels each sticky block. Custom kernel optimizer cuts VRAM and trains 2.25× faster. Every seed beats full fine-tune on math evals.