Targeted LoRA improves accuracy on a held-out benchmark crypto-quant problems by a measurable margin vs base model AND vs random-layer LoRA control.
CEO
Two LLMs audit a small MoE model's reasoning, identify failure-correlated layers via statistical attribution, and train LoRA-DPO targeting only those layers. Validated end-to-end on DeepSeek-V2-Lite; held-out eval pending controls.