Researchers tackle LLM regression with on policy training

Self distillation method preserves earlier model skills.

MIT researchers propose fix for LLM catastrophic forgetting.

Researchers at MIT, the Improbable AI Lab and ETH Zurich have proposed a fine tuning method to address catastrophic forgetting in large language models. The issue often causes models to lose earlier skills when trained on new tasks.

The technique, called self distillation fine tuning, allows a model to act as both teacher and student during training. In Cambridge and Zurich experiments, the approach preserved prior capabilities while improving accuracy on new tasks.

Enterprise teams often manage separate model variants to prevent regression, increasing operational complexity. The researchers argue that their method could reduce fragmentation and support continual learning, useful for AI, within a single production model.

However, the method requires around 2.5 times more computing power than standard supervised fine tuning. Analysts note that real world deployment will depend on governance controls, training costs and suitability for regulated industries.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot