The “ catastrophic overtraining “could harm the models of large language which are trained on more data for training


  • Researchers from the best American universities provide that pre-training extension can be harmful to performance
  • Too much pre-training can offer worse performance due to something that looks like the butterfly effect
  • The more pre-formed they are, the more they become sensitive to small changes that could disrupt the final result

Researchers from Carnegie Mellon, Stanford, Harvard and Princeton dispute one of the fundamental beliefs accepted by the development of AI – that the pre -training data is the performance.

As indicated by HpcwireA new paper describes the concept of “catastrophic overtraining”, by which an extensive pre-training can harm the performance of a model after the fine adjustment.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top