AI loses its mind after being trained on AI-generated data. AI loses its mind after being trained on AI generated data. Summarized by News AI, which is a GPT-based summarization newsletter website. Feeding AI-generated content to AI models can cause their output quality to deteriorate, according to a new study by scientists at Rice and Stanford University. The researchers found that without enough fresh real data in each generation of an autophagus Loop, future generative models are doomed to have their quality or diversity progressively decrease, a condition they term model autophagy disorder (MAD).
The study suggests that AI models trained on synthetic content will start to lose less represented information and pull from increasingly converging and less varied data, leading to a decrease in output quality. The implications of this research are significant as AI models are widely trained on scraped online data and are becoming increasingly intertwined with the internet’s infrastructure. AI models have been trained by scraping troves of existing online data, and the more data fed to a model, the better it gets. However, as AI becomes more prevalent on the internet, it becomes harder for AI companies to ensure that their training data sets do not include synthetic content, potentially affecting the quality and structure of the open web.
The study also raises questions about the usefulness of AI systems without human input, as the results show that AI models trained solely on synthetic content are not very useful. The researchers suggest that adjusting model weights could help mitigate the negative effects of training AI models on AI-generated data. It is crucial to consider the potential consequences of training AI on AI-generated data to maintain the integrity and reliability of AI systems.
In conclusion, the study highlights the risks and challenges associated with training AI models on AI-generated data. It emphasizes the importance of incorporating fresh real data and human input to maintain the quality and usefulness of AI systems. As AI continues to advance, it is essential for AI companies to carefully curate their training data sets and ensure they are free from synthetic content. By doing so, we can prevent the deterioration of AI models and ensure their continued effectiveness in various applications.