I’ve written 3 books about AI, taught classes about generative AI (ChatGPT, etc), and talked about it at several events in the last seven years. I started out terrified, and now I’m amazed, but also even more terrified.
AI is able to generate convincing outputs because it’s learned similarities between many different examples from the training data it was fed (mostly from the web). However, we’ve found that if you just train AI on content from the web, you end up with something truly evil and mean. To prevent this, AI models are heavily re-trained to make sure they don’t say anything controversial (or that criticizes whoever made the AI).
The result of all this training is that AI-generated art is overly dramatic, predictable, and easily identifiable. AI-generated music has no soul. AI-generated writing is formulaic and humorless. AI doesn’t do subtly, metaphors, or analogies well. The AI we have now is like a pop country singer. But that’s not what makes it dangerous.
Model Collapse
If you feed an image (let’s say a photo or original painting) into AI and tell it to make something similar, you get a watered down version with some random nonsense thrown in. If you feed the generated image back into AI and tell it to make something similar, you get a much less detailed version, with more nonsense thrown in. After a couple iterations, you end up with nothing but noise. It’s been estimated that, already, around 60% of writing on the web was AI-generated. What frightens me is that future AI models are going to be trained on content from an AI-generated web. Eventually, we’ll go from AI-generated content being formulaic and dull to AI-generated content being formulaic, dull, dumb, and useless.
Civilization Collapse?
AI Models trained on recursively generated data could be dangerous to the ability of people to think for themselves or to have original thoughts. In my teaching and writing over the last few years, I’ve met people who don’t make a decision or write an email without consulting with some AI assistant. I’ve also met people who use an AI chatbot as their primary information source. Governments and companies are increasingly relying on AI to assist or replace people. Will we as a species become dependent on AI models that keep getting dumber? What would be the result of that?
The Benefits of Diversity
AI models benefit from learning from diverse data. But, that’s not the default. Historically, most of what we consider to be the “great works” of writing in the western hemisphere were done by white men. The same goes for painting, architecture, and so on. Most computer code on the web was written by programmers working over the last 30 years, who are disproportionately young white men. Simply feeding the entire web into an AI model and telling it to make something “great” results in outputs that are skewed (or biased).
To counteract the bias inherent in the training data, AI models are specifically taught that’s it’s not a fact that great works must be done by white men. AI researchers know that part of the key to preventing model collapse is to train AI on data that’s as representative as possible of the entirety of human knowledge.
I support training AI with as diverse a data set as possible because that’s how you make its outputs more human-like. Without making an effort to expose AI models to diversity, we’ll doom it to a future of boring outputs, stagnation, and model collapse.
Thank you for reading!