The AI is eating itself - by Casey Newton - Platformer - 0 views
www.platformer.news/...the-ai-is-eating-itself
AI effects threat internet inputs human knowledge research bias technology
shared by Javier E on 15 Jul 23
- No Cached
-
there also seems to be little doubt that is corroding the web.
-
, two new studies offered some cause for alarm. (I discovered both in the latest edition of Import AI, the indispensable weekly newsletter from Anthropic co-founder and former journalist Jack Clark.)
-
The first study, which had an admittedly small sample size, found that crowd-sourced workers on Amazon’s Mechanical Turks platforms increasingly admit to using LLMs to perform text-based tasks.
- ...6 more annotations...
-
Until now, the assumption has been that they will answer truthfully based on their own experiences. In a post-ChatGPT world, though, academics can no longer make that assumption. Given the mostly anonymous, transactional nature of the assignment, it’s easy to imagine a worker signing up to participate in a large number of studies and outsource all their answers to a bot. This “raises serious concerns about the gradual dilution of the ‘human factor’ in crowdsourced text data,” the researchers write.
-
“This, if true, has big implications,” Clark writes. “It suggests the proverbial mines from which companies gather the supposed raw material of human insights are now instead being filled up with counterfeit human intelligence.”
-
A second, more worrisome study comes from researchers at the University of Oxford, University of Cambridge, University of Toronto, and Imperial College London. It found that training AI systems on data generated by other AI systems — synthetic data, to use the industry’s term — causes models to degrade and ultimately collapse. While the decay can be managed by using synthetic data sparingly, researchers write, the idea that models can be “poisoned” by feeding them their own outputs raises real risks for the web
-
that’s a problem, because — to bring together the threads of today’s newsletter so far — AI output is spreading to encompass more of the web every day.“The obvious larger question,” Clark writes, “is what this does to competition among AI developers as the internet fills up with a greater percentage of generated versus real content.”
-
In The Verge, Vincent argues that the current wave of disruption will ultimately bring some benefits, even if it’s only to unsettle the monoliths that have dominated the web for so long. “Even if the web is flooded with AI junk, it could prove to be beneficial, spurring the development of better-funded platforms, he writes. “If Google consistently gives you garbage results in search, for example, you might be more inclined to pay for sources you trust and visit them directly.”
-
the glut of AI text will leave us with a web where the signal is ever harder to find in the noise. Early results suggest that these fears are justified — and that soon everyone on the internet, no matter their job, may soon find themselves having to exert ever more effort seeking signs of intelligent life.