News Organizations Push Back Against Web Archive Used For AI
CNN headquarters in Atlanta.
Photographer: Brandon Bell/Getty ImagesMajor news organizations, including CNN, NBC and USA Today, have joined an effort to curb the storage of their content in a web archive used by artificial intelligence companies for training chatbots.
Those outlets are part of a group of 20 publishers who have opted out of having their content saved in an online repository maintained by the nonprofit Common Crawl, according to a letter reviewed by Bloomberg. The News/Media Alliance, which represents newspapers and magazines, sent the letter Wednesday to Common Crawl urging them to honor publishers’ requests to remove content from dozens of their websites and prohibit the unauthorized use of their work, including for AI purposes.