Cloudflare Launches AI Data Scraping Protections for Publishers

Cloudflare

Cloudflare, one of the biggest content delivery networks, has announced it will give publishers the opportunity to protect themselves from having their content scraped by AI models.

The San-Francisco headquartered firm, which is responsible for roughly 16 percent of the world’s internet traffic, poses a threat to AI models using internet content without permission or payment.

The company said it will block AI crawlers that scrape content automatically for first-time customers, blocking content from being used without knowledge.

A “Pay Per Crawl” offering is also available to customers, enabling publishers to grant AI companies access to their content for payment.

The new feature will prompt incoming website owners to decide if they’re OK with their content being scraped by AI crawlers.

So far, major publishers such as AdWeek, BuzzFeed, Fortune, Independent Media, TIME and Sky News Group have shown support for the new permission-led approach.

“Trusted, truly independent journalism is vital for us all,” said Christian Broughton, CEO, The Independent & Independent Media. “So it’s great to see Cloudflare demonstrating that ingenuity and innovation, not just legislation, can play an important role in securing a sustainable model for how publishers and AI companies co-exist. Creating a marketplace for high-quality content from responsible publishers is crucial – for the AI companies as well as the news industry.”

A Significant Development in AI Copyright Infringement

News of the protections provides a curve-ball for AI companies, and adds a layer to the ongoing debate over copyright between AI firms and content creators and publishers.

In some cases, this has resulted in legal action. One of the first lawsuits was the high-profile case between The New York Times (NYT) and OpenAI, where the NYT accused the AI giant of using millions of articles to train its chatbot ChatGPT without consent.

As recent as this year, numerous news outlets such as Vox Media, Politico, Forbes, and Condé Nast challenged AI startup Cohere for the systematic use of their articles to train models.

In response, dozens of media publishers have struck deals with AI companies over the past few years, often getting a cut of profits, or search traffic, in return for using their content to train models.

Reuters, News Corp, and The Guardian have all entered licensing agreements with OpenAI in recent years. Whilst Google, Meta, Microsoft and Perplexity AI have coined content partnerships with the likes of The Financial Times, Reuters, Le Monde, and Fortune.

The debate between AI firms and creatives has been ongoing all over the world. In the UK, the same concerns around copyright protections has been rife amongst the region’s creative industries, with singers, actors, writers, and musicians – including Dua Lipa, Elton John and Kate Bush – all voicing their concerns about the use of their content to train models.

The UK government recently denied an amendment to the Data (Use and Access) Bill, which would have forced AI companies to declare their use of copyrighted materials when training AI models.

A Department for Science, Innovation and Technology (DSIT) spokesperson referred to the reason for the rejection as being “about using data to grow the economy and improve people’s lives, from health to infrastructure and we can now get on with the job of doing that.”

Stunting AI Growth?

AI needs quality data to train itself, with publishers providing that through their articles.

However, the new offering from Cloudflare, which gives automatic protection to publishers from AI scraping, provides a solution to the problem of copyright infringement, a long-standing concern for news rooms since the Gen AI surge.

The unique solution potentially puts Cloudflare at a competitive advantage over other network providers who don’t offer these types of protections.

AI companies, and some government and public figures, however, have argued that setting stringent parameters on how AI firms scrape content could stunt AI development and growth.

US Vice President JD Vance warned European leaders that “excessive regulation” could stifle innovation. Open AI CEO Sam Altman, and CEOs at software firm SAP and appliance giant Bosch have also criticised the EU’s stricter approach to AI compared to the US.

Much of this debate has centred around remaining competitive on the global stage when it comes to AI growth, with the US and China at the forefront of this fight.

Subscribe to our newsletter for updates

Join thousands of media and marketing professionals by signing up for our newsletter.

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Share

Related Posts

Popular Articles

Featured Posts

Menu