AI has caused numerous conflicts between companies, industries, and even countries. One of the most prominent battlegrounds has been between online publishers and the AI tech giants, whose LLMs could pose a major existential threat if publishers aren’t fairly compensated.
Large language models need quality content to train their models, and publishers and websites have had major issues with AI companies for “scraping” their content to train models without asking, or without providing payment. In many cases, this has resulted in tiresome copyright lawsuits from big news outlets against AI companies. With the AI landscape looking more like the Wild Wests, the legal disputes have highlighted the potential need for regulation.
Some publishers have struck lucrative deals with AI giants – both for their content to be used to train AI models for payment, and for their content to be ranked highly in chatbot query responses. Just as content creators and publishers were starting to hand over their content to AI companies, a major development occurred on July 1st, 2025.
Cloudflare, a leading cybersecurity and network company, that’s responsible for sixteen percent of the world’s internet traffic, revealed in September last year it would be automatically blocking AI agents or crawlers – tools that analyse and extract content for AI models – to its websites, unless publishers purposefully opt-in to giving the AI companies access. Now, however, publishers would be given the opportunity to charge AI crawlers for using their content.
All of a sudden, the power dynamics shifted, with Cloudflare’s announcement being applauded by the likes of Conde Nast, TIME, and Sky News Group.
FutureWeek caught up with Will Allen, VP of Product from Cloudflare, to find out more about the company’s latest offering, what it actually means in real-terms, and the impact this will have for publishers, and the internet, moving forward.
A New Relationship Between AI and Publishers
So how does it work? Publishers who use Cloudflare’s hosting will be automatically opted out of having their content scraped by AI crawlers, meaning they must consciously opt-in to having their content being used by AI companies for training models.
Allen said the company decided to automatically opt-out publishers using their cloud after speaking to customers who told them “again-and-again” that they had no idea what was happening to their content. For Allen, it made perfect sense to have a default opt-out approach over an opt-in model, to give customers more control and know-how over where their content is going.
“We thought: how do we re-level the playing field so that news organisations, journalists, and anyone who creates content can decide what happens to what they create?,” he said.
Cloudflare also introduced a new ‘pay-per-crawl’ feature that will enable website owners to generate revenue every time their content is scraped. Allen said that content creators and publishers will be able to choose which foundational model, AI agent, or AI search engine has access to their content, and which ones they want to block.
So, for example, a publisher or content creator can see the AI crawlers coming to their website, see what they’re doing, and then use a toggle on the Cloudflare interface to decide whether to block them, allow them access, or charge them – all on a crawler-specific basis.
Allen says that publishers will have different reasons, like pre-existing commercial relationships or the likelihood of appearing in AI chatbot answer results, are all factors that will determine which AI companies publishers give their content to.
“You can choose who to charge and who will get allowed access for free. If you want to let a specific AI system have your content because it regularly sends you traffic, then you can do that without enabling access to all of them. At the same time, if you want to only create content for human consumption, without giving any of it to AI companies or tools, you can do that too,” he says.
Allen explains that website owners can set a price for their content, so when an AI crawler comes to scrape its pages, the crawler can choose whether to pay for that content or not. “It’s really our first step into thinking about a marketplace where the publisher gets to set their rates and the people willing to access that content can pay for it.”
Where some of this transparency and control ends, is being able to see what AI crawlers might use the content for, such as for training models, or for producing chatbot query answers. However, Allen says this is on Cloudflare’s radar and the latest product is only its first development. “We’re very open about the fact this is a “V1”,” he explains.
“There’s so much more we want to do. Eventually, we might be able to do dynamic pricing based on the agent or crawler coming to access content based on it being archived versus timely content, or based on what the real-time demand of the content is because of an event that’s happened – there are so many variables that we know we want to work towards.”
Managing Millions of Agents
Whilst we’ve seen multiple deals being struck by various LLMs, such as OpenAI’s deals with The Washington Post, The Guardian and News Corp, or Perplexity’s deals with Time, Fortune and The Independent, the deals are legally cumbersome and are difficult to scale.
According to Allen, Cloudflare’s new offering provides a long-term solution to the millions of AI crawlers and agents there are going to be on the internet, not only the AI giants. “For a publisher, you can strike 10 deals, but the number of agents and the number of crawlers and bots that are out there looking for amazing content is growing enormously. It’s not just one, two or three companies, it’s gonna be tens of thousands of AI companies and millions of agents – how do you work with all of them at scale?”
Despite the tool notably putting power back into the hands of publishers, Allen notes that the “pay-per-crawl” feature gives AI companies access to bespoke, and timely, content for agents and crawlers. “Maybe, as an AI company, you want the latest news in a particular region about a particular event on a one-off occasion? In this case, a publisher wouldn’t have an interest in doing a longer term licensing deal.”
Most major AI chatbots have also introduced research agents. OpenAI notably launched “Operator” and “Deep Research, and Microsoft’s CoPilot announced Researcher and Analyst. These “deep reasoning” agents can undertake multi-step research on their own.
The development of deep reasoning tools could provide an additional opportunity for publishers. Allen says one of the things he envisions is consumers giving agents a budget based on the quality of research they want.
He explains: “I would love to use a deep research agent, tell it to give me information about something – like how to plant a tree, or the best restaurant in Soho, for example – set a budget of $10, $50, or $100 for it to find the best, most relevant content from the world’s leading experts, for example. Then that agent is able to go off and purchase content at internet scale by finding it, making a bid, and purchasing that content on your behalf. That’s an incredible experience for a consumer.”
The AI Browser of the Future
Last week, Cloudflare’s CEO Matthew Prince sent out a response on X to a user saying the firm is actively attempting to block Google’s AI Overview and Answer Box, without letting this impact where a website appears in Google’s search rankings – currently blocking Google crawlers does impact ranking for publishers.
When asked how that would work, Prince went on to reply: “Worst case we’ll pass a law somewhere that requires them to break out their crawlers and then announce all routes to their crawlers from there. And that wouldn’t be hard. But I’m hopeful it won’t need to come to that.”
Perplexity AI recently announced the launch of its new AI browser “Comet”, with OpenAI reportedly set to release its own search engine in the coming weeks. As search and AI intertwines, this may add a layer of complexity when publishers choose which crawlers to block.
From Allen’s perspective, the space and relationships between AI players and publishers is going to be constantly evolving, for quite some time: “I’m excited about all of the amazing innovation that we are seeing on both the publishing side and the AI and agentic side as well – so it’s definitely an evolving space. There’s going to be a lot of new developments happening all the time,” he said.
“Publishers having the ability to decide how they want to experiment with their content is at the heart of what we are doing. It’s also exciting for consumers and AI companies to pay for content at scale easily. Cloudflare building those mechanisms – and then letting the marketplace and different models flourish – is what I would like to see.”



