Cloudflare is evolving its bot management capabilities to provide website owners with more granular control over AI traffic. This update introduces a pragmatic taxonomy to classify bots based on their behavior (Search, Agent, Training) and content use, moving beyond a simple block/allow mechanism. The new features enable finer tuning of access for various AI activities, aiming to balance content protection with discoverability and fair compensation for creators.
Read original on Cloudflare BlogCloudflare has significantly updated its bot management platform to address the evolving landscape of AI-driven web traffic. Recognizing that a simple "block all AI bots" approach is insufficient, the new system provides a more nuanced way for website owners to manage how AI crawlers interact with their content. This is crucial for maintaining content independence while ensuring discoverability and potential compensation.
The core of Cloudflare's new approach is a pragmatic taxonomy that classifies bots based on their *behavior* rather than just a generic "AI" label. This allows for more precise control and transparency regarding what bots are doing on a website. Website owners can now configure rules based on these distinct behaviors:
Architectural Transparency
Cloudflare encourages bot operators to separate their crawlers for different purposes (Search, Agent, Training) to increase transparency. This architectural decision allows website owners to better understand and manage access, ensuring that multi-purpose crawlers are treated according to all their declared behaviors, with defaults leaning towards the most restrictive rules.
For enterprise customers, Cloudflare introduces BotBase, a comprehensive database tracking all known bots and their classifications. This provides an unprecedented level of visibility into automated traffic. Additionally, new capabilities are being built to allow customers to block or allow bots based on their *content use* levels, which indicate what a bot may keep and reshare after crawling:
This granular control is extended with a new `use` signal in `robots.txt`, allowing website owners to express preferences for content use, which Cloudflare supports by tracking and verifying bot adherence. Bots that abuse these signals will lose their "Verified" status, impacting their ability to crawl effectively. This system design empowers site owners with more precise tools to manage their digital assets in the age of AI.