- Perplexity considered ignorant of signals as robot.txt to scratch online sites
- He even found protected and hidden test sites from Cloudflare
- OPENAI adheres to the responsible ramp, but to the silent perplexity for the moment
Cloudflare accused the giant AI perplexity of scratching the websites that explicitly refused the ramp via robots.txt and other rules at the network by hiding its identity and by carrying out an obscured ramp activity.
Society researchers said they observed perplexity using several user agents, including an imitation of Google Chrome on MacOS, as well as IP addresses and rotating ASNs to escape detection.
Alarming, Cloudflare has detected millions of daily requests in tens of thousands of areas, highlighting the scale of illegitimate scratching by one of the largest spaces in the space.
Perplexity is to scratch the sites that it should not be
According to Cloudflare’s analysis, in many cases, perplexity has ignored or has not obtained robot.txt – which are clear text files placed at the root of a site to indicate automated agents (such as search engines, gear robots and links) which URL can or may not be recovered.
Anyway, Perplexity has also tried to access the testing websites created by Cloudflare, even if they have been blocked via robots.txt and not publicly discovered, while using unrecognized robots which were not even associated with its official IP range.
“Although the perplexity initially ramp of their declared user agent, when presented with a network block, they seem to obscure their creeping identity in order to bypass the preferences of the website,” write the researchers.
In response to his results, Cloudflare has disappeared the Bots of Perplexity from his list of verified bots. The company has also added new heuristics managed to detect and block the furtive ramp.
On the other hand, Openai robots have so far respected robots.txt and block pages, using transparent identifiers and documented behavior to obtain information.
Perplexity denied the reprehensible acts, calling the Cloudflare post a “sales argument”, adding the identified robots were not even theirs. Techradar Pro asked for perplexity for his comment.
Cloudflare exhorts bot operators to respect the preferences of the website by being transparent, by being well high internet users, by serving a clear objective, using separate robots for separate activities and the following rules and signals as robots.txt.