Hosting providers are asked to block OpenAI's GPTBot search bot

Brother

Professional
Messages
2,566
Reputation
3
Reaction score
345
Points
83
Media reports that the Federal State Unitary Enterprise “Main Radio Frequency Center” (GRFC), subordinate to Roskomnadzor, sent out a letter to Russian hosting providers about the identification of the GPTBot search robot. The agency recommends blocking the bot’s ability to view and analyze an Internet page to “prevent the collection of information about critical resource vulnerabilities” that are within the companies area of responsibility.

Journalists from Kommersant reviewed the text of the letter and recommendations for identifying and blocking the GPTBot bot, which the department sent to a number of hosting providers on December 11. Representatives of GRCHTS confirmed the sending of the letter.

The letter states the need to assess the risks of collecting information about the vulnerability of resources or “other sensitive information, including those containing personal data.” If such risks are identified, the GRCH indicates the need to block the bot's requests. The department also sent instructions on how to do this.

OpenAI launched its crawler in August 2023 to crawl the web and collect open data, which it will later use to improve and train ChatGPT. Shortly after the launch, the company came under criticism for unauthorized data collection, after which OpenAI published instructions on how to block the bot's access to the site or provide it with only partial data.

It is worth noting that Qrator Labs analysts recently published statistics on bot activity during sales. The response noted that the number of GPTBot calls to Russian resources has reached a record level, and the search for web resource locations and APIs occurs at the highest possible speed.

“According to our observations, many market players have not yet updated their configurations, so GPTBot requests, if they are not blocked by security solutions, can cause serious parasitic load and increased consumption of server capacity. For a number of large online stores, the share of GPTBot requests in the mass of all bot requests reaches 90%,” wrote Qrator Labs experts.

As Georgy Tarasov, Qrator.AntiBot product manager at Qrator Labs, has now told reporters, in general, GPTBot is designed as correctly as possible, always “introduces itself” and declares compliance with the access rules for bots set by resource owners.

“If companies in RuNet and on the global Internet continue to deny access to GPTBot and other AI search robots at the same pace as is happening now, then AI / ML businesses will have to resort to other methods of collecting relevant data,” notes Tarasov. “For example. , to disguising bots as legitimate users and purchasing aggregated data from bot farm owners, and this already falls into the category of unwanted bot attacks.”

In turn, the head of the hosting provider RUVDS, Nikita Tsaplin, told the publication that AI bots can be used not only for peaceful purposes, “but also serve the interests of cybercriminals.” According to him, tools for hacking, phishing and ensuring the operation of darknet sites are already being created on the basis of such solutions.

“They [bots] are becoming more efficient and therefore more dangerous. Of course, distinguishing a bot from a regular user is not always easy, but in general this work needs to be done. We regard the initiative [of Roskomnadzor] as sound, and we can advise all webmasters to use it,” says Tsaplin.
 
Top