Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


atomjump

3 posts

Wannabe Geek


#318639 6-Feb-2025 09:49
Send private message

Hi - we are a self-hosted website in New Zealand (AtomJump), and during a five hour period last night we registered 70K hits by OpenAI's ChatGPT bot, ie. an average of 4 requests per second, but peaking at maybe 10 requests/second.

 

We scrambled and blocked their bot on our robots.txt file. Our service stayed up, but probably only because we have a few load-balanced servers.

 

OpenAI have not got back to us, though they seemed to stop the bot 'attack' after our robots.txt change.  Is anyone else in NZ seeing similar behaviour?


Create new topic
davidcole
6029 posts

Uber Geek

Trusted

  #3339840 6-Feb-2025 11:46
Send private message

On no, it’s started.  It’s become self aware and decided our websites are the problem.  Next it will be us!!!!  Where’s John Conner when we need him. 





Previously known as psycik

Home Assistant: Gigabyte AMD A8 Brix, Home Assistant with Aeotech ZWave Controller, Raspberry PI, Wemos D1 Mini, Zwave, Shelly Humidity and Temperature sensors
Media:Chromecast v2, ATV4 4k, ATV4, HDHomeRun Dual
Server
Host Plex Server 3x3TB, 4x4TB using MergerFS, Samsung 850 evo 512 GB SSD, Proxmox Server with 1xW10, 2xUbuntu 22.04 LTS, Backblaze Backups, usenetprime.com fastmail.com Sharesies Trakt.TV Sharesight 




atomjump

3 posts

Wannabe Geek


  #3339881 6-Feb-2025 13:15
Send private message

😀 Curiously I don't think their legal team (the only sensible contact address I could find on OpenAI's website) got the AI memo. No answer so far.  Maybe the bots have seen the future, already, and terminated them before they could reply?


marpada
475 posts

Ultimate Geek


  #3339887 6-Feb-2025 13:30
Send private message

How do you know the requests came from ChatGPT? User Agent header is trivial to spoof.




freitasm
BDFL - Memuneh
79250 posts

Uber Geek

Administrator
ID Verified
Trusted
Geekzone
Lifetime subscriber

  #3339888 6-Feb-2025 13:40
Send private message

atomjump:

 

We scrambled and blocked their bot on our robots.txt file. Our service stayed up, but probably only because we have a few load-balanced servers.

 

 

This has no effect at all. It's known most of these bots ignore robost.txt. Very few bots are "good netizens". Also the robots.txt file is not checked on every request so it would probably continue doing it.

 

 

OpenAI have not got back to us, though they seemed to stop the bot 'attack' after our robots.txt change.  Is anyone else in NZ seeing similar behaviour?

 

 

You won't hear back from them. They aren't good netizens either.

 

If you are worried you'd need a service like Cloudflare Security. This is the free Bot detection (which also includes blocking RSS feeds, and might impact inbound API requests):

 

 

This is the paid version:

 

 

 

 

 

Or you can define some Web Application Firewall rules for this:

 

 

Again, robots.txt does nothing if they ignore it. And they mostly do.





Please support Geekzone by subscribing, or using one of our referral links: Samsung | AliExpress | Wise | Sharesies | Hatch | GoodSyncBackblaze backup


atomjump

3 posts

Wannabe Geek


  #3339901 6-Feb-2025 15:16
Send private message

Thanks both. Cloudfare is an interesting option, although we are purposely independent, as an organisation, from any US-based company.  Our router firewall is probably the next best bet.

 

re: spoofing the header agent. Yes, we can't be 100% certain it is them.

 

The user agent was:

 

"Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)"

 

On adding the recommended

 

"User-agent: GPTBot  Disallow: /"

 

it did seem to back down within a few hours (unless it simply ran it's course of URIs).  

 

Other theories are malicious use of their API: https://www.theregister.com/2025/01/19/openais_chatgpt_crawler_vulnerability/

 

 


philipnewmannz80
2 posts

Wannabe Geek

Trusted

  #3366587 22-Apr-2025 18:57
Send private message

Late to the party but was looking to see what others were doing without using a 3rd party service and I came across A few Apache Mod_Security Rules to rate limit different types of bot traffic. You will find 99% of the bot traffic will come though HTTP/1.1 and normal website uses will mostly be on HTTP/2


Behodar
10501 posts

Uber Geek

Trusted
Lifetime subscriber

  #3366589 22-Apr-2025 19:18
Send private message

I've also seen a trick of putting a "honeypot" URL in robots.txt (and nowhere else): if a given client hits that disallowed URL then you can automatically block them at the firewall. You'd probably want to set a reasonable timeout that auto-unblocks them again in case the IP address gets reallocated to a legitimate user.


Create new topic





News and reviews »

Air New Zealand Starts AI adoption with OpenAI
Posted 24-Jul-2025 16:00


eero Pro 7 Review
Posted 23-Jul-2025 12:07


BeeStation Plus Review
Posted 21-Jul-2025 14:21


eero Unveils New Wi-Fi 7 Products in New Zealand
Posted 21-Jul-2025 00:01


WiZ Introduces HDMI Sync Box and other Light Devices
Posted 20-Jul-2025 17:32


RedShield Enhances DDoS and Bot Attack Protection
Posted 20-Jul-2025 17:26


Seagate Ships 30TB Drives
Posted 17-Jul-2025 11:24


Oclean AirPump A10 Water Flosser Review
Posted 13-Jul-2025 11:05


Samsung Galaxy Z Fold7: Raising the Bar for Smartphones
Posted 10-Jul-2025 02:01


Samsung Galaxy Z Flip7 Brings New Edge-To-Edge FlexWindow
Posted 10-Jul-2025 02:01


Epson Launches New AM-C550Z WorkForce Enterprise printer
Posted 9-Jul-2025 18:22


Samsung Releases Smart Monitor M9
Posted 9-Jul-2025 17:46


Nearly Half of Older Kiwis Still Write their Passwords on Paper
Posted 9-Jul-2025 08:42


D-Link 4G+ Cat6 Wi-Fi 6 DWR-933M Mobile Hotspot Review
Posted 1-Jul-2025 11:34


Oppo A5 Series Launches With New Levels of Durability
Posted 30-Jun-2025 10:15









Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.