Page 1 of 3

Ongoing Problem with Bots / Reading Forum when Not Logged In

Posted: Thu Jun 26, 2025 3:12 pm
by Mike Naberezny
Yesterday, the forum was unusable for several hours. You may notice on the forum's index page that we've hit a new record of "users":
Quote:
Most users ever online was 3457 on 25 Jun 2025 10:03 pm
All of these were bots that appeared within about 30 minutes and then continued hammering the forum continuously for hours until I blocked their IP addresses. The IPs were from all over the world and there doesn't seem to be any rhyme or reason to what they were requesting.

Unfortunately, a few hours after blocking them, a new wave appeared from different IP addresses. This brought the forum down again. As a stopgap measure, I've been forced to temporarily disable read access to the forum when not logged in. Whenever you load a page on the forum, some resources are consumed as the forum software fetches the posts from the database and formats them for display in your browser. The forum does do some caching but even with this, when thousands of bots request pages relentlessly, it overwhelms the server. The bots are "reading" the forum anonymously, so temporarily disabling read access for anonymous users has made the forum usable again, even though there are about a thousand bots hitting it right now as I type this.

I do not want to require logging in to read the forum since there's so much helpful content that users often find using search engines. However, at this moment, I've been forced to do this until we have a solution. There are various DDoS protection services and solutions intended to block AI that could be used to help. Once we get something in place that can get these bots under control, I will be able to open the forum back up for anonymous readers.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Thu Jun 26, 2025 3:27 pm
by BigEd
Thanks Mike! Much appreciate you fighting the fight for us. A pity it's needed but that's the present reality.

The hpmuseum.org forum went through a similar thing quite recently, with a similar tactic, and did in due course return to normal service.

See for example this post and nearby threads.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Thu Jun 26, 2025 3:38 pm
by L0uis.m
Thanks for all the effort Mike !

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Thu Jun 26, 2025 9:13 pm
by BigDumbDinosaur
Thank you, Mike, for the effort.  Being the admin of several sites, and the systems on which they are hosted, I can well appreciate the work and frustration that’s involved.

I’ve had bots hitting both my private and business websites, although the effects on the business site have been minimal.  On the other hand, traffic on my sbc.steggy.net site got to where the bandwidth consumption was affecting everything else running on the same server.

Like what has been happening to 6502.org, AI bots seem to account for this explosion in Internet traffic.  In examining the logs on the server hosting my web sites, I determined about 90 percent of the connections are originating from the IP block assigned to the People’s “Republic” of China (PRC).  I have tried to counteract that by setting a connection limit in the Apache server software, which seems to have helped somewhat, although the average packet count per second is still quite high.  The problem with connection limiting is it is global in effect—all sites hosted on the server are throttled, so is ultimately a poor solution (BTW, sbc.steggy.net appears to get a fair amount of legitimate daily traffic, which was unexpected :)).

I’ve concluded the only practical solution is to enable packet filtration on all traffic from the PRC IP block—inbound packets will be discarded.  I don’t do business with anyone in PRC except JLCPCB (whose IP block would be excepted from packet filtration), so I’m not seeing any downside to totally blocking PRC access.

BTW, I occasionally get DoS attacks on my company mail server.  Most of those are from IP addresses in Russia.  The mail server has been configured to refuse such connections, since I know of no one in that part of the world who would have a reason to communicate with me.

<rant>
It’s really a shame that I or anyone else would have to resort to draconian measures to maintain system and site integrity.  The Internet has the potential to be a great equalizer in enabling universal access to knowledge, as well as a convenient way to internationally communicate and help break down barriers to mutual understanding.

Focusing on the PRC, their government doesn’t want that sort of thing, as too much information in the hands of the average Chinese citizen is potentially dangerous to the government’s policy of totalitarian control of the population.  My opinion, for what it’s worth, is DDoS attacks on websites and mail servers outside of the PRC are carried out with the intent of forcing website and mail server administrators to block access, thus denying Chinese citizens easy access to outside information and thinking that might lead to a weakening of the government’s iron-fisted control.
</rant>

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 12:34 am
by Yuri
BigDumbDinosaur wrote:
... My opinion, for what it’s worth, is DDoS attacks on websites and mail servers outside of the PRC are carried out with the intent of forcing website and mail server administrators to block access...
I suppose that's one way to look at it, but I don't think a website that predominately talks about 40+ year old technology is high on the PRC's hit list; they basically have a firewall around the county's access points, and that would be a much easier way for them to deny access to this site if they deemed it necessary.

More likely is that it's some troll who just get their kicks by knocking sites off line just for the fun of making others be miserable. That's been my experience at least having run an IRC network for the last couple of decades.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 4:36 am
by BigDumbDinosaur
Yuri wrote:
BigDumbDinosaur wrote:
... My opinion, for what it’s worth, is DDoS attacks on websites and mail servers outside of the PRC are carried out with the intent of forcing website and mail server administrators to block access...
I suppose that's one way to look at it, but I don't think a website that predominately talks about 40+ year old technology is high on the PRC's hit list; they basically have a firewall around the county's access points, and that would be a much easier way for them to deny access to this site if they deemed it necessary.

Yet, most of the activity I’ve seen logged on the server hosting my homebrew computer site is coming out of the PRC netblock.  Are you suggesting PRC has a lot of trolls with nothing better to do than mount DDoS attacks against hobby websites such as this one or mine?

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 3:26 pm
by teamtempest
Thank you Mike!

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 3:42 pm
by Yuri
BigDumbDinosaur wrote:
Yuri wrote:
BigDumbDinosaur wrote:
... My opinion, for what it’s worth, is DDoS attacks on websites and mail servers outside of the PRC are carried out with the intent of forcing website and mail server administrators to block access...
I suppose that's one way to look at it, but I don't think a website that predominately talks about 40+ year old technology is high on the PRC's hit list; they basically have a firewall around the county's access points, and that would be a much easier way for them to deny access to this site if they deemed it necessary.

Yet, most of the activity I’ve seen logged on the server hosting my homebrew computer site is coming out of the PRC netblock.  Are you suggesting PRC has a lot of trolls with nothing better to do than mount DDoS attacks against hobby websites such as this one or mine?
Not at all actually. More likely what you're seeing there is one of the many botnets for hire situations. Many of those bot nets find roots in places where people are lax in updating the security of their machines, or cannot receive regular security updates for various reasons or another. Or their counties have decided to roll their own solutions with little to no oversight from outside perspectives and have a vested interest in NOT reporting security flaws.

So I don't think it's the PRC taking any direct interest in this site as much as taking more of a direct interest in not wanting to rely on technology they cannot control with an iron fist.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 4:14 pm
by jgharston
I had the same problem last month with the Wiki I maintain. Swamped with requests from essentially random IP addresses. I temporarily turned it off for a week (every access gave 403 Forbidden) while I investigated a fix.

I noticed that all the bot fetches had more than two query strings - so, if (count($QUERY)>2) return(403).

The second thing I did was I found something called "blackhole". On noticing that almost all bots ignore robots.txt and nofollow directives, there's a display="invisible" nofollow link to a blackhole directory which is also blacklisted in robots.txt. Anything that does access the blackhole directory gets a 403 response, and added to a list of banned IPs which is checked by the main home page.

It seems to be working so far. I didn't want to do things like captcha checks or these newfangled make-the-browser-do-some-arithmetic things (eg: see SABRE) as most of the machines I use the browser fails to validate them.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 8:41 pm
by 6502inside
My bet is these are AI bots crawling for LLMs. It's not an intentional DDoS attack; it's the next AI gold rush, and they don't care if they knock over other people's machines in the process. I got slammed with them on my own site until I started cracking down on entire IP blocks. However, I did this at the IP filter level so they don't even get to the webserver to be booted.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Fri Jun 27, 2025 10:51 pm
by GARTHWILSON
6502inside wrote:
I got slammed with them on my own site until I started cracking down on entire IP blocks. However, I did this at the IP filter level so they don't even get to the webserver to be booted.
I just tried clicking on the link to your site that's in your signature line, and after a long time, got the message, "The connection has timed out."  You say you're in sunny SoCal.  So am I.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Sat Jun 28, 2025 2:40 am
by BigDumbDinosaur
GARTHWILSON wrote:
6502inside wrote:
I got slammed with them on my own site until I started cracking down on entire IP blocks. However, I did this at the IP filter level so they don't even get to the webserver to be booted.
I just tried clicking on the link to your site that's in your signature line, and after a long time, got the message, "The connection has timed out."  You say you're in sunny SoCal.  So am I.
I was able to connect to his site.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Sat Jun 28, 2025 2:50 am
by GARTHWILSON
I just tried again.  Same thing.  Timed out.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Sat Jun 28, 2025 4:43 am
by barnacle
No problem from Germany this morning.

Re: Ongoing Problem with Bots / Reading Forum when Not Logge

Posted: Sat Jun 28, 2025 5:53 pm
by 6502inside
GARTHWILSON wrote:
6502inside wrote:
I got slammed with them on my own site until I started cracking down on entire IP blocks. However, I did this at the IP filter level so they don't even get to the webserver to be booted.
I just tried clicking on the link to your site that's in your signature line, and after a long time, got the message, "The connection has timed out."  You say you're in sunny SoCal.  So am I.
If you can PM me the IP (or IP range) you're connecting from, I'll check the filter. I'd obviously like to avoid false positives. The hardware is in So Cal, though it routes through a VPN which acts as the front end.