6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 10:53 pm

All times are UTC




Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Mon Jan 29, 2024 10:47 pm 
Offline

Joined: Fri Dec 21, 2018 1:05 am
Posts: 1076
Location: Albuquerque NM USA
Yesterday morning when I was adding a new post to my VGA controller topic in the Hardware section, I noticed the "views" count was huge, 53008. That topic was mostly me talking to myself, so I thought the views count was big because it was an old topic started March 2021. Anyway, I wrote down the number, 53008 views. 30 hours later, the view count is 54557! 1500 views in 30 hours? Really? Do we have internet bots snooping like crazy? What does the views count really mean?
Bill


Top
 Profile  
Reply with quote  
PostPosted: Tue Jan 30, 2024 12:31 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8147
Location: Midwestern USA
plasmo wrote:
Yesterday morning when I was adding a new post to my VGA controller topic in the Hardware section, I noticed the "views" count was huge, 53008.  That topic was mostly me talking to myself, so I thought the views count was big because it was an old topic started March 2021.  Anyway, I wrote down the number, 53008 views.  30 hours later, the view count is 54557!  1500 views in 30 hours?  Really?  Do we have internet bots snooping like crazy?  What does the views count really mean?
Bill

I’ve noticed that my POC V1 topic now has several million views.  That is clearly the work of bots and may indicate a missing or improperly-defined robots.txt file in the site’s document root.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Jan 30, 2024 8:58 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1399
Location: Scotland
This site has no robots.txt file.

Not that that matters today - robots will search if you ask them to or not.

The searches/scrapes are used for many purposes - from the simple seeding of search engines to the more dodgy scraping for personal data - the more data people have the better they can target advertising at you, or use it to launch phishing attacks on you once they have gathered your profile of forum names to real names to addresses and so on - and they can do this by scraping *everything* they can.

Some forums even have "legit" search engine logins.

Its then up to the forum owners to do something about this as we end-users can't. Firewalling (not always possible with cheap hosting packages) or more....

But really - we lost the battle some time back by simple walking right into it.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Tue Jan 30, 2024 10:08 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
In some forum software, and I think this applies here, the admin can declare a definition of "bot" such that those sessions don't take up as many resources as they might normally do. You'd see in the footer of the page this kind of thing:
Quote:
Users browsing this forum: BigEd and 3 guests
or at the footer of the front page
Quote:
Registered users: BigEd, gilhad, Google [Bot]

Of course, the identification of a bot (usually from user agent) is no guarantee, but over on anycpu this does help keep resource usage down. We have some 50-ish bots declared, and 30+ of them have visited in the past month.

Sometimes it's the case that someone somewhere is trying to fetch the entire forum. This usually works poorly because every page is cross-linked many times, with each post having a URL, each thread, each page of each thread, each user and each sub-forum. There are also links to next and previous threads. You need a smart crawler to fetch a forum efficiently, whereas anyone can have the (not so) bright idea to just do a recursive fetch.

A thread with many posts will, in my experience, accumulate views ever-faster. Perhaps a reason not to have mega-threads too often.

And then, there are sites like hackaday, hacker news, reddit, other forums, which have large audiences, and which can easily cause a massive burst of traffic usually for a short time but with a long tail.

None of these are any sort of problem, so long as the server has the resources and the costs can be covered - we should be confident that Mike is on top of all that.

Certainly we should fully expect that everything we post here is public, is accessible, and will probably find its way into databases, private or commercial or otherwise. And into the current crop of large language models, for sure. Again, I can't see this as a problem, because we always post publicly, and knowingly so.

Edit: having said that, for my own amusement I try to keep an eye on the max-concurrent-users-ever statistic found on the front page. I see that it did recently take a big leap:
from Most users ever online was 761 on Sat Dec 19, 2020 1:05 am
to Most users ever online was 1425 on Mon Jan 29, 2024 11:53 pm


Top
 Profile  
Reply with quote  
PostPosted: Tue Jan 30, 2024 4:50 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
As another datapoint, I see Michael's thread
Using a 74HC151 as an address decoder?
which has only 3 replies and is only a couple of weeks old, has 145k views already. Possibly it's because it is an interesting or useful topic, people have shared links, and it has been well-read.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 13 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: