Fijxu@programming.dev to

Privacy@lemmy.ml · 1 year ago

Does Google uses Google Chrome users to discover new unindexed pages?

49

Does Google uses Google Chrome users to discover new unindexed pages?

Fijxu@programming.dev to

Privacy@lemmy.ml · 1 year ago

This is not a long post, but I wanted to post this somewhere. This may be useful if someone is doing an article about Google or something like that.

While I was changing some things in my server configuration, some user accessed a public folder on my site, I was looking at the access logs of it at the time, everything completely normal up to that point until 10 SECONDS AFTER the user request, a request coming from a Google IP address with Googlebot/2.1; +http://www.google.com/bot.html user-agent hits the same public folder. Then I noticed that the user-agent of the user that accessed that folder was Chrome/131.0.0.0.

I have a subdomain and there is some folders of that subdomain that are actually indexed on the Google search engine, but that specific public folder doesn’t appear to be indexed at all and it doesn’t show up on searches.

May be that google uses Google Chrome users to discover unindexed paths of the internet and add them to their index?

I know it doesn’t sound very shocking because most people here know that Google Chrome is a privacy nightmare and it should be avoided at all times, but I never saw this type of behavior on articles about “why you should avoid Google Chrome” or similar.

I’m not against anyone scrapping the page either since it’s public anyways, but the fact they discover new pages of the internet making use of Google Chrome impressed me a little.

Edit: Fixed a typo

Chat

Fijxu@programming.devOP
link
fedilink
arrow-up
4·
1 year ago
Nope, is just a file indexer that I host publicly. I don’t care about sharing the URL to provide more context.

The user accesed https://luna.nadeko.net/Movies/Ch3k0p3t3/ with Google Chrome

And 10 seconds after, Googlebot scrapes the folder.

Simple as that, I don’t have privacy invasive trackers on any of my webpages/services

Privacy@lemmy.ml

privacy@lemmy.ml

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Posting a link to a website containing tracking isn’t great, if contents of the website are behind a paywall maybe copy them into the post
Don’t promote proprietary software
Try to keep things on topic
If you have a question, please try searching for previous discussions, maybe it has already been answered
Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
Be nice :)

Related communities

much thanks to @gary_host_laptop for the logo design :)

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

34 users / day
34 users / week
34 users / month
101 users / 6 months
63 local subscribers
44.7K subscribers
3.86K Posts
79.8K Comments
Modlog