TVTropes Now available in the app store!
Open

Follow TV Tropes

Ask The Tropers

Go To

Have a question about how the TVTropes wiki works? No one knows this community better than the people in it, so ask away! Ask the Tropers is the page you come to when you have a question burning in your brain and the support pages didn't help. It's not for everything, though. For a list of all the resources for your questions, click here. You can also go to this Directory thread for ongoing cleanup projects.

Ask the Tropers:

Trope Related Question:

Make Private (For security bugs or stuff only for moderators)

Fighteer MOD (Time Abyss)
2019-06-01 16:38:56

Any scraping algorithm or program would be blocked by our software, even if it's for the purpose you stated.

"It's Occam's Shuriken! If the answer is elusive, never rule out ninjas!"
RamenChef Since: Dec, 2017
2019-06-01 17:33:45

I'm confused. How would this algorithm tell apart a (sufficiently advanced) scraper from a browser or web spider?

This wouldn't be using a generic User-Agent header. This would be a very specialized header that I'm fairly certain will be unique to this particular script. If necessary, I can also slow it down as much as I like, so speed isn't an issue.

sgamer82 Since: Jan, 2001
2019-06-01 17:54:06

I'm confused. How would this algorithm tell apart a (sufficiently advanced) scraper from a browser or web spider?
If I'm understanding correctly, the answer to this question is "it wouldn't". If I'm following this correctly, Fighteer is saying that the site's software will block any such "scraper" like you're suggesting. Or at least try to, no matter what it's for.

RamenChef Since: Dec, 2017
2019-06-02 14:09:43

I just checked with a test script, and it would appear that it does, in fact, appear to be blocking based on the User-Agent header. I would like to stress that setting it really isn't an issue for me.

Back to the main point of this thread. I would like clearance to use the script I described before I actually start using it, because last I checked it was required for any sort of HTML scraper on this site. Though I can't seem to find where that policy was described.

Fighteer MOD (Time Abyss)
2019-06-03 06:17:04

You need permission from the admins to run any scraper. Moderators cannot grant it. Send a message to "The Staff" using our contact form. We cannot offer any promises as to whether (or when) you will get a response.

"It's Occam's Shuriken! If the answer is elusive, never rule out ninjas!"
Top