Friday, June 28, 2013

Snowden Petition Blocked From Google? Like All Petitions, It Won’t Be When It Gets Enough Signatures?

Search for “edward snowden petition” to find the petition filed through the White House’sPetition site, and you’ll see something odd. The petition has no description, because the White House won’t let Google crawl the page. But it’s not a move against Snowden, as some might think. It’s part of how Petition.org has worked with search engines for some time.


Acquired noted the oddity this week, that the page is listed, but without a description – and that this is due to be page officially being blocked from Google and other search engines such as Bing.
How can a page that’s blocked still be listed? This is what’s known as a “link only” listing, where Google can guess at what the page is about from other pages linking to it to form a title. But it can’t generate a description nor gather any information form the page itself, because it’s blocked and thus cannot access the content on the page to show a description of the page.
In fact, all pages on the White House Petition web site are blocked like this — and have been since 2011, as shown by this copy of the robots.txt file via the Way Back Machine.
Why woudl this happen? The White House is blocking petitions that are below a certain threshold. Page that gain enough signatures get an official response, and that also means they get a new page under the /responses folder that is not blocked from crawling.
The White House has page explaining the threshold needed, though it doesn’t explain the search engine blocking. However, our understanding is that this is how things work — pages below a threshold of signatures don’t get indexed, mainly to help prevent people who might try to use the White House site to generate spam.
Get enough signatures, and you’re guaranteed a response — and also deemed Google-worth. For Snowden, that means 100,000 signatures and then time for Google to recrawl the page.

No comments:

Post a Comment