Page Content | Main Menu | Section Menu | Support Us | Contact Us
Center for Democracy and Technology
Working for Democratic Values in a Digital Age
Support CDT
Contact Us
PolicyBeta - Digital Policy in Process
This Section

White House Web Site Now ‘Crawler’ Friendly

January 22nd, 2009 by Heather West

The minute President Obama assumed his office, before he had even taken his oath, WhiteHouse.gov was updated to reflect the new executive. As expected, the new WhiteHouse.gov used some of the tools that were used online throughout the campaign. In addition, WhiteHouse.gov became drastically more accessible to search engines with an update to its robots.txt file.

Robots.txt is a standard way for Web sites to tell search engine Web crawlers which Web pages the crawlers can index for search purposes. In 2007, our Hiding in Plain Sight report noted that many federal Web sites abused the robots.txt file to hide content from search engines, making it hard for users to find federal information online. One of the worst offenders was the White House Web site itself, with almost 2400 specific (and chuckle-worthy) exclusions from the search index. In comparison, the new robots.txt file has just two lines and excludes almost none of its content from Web crawlers’ reach- the only folder that is blocked is the ‘includes’ folder, which is generally used for files that will be used as a part of another page.

The widespread abuse of robots.txt on federal government Web sites is a questionable practice that serves to limit the availability of government information. We applaud the White House for stepping up its commitment to transparency and setting a good example for other federal Web sites to follow.


This entry was posted on Thursday, January 22nd, 2009 at 9:50 am and is filed under CDT, Open Government. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to “White House Web Site Now ‘Crawler’ Friendly”

  1. Google takes 4 years to address Bush Googlebomb, 4 *days* to fix Obamabomb — But As For Me Says:

    [...] PolicyBeta – Blog Archive – White House Web Site Now ‘Crawler … [...]

  2. Google’s Results Biased Towards Obama — But As For Me Says:

    [...] PolicyBeta – Blog Archive – White House Web Site Now ‘Crawler … [...]

Leave a Reply

About the Blog

    PolicyBeta is a forum for CDT experts to discuss news and developments in the technology policy arena. Visitors are encouraged to comment on the blog or email the authors.

    Our goal with PolicyBeta is to foster thoughtful discussion regarding technology policy as it relates to civil liberties and democratic values. While we encourage comments, we must insist that they be focused, relevant and written in a tone that is respectful of other posters. For more information, please feel free to contact PolicyBeta editor Brock Meeks.

    Check the main CDT site for complete, up-to-date information on CDT initiatives and activities.

Search Blog
       Top
Privacy Policy | Feedback