Last week, the Twitter account @congressedits launched. This account is a bot that tweets edits to Wikipedia that were made by members of the U.S. Congress, in order to help facilitate transparency. The account works by automatically tweeting any Wikipedia edits made by anonymous contributors with IP addresses between the known IP address blocks of the U.S. Senate or the House of Representatives.

Google’s BigQuery tool has a sample dataset of Wikipedia data, representing the data on 314 million article edits up to April 2010. Out of curiosity, I wrote a query which returns the top 100 pages with the most amount of edits by Wikipedia contributors in the U.S. Senate’s IP block.

Using this query for the Senate’s IP block, and a similar one for the House of Representatives IP blocks, I retrieved the most-edited entries for both entities. You can access the spreadsheet of this data by downloading a .pdf or by viewing the data online with Numbers for iCloud, both of which contain high-resolution charts and clickable Wikipedia links. A Google Sheets version is also available.

Here are the Top 10 Wikipedia entries with the most amount of edits by members of the Senate:

Wait a minute. Hawk from G.I. Joe?!

Saying that the query results were surprising would be the understatement of the century.

Two of the top-edited entries are directly pertaining to the U.S. Senate, which helps prove that the IP block is indeed the Senate’s IP block. Both Kappa Upsilon Chi and Beta Upsilon Chi are Christian fraternities. (however, the Kappa Upsilon Chi Wikipedia entry no longer exists for some reason)

The edits corresponding to actual people are ones which are the most interesting. William Swain Lee is a Delaware politician whose entry was created and edited by a user in the Senate IP block. OrangePie is a user who, according to his talk page, was criticized for repeatedly recreating an entry for “Michael Hardaway” after deletion, who coincidentally worked for the Senate according to his Twitter bio. In journalist Paul D. Thacker’s entry, one Senate editor replaced a paragraph of Thacker’s biography with the word “anus?”. Jay Rockefeller is an actual U.S. Senator, so the edits are definitely a conflict of interest. The user who made the edits apparently also removed information about a government investigation into the Senator.

I have nothing to add for Hawk from G.I. Joe.

Other interesting frequently-edited Wikipedia entries from members of the U.S. Senate are Primetime Emmy Award for Outstanding Supporting Actor – Comedy Series (11 edits), Wikipedia:Introduction (5 edits) and Crash (2004 film) (5 edits)

The Wikipedia entries with the most amount of edits by members of the House of Representatives are somehow even weirder, and that’s quite an accomplishment.

Well, if anyone in the entire United States would be experts on the topics of cleft chins and dimples, it would be the members of the House of Representatives.

Again, one of the most-edited entries corresponds to a House of Representatives topic, which helps validate the IP blocks. The Cerritos, California location had neutral edits made by a rather dedicated Wikiuser. Wynne, Arkansas and Michelle Ye’s edits were made by the same dedicated Wikiuser. Waverly, Pennsylvania was edited by a user who’s really passionate about Doc’s Deli. Luis Fortuno, former governor of Puerto Rico, had his history excised by another user. Betty Sutton, however, is a actual Representative from Ohio, representing another conflict of interest, as another user constructed most of her entry.

I have nothing to add regarding effeminacy in the House of Representatives.

Other interesting edits by members of the House include Apocalypse Now (10 edits), History of Italy as a monarchy and in the World Wars (9 edits), and Whitney Houston (9 edits)

In the end, the members of the U.S. Congress have the same peculiar interests as typical Americans. However, when these people edit entries on topics in which they are directly involved, the potential bias threatens the integrity of all Wikipedia. And this is just the tip of the iceburg.


Max Woolf (@minimaxir) is a former Apple Software QA Engineer living in San Francisco and a Carnegie Mellon University graduate.

In his spare time, Max uses Python to gather data from public APIs and ggplot2 to plot plenty of pretty charts from that data.

You can learn more about Max here, view his data analysis portfolio here, or view his coding portfolio here.