Wednesday, October 25, 2006

Six Degrees of Wikipedia

"Six Degrees of Separation" is a theory in which it is alleged that any person on the earth can be connected with any other by 6 or more degrees of separation. That would mean that if you hop through your web of aquaintences, you can eventually get to anyone (even famous people) quite quickly. This theory, while still heavily debated, is probably at least generally true. Unless you are thoroughly isolated, chances are you know someone who knows someone who knows someone who knows someone who knows someone who knows someone who knows Joe Smith. And if you need more degrees of separation than that you or Joe probably live in an igloo or live among some remote undiscovered tribe.

The theory was first proposed by Hungarian writer Karinthy Frigyes. There is a neat little web tool which is called "Six Degrees of Wikipedia". I'm not sure where else it can be used, but I've added it as a "widget" to my Google home page. You enter two names and it tells you how many degrees of separation there are between the two people in the Wikipedia database. The degrees of separation are not measured by aquaintences, but rather by article links. So if the Joe Plymoth article links to Bob Smith's, that would be 1 degree of separation. However if Joe Plymoth has a link to Mike Walters, and Mike Walter's article links to Bob Smith, then that would be two degrees of separation between Joe Plymoth and Bob Smith.

Six Degrees of Wikipedia is neat in concept, but there is one little problem which spoils it. The problem is Wikipedia's entries for years (ie 1542) and for particular days (ie November 19). These entries link almost anyone. For example, we know that the combination of anarchist Abbie Hoffman and Protestant reformer John Knox should test the 6 Degrees of Wikipedia separation theory to the brink. These fellows have very little, if anything at all, in common. However, if John Knox was born on July 20th, 1505 (hypothetical) and Abby Hoffman stagged some antic on July 20th, 1969 (hypothetical), they would be listed as having a very low degree of sepration since the common date makes them "close" in the Wikipedia database. This sort of disparity calls into question most attempts to use Six Degrees of Wikipedia for anything useful. If there is no intersection of arbitrary dates, then it seems the results can sometimes be meaningful to show how contextually/informationally related two people are.

Anyways, let's look at some of the Degrees of Wikipedia. And we'll need enough data to try to see to gauge performance:

  • John Knox, Abbie Hoffman (3 degrees)

  • Barry Bonds, Ghandi (5 degrees)

  • Martin Luther King Jr, David Duke (3 degrees)

  • Bill Clinton, Al Gore (1 degree)

  • Bill Lee, Ronald Reagan (3 degrees)

  • Jacob Arminus, John Calvin (2 degrees)

  • Ghengis Khan, Greg Bahnsen (4 degrees)

  • Stephen Harper, Donald Knuth (3 degrees)

  • Michael Jordan, Peter Seller (3 degrees)

  • John Lennon, John Wayne (2 degrees)

  • Neil Young, George Bush (3 degrees)

  • Francis Schaeffer, Toronto Maple Leafs (3 degrees)

  • Ken Kesey, Hunter S Thompson (2 degrees)

  • Cornelius Van Till, Rick Warren (4 degrees)

  • Jerry Rubin, George Wallace (3 degrees)

  • Sparky Anderson, Lou Whitaker (2 degrees)

What do you think? Do these degrees of separation do justice to the people in question? Not in whether they know each other, but rather in how closely related (in terms of ideas, context, history) they are? I guess it is up for debate. It does seem that 1 or 2 degree separations have genuine relation to each other in terms of their context. And anyone that rates as a 4 or 5 degree separation definately displays no or very little relation to each other. However, it seems that the 3 degrees seems to be frequently attained merely by linkage through arbitrary dates (ie. Francis Schaeffer's birthday and a milestone in the Toronto Maple Leafs history). So, maybe the results aren't that off base if we are careful to account for skewing due to arbitrary date linkage.

One thing is for sure, none of this test data even reached 6 degrees of separation. Almost any two people can be linked through Wikipedia within less than 6 degrees of separation, and some can be linked up even without the presence of arbitrary date linking.

