Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

The Code Project Kevin Bacon Game (Part 3)

4.93/5 (11 votes)
28 Jun 2008CPOL8 min read 2   716  
A Code Project orientated Kevin Bacon game utilising the CodeProject.dll

http://www.derek-bartram.co.uk/index.cfm?PageID=kevinbacon - Online version of the Code Project Kevin Bacon Game. Please note that the game is still being populated so finding links presently is very hard (try a link from 10 to 1).

codeprojectkevinbacon2.png

codeprojectkevinbacon1.jpg

Introduction

This article demonstrates how to use the CodeProject.dll to extract information for the Code Project Kevin Bacon game!

IMPORTANT NOTE - the data retrieval if set of 5,000,000 as it should be will consume lots of Code Project's bandwidth; if you want to give the demo a whirl, please use a limited set of say 1000 users. I have included the bulk load data for the first 1000 users which you should really use instead. Please respect Code Project. The online version is being populated presently!

Background

"The trivia game Six Degrees of Kevin Bacon is based on a variation of the concept of the small world phenomenon and states that any actor can be linked through his or her film roles to actor Kevin Bacon. The game requires a group of players to try to connect any film actor in history to Kevin Bacon as quickly as possible and in as few links as possible. The game was played across various college campuses in the early 1990s.[citation needed] Its name is a pun on the concept of six degrees of separation. In 2007, Bacon started a charitable organization named SixDegrees.org."

Kevin Bacon, in a 1994 Premiere interview for the film The River Wild, while talking about his fame and career, comments that he's worked with everybody in Hollywood or someone who's worked with them.

Six Degrees of Kevin Bacon first surfaced at about the same time. On April 7 1994, a lengthy newsgroup thread headed "Kevin Bacon is the Center of the Universe" appeared.

The game was created in early 1994 by three students at Albright College, Craig Fass, Brian Turtle, and Mike Ginelli. According to an interview with the three in the Spring 1999 issue of the college's magazine, The Albright Reporter, they were watching Footloose during a heavy snowstorm. When the film was followed by Quicksilver they began to speculate on how many movies Bacon had been in and the number of people he'd worked with.

In the interview Brian Turtle said, "It became one of our stupid party tricks, I guess. People would throw names at us and we'd connect them to Kevin Bacon."

The trio wrote talk show host Jon Stewart a letter telling him that "Kevin Bacon was the center of the entertainment universe" and explaining the game.[citation needed]

They appeared on The Jon Stewart Show and The Howard Stern Show with Bacon to explain the game. Bacon admitted that he initially disliked the game because he believed it was ridiculing him, but he eventually came to enjoy it. The three inventors of the game released the book Six Degrees of Kevin Bacon (ISBN 9780452278448) with an introduction written by Bacon., and a board game based on the concept was released by Endless Games.

By the 2000s, the game was familiar enough to be referred to in United States popular culture. For example, In an episode of NBC's Will & Grace titled "Bacon and Eggs", Bacon makes a guest appearance, playing himself. During this episode, he makes an obvious reference to the game when talking with Will (played by Eric McCormack):

Will: You— you did a movie with Val Kilmer?
Kevin: No, but Val was in Top Gun with Tom Cruise, and Tom was in A Few Good Men with me. Huh, that was a short one. Bacon also appeared in a commercial for the Visa check card that parodied the game. In the commercial, Bacon wants to write a check to buy a book, but the clerk asks for his ID and he does not have it. He leaves and returns with a group of people, then says to the clerk, "Okay, I was in a movie with an extra, Eunice, whose hairdresser, Wayne, attended Sunday school with Father O'Neill, who plays racquetball with Dr. Sanjay, who recently removed the appendix of Kim, who dumped you sophomore year. So you see, we're practically brothers."

The concept was also presented in an episode of the TV show Mad About You dated November 19, 1996 in which a character expressed the opinion that every actor is only three degrees of separation from Kevin Bacon.

Bacon provides the voice-over commentary for the NY Skyride attraction in the Empire State Building in New York City. At several points throughout the commentary Bacon alludes to his connections to Hollywood stars via other actors he has worked with."

Quote: Wikipedia - Six Degrees of Kevin Bacon

Aim of the Game

The aim of the game is simple, find a pair of member id's who are linked via comments on articles but with the longest minimal chain of connections. For example user Mike Klimentiev (3045) to Michael Dunn (152) is pretty good, with a Bacon number of 3 (since there are three links in the chain).

  1. 3045 left a message in the "Tips for writing documentation" article written by 230
  2. 230 left a message in the "Surviving the Release Version" article written by 117
  3. 117 left a message in the "Neat Stuff to Do in List Controls using Custom Draw" article by 152

User 230 is Tibor Blazko.
User 117 is Joseph M. Newcomer.

Game Theory

This article is not intended to serve as an introduction to the theories behind the Kevin Bacon game, however a summary is presented herein for completeness.

Consider the link between users and comments in articles as a (non-fully) directed graph as shown below;

codeprojectkevinbacon3.jpg

The Bacon Number (or rather path) is the path with the shortest number of hops between source and destination. This may be determined via Dijkstra's algorithm (see http://en.wikipedia.org/wiki/Dijkstra's_algorithm). I have used an existing .Net implementation in this version of the game, courtesy of Michael Demeersseman ( http://www.codeproject.com/KB/recipes/ShortestPathCalculation.aspx), albeit with a few minor alterations to store the artical information in Connection and removing Connection weight (as all connections have a weight of 1 in this game).

Retriving Required Data

Two pieces of information are required, user articles and comments left in articles. The nodes (i.e. the end points) represent all articles from a single user and can be data mined via the ArticlesSummary class of the CodeProject.dll. The ArticlesSummary requires the member number, however a simple loop from 1 to 5,000,000 (the number of Code Project members) performs this task;

C#
counter = 1;
StreamWriter articleDataWriter = new StreamWriter(System.Environment.CurrentDirectory 
   + "\\articles.blf", false);
articleDataWriter.AutoFlush = true;

while (counter < maxUsers && false)
{
    ArticlesSummary summary = new ArticlesSummary(counter.ToString());
    summary.update();

    List<Article> articles =  summary.Articles;
    foreach (Article a in articles){
        articleDataWriter.Write(counter + "," + a.Title + "," + a.Link + "\r");
    }

    if (counter % 100 == 0)
    {
        articleDataWriter.Flush();
        Console.Write(".");
    }

    counter++;
}
articleDataWriter.Flush();
articleDataWriter.Close();
Console.WriteLine(".");

The results are written to a simple text file ready for bulk loading.

The second piece of required information is comments left in articles by each user, i.e. the connections between nodes. A similar loop over user numbers performs this task;

C#
counter = 1;
StreamWriter commentDataWriter = new StreamWriter(System.Environment.CurrentDirectory 
   + "\\comments.blf", false);
commentDataWriter.AutoFlush = true;

while (counter < maxUsers)
{
    User u = new User(counter.ToString());

    foreach (String article in u.ArticleComments)
    {
        commentDataWriter.Write(counter + "," + article + "\r");
    }

    if (counter % 100 == 0)
    {
        commentDataWriter.Flush();
        Console.Write(".");
    }

    counter++;
}
commentDataWriter.Flush();
commentDataWriter.Close();
Console.WriteLine(".");

In a similar manor the resultant data is written to text file for bulk loading.

Processing Data

Before the data may be successfully used, it is imported into a SQL Server database using the following commands.

Create Tables

USE [KevinBacon]
GO
/****** Object:  Table [dbo].[Articles]    Script Date: 04/27/2008 21:03:15 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Articles](
    [memberID] [nchar](10) NOT NULL,
    [articleName] [nchar](512) NOT NULL,
    [webLink] [nchar](128) NOT NULL
) ON [PRIMARY];

USE [KevinBacon]
GO
/****** Object:  Table [dbo].[Comments]    Script Date: 04/27/2008 21:03:19 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Comments](
    [memberID] [nchar](10) NOT NULL,
    [articleName] [nchar](512) NOT NULL
) ON [PRIMARY];

Import Data

For those not familiar with the BULK INSERT command (or the BCP utility), it allows significantly faster data import into a SQL Server database table as constraints are not checked and only one command is required for multiple row insertion. Consider the case of importing comments data; there are 5,000,000 each with on average 20 comments, resulting on 100,000,000 INSERTs or 1 BULK INSERT.

BULK INSERT Articles
   FROM 'C:\Users\Administrator\Documents\Visual Studio 2008\Projects\CodeProject\
KevinBaconDataRetrieval\bin\Debug\articles.blf'
   WITH 
      (
         FIELDTERMINATOR =',',
         ROWTERMINATOR ='\r'
      );

BULK INSERT Comments
   FROM 'C:\Users\Administrator\Documents\Visual Studio 2008\Projects\CodeProject\
KevinBaconDataRetrieval\bin\Debug\comments.blf'
   WITH 
      (
         FIELDTERMINATOR =',',
         ROWTERMINATOR ='\r'
      );

Note that the resultant output of data retrieval requires that the text file needs a minor alteration; the last character of the file needs to be removed (i.e. the last \n). Notepad or similar is sufficient; the process should be performed as follows;

  1. Open file in Notepad
  2. Go to end of file (via Ctrl+End)
  3. Delete last character (via backspace)*
  4. Save and exit Notepad
  5. Repeat for both files

* Note that the carat position does not appear to change, this is normal and correct.

Calculating the Shortest Path

The game program proceeds along the following stages;

  1. Get location data, that is the user/article information
  2. Get connection data, that is which user has commented on which article
  3. Create a routing engine (based on Michael Demeersseman's work)
  4. Loop until program close:
    1. Get first member id (and corresponding Location)
    2. Get second member id (and corresponding Location)
    3. Use engine.CalculateMinCost(fromLocation) function to get a dictionary of routes to all other locations
    4. Select route corresponding toLocation from engine results

Get Location Data

SELECT memberID FROM Comments GROUP BY memberID 

Get a list of the distinct member IDs

Get Connection Data

SELECT 
    Comments.memberID AS fromMemberID, 
    Articles.memberID AS toMemberID, 
    Articles.articleName AS articleName 
FROM Articles, Comments 
WHERE Articles.articleName = Comments.articleName

Performs a join on the Article and Comments tables. The fromMemberID and toMemberID is used as a key to a dictionary of Locations as previously generated, and a new Connection created accordingly (along with the articleName)

Create Routing Engine

C#
RouteEngine engine = new RouteEngine();
engine.Connections = connectionsList;
engine.Locations = locationsList;

The routing engine is instantiated.

History

Version 1.0.0.0 - First build

Useful Links

codeprojectarticle.aspx (Part 1, Articles)

codeprojectuser.aspx (Part 2, Users)

codeprojectkevinbacon.aspx (Part 3, Kevin Bacon)

Additional Licensing Notes

Please feel free to use this in your work, however please be aware that a modified The Code Project Open License (CPOL) is in use; basically it is the same as the standard license except that this code must not be used for commercial or not-for-profit commercial use without prior authorisation. Please see license.txt or license.pdf in the included source and demo files.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)