Introduction
As a school project, I had the task to write an application to extract the social relations in a web community. The community chosen by me was the Flickr community and I planned to extract all the contacts of some username. Then, for those contacts, extract all of their respective contacts, and so on, until you reach a depth specified in the config.xml file.
So I started to look for an API to access this community.
I found the flickrj API here. Problem is there were quite none examples on the internet for what I decided to do; so, after reading the examples provided in the .jar, and searching thru the thin documentation also in the archive, I came with the code I am going to present here.
Background
The code is written in Java, compiled with
JavaSE 6. You have to know the concepts of
XML,
DOM and the related packages that are used to implement the DOM in Java,
JAXP and
JDOM.
Using the code
The code is written carefully, and it kind of self explains, but I'll put a brief description for the main 3 functions. In the archive provided, you can find a README file with compile and run instructions. You have to make sure to include the flickrj jar in the folder you compile and run the application.
One thing you have to be aware of is that running of the application is slow, due to the communication with the community. One contact is taken in aproximately one second, so pacience is required.
The first function reads the data from the config.xml file. We are mainly interested in the depth of the search and the username to start. To find a username, go here, type the name of the contact in the search box, lets say "john". After the search finishes, click on the John you want, then click the Profile link for him. Here, you can see his username after "About", and his contacts in the middle of the page.
public void ReadInput(String xmlFile)
{
try
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File(xmlFile));
Element elem = doc.getDocumentElement();
if(elem.getTagName().compareTo("config") == 0)
{
NodeList children = elem.getChildNodes();
for(int i = 0; i < children.getLength(); i++)
{
Node child = children.item(i);
if(child.getNodeName().compareTo("url") == 0)
{
url = new URL(child.getChildNodes().item(0).getNodeValue());
}
if(child.getNodeName().compareTo("username") == 0)
{
id = child.getChildNodes().item(0).getNodeValue();
PeopleInterface people = flickr.getPeopleInterface();
userGen = people.findByUsername(id);
}
if(child.getNodeName().compareTo("depth") == 0)
{
depth = Integer.parseInt(child.getAttributes().getNamedItem("value").getNodeValue());
}
}
}
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
}
This function uses the JDOM and JAXP to read the contents of the xml file. You will create a factory and a document builder to be able to parse the xml in a Document tree. Then, you walk this tree and extract the username and depth to use in the program later, and also the url, to write it in the output, for better understanding of the community analyzed.
The next function is the most interesting in the program, it is actually the function I wrote from scratch.
public void ProcessContact(String username, int degree, String parentId)
{
try
{
User userLoc = people.findByUsername(username);
System.out.println("Username: " + username + ", id: " + userLoc.getId() + ", parent id: " + parentId);
PersonContact person = new PersonContact();
person.userURL = "http://www.flickr.com/people/" + userLoc.getId() + "/";
person.relationURL = "http://www.flickr.com/people/" + userGen.getId() + "/";
person.degree = depth - degree;
person.refURL = "http://www.flickr.com/people/" + parentId + "/";
contactsList.add(person);
if(degree > 0)
{
Iterator<Contact> it = ci.getPublicList(userLoc.getId()).iterator();
for(int i = 0; i < ci.getPublicList(userLoc.getId()).size(); i++)
{
Contact contact = it.next();
ProcessContact(contact.getUsername(), degree-1, userLoc.getId());
}
}
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
}
Here, you populate a "User" instance with the informations for a user, by using the method findByUsername, that resides in the PeopleInterface interface of flickrj. The details for this user are put in a list. Then, for each of its contacts, obtained with the getPublicList function of the ContactsInterface interface of flickrj, you recursively call the function. The calling is made until the search depth provided is reached.
The last function creates the output file.
public void CreateOutput(String outXmlFile)
{
try
{
File f = new File(outXmlFile);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = factory.newDocumentBuilder();
Document doc = docBuilder.newDocument();
Element root = doc.createElement("socialnetwork");
doc.appendChild(root);
for(int i = 0; i < contactsList.size(); i++)
{
Element userElement = doc.createElement("user");
userElement.setAttribute("url", contactsList.get(i).userURL);
Element relationElement = doc.createElement("relation");
relationElement.setAttribute("url", contactsList.get(i).relationURL);
relationElement.setAttribute("degree", (new Integer(contactsList.get(i).degree)).toString());
relationElement.setAttribute("ref", contactsList.get(i).refURL);
userElement.appendChild(relationElement);
root.appendChild(userElement);
}
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
Source src = new DOMSource(doc);
Result dest = new StreamResult(f);
aTransformer.transform(src, dest);
}
catch(Exception e)
{
System.out.println(e.getMessage());
}
}
First, you create a doc tree to keep the future xml info. Then, you populate the tree with the info in the list of contacts. After that, you create a transformer and with its help you write the tree in the desired file, in the form of an xml document.
Points of Interest
I had problems using the ContactsInterface, meaning that this interface has no method to extract a list of contacts for an username. You have to specify a contact id to extract its contacts list, and the old users' contact ids are kind of hard to find on the Flickr site (because of the aliases option there, that hides the contact id).
So, finally I found out that I can use the PeopleInterface. Found this in a quick answer posted on the web, kind of unrelated to my problem, still very useful :D.
History
This is the final version. It can be modified for the people needs, but it is final for my purpose.
References
Interfete Web
Web technologies