Saturday, 9 February 2013

Complex Graph Visualizations

This post isn’t going to be about defining mathematical relations on complex graphs. It is about the first step - visualizing complex graph data. Of course, some people may like to differ about visualization being the first step but conventionally it is more or less assumed that visualizations play a major role in the initial study of complex systems’ data.

The most obvious question which comes to mind is “Why visualize?” since visualizations do not make any mathematical contributions to the data (not directly of course) nor do they augment the already existing data in any form. Here are some points which may answer this question.
• Visualizations help in identifying the major nodes (or the key players) in the graph being analyzed. We can “look” at which node is most influential, which node can disrupt the graph setting or which node's movement can alter major properties of the graph.
• Graph visualizations are also helpful in ascertaining various relationships between nodes. Say for example, if you have close to 800 facebook friends and about 10 are really close to you. You may want to invite the close ones to your birthday party and make sure that none of the invitees get bored due to lack of  presence of common friends at the party.
• Relationship strengths can also be determined from graph visualizations. The best example to quote here would be “The Hubway Data Visualization Challenge”.  Hubway is a bicycle sharing system in Boston, MA and is an absolute delight for visitors who wish to move around the city without having to keep track of the metro. The following visualization denotes the strength of relationship between each pair of stations.

• Temporal analysis of complex networks is another crucial application of graph visualizations. Relationships and properties of a graph may change over time.

Here is a visualization I made for half a million geo-tagged tweets collected during the Masters Golf Tournament 2012. The frames go a bit too fast because we did not want to exceed the time limit for our presentation. I recommend watching the 720p HD version of it in full screen mode.
• The identification of cliques or communities becomes much more apparent from visualizations and it is easier to compare the subgraph properties with the complete graph. Here is one of my favorites by Moritz Stefaner called well-formed eigenfactor. It gives an overview of a whole citation network.

• Graphs in 3D - Recently, I noticed some 3D visualizations of graphs. It felt like floating inside a web amongst the nodes. However, it should be admitted that 3D visualizations are not as helpful in extracting aggregated information but they are sexy! They might prove beneficial in enticing investors to your project.

This section of my post will enumerate some of the widely used tools for visualizing complex systems.

• Gephi is the photoshop for graphs. One can easily color, cluster and manipulate elements of the graph. It is open source.
• Sigma.js is a javascript library which brings Gephi on the web. It is also open source and has most of the layout algorithms as on Gephi with features like zoom in - zoom out, several ways to import and export data etc.
• Processing is the software for visual art. It was built upon Java and is one of the most sophisticated visualization tools present. The project was initiated in 2001 by the MIT Media Lab. One of its phenomenal applications were in visualizing graphs for Max Planck Research Networks project. The visualizations of the graph were controlled with fingers on a surface table.

Processing also has many sister projects such as Processing.js for integration with the web and iProcessing for the iOS environment.
• D3.js (Data Driven Documents) is a JavaScript library for data visualization and is an important tool to generate interactive graphics from data. It was developed as a successor to Protovis by Prof. Jeffrey Heer and his PhD students at Stanford. Even if one is working with noSQL or graph databases, the end result of a query is always a table. Such resultant tables can be easily converted to json format on the fly for visualization with D3. The force-directed and force-collapsible layouts are very useful in visualizing weighted complex graphs.
• Cinder is the tool for you if you are a C++ fan. Even though Cinder may not be the first choice for most interaction designers, it may be the tool for you if you need the speed gain associated with C++ and you are ready to spend some time building stuff. It is also known as the toolbox for creative coding and is often the subject of blog posts trying to explain music generation and visualization.
• Flare, three.js, VivaGraph.js etc. are other such libraries for graph visualizations. VivaGraph.js is known to scale well for larger graphs.

References: