Last night I had the honor to lead the first BetaLabs class. BetaLabs is an effort organized by product designer extraordinaire Summer Bedard, an internal Betaworks skill-sharing class that takes place every two weeks. I wanted to give a class that’ll utilize python to access the Twitter API, grab some interesting data, and visualize it. The necessary code was provided, and participants got a list of downloads and installs that needed to happen before the class. Within two hours, everyone was playing with a graph based on data that they grabbed from the Twitter API!
We used the tweepy python library to access the Twitter API. First thing we did was grab a number of Betaworks Twitter lists that different users created. Once we got a hold of this list of Twitter users who are associated with Betaworks, we pinged the Twitter API again to grab their relationships – who follows who. Once we had all this data, we used the python networkx library to build a network graph that reflects people’s relationships, where a node represents a Betaworks Twitter user, and a directed edge represents who follows who (an edge from node A to B = A follows B on Twitter).
Lastly we exported this graph into .graphml format which we then imported into the open source network visualization software – gephi. Using gephi each person personalized the design of their graph – colors, sizes and layout.
While the Betaworks network is pretty densely clustered (as expected – many Betaworkers follow other Betaworkers), different parameters reveal some interesting insight.
Eigenvector Centrality is a measure for the importance of a node in a network. Nodes are scored based on the principle that connections to high-scoring nodes contribute more. Like google’s page rank algorithm, the more powerful nodes that follow you, the higher score you get. We get a pretty straightforward graph, where there’s a high correlation between size/weight and the amount of time that person’s been at Betaworks. This is what we’d expect – the longer one has been at Betaworks, the more connections that person is likely to have with other folks, especially other “central” folks – the older geezers!
Now if we look at Betweenness Centrality, we get a slightly different picture of the graph. Betweenness Centrality is the measure of a node’s centrality within a network – the number of shortest paths from all vertices to all others, that pass through this node. In effect, this measures how much of a bridge the user is within the displayed network. While @Borthwick is still as central of a bridge, it is clear that @MattLeMay is not only followed by many other colleagues (above) but also an important bridge between the different parts of the network.
Class python code and instructions are available here.
Photos and people’s graphs can be seen here.