Brian Shaler

Occasionally Interesting

Gravity is for chumps

Archive for March, 2007

Data Visualization: Digging Into DiggTaggr’s Usage Stats

The Challenge

In the last seven weeks, DiggTaggr has delivered about 115,000 sets of links to relevant stories to several thousand unique users. This is a pretty good size dataset to tinker with, so I decided to hack through it and see if I could present the data in an interesting way.

I have to admit, I was partially inspired to do this by Stamen Design’s data visualizations of Digg’s traffic. If you haven’t seen their scatter-plots, you should check them out.

Graph #1: User ID vs. Story ID

This was my first attempt at displaying the dataset in an interesting manner. Two stories related to DiggTaggr hit the front page and are labeled on the graph. DiggTaggr debuted on Friday, February 2nd, and received a complete redesign on the 4th.

You can see the curve of new users accelerating through most of the graph. This illustrates that there are fewer and fewer new users. There are also grid-like patterns emerging. Horizontal lines represent highly active users, while the dark horizontal gaps represent users who tried the tool and stopped using it. Vertical lines represent high activity during peak hours, while dark vertical gaps represent low activity on weekends.

Graph #2: Time vs. User ID

The Story ID axis in the previous graph gave a fairly accurate chronological referrence, but if it’s time you want, it’s time you should use.

This graph illustrates peak hours and peak days of the week in a more explicit way. I labeled the distinct patterns of weekdays and weekends. You can see that the pattern is more clearly defined in a certain area of the graph. These users involuntarily grouped themselves together by seeing the tool first thing Monday morning (the white horizontal line at the top of that section of users), while daily users had already seen the tool for 2 days.

The graph is color-coded to see how quickly users went through 40 Digg stories using DiggTaggr. Some users quickly went to red, while others used the tool less frequently.

Graph #3: Stories Viewed vs Time

Okay, okay. I was having a little bit of fun with this one. This one took over an hour to render on my laptop, partially because each dot had its own database query to determine how many previous instances there were for that user.

Each squiggly line represents a user. When the line is vertical, the user is viewing stories quickly one after another. You can see that only a handful of users have made it to the 1,000 story mark. I should give them a prize!

The density at the bottom left illustrates the high volume of new users. Some Digg at a rapid pace and shoot up, while others are more moderate and gradually climb. The density at the bottom tells us that a high percentage of DiggTaggr users either rarely visit Digg or uninstalled the tool.


Data visualization is still fascinating and fun.

DiggTaggr has sent almost half a million links to relevant stories to its users.

Yesterday, I chose to parse datasets instead of going outside. geek++

I still enjoy hearing from users. Feedback is always welcome and appreciated.

You are currently browsing the Brian Shaler blog archives for March, 2007.