Home About Posters


What is this site?

This site lets you explore traffic patterns of MBTA commuters across the various lines and stations over a single day. The data for this visualization was released as part of the Visualization Challenge by the Executive Office of Transportation.

What is this visualization called?

This visualization is a variant of a "Theme River" visualization. You can read more about the visualization here.
[A theme river] visualizes thematic variations over time across a collection of documents. The “river” flows through time, changing width to depict changes in the thematic strength of documents temporally collocated. Themes or topics are represented as colored “currents” flowing within the river that narrow or widen to indicate decreases or increases in the strength of a topic in associated documents at a specific point in time. The river is shown within the context of a timeline and a corresponding textual presentation of external events.
The main difference is that instead of looking at data over a linear axis, I chose to depict it around a 24 hour clock.

What does the 'thickness' at any given hour mean?

help image 1

Let's start with the basic scenario: We're looking at a single station.

You can see that the line marked "a" is longer than the line marked "b". This is to indicate that given the overall traffic on that particular line over a 24 hour period, the percentage of travelers at hour "a" and the percentage of travelers at hour "b" corresponds directly to the length of the "white ray". Thus, there were more people traveling at hour "a" than there were at hour "b".

The simpler explanation is that visually, by comparing the two lengths, you can tell that there were about twice as many travelers at hour "a" because that line is twice as long as hour "b".

What about multiple "waves" stacked together?

help image2

That is a slightly more complex scenario. Let's take the "All..." line view as an example to work with. We are looking at 8:00am as indicated on the image.

Let's assume that across all stations at that hour we have 100 passengers. The total length of the entire segment then represents 100 passangers. Now let's look at the individual segments, assume they are parts of a whole. Each part per line, represents proportionally, the amount of travelers for that line. In this example we can see that the red and orange lines have about 4/5th of the traffic (or 2/5 each) while the green and blue line have about 1/5 (or 1/10 each). If we assume those proportions are correct, then the green and blue lines each have 10 passengers while the red and orange lines have 40 each. Add those up, and you get 100.

How do I see some real data?

Don't forget to click on any "wave" to see a bar chart showing the actual numbers below.
For example, when clicking on a red line:

help image3

Why do some waves disappear at various hours?

For certain lines and certain hours, there was no traffic. As such, at this hours, the "wave" is 0% thick, or rather, has no thickness at all.

Why doesn't the Silver/Green Line look like I expect it to?

You've probably noticed that the data is somewhat misleading for the Silver Line and Green Line, which have above-ground boardings on the vehicles, not at station gates. The relatively few boardings at stations like Brookline Hills, Waban, and Beaconsfield do not reflect actual usage. Data for those boardings has not been integrated yet.

Also, some stations that are at intersections of lines are only listed under one color, even though they actually intersect several. For example, "North Station", which I always considered a green line, is actually listed as an "Orange Line" and as such appears in that list.

Can I use this in print?

Sure. A quick suggestion: Print as pdf first, to get a high resolution file which can then be scaled to the appropriate size.
Also, I kindly ask that you inform me of your intentions prior to using the work. You can reach me via imirene [at] gmail [dot] com.

Have you thought about releasing any of the code?

Yep! The javascript source (which really is the only interesting part) is up on github. Github Repo


I would like to offer my sincerest gratitude to the following people for their insightful, helpful and honest input. Oliver Chong, Jesse Kriss and Marian Dörk. Without your support and suggestions this would probably have been a much uglier and unfinished project ;)