2012 Aurora shootings

Data, Dissertation, Wikipedia

The “2012 Aurora shooting” article on Wikipedia is an example of a breaking news article which has a many editors intensively and jointly editing a single article. As of approximately 1pm EDT on July 21, 290 unique editors have made 1,281 changes to the article in a period of less than 36 hours.

The chart below shows how the size of the article has rapidly grown as well as interesting changes in the concentration of work done per editor (Time is recorded at UTC). The blue dots are the length of the article in bytes. We see that there is a rapid growth of the article at about 19:00 UTC which is later reverted, a rapid deceleration of article growth at about 1:00 UTC on 7/21, and a momentary expansion of the article at about 6:00 UTC and a sudden contraction at about 10:00 UTC. The average number of edits per article (green) also shows interesting damped sinusoidal behavior with the earliest part of the article involving many contributions from few editors, a sudden decline as many new editors join the collaboration, a rise again as some of these new contributors make many revisions, and then a stabilization around 4 edits per editor.

Of course, the actual work editors are doing on the article is very uneven and follows classic long-tail behavior. Most editors make only a single contribution, but a handful of editors are making dozens of contributions. Users O’Dea and Sandstein each have made more than 65 contributions over the last 36 hours.

Moreover, the activity on this article is also extremely intense. The chart below plots the distribution of time between edits. Again, the vast majority of edits occur within seconds or minutes of each other and only twice in the entire 36 hour history of the article have 20 minutes gone by without someone making a change to the article.

Finally, I used a method to mine the log of revisions made to the article to create a network of users modifying each others’ work (see paper here). For example, if user B makes a change to the article which was previously the version from user A, a directed link would be created from user B towards user A: user B modified user A’s version. I can also encode a variety of other information into this network. I color the nodes such that bluer nodes are users who started editing early in the article’s history, redder nodes are user who joined the collaboration later. Larger nodes are editors who have more connections to other editors. Darker links are larger changes made to the article. Larger links are more interactions between editors. Visualization was done in Gephi.

 

Eyeballing the diagram (which is no way to do real analysis), suggests that most prolific editors joined the collaboration early but are not the first contributors either (they are at approximately 2 o’clock). The dense core of the network consists of several editors who appear to be working closely together modifying each others’ work. O’Dea is at the center. Most of the editors who join later (greens, oranges, and reds) are on the periphery of the network suggesting they make relatively minor contributions, sometimes with other less central editors, but their work is subsequently revised by the central editors. These high activity editors are maintaining their involvement over time by interacting with many different types of editors who joined earlier and later (as indicated by being connected to  nodes of different colors). However, the largest changes prolific users make are still small relative to the changes other users are making (lighter links compared to the black links elsewhere).

The data for this analysis is based off revision history (CSV here) and was translated into a network format (GEXF here) using custom code I promise I’ll post when it’s not embarrassingly hacky.

Update via Taha Yasseri:

Within 48 hours of the event, 30 Wikipedia language editions had coverage of the event. 10 of these Wikipedias had articles within 6 hours of the article on the English Wikipedia. Interestingly, smaller languages like Latvian (3) and Danish (7) appeared before other major languages like Polish (8), French (10), and German (13).