On Monday, October 27 Andy Baio posted an analysis of 72 hours of tweets with the #Gamergate hashtag. With the very best of intentions, he also shared the underlying data containing over 300,000 tweets saved as CSV file. There are several technical and potential ethical problems with that, which I’ll get to later, but in a fit of “rules are for thee, not for me,” I grabbed this very valuable data while I could knowing that it wouldn’t be up for long.
I did some preliminary data analysis and visualization of the retweet network using this data in my spare time over the next day. On Wednesday morning, October 29, I tweeted out a visualization of the network describing the features of the visualization and offering a preliminary interpretation, “intensely retweeting and following other pro-
#gamergate is core to identity and practice. Anti-GG is focused on a few voices.” I intended this tweet as a criticism of pro-Gamergaters for communicating with each other inside an insular echo chamber, but it was accidentally ambiguous and it left room for other interpretations.
The tweet containing the image has since been retweeted and favorited more than 300 times. I also received dozens of responses ranging from benign questions about how to interpret the visualization, more potentially problematic questions about identifying users, and finally responses that veered into motivated and conspiratorially-flavored misreadings. Examples of the latter are below:
— Paul Blank (@MrPeaTea) October 30, 2014
— Alrenous (@Alrenous) October 30, 2014
To be clear, I do not share these interpretations and I’ll argue that they are almost certainly incorrect (a good rule of thumb is to always back away slowly from anyone who says “data does not lie“). But I nevertheless feel responsible for injecting information having the veneer of objectivity into a highly charged situation. Baio mentioned in his post that he had a similar experience in posting results to Twitter before writing them up in more detail. A complex visualization like this is obviously a ripe for misinterpretation in a polarized context like Gamergate and I wasn’t nearly clear enough describing the methods I used or the limitations on the inferences that can be drawn from this approach. I apologize for pre-maturely releasing information without doing a fuller writeup about what went in and what you should take away.
So let’s get started.
I will have to defer to Baio for the specific details on how he collected these original data. He said:
“So I wrote a little Python script with the Twython wrapper for the Twitter streaming API, and started capturing every single tweet that mentioned the #Gamergate and #NotYourShield hashtags from October 21–23.”
This data collection approach is standard, but has some very important limitations. First, this only looked at tweets containing the hashtags “#gamergate” and “#notyourshield.” These hashtags have largely been claimed by the “pro-gamergate” camp, but there are many other tweets on the topic of Gamergate under other partisan hashtags (e.g., “#StopGamerGate2014″) as well as tweets from people speaking on the topic but consciously not using the hashtag to avoid harassment. So the tweets in this sample are very biased towards a particular community and should not be interpreted as representative of the broader conversation. A second and related point is that these data do not include other tweets made by these users. On a topic like Gamergate, users are likely to be involved in many related and parallel conversations, so grabbing all these users’ timelines would ideally give a fuller account of the context of the conversation and other people involved in their mentions and replies.
A third point is that Baio’s data was saved as a comma-separated value (CSV) file, which is a common way of sharing data, but is a non-ideal way to share textual data. Reading the data back in, many observations end up being improperly formatted because of errant commas and apostrophes in the tweets break fields prematurely. So much of the analysis involves checking to make sure the values of fields are properly formatted and tossing those entries that are improperly formatted. Out of 307,932 tweets, various stages of data cleanup will toss thousands of rows of data for being improperly formatted depending on the kind of analysis I’m focusing on. While this was not a complete census of data to begin with, this is still problematic as these data are likely non-random because they contain a combination of mischief-causing commas and apostrophes, which is another important caveat. Please use formats like JSON (ideally) to share textual data like this in the future!
To review, besides only being a three-day window, this dataset doesn’t include other Gamergate-related conversations occurring outside of the included hashtags, ignores participating users’ contextual tweets during this timeframe, and throws out data for tweets contains particular grammatical features. With these important caveats now made explicit, let’s proceed with the analysis.
Baio worked with the very awesome Gilad Lotan to do some network analysis of the follower network. I wanted to do something along similar lines, but looking at the user-to-user retweet network to understand how messages are disseminated within the community. For our purposes of looking at the structure of information sharing in GamerGate, we can turn to some really interesting prior scholarship that’s looked at how retweet networks can be used to understand political polarization [1,2] and what are the factors that influence people to retweet others [3,4]. Their work does far more in-depth analyses and modeling than I’ll be able to replicate for a blog post in the current time frame, but I wanted to highlight a few. boyd and her coauthors  identify a list of uses for retweeting, including amplifying information to new audiences, entertaining a specific audience, making one’s presence as a listener known to the author, or to otherwise agree, validate, or demonstrate loyalty. These are obviously not exhaustive of all the uses of the retweet, but they can help frame the goals users have in mind when retweeting.
Using additional metadata in the file about the number and statuses and followers a user has around the time time of his tweet, I can create additional variables. One measure is “tweet_delta”, or the difference between the maximum and minimum observed value for a user’s “user_statuses_count” field recorded for each of their tweets. This ideally captures how many total tweets the user made outside of the observations in the dataset. A second related variable, “tweet_intensity” is the ratio of tweets in the (cleaned) dataset to the tweet_delta. This value should range between 0 (none of the tweets this user made over this timespan contain #Gamergate/#NotYourShield) and 1 (all of the tweets this user made over this timespan contain #Gamergate/#NotYourShield).
A third measure is “friend_delta”, or the difference between the maximum and minimum observed value for the number of other users that a given user follows. Like the “tweet_delta” above, this capture how many friends (I prefer the term “followees”, but “friends” is the official Twitter term) a user has at the time of each tweet. A similar measure can be defined for followers. Since you have less control over who or how many people follow you, friends/followees is a better metric for measuring changes in an individual’s behavior like actively seeking out information by creating new followee links. This value varies between 0 (no change in followees) to n (where n is the maximum number of followers observed over these 3 days).
The retweet graph visualization
The basic network relationship I captured was whether User A retweeted User B. This network has directed (A retweeting B is distinct from B retweeting A) and weighted (A could retweet many of B’s tweets) connections. Again, to be explicit, the large colored circles are users and the colored lines connecting them indicate whether one retweeted each other (read an edge “A points to B” in the clockwise direction). This is not every retweet relationship in the data, but only those nodes belonging to retweet relationships where A retweeted B at least twice. This has the effect of throwing out even more data and structural information (so inferences about the relative size of clusters should reflect single instances of retweeting have been discarded), but reveals the core patterns. This is an extremely coarse-grained approach and there are smarter ways to highlight the more important links in complex networks, but this is cheap and easy to do.
The x and y coordinates don’t have any substantive meaning like in a scatterplot, instead I used the native ForceAtlas2 force-directed layout algorithm to position nodes relative to each other such that nodes with more similar patterns of connections are closer together. Making this look nice is more art than science and most of you can’t handle all my iterative layout heuristics jelly.
- I’ve sized the nodes on “in-degree” such that users that are retweeted more by many unique users are larger and users that are retweeted less are smaller.
- The color of the node corresponds to the “friend_delta” such that “hotter” colors like red and orange are larger changes in users followed and “cooler” colors like blue and teal are 0 or small changes in users followed. Nodes are colored grey if there’s no metadata available for the user.
- The color of the link corresponds to the “weight” of the relationship, or the number of times A retweeted B. Again hotter colors are more retweets of the user and cooler colors are fewer retweets within the observed data.
Manual inspection of a few of the largest nodes in the larger cluster reveal that these are accounts that I would classify as “pro-Gamergate” while the largest nodes in the cluster in the lower left I would classify as “anti-Gamergate.” I didn’t look at every node’s tweet history or anything like that so maybe there are some people on each side being implicated by retweet association. There were a lot of questions about who the large blue anti-GG node is. Taking him at his word as someone who would welcome being targeted, this is the “ChrisWarcraft” account belonging to Chris Kluwe, who tweeted out this (hilarious) widely-disseminated post on October 21, which is during the time window of our data.
— Chris Kluwe (@ChrisWarcraft) October 21, 2014
Let’s return back to my original (and insufficient) attempt at interpretation:
Basic story: intensely retweeting and following other pro-#gamergate is core to identity and practice. Anti-GG is focused on a few voices.
— Brian Keegan (@bkeegan) October 29, 2014
The technical term that network scientists like myself use for images like the one above are a “hairball” that often offer more sizzle than steak in terms of substantive insights. Eyeballing a diagram is a pretty poor substitute for doing statistical modeling and qualitative coding of the data (much more on this in the next section). Looking at a single visualization of retweet relationships from three days of data on a pair of hashtags can’t tell you a lot about authoritarianism, astroturfing, or other complex issues that others were offering as interpretations. I don’t claim to have the one “right” answer, but let me try to offer a better interpretation.
- The pro-GG sub-community is marked by high levels of activity across several dimensions. They retweet each other more intensively (larger in a network where all edges are at least 2 retweets). They are actively changing who they follow more than the anti-GG group (this would need an actual statistical test). It’s certainly the case that participants are highly distributed and decentralized, but as I discuss more below, it also suggests they’re highly insular and retweeting each others’ content is an important part of supporting each other and making sense of outside criticism by intensively sharing information.
- I suspect the anti-GG sub-community is smaller not because there are fewer people opposed to GG, but that the data analysis and visualization choices Baio and I made included only those people using the hashtag and excluded people who only retweeted once. In other words, one shouldn’t argue there are more Republicans than Democrats by only looking at highly active #tcot users. Ignoring Kluwe’s post as an outlier, the anti-GG sub-community looks smaller but similarly dense.
- There’s a remarkable absence of retweeting “dialogue” between the two camps, something that’s also seen in other online political topics. Out of the thousands of users in the pro-GG camp, only 2 appear to retweet Kluwe’s rant. So contra the “diversity” argument, there actually appears to be a profound lack of information being exchanged between these camps which suggests they’re both insular. But if #Gamergate is where a lot of the pro-GG discussion happens while anti-GG discussion happens across many other channels not captured here, we can’t say much about anti-GG’s size or structure but we can have more confidence about what pro-GG looks like.
- The reaction among the pro-GG crowd to my visualization also gives me an unanticipated personal insight into the types of conversations that this image became attached to. The speed and extent to which the visualization spread, the kinds of interpretations it was used to support, and the conversations it sparked all suggested to me that there were many pro-Gamergaters looking for evidence to support their movement, denigrate critics, or delegitimize opponents. My first-hand experience observing these latter two points (the tweets above being a sample) lend further weight to many other critics’ arguments about these and other forms of harassment being part and parcel of tactics used by many pro-GGers.
If anything, I hope this exercise demonstrates that while visualization is an important part in the exploratory data analysis workflow, hairballs will rarely provide definitive conclusions. I already knew this, but as I said before, I should have known better. But to really drive the point home that you might fashion the same hairball visualization to support very different conclusions, here are some more hairballs below from the very same dataset using other kinds of relationships.
First off, here is the mention network where User A is linked to User B if User B’s account in mentioned in User A’s tweet. Now this is a classic hairball (Kluwe is again the isolated-ish green node in the upper left, for those of you keeping score at home). Links of weight 1 are black and higher weights range from cool to hot. Unlike the highly polarized retweet network, here we have an extremely densely-connected core of nodes. I’ve decided the color the nodes by a different attribute than above, specifically normalized degree difference. This is calculated as (out-degree – in-degree)/(out-degree + in-degree) and varies from -1 where a user receives only mentions but never mentions anyone else (bluer) to 1 where a user only makes mentions but is never mentioned by anyone else (redder). There’s really no discernable structure as far as I can tell and anti-GG accounts are mixed in with pro-GG accounts and other accounts like Adobe and Gawker that have been caught up.
But the node colors do tell us something about the nature of the conversation, namely, there are very many nodes that appear to be engaged in harassment (red nodes talking at others but not being responded to) and many nodes that are being targeted for harassment (blue nodes being talked at but not responding, like Feliciaday). Indeed, plotting this relationship out, the more tweets a user makes mentioning another account (x-axis), the lower their normalized degree difference (y-axis). I’ve fit a lowess line to clarify this relationship in red. In other words, we’re capturing one feature of harassment where more tweets mentioning other people buys you more responses from others up to about a dozen tweets and then continuing to tweet mentioning other people results in fewer people mentioning you in return.
Second, here’s the multigraph containing the intersection of the retweet and mention networks. User A is linked to User B if A both retweeted and mentioned B within the dataset. Unlike the previous posts, I haven’t filtered the data to include edges and nodes above weight 1, so there are more nodes and weaker links present. I’ve colored the nodes here by account_age, or the number of days the account existed before October 24. Bluer nodes are accounts created in recent week, redder nodes are accounts that have existed for years, grey nodes we have no data on. I’ve left the links as black rather than coloring by weight, but the edges are still weighted to reflect the sum of the number of mentions and number of retweets. This network shows a similarly polarized structure as the retweet network above. Manual inspection of nodes suggests the large, dense cluster of blue nodes in the upper-right is pro-GG and the less dense cluster of greener nodes in the lower-left is anti-GG. By overlapping the data in this way, we have another perspective on the structure of a highly-polarized conversation. The pro-GG came looks larger in size, owing to the choice not to discard low-weight links, which suggests that anti-GG participation is not as intense and cohesive as the tightly-connected pro-GG camp that suggests more insularity.
It’s also worth noting there are substantially more new accounts in the pro-GG camp than the anti-GG camp. We can examine whether there’s a relationships between the age of the account and the clustering coefficient. The clustering coefficient captures whether ones friends are also friends with each other: the pro-GG appears to have more clustering and more new accounts and the anti-GG appears to have less clustering and older accounts. The boxplots below bear this rough relationship out: as the clustering coefficient increases (the other users mentioned by a user also mention each other), the average age of these accounts goes down substantially. This also seems to lend more weight to the echo chamber effect — newly created accounts are talking within dense networks that veer towards pro-GG with older accounts are talking within sparser networks that veer towards anti-GG.
Third, here’s the hashtag network where User A (users are blue) is linked to Hashtag B (hashtags are red) if the user mentions the hashtag in the tweet. I’ve intentionally omitted the #Gamergate and #NotYourShield hashtags as one of these would show up in every tweet, so it’s redundant to include them. I’ve also focused only on the giant component, ignoring the thousands of other unconnected hashtags and users in the network. This graph is distinct from the others as it is a bipartite graph containing two types of nodes (hashtags and users) instead of one type of node (users in the previous.) This graph is also weighted by the number of times the user mentions a hashtag (wamer = more). Some of the noticeable related hashtags are #gamer (top), #fullmcintosh (centerish), and #StopGamergate2014 (bottom right). Interestingly, many of these hashtags appear to be substrings “gamergate” such as “gamerg”, “gamerga”, “gam”, etc. that is some combination of an artifact of Twitter clients shortening hashtags, or improvisation among users to find related backchannels. But a number of anti-GG hashtags are present and connected here suggesting the discussion isn’t as polarized as the RT graph would suggest. This likely reflects users including hashtags sarcastically, like a pro-GG including #StopGamergate2014. There are also outwardly unrelated hashtags such as #ferguson, #tcot, and #ebola included.
Each of these new networks reveals alternative perspectives about the structure and cohesiveness of Gamergate supporters and opponents. I should have shared all these images from the start, but these later three required a bit more work to put together over the past few days. My priors about it being an insular echo chamber are borne out by some evidence and not by other hairballs. Each side might divine meaning from these blobs of data to support their case, but in the absence of actual hypothesis testing, statistical modeling, and qualitative coding of data, it’s premature to draw any conclusions. I did some other exploratory data analysis that suggests features associated with being pro-GG like highly-clustered networks and authoring many tweets mentioning other users are associated with potentially harassing behavior like using newly-created accounts and getting few replies from others.
So where do should we go from here? First and most obviously, I hope others are collecting data about how Gamergate has unfolded over a wider range of time than three days and set of hashtags than #gamergate itself. I hope that literature around online political polarization, online mobilization of social movements, and the like is brought to bear on these data. I hope qualitative and quantitative methods are both used to understand how content and structure are interacting to diffuse ideas. I hope researchers are sensitive to the very real ethical issues of collecting data that can be used for targeting and harassment if it fell into the wrong hands. I hope my 15 minutes of fame as an unexpected B-list celebrity in the pro-GG community doesn’t invite ugly reprisals.