The Infinite Zenith

Where insights on anime, games, academia and life dare to converge

What you get is what you see: social network analysis on a subset of the AnimeSuki forum community

“Thank you for the civil discussion. I wish others would be up to that standard.” —Akeiko Sumeragi

This post is somewhat unusual in that it deviates from the typical topics that are discussed here. What motivates the topic is a bit of curiosity: I encountered a chart depicting a subset of AnimeSuki’s forum members last year. It remained little more than a novel illustration of how the community’s more tightly-knit sections were connected until recently, after I had taken a course on social network analysis and data mining (I had intended it to help me gain some background into data mining techniques for biology and other forms of big data). Some of the concepts that were covered piqued my interest in this chart, and so, I decided to take a crack at seeing if there were any notable patterns amongst the members of the AnimeSuki community depicted in the chart. This post does make use of some social network analysis (SNA) terminology, but I’ll do my best to keep the talk accessible. Before going on, let’s bring everyone up to speed on what social network analysis is. There’s a long, uninteresting definition, so for the purposes of this discussion, SNA is applying concepts taken from graph theory to study how individuals are related in a group. Such networks are represented as a graph, which is a collection of vertices and edges [1, 2]. The former represent an actor and the latter denote any connection that they share (such as friendship, collaboration, communication, kinship, etc.). Patterns may arise in these graphs that can predict patterns within that group, and SNA has diverse applications, being useful in anything from figuring out how genes might be related to studying the organisation of terror groups [1, 2]. While it should be clear that analysis of a sub-network in AnimeSuki is not anything like studying genes or terrorism, it nonetheless remains a reminder that SNA is very powerful and relevant in an age where connections and data continue to grow, and moreover, it does yield some interesting results for small-community interactions amongst anime fans.

  • I’m no expert on SNA analysis, but I do have a minimal amount of familiarity with the topic. Upon re-encountering the chart on my hard disk, the notion of an interesting and unusual blog post came into my mind: this chart was something that could be converted into a social network graph, with the different individuals as nodes and the edges indicating their connectivity. I understand that my methods and definitions are nowhere nearly as rigorous as a formal publication would require. As such, I’m sure there are all sorts of errors and inconsistencies inside the post, but since this was done purely for fun, accuracy should not be a major concern.

While this post may be written much more informally than a paper, I’ll still go into details about what I did to gather the data, as well as what the metrics I’ve used mean by comparing them to a network of friends. The original chart acted as the network that I built the undirected graph from. I treated the original relationship chart as an undirected graph on the assumption that two actors linked together by an edge as having above-average trust for one another. Lastly, I’ll assume that Mädchen und Panzer is a maximal clique (i.e. everyone is connected to everyone else). Through this post, I will mention degree centrality, closeness centrality, betweenness centrality, PageRank and several others quite frequently. Degree centrality refers to the number of edges a node has on a graph; in a network of friends, someone with a high degree centrality has many friends [1]. Closeness centrality is the reciprocal of farness, or the sum of a node’s distance to other nodes. In graphs, the distance is the number of nodes one must pass through to reach a destination node from a starting node [1]. In our example, someone with a high closeness value is influential in amongst a sub-group of friends but does not particularly hang out with people outside that group. The betweenness centrality refers to the number of instances a node must be passed through when travelling from one node to another [1]. In a group of friends, someone with a high betweenness centrality are the people who have friends with a comparatively large number of people in other sub-groups. As a result, they are considered to be the people who disseminate information between different groups. Degree, closeness and betweenness are common SNA metrics, but other common ones include Eigenvalue centrality and PageRank. Eigenvalue centrality is how connected a node is with other well-connected nodes within the graph [3]. In a group of friends, this is the most popular person with the greatest influence, but typically does not communicate widely with the others. PageRank is a measure of importance: originally intended to assess the influence of a webpage, PageRank is calculated by considering the number of connections a node has and the PageRank of the nodes said page is connected to [4].

  • Figure I: The original relationship chart from AnimeSuki. I found back during early 2014, and it was made somewhere back during 2013 April (as per the information in the image’s properties file) by an AnimeSuki member. This is the chart that forms the basis for the calculations below, and as it’s public domain, I wouldn’t need ethics approval. As one can imagine, anything more complicated would take a fair bit of time and would require ethics approval from CAREB-ACCER to carry out. A more detailed study would also need access to the AnimeSuki database, and I imagine that I’d need to go to quite the lengths to obtain it. Between this and the time required to gain ethics clearance for such a project, it’s clear that no serious academic organisation would wish to analyse an anime community.

  • Figure II: This is the chart that Gephi generated using the information solely from the relationship chart. Owing to the dynamics in a community, I had to make several assumptions when converting it into a social network: I decided to treat the Mädchen und Panzer as a complete graph, while Sumeragi only had edges to Wilx and Myssa Rei (based on a priori knowledge). The Audience was likewise treated as a complete subgraph that lacked edges to any other part of the graph, and for obvious reasons, the popcorn machine was not included. While there are arrows indicating the directionality of a relationship in this chart, I assume that this will be an undirected graph; every edge is considered to represent trust, denoting an increased likelihood to share information with one another before releasing it to the forums. The default visualisation is not particularly informative and does not show any of the cliques that form. However, through visualisation of the metrics later on, some interesting properties can be obtained through Gephi: this software is amongst the best for graph and social network analysis. It’s easy to install, but only runs if the Java Runtime Environment 1.7 is installed (1.8 causes it to crash) [5].

So I’ve explained what the metrics mean, and I’ve converted the relationship chart into something that graph visualisation tools can understand using Gephi (Figure I, Figure II) [5]. What does all of that data actually mean? The results are summarised in Table I. The first metric to consider is weighted degree centrality (Figure III), which is similar to degree centrality but also accounts for the degree of it’s neighbours. Notice that members of the Mädchen und Panzer community have the highest weighted degree centralities, and this is not surprising, since everyone knows everyone here. These individuals can be assumed to communicate frequently behind-the-scenes to coordinate their games and as such, maintain strong ties to one another. On the whole, however, degree centrality is a relatively uninteresting metric, as it only indicates who has connections with whom. Closeness centrality determines who is more popular in a small group: the metric finds that Genji-chan (presently NoemiChan), Dr. Casey and Azuma Denton have the highest closeness values, while Terrestrial Dreams, Endless Soul and Kanon have the lowest (Table I, Figure IV). This suggests that Genji-chan, Dr. Casey and Azuma Denton probably spend more time interacting with their preferred contacts, so if they were to come across anime-related news and information, their preferred contracts may learn of the news quite quickly, and the news would propagate from there. Betweenness centrality is (to me) one of the most exciting metrics, as it measures how important an individual is towards linking communities together. Here, we find that KonaKaga has the highest value out of everyone in the network (83.05), which is to be expected; KonaKaga is a moderator and interacts with a large number of different groups. We also find that Hoove, FlavoryFantasy and Sumeragi have high betweenness values (Table I, Figure V). These are the individuals who bridge communities and play a major role in spreading news between different portions of the communities. Where the Eigenvalue Centrality is concerned, Myssa Rei and Wilx have the highest values; these individuals post frequently in many locations and appear to be quite influential in their own community, but also seem to maintain a very small number of close connections with others in the network (Table I, Figure VI). Lastly, for PageRank, FlavoryFantasy, Chaos2Frozen and SaintlessHeart have the greatest PageRank values, suggesting that these are the individuals with a large number of connections to other influential members (Table I, Figure VII).

  • Table I: Results of the SNA. The values were obtained using the built-in tools from Gephi and are sorted alphabetically, with the absolute value for each metric is given [5]. The values do not necessarily reflect the real-world properties of the AnimeSuki community, and only represent the values obtained from the network generated (Figure II) based on the assumptions described by Figure I. For formatting purposes, I’ve simplified the values to two decimal places.
    • *-Alternatively known as Naomi-chan, inactive
    • **-Alternatively known as Daigensui, inactive
Label Weighted Degree PageRank Closeness Centrality Betweenness Centrality Eigenvector Centrality
Ascaloth 7.00 0.04 2.64 0.00 0.95
Azuma Denton 2.00 0.02 3.00 0.00 0.04
Chaos2Frozen 6.00 0.05 2.14 37.13 0.18
Dr. Casey 2.00 0.02 3.00 0.00 0.04
Endless Soul 2.00 0.02 1.50 0.00 0.01
Eroking 5.00 0.05 2.14 27.53 0.14
FlavoryFantasy 6.00 0.06 2.09 47.73 0.15
Genji-Chan* 2.00 0.02 3.32 0.00 0.03
Hasumi 3.00 0.03 2.55 18.42 0.07
Hooves 8.00 0.05 2.05 59.83 0.99
Kamijou_Touma 7.00 0.04 2.64 0.00 0.95
Kanon 2.00 0.03 1.00 1.00 0.01
Kimidori 7.00 0.04 2.64 0.00 0.95
KonaKaga 5.00 0.04 1.77 83.05 0.25
Mangatron 3.00 0.03 2.91 7.25 0.05
Myssa Rei 8.00 0.05 2.32 13.95 1.00
Patchy 2.00 0.02 2.86 0.75 0.05
Ridwan 2.00 0.02 2.77 0.00 0.07
RRW 7.00 0.04 2.64 0.00 0.95
SaintessHeart 6.00 0.05 2.09 31.65 0.19
Seitsuki 3.00 0.03 2.68 4.23 0.08
Sumeragi** 4.00 0.03 2.09 38.57 0.35
Tak 7.00 0.04 2.64 0.00 0.95
Terrestrial Dream 1.00 0.02 1.50 0.00 0.00
Tsundere Louise 2.00 0.02 2.73 0.95 0.06
Wilx 8.00 0.05 2.32 13.95 1.00

  • Figure III: Weighted degree centrality values. Nodes with a high weighted degree centrality are larger in size and red, while nodes with a low weighted degree centrality will be smaller and have a lighter colour.

  • Figure IV: Closeness centrality values.  Nodes with a high closeness centrality are larger in size and red, while nodes with a low closeness centrality will be smaller and have a lighter colour.

  • Figure V: Betweenness centrality values.  Nodes with a high betweenness  centrality are larger in size and red, while nodes with a low betweenness centrality will be smaller and have a lighter colour.

  • Figure VI: Eigenvector centrality values. Nodes with a high eigenvalue centrality are larger in size and red, while nodes with a low eigenvalue centrality will be smaller and have a lighter colour.

  • Figure VII: PageRank values. Nodes with a high PageRank  are larger in size and red, while nodes with a low PageRank will be smaller and have a lighter colour.

We introduce a few new terms to discuss the network topology, or the metrics that measure the patterns seen throughout the network. The graph density is a measure of how many edges there are in the graph, compared to how many edges are possible. A graph’s diameter refers to the longest path of the shortest paths between any two nodes. The average path length is the average length of all the shortest paths between two nodes within the graph; a shorter value for both the diameter and average path length means it’s easier to reach one node from another node [1, 2]. Lastly, the clustering coefficient refers to the degree to which the nodes cluster together. Overall, Gephi finds that the graph density was 0.182, the diameter was 4, the average path length was 2.508 and the clustering coefficient was 0.585. Thus, while the graph is not particularly dense, the different nodes in the graph (representing an AnimeSuki member) were reasonably well-connected together. In a random graph, the average path length is 3.23, the node diameter is 6 and the clustering coefficient is 0.02 [6]. All of the values for the AnimeSuki graph are slightly higher than the random graph, corresponding to the small-world topology. A small-world graph can be roughly defined as a graph where there most of the nodes are not neighbours of one another, but there are a sufficient number of connections such that all of the nodes can be reached through a relatively small number of jumps [7]. This is not surprising for an online community, since a large number of individuals in online communities are usually linked together by means of mutually shared connections. Thus, even if the individuals do not share connections with one another (recall that we assumed edges between two nodes to denote trust), information can nonetheless propagate rapidly throughout the group, and moreover, can continue propagating through the network even if a few connections are removed (for instance, permanently banned members have a relatively small impact, and the community nonetheless feels like it always was). The small-world topology is quite robust, although if choice nodes were targetted, the impacts would be more substantial as far as community cohesion would go.

  • Social networks are dynamic and ever-changing, so the values calculated here do not hold true for AnimeSuki as of 2015. This attribute of social networks makes them more difficult to study, since their structure (and corresponding properties) are fluid. One of the areas of interest in SNA is devising the means of studying dynamic networks. For my case, since the AnimeSuki community is quite tightly-knit, the predictions afforded by SNA remain quite consistent with what is actually seen at the forums. For those who are considering a critique of my methods, again, I note that I did not employ the same formalisations that a proper study would have, so readers should not trouble themselves with the nitty-gritty details that I might have missed. Had I wished to something like this for keeps, I would have spent more than an hour on putting it together.

Social Network Analysis was applied here (loosely!) to determine whether or not patterns existed within a subset of the AnimeSuki community, and it was found that this group has a small-world topology, accounting for how information can propagate quickly through the network even though the graph is relatively sparse, and why bans to members with lower SNA values do not appear to interrupt information flow to too significant of an extent. Individuals who occupy the more interesting positions (high betweenness and Eigenvalue centrality) appear to have a more active role in the forums, being frequent posters who have a fair bit of influence over the discussions. Of course, the results of this experiment cannot be said to be particularly meaningful, as the forums extend well beyond the 26 members here and, the interactions are certainly more complex than depicted. Nonetheless, it represents a fun exercise that illustrates the sort of insights that SNA can offer on a network of individuals; it is not surprising that some of the metrics do indeed correspond with what is empirically observed, which can help quantify some trends and behaviours that would otherwise be quite difficult to describe through other means.

Bibliography

  1. Hanneman RA and Riddle M. (2005). Introduction to social network methods. Riverside, CA: University of California, Riverside.
  2. Wasserman S. (1994). Social network analysis: Methods and applications (Vol. 8). Cambridge university press.
  3. Spizzirri L. Justification and Application of Eigenvector Centrality. (2011). In: Algebra in Geography: Eigenvectors of Networks.
  4. Page L, Brin S, Motwani R and Winograd T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford InfoLab.
  5. Bastian M, Heymann S and Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.
  6. Qin J et al. (2005). Analyzing terrorist networks: A case study of the global salafi jihad network. In Intelligence and security informatics (pp. 287-304). Springer Berlin Heidelberg.
  7. Watts DJ and Strogatz SH. (1998). Collective dynamics of ‘small-world’ networks. Nature 393 (6684): 440–442.

Please provide feedback!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: