Could it be that easy?My initial thought was that the tightly clustered yellow nodes in the center made up the community I was interested in monitoring, but after further analysis, I discovered I was completely wrong.
Looking closer at the content of the interactions, I began to notice these accounts were actually ones that were verbally attacking the fifty or so accounts I had been watching.
The individuals on the list I was monitoring were in fact the blue nodes surrounding the center.
It occurred to me that I had gone about it all wrong.
Because there was such a strong reaction to those initial accounts being called out, an online lynch mob of sorts had formed that began to aggressively respond to everything the group posted.
Though I could see high volumes of bi-directional communications between the accounts of interest (football shaped patterns between blue nodes), the attacking accounts were really skewing the data.
After this discovery, I sat back and tried to sort out what it all meant.
I had no doubt discovered something interesting in the data, but I had nearly classified a few dozen accounts as something they were not.
In fact, I started to realize that my strong reaction to this investigation, something stemming from my own morals and ethics, might in itself be somewhat misdirected.
After a bit of reflection, I decided to re-approach the investigation, but this time with a different mindset.
Setting aside my personal bias, I simply looked to the facts to tell the story.
My now obvious first mistake was that I based my initial collection of data off a list that was tweeted publicly, which had rapidly gained a lot of attention.
I hastily had chosen a list that thousands of others had access to and were directly acting upon.
The mere existence of that list had, in fact, resulted in creating the exact pattern I was expecting to find.
I accepted that first mistake as one that was easy to make and treated it as a learning experience, but I was really mad at myself for making the second one, because I knew better…I had gone into the investigation with such a strong opinion about what I would find that I jumped to a conclusion the moment a pattern emerged that met my expectations.
This was something that I had worked hard to train myself not to do and something I have cautioned my students to watch out for.
I unwittingly allowed my own personal beliefs to interfere with the results.
I decided a better approach would be to take the list of accounts that were confirmed to be of interest, enumerate the friend/follower networks of those accounts, and then build out a tentative “community” with the resulting data set.
By selecting just the nodes that had ten or more relationships in common, I was able to hone in on 115 accounts that had a high probability of being relevant to my investigation.
Filtered list of friend/follower relationshipsThis list was still not perfect, but it would serve as a better starting point, since it was derived from friend/follower relationships instead of biased responses.
The attacking accounts would still appear in the graph, but at least there should be less bias in the collection of data.
This event taught me a valuable lesson.
Though I felt I was better prepared than most, I was still vulnerable to letting my emotions drive an investigation.
It was still possible for me to fall into the trap of just looking for what I wanted to find…Data visualization tools like Graphistry are incredibly powerful.
I personally feel they are some of the most significant pieces of technology I have discovered in my career.
The way they are able to assist an analyst in finding a needle in a massively sized haystack of data is fascinating to say the least.
This specific investigation made me realize though that with any new technology comes new challenges.
As an analyst, whether we are on the job or working to help the community on our personal time, we must strive to eliminate as much personal bias from our research as possible.
To do this, I suggest trying to continuously loop through alternate theories that might explain the outcome more accurately, even if that explanation completely goes against your original hypothesis.
It is up to us to find the truth in the data.
In the end, I made the decision not to pursue that group any further.
Over the period of time that I was watching, the only truly harmful behavior I saw was from the mob that formed around them.
With increasing frequency, we see reactions on social media that are primarily based on emotion rather than fact.
I just did not feel comfortable sharing any research that could easily have a negative influence on the opinions of others.
Knowing just how quickly I jumped to my own conclusion, I felt the right thing to do was stand down from this case and move on to the next one.
My quest for truth in the data continues…“The human talent for pattern recognition is a two-edged sword.
We’re especially good at finding patterns, even when they aren’t really there.
Something known as false pattern recognition.
We hunger for significance, for signs that our personal existence is of special meaning to the universe.
To that end, we’re all too eager to deceive ourselves and others, to discern a sacred image in a grilled cheese sandwich or find a divine warning in a comet.
”— Neil deGrasse Tyson, Cosmos: A Spacetime Odyssey.