And more specifically, threat intelligence?Well, we plot hundreds and thousands of attributes and relationships within threat intelligence.
The amount of data we collect on the likes of malware & threat actors results in big data challenges that only data science can solve.
Handling extremely large data sets and making sense of it, is one of the largest challenges we face in cyber security.
We can use Network Science, to solve big data challenges with threat intelligence data.
We can look to understand the relationships between things like threat actors and malware strains, which over time become very difficult to manage at scale.
By applying network science, we can build complex networks of relationships within our threat intelligence gathering.
Being able to build these complex networks using programming can allow us to not only make more sense of the data we are collecting, but also give masses amounts of capabilities in building new solutions and integration with security systems.
Below is a basic example of how Network Science can be applied to a APT network, to understand relationships.
Whilst this is a small example, when building large complex networks, Network Science is really important to be able to handle big data sets.
This tutorial, will provide a entry level understanding of how we can program Networks in Python for large scale for network analysis.
What is NetworkX in Python?Networkx is an opensource networking package for python that allows us to perform network science.
It was developed in 2005 and is a package for the creation, manipulation and study of the structures, dynamics and functions of complex networks.
So… lets install networkx using the following command:Now, we must import networkx, we will name it nx for the rest of this tutorial.
We will also import matplotlib to allow us to draw our graphs, we will call this plt.
Next, we will define a node graph object by calling the graph constructor.
Our graph is currently empty, as seen by the nodes variable below.
Helper function: This function prints all the nodes we have on our graph and draws a visualisation of it.
We will also set the layout of our graphs.
We will extract node attributes within our code and we will display them as labels.
Add our First Node Lets get started…First, lets add a node.
NetworkX has a node identifier, this can allow us add a specific node we like:We can add more nodes, like this:The main purpose of creating complex networks is by joining nodes together.
We join nodes together by adding what we call “Edges”.
In NetworkX, edges are essentially tuples consisting of two node identifiers, one for origin and one for destination.
Let’s try this out…You can also add edges in bulk, which is necessary for when you plot large data sets…Here we will show the concept of nbunches & ebunches.
We will create a family network… for the purpose of this demonstration, lets use some threat actors and see how they are connected…Once we have plotted our bunches, we might want to create a list of neighbors for one threat actor.
As we only have 2 nodes within each bunch, each node only has one neighbor (obviously)… but at scale this won’t be the case…When you have a large network, can list the neighbours to query things like actor or technology.
So lets see what neighbours “GoldDragon” has by calling the neighbors function built into NetworkX…We could also remove nodes and edges from our network if we wanted to.
When dealing with large sets of data, you might fined that a threat actor or technology is incorrect or benign, for example.
So we can remove this by calling the remove_node function.
Let’s use Digraphs to plot direction between malware straigns.
You might want to use this to show malware used in an attack campaign consisting of multiple strains.
I’m going to use the malware from my latest blog post as an example.
Once we have plotted this, we have access to the successors and predecessors functions…For successors, we follow outgoing edges For predecessors we follow incoming edgesFor example: We can see the successors of Emotet are Trickbot and Ryuk during this campaign, because Emotet was used as the dropper…We can see that there are no predecessors for Emotet because of this…If we want to look for one specific relationship.
We can use subgraph, a subset of nodes and edges between them…We could get a subgraph of the network between Emotet and Trickbot like this:So, lets introduce a new attack campaign that might relate to our existing one.
We will do this to show you how to combine two networks together…We can use the union function to combine connected fragments of the network in, in this example they are not connected… but it shows how you might like to combine fragments and how to build out more complex networks.
As the node sets of G and H are not disjointed, we must rename each network for NetworkX to plot them.
Next we’re going to create a new network, but use the same nodes.
We will define these as old and new allowing for us to compare the edges.
This is a VERY basic example of how you might perform this.
But, at scale you can see why this is extremely important.
We can also look at the node degree, this is the number of edges a node has.
Let’s say you have an extremely large data set with hundreds of nodes and edges between them.
You might want to call a degree variable to see how many edges each node has.
This can allow you to understand metrics and variations in things like malware and apt relationships.
However… degree does not differentiate from incoming edges and outbound edges, to do this, you can specify inbound and outbound like this:You can also call specific nodes by passing them to a nbunch, like this…This is the end of Part One of this tutorial series.
Part Two coming soon.
See the full code seen within this article on my GitHub, you are welcome to contribute or fork as you wish:Network-Analysis-of-Threat-Actors-Malware-Strains-PART-1-Contribute to rylittlefield/Network-Analysis-of-Threat-Actors-Malware-Strains-PART-1- development by creating an…github.
comIf you liked this article, please hit the follow button or view my other blog posts:Ryan LittlefieldCyber security research, coffee and other stuff…littlefield.
coAnd if you’re on socials…Ryan Littlefield (@rylittlefield) | TwitterThe latest Tweets from Ryan Littlefield (@rylittlefield).
InfoSec Researcher / Analyst | Threat Intel, Malware, Python…twitter.
Morales-Luna, Chapter 5 — Characterization and Traversal of Large Real-World Networks, Editor(s): Rajkumar Buyya, Rodrigo N.
Calheiros, Amir Vahid Dastjerdi, Big Data, Morgan Kaufmann, 2016, Pages 119–136, ISBN 9780128053942, https://doi.