Capture the Essence of Any Video: Visualize its Subtitles as a Graph

Let me demonstrate how it works…Step 1: Network Prism — Getting an Overview of the Video’s ContentFirst, you need to find a video that you’re interested in.

It can be a MOOC lecture, a TED talk, or a YouTube video.

In the example below I will use one of the most popular TED talks of Ken Robinson on how “schools kill creativity”.

You will then need to get the subtitles of this video: these can be downloaded as a file or simply extracted from YouTube (as it provides automatic captions to most uploads in many different languages).

The subtitles can then be visualized using a text to network transformation algorithm.

In this example I use InfraNodus open-source software, which can automatically extract the subtitles file from any YouTube video and visualize them as a network graph.

If your video is not on YouTube, you can simply copy and paste the subtitles from the file into a new graph in InfraNodus.

This is the result (click here for the interactive version):Automatic import of the Ken Robinson talk using the InfraNodus YouTube Import feature.

You can see an interactive version of this graph on https://infranodus.

com/ted/iG9CE55wbtYThe words are normalized (using a lemmatization algorithm), the stopwords are excluded (these can be adjusted), the main topical clusters are identified (indicated with different colors), the words that link those clusters together are shown bigger on the graph (the nodes with a higher betweenness centrality).

As a result we obtain a pretty good representation of what the video is about in just a few seconds:We can see that the main topics are:education — system — ideapeople — talk — humanhear — happen — womanchild — dance — futureThe most influential words in this talk areeducation — people — year — talkWhich gives a pretty good idea of what the talk is about.

Step 2: Nonlinear Watching — Get to the Most Interesting Part of the VideoNow that we have an overview, we might want to have a more precise idea and to put these keywords and topics that we extracted in context.

The best way to do that is to select the topic or the keywords we’re interested in, for example:education | creativity | importantAnd perform a search on the subtitles to find the part of the text that contains the highest concentration of these terms.

If you click these nodes in InfraNodus, you’ll be able to get to the relevant parts of the text:Click on the nodes you’re interested in on the graph to see the excerpt of video.

(Use the interactive version on https://infranodus.

com/ted/iG9CE55wbtY to try it out)On this graph I see the excerpt from the subtitles file of the video, where the speaker is just talking about “education”, “creativity” and “important”.

Click on the video link in that excerpt and you will get directly to the part of the video (timecoded) where Ken Robinson is talking about these topics: http://youtu.

be/iG9CE55wbtY?t=185You can then select some other part of the video, for example:Select the topic you’re interested in on the graph, find the part of the video where Ken Robinson is talking about the importance of using dance in education, then click on the link to see him speak about this directly: http://youtu.

be/iG9CE55wbtY?t=550This offers a new way to watch videos in a non-linear way, focusing on the important concepts, which can be a huge time-saver in the context of informational overload.

Step 3: Intelligent Skimming — Getting the Gist of the VideoIf you don’t feel like interacting with a graph, you can also get to the most essential parts of the video if you select the topics that have the highest concentration of the main words for each topics.

In InfraNodus this feature is realized using the Essence tab, which is essentially a summarization tool.

It aligns the excerpts that contain the main topics chronologically, so after you watch 4 of them (15 seconds each), you can get a pretty good idea of the video’s content.

Too lazy to navigate?.Click the “Essence” tab and see the most important stuff: https://infranodus.

com/ted/iG9CE55wbtYThis data can of course be also streamed to another application using an API, which could then have a more user-friendly interface to play the video (not everyone likes the graphs).

Let me know if you’d like to build an app like that :)This feature is pretty cool, because instead of watching a 20-minute long video or skimming through it randomly, you can now skim in a much more efficient way in a matter of seconds.

The algorithm chooses the main junctions for meaning circulation and helps you get to the parts of the video, which are most relevant for the discourse formation.

You can try this with several popular TED talk videos we visualized below to see how it works.

Just open the page and click on the “Essence” tab at the top left corner and then follow the timecoded YouTube links from each statement to get to the relevant parts of the videos:• What makes a good life?.Lessons from the longest study on happiness by Robert Waldinger (video)• The Power of Introverts by Susan Cain (video)10 Ways to Have a Better Conversation by Celeste Headlee (video)Step 4: Generate Insight — Find the Structural Gaps in the NetworkMost recommender systems work on the basis of similarity — “people who liked the video you like also like…” This method works pretty well, but it suffers from popularity bias and may lock viewers into filter bubbles.

A solution for this is to develop recommender systems that work on a different basis: helping to generate insight from text using network analysis.

The basic idea is very simple: in social sciences there’s a well-known concept of “structural gaps” — the parts of the graph which are not connected.

This is the place where innovation occurs.

If we use this metaphor when we study text, structural gaps are the places where the new ideas are born.

If you connect two different topics that were disconnected before, you’ll generate an insight.

This can be done both within a graph that describes a certain narrative, but also between different graphs.

In the context of our graph the structural gaps between the most prominent clusters indicate the parts of the discourse, which are underdeveloped.

In InfraNodus can get to those structural gaps if you click the “Insight” pane at the top left of the graph:We find the two topical clusters that are not so well connected and then propose the user to make a connection between them.

Usually this leads to new ideas and insight.

As you can see it shows the parts of the graph and the statements within the discourse that are prominent enough but are not really connected.

So asking a question that links the two statements (or the two sets of keywords) together may lead to a novel idea that will still be relevant to the whole discourse.

This article was written by Dmitry Paranyushkin from Nodus Labs.

If you’re interested to learn more about various applications of network analysis, check out my posts in Towards Data Science:Identifying Bias in Discourse Using Network AnalysisHow to Identify Gaps in Public Discourse Using SEO and Text MiningYou are also welcome to try the open-source text network visualization tool InfraNodus.

The online version is available on www.


com or you can download it and run it on your own machine: www.


com/noduslabs/infranodus (if you’d like to contribute to the code, please, let me know!).. More details

Leave a Reply