Finding Success on TwitchNatasha BordersBlockedUnblockFollowFollowingJun 20Photo by Caspar Camille Rubin on UnsplashIntroductionIt seems only a few weeks ago that I started my journey at Metis, the immersive and challenging data science program in San Francisco, and here we are, twelve weeks later, working on our final project.
In this blog post, we’re going to walk through how we built a streamer recommender for Twitch and the various tools we used to make the resulting app, available now on Heroku.
The code for this project can be found here.
It all started with an idea.
After two years of working with my immersive events studio, Hanging Lantern, I became intimately familiar with creative content creators by joining their ranks.
The constant drive to deliver a delightful experience to one’s viewers or attendees is a tough but thrilling challenge, so I explored the various aspects of it during my time at Metis by studying the process of Kickstarter campaigns and the public’s opinion of Game of Thrones.
When the time came to select the final project, I was mulling over making a recommender system for the content creators on the popular streaming website, Twitch, as an expression of both my love of video games and my passion for helping out content creators.
It looked like no such recommender was available at the time, even though there were multiple articles out there giving advice on what to stream and several websites dedicated to Twitch running stats.
It seemed like a perfect and overly ambitious final project, and I was not the only one in our cohort passionate about video games and Twitch.
Together with Jeremy Chow and Randy Macaraeg, we began the first three-person final project effort ever seen at Metis as we set out to produce a fully functional Twitch Recommender for Streamers.
How does Twitch work?Ninja’s channel page on TwitchFor those unfamiliar with it, Twitch is a streaming video website where content creators attract wide audiences of viewers and subscribers by streaming themselves while they play popular video games or other entertaining content.
In 2018 alone, over 1 million years of content was consumed on Twitch, with over 4 million unique monthly streamers providing it.
As you can imagine, with so many choices of what to watch, the streamers are competing with each other for viewers and dedicated subscribers.
Choosing a game (or category) to stream on Twitch is one of the hardest choices to make and the one that will impact your success more than anything else.
(by Mark Longhurst of The Emergence, in “How to Choose What Game to Stream”)This is precisely the problem our recommender aims to tackle, and such a problem is not unique to Twitch, Many companies connect providers of content, goods, or services, with consumers of these products, so the diversity and quality of what’s available for consumers has a direct influence on the welfare of the company.
Specifically for Twitch, our goal is to help streamers pick the best game for GROWTH, to attract viewers and subscribers, grow their channel and share the content they enjoy.
Let’s take a look at two hypothetical Twitch streamers:Photo by KAL VISUALS on UnsplashMelany is a brand new streamer who knows what she likes to play but is not sure if her favorite genres and games will bring her viewers and subscribers.
If we know that she likes Role Playing and Adventure games, and her favorite games are The Witcher 3: Wild Hunt and Dragon Age: Origins, we can show her how those genres and games are currently performing on Twitch so she can decide if she wants to pursue them.
We can also display the current streaming trends to help her understand the market, identify gaps in supply which she can fill, and make sure she can focus on the types of games she is already familiar with.
She needs to know:Should she stream her favorites?Which content will result in more views?UnsplashAlex, on the other hand, is an existing streamer.
She already has viewers for her Fortnite and Madden NFL games and would like to explore growth options that will bring her existing fans along and grow her viewer base even further, exploring her favorite genres of Shooters and Sports games.
She needs to know:What is the next big game her existing viewers will love?What other games should she try to keep her momentum going?Let’s Build a Data PipelineWith those two streamer cases in mind, let’s walk through our data collection process.
Having only three weeks to execute the complex vision we had in our minds, we plunged into data collection.
After studying several Twitch analytics websites, I sent a Twitter message to the owner of SullyGnome, who graciously provided us with 40 days worth of aggregate Twitch data, providing us with viewership numbers and channel streaming informationTo extract the constant streaming data, we built a pipeline that queried the API, funneled the data through our Amazon EC2 instance into the PostgreSQL database we created using the Amazon Relational Database Service, and saved it there for our usage.
We then sourced this data for our data analysis and development and used Heroku to launch our app.
The architecture of this pipeline ensured that we all had access to the same stream of information at all times, and the Python script collected Twitch data every hour, depositing the results into three PostgreSQL tables we have set up to receive the information.
Our PostgreSQL Tables Schemastream_data game_informationstream_id text game_id text user_id text game_name text user_name text pic_url game_id text stream_type text title text game_genres viewer_count int started_at timestamp game_name text language text game_genres text time_logged timestampOnce the tables were completed, we queried the data directly and began our exploratory data analysis.
The distillation of this analysis is available on the first tab of the recommender app and in our Tableau dashboard.
For example, we examined which genres of content were on the rise by looking at the changing ratio of viewers to channels, and finding those genres that had room for streamer growth and entry.
Taste of our Tableau graphs, the entire selection is linked below.
Trends in Games, Genres, Channels, and Viewership for Twitch over time.
GitHub…Trends in Games, Genres, Channels, and Viewership for Twitch over time.
comSurprise!.Our Recommender Creation ProcessWe used the Surprise Python library which is a great tool for developing a recommender system.
All we need for this framework is the information about our streamer, the genres and games they like or already stream, and a custom game success metric we developed which ranges from 1–5 and acts as a rating for each streamer-game relationship.
Our algorithm selection focused on both accuracy and speed — the lower test Root Mean Squared Error means that these algorithms did the best job with unseen data and they work fast which means that it would be easy to retrain and deploy live when the amount of available streaming data is constantly increasing.
These three work in tandem to identify the most suitable games and genres based on the streamer preference, their existing experience (if any), and the relative success of other streamers with similar content.
We are blending them together to produce a single list of recommendations for each streamer, offering a variety of genres and games to stream, all of which have the potential to improve their viewership numbers and retain subscribers.
You can try out the recommender for yourself on Heroku.
At the moment the models used in the recommender app are the simpler, faster ones we have tested, but given a faster server able to handle large amounts of streaming data, we can utilize the more robust algorithms we developed.
Timing is EverythingApart from recommending streamers what to stream, we also wanted to be able to tell them when would be a good time to stream.
Considering that Twitch viewership is cyclical in nature, and typically peaks in viewership occur at evening times, identifying gaps in supply over the course of a day or a week becomes really important.
We used the Long Short-Term Memory (LSTM) network model to examine the fluctuations of viewers and channels over time, aiming to predict the anticipated changes.
LSTM uses long-term memory to quantify growth over time and examine seasonal trends, short-term memory to take into account recent/daily trends, and its ability to forget allows it to look back further without the speed penalty of other networks such as recurrent neural networks (RNNs).
We looked at the entire last week of data (168 hours), checked 24 last model states, and predicted the following week (168 hours).
With the in-sample and test RMSE at 10%, we are able to predict fluctuations in streaming channels over time quite well.
Growth ScoreIn addition to making recommendations, and specifying when to stream, we wanted to be able to rank the games we recommend based on a common metric which not only includes the recommender rating we’ve discussed before but also the growth potential as a function of the market share of top 8 streamers.
This approach would prevent a streamer from trying to break into a game already monopolized by a few people with no room for expansion.
Doesn’t this look clear and uncomplicated?What it really translates to is this:Ah, much better!The first term, streamer affinity, comes from the recommender itself and is personal for each streamer-game success potential, game growth and game popularity change with time and can be inferred from the incoming stream of data, and the market penetrability stems from how monopolized each game is by its top streamers.
Taken as a whole, a growth score reflects each game’s full potential for each streamer.
It is currently in development and we’re hoping to implement it in our modeling in the future.
Overall, our recommender system accomplishes several important things: it improves streamer experience and retention, it encourages channels growth and diversity of content, and it increases revenue for Twitch.
This kind of content creator recommender system can be adapted to other companies and types of content creation.
This project has been an amazing challenge to tackle and going forward, we would like to scale our model to be used with Twitch streaming data in real time, to collect all of that real-time data and keep training our recommender model to improve its accuracy, as well as fully integrate it into the streamer experience on Twitch.
You can find our app on Heroku.
The code for this recommender can be found on GibHub.
Thank you for reading.
If you’d like to get in touch, I can be found on Github, LinkedIn, and Twitter.
Natasha BordersData scientist with a background in marketing research and brand management, CPG, pet and immersive event industries…www.