Do you think it’s important for product managers who work on recommendation have math skills?A.
I started at Netflix as a data scientist, and worked on a wide range of projects across the still-small company.
However, after talking to the engineering and product teams working on the recommendation algorithm, I became very interested in that specific area.
Netflix was planning a streaming-based global product, and the recommendation algorithm was still relatively basic; it optimized mostly with the information that was available from the DVDs-by-mail product.
I came up with a number of ideas to learn statistical models based on streaming data to improve the recommendations, and I proposed the ideas to the product team.
They were open and interested in the models, but did not feel comfortable enough with the quantitative methods involved to decide to sponsor any of them.
Thankfully, my boss at the time, Chris Pouliot, was supportive (although skeptical) of the ideas, and gave me a few weeks to develop one of them into an offline prototype.
The recommendations that the prototype produced looked interesting enough that once we showed them to the product team they sponsored the project.
We worked with engineering and other teams to fully develop and test it, and based on the results, we deployed it.
This success led the product team to invite me to work on other recommendation-related projects.
In the course of one of them, I became convinced that it was suboptimal to have people — the product managers in this case — who were not deep on quantitative methods manage the entire innovation process.
So I passed on that criticism to the product team.
To my pleasant surprise, their reaction after thinking about it for a few days, was to agree that it would be interesting to experiment having product managers with a deep and broad quantitative experience, and created a new position to try the idea out.
Even more surprising: they asked me to interview for the new position.
So I moved to the product innovation team.
The result worked well enough that I was then asked to build a team of product managers with a strong quantitative background to manage the innovation process for recommendations and search at Netflix.
Finding the right people for this team took time, but the effort was well worth it because they were better able to lead the cross-functional teams that built the global recommendation and search technology for Netflix streaming.
This experience is in stark contrast to what I have seen in other firms I’ve spent time in, and I remain convinced that strongly quantitative product leaders managing the innovation process is the only way to create high quality algorithmic products.
You were at Netflix for seven years; what changes did you see in the way machine learning was used over that time?It became more sophisticated.
We started with few relatively simple time series or regression models for video ranking, and over time we found more important use cases such as a recommendations page construction or selecting the right image to support a recommendation.
In parallel, we found that more complex models of various kinds, appropriately tuned and used, delivered a better experience.
It can be hard to entrust critical parts of the product experience to machine learning, but only deploying machine learning in marginal parts of the product guarantees limited impact and limited learning.
How do you think about making the leap of faith to trust machine learning with impactful aspects of the product?I think you should never blindly trust a machine learning system to behave better than an alternative (such as hand-curation), until proven otherwise.
Even after getting such proof, a machine learning system needs constant monitoring to make sure the entire system or product continues to behave well — in a way that is consistent with the mission of your organization — as the world around it changes.
The point is to articulate how you expect machine learning will improve a product or experience, and then invest in the research and development to explore that hypothesis.
Only through careful testing and analysis aimed at understanding the full implications of a change should you switch to a product that is based on machine learning.
The process is long and expensive enough that it does not make sense to only try it on small parts of your product, because even in the best cases, the return to the business and to the people who use your product will be small.
Last summer, Monica Rogati wrote a great piece called “The AI Hierarchy of Needs” and argued, “You need a solid foundation for your data before being effective with AI and machine learning.
” However, sometimes cool results through machine learning can provoke the kind of improvements a company needs to extract more value from a company’s data.
Is it productive to develop machine learning capabilities before your data engineering is perfect?Your data engineering will never be perfect, so it is unwise to wait for perfection to develop machine learning solutions.
However, if the data is too noisy, brittle or sparse, you will waste your time building and deploying large-scale machine learning solutions.
You need to be somewhere in between.
One approach is to have a hierarchy of data that you care about, and start to develop machine learning solutions once the first hierarchy of data is solid.
In parallel, continue to work on improving the quality and availability of other data.
Also, I think it is always productive to try to better understand your clients, your business, your product and any interactions between them.
Data modeling and analysis can be a great way to do that; it can help generate hypotheses and potentially useful input signals for machine learning technologies that can be developed later.
Much of this exploratory work can be done before the data is production ready.
Talent feels very constrained right now.
With such an intense recruiting effort from big tech companies, it’s very hard for the rest of us to hire qualified people, even straight out of school.
So how do we move forward when talent is so constrained and what does it mean for the development of AI?Universities are responding by starting to train more and more people in the combination of skills like statistics and computation that are necessary for this kind of work, so supply is increasing.
But demand is also growing as more organizations realize they can derive value from these skills.
It is unclear to me how the next decade will play out in terms of this.
However, I also think companies that want folks with an AI background should be very clear about the mission of their organization, and the role AI can play within that mission.
Surely, within the pool of well-trained people in AI, you should be able to find those who are not just interested in it, but rather passionate about the specific intersection of it with the mission of your organization.
Illustration by Alfonso de AndaThere’s several ways data can influence product, such as A/B testing and reinforcement learning to optimize or personalize.
But even before that, Netflix has been able to use data to suggest new features to test, or even new content strategies.
Can you talk about how digital companies can use data to suggest new features,new products, or even new content strategies?This goes back to an earlier question: using data analysis and modeling to better understand your business, your clients, your products and their interactions.
For example, to find groups of people who tend to behave similarly when interacting with your product.
Using the sizes of such audiences and analyzing the kinds of content each of these audiences engages with may lead you to change the content mix to better match the audiences with the offering.
However, analyzing past data should mostly be used to generate hypotheses.
Many will turn out to be false, such as simple correlations that play out differently when implemented.
So testing, whether fully controlled (as in an A/B test) or not (changing something and analyzing the before and after) is always essential.
You wrote a paper this summer about an online approach for matrix factorization models, which Netflix became famous for as a way to do recommendations.
But you’ve also written about multi-armed bandits.
When is a supervised learning approach like matrix factorization the way to go for recommendations, and when should a company be using bandits?Factorization models have proven to work very well for personalization because they efficiently share information across individuals and items to learn one small (e.
, 100) set of numbers to describe the behavior of each person and each item.
On the other hand, multi-armed bandits are essentially a regression model that predicts an outcome based on a weighted sum of input signals.
This is problematic for personalization, where by definition, we expect that the interaction of an individual and an item is important.
But quantifying such an interaction to turn it into a small enough number of input signals for a regression model can be complex.
Multi-armed bandits can be very successful in unpersonalized settings because they use so-called explore-exploit strategies to avoid missing out on good decisions (such as which articles to recommend) just because you do not have enough data for them.
The point of the first article you mention here is that one can also implement such explore-exploit strategies with factorization models, if relevant for a particular problem.
Since Netflix is a subscription service, it seems like the most important thing to optimize is retention, which only gives you a signal every few months.
How do you prioritize fast proxy KPIs like clickthrough against more important but slow KPIs like retention?.How do you balance quick optimization via reinforcement learning vs.
the slow response of A/B testing?I left Netflix almost two years ago, so I don’t know how they make algorithm decisions today.
But in the internet tech industry in general, you find a very quick proxy (plays, reads, clicks, etc.
), then you carefully test a model trained on such short-term data to determine if the longer-term metric actually improves.
At Netflix, this always meant waiting three to six months before we would be confident about having found a better version of the product.
Most companies I know aren’t as patient and I think that’s a mistake.
In internet tech, I have not yet seen any successful application of reinforcement learning to directly optimize the long-term metric one may care about, but I think this direction is interesting and has not been explored enough.
Overall, what lessons did you learn about how a company should (or should not) integrate machine learning into their product development?Very carefully and thoughtfully, in particular, making sure that the resulting product is of high quality (not favoring the worst quality of content or interaction one can imagine, regardless of your KPI).
That the product ends up being at least consistent with, and ideally supportive of, the company’s mission.
And finally, having an opinion about the product’s societal consequences.