Combining Data to Create Powerful Modeling
April 25, 2012 Leave a comment
The Netflix Tech Blog recently posted an entry that discusses their recommendation algorithms and outlined the Netflix Prize, a machine learning and data mining competition to predict movie ratings. The 2009 winner of the contest improved Netflix’s ratings prediction system by more than 10% with a new algorithm. For a company owing 75% of viewership to recommendations, this would seem to be a huge step for Netflix. They didn’t adopt the winner of the contest, however – and the reasoning is perfectly logical.
The winning algorithm focused mainly on predicting ratings. Ratings are an important source of data for Netflix, but new types of analyses & inputs beyond ratings alone have emerged, all of which can help Netflix create even better recommendations. These include context, movie title popularity, novelty, diversity and freshness.
If you’ve been reading this blog or our Twitter handle for the last 2 months, then you are probably familiar with maX™ – or Modeled Audience eXtension, our custom modeling product that helps advertisers build a scaled audience with broad reach that is 3-5x more likely to respond to their campaigns.
We make it work by focusing on all the data – the advertiser’s first party data as well our eXelate premium marketplace data. And we look at as many data points as possible, including demographic, lifestyle, intent, behavior, brand affinity information and other proprietary data sets many advertisers may not have access to. We’ve learned, as Netflix seems to have, that combining multiple forms of data can prove to be the most powerful way to reach an audience. If you were Netflix, what would you have done?