In the old days, a movie genre was a simple, communal category: Action/Adventure, Comedy, Drama. One had to locate oneself in the Drama aisle at the video store and then look for just the right thing: A dark road trip movie with a strong female lead? Aha, Thelma & Louise.
But Netflix, the movie streaming and DVD rental service, doesn't work like that. It recommends genres that are intensely, almost bizarrely personalized. Netflix might tell you not just that you like road trip movies, but that you like Understated Romantic Road Trip Movies, Dark Road Trip Thrillers, Road Trip Art House Movies, Road Trip Musicals or, of course, Canadian Independent Road Trip Movies.
That's because seven years ago, Todd Yellin, a film-obsessed executive at Netflix, set out to break down every movie into data. He hired aspiring screenwriters and paid them to watch movies and rate their levels of romance, gore, quirkiness and even plot resolution. In a sense, Yellin wanted to reverse-engineer all the Hollywood formulas so that Netflix could mathematically show you the movies it knew you would like.
Now it's become one of the company's big selling points. Netflix doesn't just provide streaming movies and TV shows; it knows you.
Thinking about how specific Netflix could get, I started to wonder, "Just how many microgenres does Netflix really have?"
A friend pointed out that the Web addresses for the categories in the Netflix database were sequentially numbered, and that I could type through each URL, one by one, and figure out all the microgenres.
The first brought up African-American Crime Documentaries. The second pulled up Scary Cult Movies From The 1980s. The next was Tearjerkers From The 1970s. After a couple of more minutes, I tried entry 10,000, just to see if the database was really that big. Japanese Horror Movies From The 1960s was in that slot.
There was no way I could copy and paste tens of thousands of genre titles by hand, so I wrote a simple script, a little piece of code, that would copy the names to a list. I set it up to run and then I waited, as the script kept copying and pasting for more than 20 hours.
I found that Netflix has 76,897 separate categories. To my knowledge, no one outside Netflix has ever compiled this mass of data before. And now we can really understand how the system works.
The microgenres are formed from Netflix's version of Mad Libs â an algorithm that takes all the tags in Netflix's system and combines them based on specific criteria, especially the number of movies fitting the category.
Traditional genres, like Drama, form the center of each microgenre, but Netflix can toss in actors and directors and a bunch of descriptors, including time period, location, age level and the squishier human words, the adjectives. These are really what make Netflix's movie genres seem uncannily precise.
Netflix's favorite adjective is "romantic," which appears in 5,272 categories. Following it are foreign, classic, dark, British, critically acclaimed, suspenseful, gritty, independent, visually striking, family, violent and feel-good.
But not all the adjectives are used thousands of times. Some of the least-used adjectives are telling, too: experimental, screwball, satanic, stoner, visionary and Depression-era.
Hollywood is a popularity contest, though, so we have to ask: Which actor is the most Netflix famous? That is to say, which actor appears in the most Netflix microgenres?
The No. 2 answer is precisely whom you might expect: Bruce Willis, who has 17 dedicated categories, including Violent Action Thrillers Starring Bruce Willis. But the actor with the most categories dedicated to himself is not Tom Cruise or Angelina Jolie or Jackie Chan or Meryl Streep or Clint Eastwood or Doris Day. It's Raymond Burr, star of the TV show Perry Mason.
Why? I have no idea. Even Yellin, who created the Netflix system, was baffled by the number of Burr categories.
And that's the interesting thing about wandering through Netflix's big data. Only some of the logic that drives these categories feels human. But perhaps that's exactly what we like about Netflix's recommendations. They take our taste, break it down into its constituent parts, and spit it back to us in new and revealing ways. Netflix's strange machine wants to make us happy. And to do so, it must know us and our culture in ways that are not always obvious to humans.
How else do we explain the Raymond Burr phenomenon? That's the eye of software staring into the American soul.
This story was adapted from Alexis Madrigal's Atlantic story "How Netflix Reverse Engineered Hollywood," which includes a hand-built Netflix Genre Generator.
Madrigal is a senior editor at The Atlantic and a visiting scholar at Berkeley's Center for Science, Technology, Medicine and Society.
Transcript
TERRY GROSS, HOST:
This is FRESH AIR. The rise of on-demand video content delivered over the Internet has made it possible to watch many movies and TV shows any time, anywhere. But with so many choices available at our fingertips, deciding what to watch can be a bit daunting. In an attempt to help viewers find something that appeals to them, Netflix presents its subscribers with personalized viewing recommendations. Our tech contributor Alexis Madrigal explains how and why they do it.
ALEXIS MADRIGAL, BYLINE: In the old days, a movie genre was a simple communal category - Action/Adventure, Comedy, Drama. One had to locate one's self in the drama aisle in the video store and then look for just the right thing - a dark road trip movie with a strong female lead? A-ha: "Thelma and Louise." But Netflix, the movie streaming DVD rental service doesn't work like that. It recommends genres that are intensely, almost bizarrely personalized.
Netflix might tell you not just that you like road trip movies but that you like understated, romantic road trip movies, dark road trip thrillers, road trip art house movies, road trip musicals, or of course, Canadian independent road trip movies.
That's because seven years ago, Todd Yellin, a film-obsessed executive at Netflix, set out to break down every movie into data. He hired aspiring screenwriters and paid them to watch movies and rate their level of romance, gore, quirkiness, and even plot resolution. In a sense, Yellin wanted to reverse-engineer all the Hollywood formulas so that Netflix could mathematically show you the movies they knew you would like.
Now it's become one of the company's big selling points. Netflix doesn't just provide streaming movies and TV shows - it knows you. Thinking about how specific Netflix could get, I started to wonder just how many micro-genres does Netflix really have? A friend pointed out that the web addresses for the categories in the Netflix database were sequentially numbered and that I could type through each URL one by one and figure out all the micro-genres.
The first brought up African-American Crime Documentaries. The second pulled up Scary Cult Movies from the 1980s. The next was Tear-Jerkers from the 1970s. After a couple more minutes, I tried entering 10,000 just to see if the database was really that big. Japanese Horror Movies from the 1960s was in that slot. There was no way I could copy and paste tens of thousands of genre titles by hand.
So I wrote a simple script, a little piece of code, that would copy the names to a list. I set it up to run and then I waited as the script kept copying and pasting for more than 20 hours. I found that Netflix has 76,897 separate categories. To my knowledge, no one outside Netflix has ever compiled this massive data before, and now we can really understand how the system works.
The micro-genres are formed from Netflix's version of Mad Libs, an algorithm that takes all the tags in Netflix's system and combines them based on specific criteria, especially the number of movies fitting the category. Traditional genres like Drama form the center of each micro-genre but Netflix can toss in actors and directors and a bunch of descriptors, including time period, location, age level, and the squishier human words - the adjectives.
These are really what make Netflix's movie genre seem uncannily precise. Netflix's favorite adjective is romantic, which appears in 5,272 categories. Following it are foreign, classic, dark, British, critically acclaimed, suspenseful, gritty, independent, visually striking, family, violent, and feel good. But not all the adjectives are used thousands of times.
Some of the least used adjectives are telling, too - experimental, screwball, Satanic, stoner, visionary, and Depression Era. Hollywood's a popularity contest, though, so we have to ask: Which actor is the most Netflix famous? That is to say, which actor appears in the most Netflix micro-genres? The number two answer is precisely who you might expect - Bruce Willis - who has 17 dedicated categories including violent action thrillers starring Bruce Willis.
But the actor with the most categories dedicated to himself is not Tom Cruise or Angelina Jolie or Jackie Chan or Meryl Streep or Clint Eastwood or Doris Day, but Raymond Burr, star of "Perry Mason." Why? I have no idea. Even Todd Yellin, who created the Netflix system, was baffled by the number of Burr categories, and that's the interesting thing wandering through Netflix's big data - only some of the logic that drives these categories feels human.
But perhaps that's exactly what we like about Netflix's recommendations. They take our taste, break it down into its constituent parts, and spit it back to us in new and revealing ways. Netflix's strange machine wants to make us happy and to do so it must know us and our culture in ways that are not always obvious to humans. How else do we explain the Raymond Burr phenomenon? That's the eye of software staring into the American soul.
(SOUNDBITE OF MUSIC, "PERRY MASON THEME")
GROSS: Alexis Madrigal is a senior editor at The Atlantic and a visiting scholar at Berkeley's Center for Science, Technology, Medicine, and Society. He created an interactive webpage where you can explore the 76,897 genres in Netflix's database and have fun creating new ones. The webpage includes a Netflix genre generator. You'll find a link to that page on our website freshair.npr.org. Transcript provided by NPR, Copyright NPR.
300x250 Ad
300x250 Ad