SINGAPORE — We watch TV on Netflix so much nowadays, but do you wonder sometimes how Netflix’s interface decides what content to recommend you from among the thousands of films or shows in its inventory? Well, the tech and entertainment company uses sophisticated algorithms to figure out your viewing preferences, and then personalises the content that is recommended to you so as to increase the likelihood of you watching a particular show.
The people shaping the algorithms are a team of thousands comprising coders, engineers, and data scientists. And one of these data scientists is 33-year-old Singaporean Grace Tang.
To find out more about how Netflix delivers its content, we caught up over the telephone with Grace, who is currently based in Netflix’s Los Gatos headquarters in Silicon Valley.
What is your role at Netflix?
I’m a senior data scientist. I do write code, but I don't personally write the algorithms myself – that's the job of the algorithm engineers. What I do is I analyse the data that is the output of those algorithms, and then I assess the effectiveness of each version of the algorithm.
How did you come to work for Netflix as a data scientist?
I majored in neuroscience at the University of Wisconsin-Madison, then I did grad school in neuroscience at Stanford. After grad school, I took my first data science job in a Singaporean startup, a real estate website called 99.co. At 99.co, I used well-known recommendation algorithms developed by Netflix and adapted them for 99.co’s purposes. After that, I joined a few other companies, also working in data science, before the opportunity to work in Netflix arose. Even though I'm trained in neuroscience, and not specifically in data science, I've always been very interested in data science and took some classes on it during grad school; on the other hand, I'm also very interested in more creative arts like illustration, writing fiction, writing screenplays. So taking the opportunity to work with both the data science and creative departments in Netflix was an easy choice to me.
Funnily enough, most colleagues in my team have PhDs in subjects, like physics, neuroscience, statistics, math, and all that. None of them actually have been specifically trained as a data scientist.
What kind of algorithms does Netflix use?
Netflix has a very famous algorithm called “collaborative filtering” that they have published. So what I did in my first job at 99.co was, I took the algorithm that Netflix developed, and adapted it to build recommendation algorithms for 99.co. And this algorithm is applicable to many domains.
An analogy for collaborative filtering is that, based on users’ viewing preferences, we find people who are similar to you, and then we recommend things that they enjoy to you.
Is collaborative filtering the only algorithm that Netflix uses, or are there others?
It's the main one, and there are variations on it. As for the other algorithms, I would prefer not to go into too much detail.
How big is the data science team at Netflix?
You know, it's not really just one team. There’s a cast of thousands. There are the creative people who create the actual images, videos, and write the text that goes with the show. And then there are the research engineers who build recommendation algorithms and optimise them. And countless other engineers who make sure these elements are integrated well into the platform. And then the data scientists like me, who assess the effectiveness of the images, videos and texts, as well as the effectiveness of the personalisation algorithms. So it's not really just one team working in isolation.
So how exactly does Netflix personalise recommendations? What kind of data is collected and how is it used?
Mostly we rely on viewing data. As I mentioned, we use each person's viewing data to find others who have similar preferences, then the algorithms personalise recommendations, based on your shared preferences. Based on your viewing history, we find people with similar viewing histories and then we recommend things to you that those other people also viewed.
I think Netflix is more well known for the personalisation algorithms that pick the titles. But in a very similar way, the poster images, the trailers, and all those other things associated with the shows are also personalised for each user. So as an example, a show can have different images associated with it. For example, each show might have multiple dimensions, for example, there might be an action scene in the movie, and there might be a romantic scene with a couple and so on. And what the algorithm does is, it will try to predict which image is most attractive for users. So for example, if you tend to like to watch romantic shows, then the algorithm might pick an image with a couple in it, whereas if you're more skewed towards action shows, then the interface might try to pick an image with an action scene in it.
We collect data on viewing time, and viewing numbers per show. And then also, whether or not the poster image was actually representative of the show, and informative. We want the image to not only be attractive, but also help the user make an informed decision on whether or not they will enjoy the show. So we care not only about whether they click on the image, but also whether they actually complete the show, which indicates that they actually enjoy it.
Personalising show synopses is also something that we're planning for the future.
Is there anything special about the viewing behaviour of Singapore Netflix users?
I don't actually know the information. I’ve never looked specifically at Singapore users. But I'm very happy to see that there are shows (on Netflix) that feature Singapore, like Shirkers and Street Food.
Netflix has a very user-friendly interface compared to some competitors in streaming. How do they do it?
I don't think I'm in a position to make comparisons against competitors. But I can say that Netflix does devote a lot of resources to making the user experience very good.
Artificial intelligence has many profound implications for society. How do you think data science and AI will impact society, and not just in entertainment?
We've already seen data science and AI impact a lot of areas like science, government, finance, I can go on and on. And, you know, there’s the example from Singapore where the government tech team used data science to track the cause of MRT breakdowns a few years ago. There’s vast potential for data science to improve our lives. On the flip side, there are things like deepfake, which allows people to create really convincing realistic videos of, you know, famous people saying anything you want them to. So I think data science is a powerful tool, but like everything else, it can be used for good things, and it can be abused. And it's really up to the person or company wielding that power to use it responsibly.