Google is teaming up with a Harvard professor to promote a new scale for measuring skin tones in hopes of fixing bias and diversity issues in the company’s products.
The tech giant is working with Ellis Monk, an assistant professor of sociology at Harvard and the creator of the Monk’s Skin Scale, or MST. The MST Scale is designed to replace old-fashioned skin flakes that tend to have lighter skin. When these older scales are used by tech companies to classify skin color, it can lead to worse products for people with darker coloring, Monk says.
“Unless we have a proper measure of differences in skin tone, we can’t really integrate that into products to make sure they’re more inclusive,” Monk says. The Edge. “The Monk Skin Tone Scale is a 10-point skin tone scale that has been purposely designed to be much more representative and inclusive of a wider range of different skin tones, especially for humans. [with] darker skin tones. ”
There are many examples of technical products, especially those that use AI that work worse with darker skin tones. These include apes designed to detect skin cancerface recognition software, and even car vision systems used by self-driving cars.
Although there are many ways in which this type of bias is programmed into these systems, one common factor is the use of outdated skin tone scales when collecting training data. The most popular skin tone scale is the Fitzpatrick scale, which is widely used in academia and AI. This scale was originally designed in the 1970s to classify how people with paler skin burn or tan in the sun and were only later expanded to include darker skin.
This has led to some criticism that the Fitzpatrick scale fails to capture a full range of skin tones and may mean that when machine-learning software is trained on Fitzpatrick data, it is also biased towards lighter skin types.
The Fitzpatrick scale consists of six categories, but the MST scale expands this to 10 different skin tones. Monk says this number was chosen based on his own research to balance diversity and ease of use. Some skin tones offer more than a hundred different categories, he says, but too much choice can lead to inconsistent results.
“Usually, if you passed 10 or 12 points on these kinds of scales [and] ask the same person to choose the same tones over and over again, the more you increase that scale, the less people are able to do that, ”says Monk. “Cognitively speaking, it’s just getting really hard to differentiate accurately and reliably.” A choice of 10 skin tones is much more manageable, he says.
Creating a new skin tone is, however, only a first step, and the real challenge is to integrate this work into real-world applications. To promote the MST Scale, Google has created a new website, skintone.google, dedicated to explaining the research and best practices for its use in AI. The company says it is also working to apply the MST Scale to a number of its own products. These include its “Real Tone” photo filters, which are designed to work better with darker skin tones, and its image search results.

Google says it is introducing a new feature to image search that will allow users to refine searches based on skin tones classified by the MST Scale. So, for example, if you are looking for an “eye makeup” or a “wedding makeup look”, you can then filter results by skin tone. In the future, the company also plans to use the MST Scale to monitor the diversity of its results, so that if you’re looking for pictures of “cute babies” or “doctors,” you won’t show yourself just white faces.
“One of the things we do is take a set [image] results, understanding when those results are particularly homogeneous across a set of tones, and improving the diversity of the results, “said Google’s co-founder for AI, Tulsee Doshi. The Edge. Doshi stressed, however, that these updates were in a “very early” stage of development and had not yet been launched through the company’s services.
This should strike a note of caution, not only for this specific change but also for Google’s approach to fixing bias problems in its products more generally. The company has a vague history when it comes to these issues, and the AI industry as a whole tends to promise ethical guidelines and a barrier and then fail in the pursuit.
Take, for example, Google Photos’ infamous bug that led to its search algorithm tagging photos of blacks as “gorillas” and “chimpanzees.” This error was first noticed in 2015, however Google has confirmed it The Edge this week that it has not yet fixed the problem but simply removed these search terms altogether. “Although we’ve significantly improved our models based on feedback, they’re still not perfect,” said Michael Marconi, a spokesman for Google Photos. The Edge. “To prevent this type of error and possibly cause further damage, the search terms remain disabled.”
Introducing these kinds of changes can also be culturally and politically challenging, reflecting broader challenges about how we integrate this kind of technology into society. In the case of filtering image search results, for example, Doshi notes that “diversity” may look different in different countries, and if Google adjusts image results based on skin tone, it may need to change these results by geography.
“What does diversity mean, for example, when we see results appear in India [or] when we see results in different parts of the world, it will be fundamentally different, ”says Doshi. “It’s hard to say, ‘Oh, this is the right set of good results we want,’ because that will differ by user, by region, by question.”
Introducing a new and more inclusive scale for measuring skin tones is a step forward, but much more thorny issues remain with AI and prejudice.