Thursday, January 23, 2020

The Dream of Olfaction Prediction

You might want to take this post in doses, because it's a mouthful. I tried to help by adding some totally unrelated but beautiful images from Richard Pousette-Dart, a founder of the New York School of art.

I've been pushing this off for years now, waiting for my schedule to allow me to dive in and give it the respect it deserves.

We're looking at the DREAM challenge, a science and technology research consortium that set their sights on olfactory perception a couple years ago. *Dialogue on Reverse Engineering Assessment and Methods (DREAM).

Forever, olfaction has been an unruly member of the human sensory suite, refusing to offer any insight into how we perceptually organize odors. Colors have a spectrum and sounds have frequencies, but smells are simply un-organizable.

I'll take a portion of the abstract from the winning team, because they've written a concise, comprehensive explanation of the problem of olfactory recognition:
The olfactory stimulus-percept problem has been studied for more than a century, yet it is still hard to precisely predict the odor given the large-scale chemoinformatic features of an odorant molecule. A major challenge is that the perceived qualities vary greatly among individuals due to different genetic and cultural backgrounds. Moreover, the combinatorial interactions between multiple odorant receptors and diverse molecules significantly complicate the olfaction prediction.
 Some structurally similar compounds display distinct odor profiles, whereas some dissimilar molecules exhibit almost the same smell. Many attempts have been made to establish structure-odor relationships for intensity and pleasantness, but no models are available to predict the personalized multi-odor attributes of molecules.

But, some recent advancements in the field have made it worth trying again. Number one is the Dragon software. It's a database of chemicals big enough to be worthy of the Big Data era. Each of its hundreds of odorous chemicals has thousands of features like functional group, boiling point, etc. It's a lot easier to find patterns in the chemicals when you have this much correlating data.

The number two development is a new set of odor words. Just about all olfactory perception science since the 1980's has been using one specific set of odor/names, called the Dravnieks set. Some use the Arctander set, but the Dravnieks has ASTM behind it, so it's usually the main one. The thing is, it's now almost 40 years old. And that means a lot when it comes to smells, because the language of smell is a very dynamic thing.

I'll give a quick example. The first commercial toothpaste ever invented, Pepsodent, was called "minty," but you know what it was made with? Sasparilla, like Root Beer. Who knew "root beer" and "minty" were the same thing? They were at that time and in that place. And that's how smells work. The language we use to talk about smells is not so much related to the molecules themselves but to our experiences with them.

You know how baggy pants are popular sometimes (1995), and then later on (2015) they make you look homeless? That's similar to the way our odor lexicon changes. The words themselves are just as fashionable and ephemeral as the fragrance market itself. From the authors of the new study: "Another problem with verbal descriptors is that they are culturally biased. The current standard set of 146 Dravnieks descriptors was developed in the United States in the mid-1980's and is increasingly semantically and culturally obsolete." (Keller 2016 below)

Also, let's not forget that the entire Oceanic/Ozonic/Marine class of fragrance aromas (Cool Water, Acqua Di Gio) didn't exist until the chemical Calone was discovered by a pharmaceutical company researching benzodiazepine derivatives for anti-depression meds circa 1990.

So finally, a bunch of vigilant olfactory enthusiasts got together and generated a killer dataset for smellable molecules and the words we use to describe them (Keller et al 2016). This new set leaves Dravnieks in the dust. It's got 480 molecules tested on 55 subjects. Dravnieks had 146 smell-word combinations and the subjects were all American/Western European. It's important to get the subjects to be as diverse as possible, because whether it's cultural or genetic, we all smell things different from each other and we all use different words to describe those sensations. Pigeon-holing your demographic yields a pretty distorted dataset.

Other ways they out-did the Dravnieks dataset: they use odorless compounds (like water), they included molecules with unfamiliar smells, they include familiarity ratings (we'll see why this is important later), and they extract both population average data AND individual reporting data.

Summary: updated datasets, both on the chemical-feature side and on the odor-descriptor side. Now for the DREAM Challenge itself. This is where crowdsourcing, which I guess is now just another word for "competition," narrows down the best approach to tying together categories of chemical features and the words we use to describe the way they smell.


Let's start with a basic pair, to get an idea. Sulfur smells like rotten eggs. Simple, right? If it's got a sulfur molecule, it probably smells 1. bad and 2. like rotten eggs.

Actually, according to this new dataset, it's related more to "garlic"-smell than anything else. ... but that's because "garlic" was one of the pre-determined descriptors that the particpants were allowed to choose from; "rotten egg" was not on that list.

Let's get a bit more complicated. Below I'll give bulleted summaries of the three steps: First is the Challenge itself, then is the new and improved dataset used in the challenge, and finally is the winner of the challenge.

And for the record, I'd really like to see this kind of work done using not just the Dragon database of chemoinformatics, but with the almighty Human Metabolome Database which contains 40,000 entries of all the metabolites that exist within and among the human body. Because that would be interesting to see.

The DREAM Olfaction Prediction Challenge
This challenge aims to develop the most comprehensive computational approach to date to predict olfactory perception based on the physical features of the stimuli.

Teams developed machine learning algorithms to predict sensory attributes of molecules based on their chemoinformatic features to predict the perceptual qualities of virtually any molecule with high accuracy and also reverse-engineer the smell of a molecule.

Predicting human olfactory perception from chemical features of odor molecules. Keller A, Gerkin RC, Guan Y, Dhurandhar A, Turu G, Szalai B, Mainland JD, Ihara Y, Yu CW, Wolfinger R, Vens C, Schietgat L, De Grave K, Norel R, DREAM Olfaction Prediction Consortium., Stolovitzky G, Cecchi GA, Vosshall LB, Meyer P. Science. 2017 Feb 24; 355(6327):820-826.

The Dataset Used in the DREAM Challenge
[aka the Rockefeller University Smell Study]
[aka The New Dravnieks]

Their dataset captured the sensory perception of 480 different molecules (249 cyclic molecules, 52 organosulfur molecules, 165 ester molecules) each with 4884 corresponding chemical features, at two different concentrations, experienced by 55 demographically diverse healthy human subjects (really 49 because some were removed). Subjects rated intensity (0-100), pleasantness (0-100), familiarity (did they rate familiarity?), and were asked to apply 20 pre-determined semantic odor quality descriptors to these stimuli, and were offered the option to describe the smell in their own words.

Pre-determined semantic attributes: bakery, sweet, fruit, fish, garlic, spices, cold, sour, burnt, acid, warm, musky, sweaty, ammonia/urinous, decayed, wood, grass, flower, and chemical.

Findings in General

·      Familiarity had a strong effect on the ability of subjects to describe a smell.
·      Many subjects used commercial products to describe familiar odorants, highlighting the role of prior experience in verbal reports of olfactory perception.
·      Nonspecific descriptors like "chemical" were applied frequently to unfamiliar odorants.
·      Unfamiliar odorants were generally rated as neither pleasant nor unpleasant.
·      Many molecules had unfamiliar smells: of the stimuli that subjects could perceive, 70% were rated as unknown and were given low familiarity ratings.
·      Highlights the dominant role of familiarity and experience in assigning verbal descriptors to odorants.

Findings Specific

·      Compounds that contain sulfur or nitrogen (amines) are probably unpleasant
·      Compounds that contain oxygen are probably pleasant
·      If it's got sulfur atoms, there's a good chance someone will call choose "garlic" from the list of descriptors (note "rotten eggs" is not on that list)
·      The number of sulfur atoms in a molecule was correlated with the odor quality descriptors "garlic" "fish" and "decayed"
·      Large and structurally complex molecules were perceived to be more pleasant.
·      Vanillin (and ethyl vanillin) was the most likely to record as pleasant
·      Vanillin likely to be called “edible”, “bakery”, “sweet”
·      Vanillin acetate was rated the “warmest” stimulus
·      (−)-Carvone and various esters were the rest of the pleasant odors
·      Methyl thiobutyrate was the least pleasant, also the most intense
·      Methyl thiobutyrate most likely to be called "Decayed"
·      Isovaleric acid received the highest rating for both “musky” and “sweaty”
·      Others of the least pleasant compounds were sulfur-containing (4 in total) and carboxylic acids (4 in total)
·      Benzenethiol and 3-pentanone and Androstadienone most variable intensity perception
·      The most commonly used descriptor was “chemical”
·      The least frequently used descriptor was “fish” 
·      "Chemical" was used most often for unfamiliar odors
·      "Edible" was used most often for familiar odor
·      Words least likely to be used for the same compound (negatively correlated) were:
o   edible/chemical
o   sweet/musky
o   sweet/sweaty
·      When describing in their own words, participants used often:
o   “sweet”
o   “burnt”
o   “grass”
o   “candy”
o   “vanilla”
·      Women used their own words more than men
·      Commercial names, trade names (like Vicks Vapo-Rub) were used a lot.
·      In concernt w Dravnieks, the most representative descriptor/molecule pairs:
o   “garlic”            (diethyl disulfide)
o   “flower”          (2-phenylethanol)
o   “decayed”       (methyl thiobutyrate)
o   “sweaty”         (isovaleric acid)
o   “spicy”             (eugenol)

Special Note 1

"Only descriptors with an unambiguous reference odorant can be predicted based on molecular features." (For example, garlic means something pretty specific, but chemical is as ambiguous as it gets.)

The winners of the competition in their own paper mentioned this: "The large differences may result from the relative ambiguity of the word “warm” to describe odor." (Hongyang et al 2018)

This is one of the most important conclusions to come out of this study, because it shows us how olfaction and language really work together. You can't name smells you've never smelled before. And you need very specific references to develop a useful lexicon. This has a lot to do with why commercial products are used in these cases (like in the 2016 World CoffeeResearch Sensory Lexicon). It's better to say McDonald's Chicken McNuggets or Vick's Vapo Rub or Hasbro's Play Doh because they are highly controlled substances (in terms of quality not illegality!) and so they are exactly the same every time.

It also suggests that any universal odor lexicon needs to have an ambiguity rating next to each word.

(FYI: Play-Doh is one of the only branded scents, ever, because you can't have copyright protection for smells, and the brand Mama Celeste's microwave pizza is the World Coffee reference standard for "Cardboard" aroma, poor Mama!)

Special Note 2

This last one is great, for me at least, because it echoes many ideas already posted on this weblog. Here, taken from the authors:
However, we also found marked differences in how descriptors were used by our untrained subjects and experts. For example, subjects used “musky” to describe unpleasant body odors. In contrast, experts use “musky” to describe compounds naturally sourced from animal glands or their synthetic analogues. These are often used as base notes in perfumery, and experts associate musks with pleasant descriptors such as “sweet,” “powdery,” and “creamy.” However for our subjects, “musky” had a negative correlation with pleasantness, and was instead correlated with the descriptor “sweaty.”
"The molecule rated as most “musky” in this study was isovaleric acid, which experts do not rate as “musky” (Dravnieks). The five molecules that Dravnieks lists as representative of the “musk” descriptor are also rated “fragrant” and “perfumery” by experts (Dravnieks).
Therefore, the word “musky” has a colloquial meaning that is different from its technical meaning in perfumery.

Olfactory perception of chemically diverse molecules. Keller A, Vosshall LB. BMC Neurosci. 2016 Aug 8; 17(1):55.

Here is a link to my previous post on the topic, from 2017:

DREAM Challenge Winners

The winners were from the University of Michigan and used a random forest-type machine learning algorithm. It won 1st place for predicting individual responses and 2nd place for predicting population responses.

Right off the bat, one of the important things they do is to combine the (stable) population average with the (highly varied) individual responses. This is a big deal because there is so much variety to individual responses, as explained above, because of either culture or genetics. We all smell things differently, and we all use different words to refer to those smells. And that difference is large enough to make the data messy as heck. So this winning team introduced a weighted value, alpha, to balance the two, and it works like this:

"When α equals 0, only population ratings are considered. Conversely, when α equals 1, only individual ratings are used (see the “Methods”). Surprisingly, a small α = 0.2 achieves the largest Pearson's correlation coefficient (Fig. ​(Fig.3B).3B). Without population information (α = 1.0), the correlation of predicting the 19 semantic descriptors is the lowest. This reveals that population perceptions play a crucial role when individual responses display large fluctuations."

And this is a great improvement, because cultural influence / cultural conditioning is so influential on our own subjective perception (see Greta Garbo and the Vermeer forgeries).

The next thing to note about the work of the winning team's algorithm is that it performs just like you would expect it to in that the results seem to make more sense to a machine than a person.

For anyone familiar with recent examples of machine learning, you hear a lot of 1. it's like a black box and we can't see what it's doing to make its decisions, or 2. it's like the adversarial image hack where they do what looks like absolutely nothing to the image, and yet the network reads it as something wildly different than what it is.

In this case, the algorithm found the most "obvious" patterns in chemical features did not correspond to things we already know about chemicals. Sure, sulfur atoms correlate to bad smells, but the 2nd and 3rd most correlated features had nothing to do with features we would associate with odor.

I guess one of the main reasons for this disconnect is the idea of degenaracy, which is a word that refers to the fact that so many molecules actually have identical or similar values for simple features. So if the point of the algorithm is to predict a smell based on chemical information alone, but then you have a feature that belongs to more than one smell group, then it sure won't help you to predict which smell it's going to be from the chemoinformatics. So let's say a chemical has an oxygen molecule. Well lots of different-smelling chemicals have oxygen in them, so we can't use that simple feature as a way to organize.

After all this work, it should be noted this one point: the top 5 features achieve similar performance as random forest with all 4884 features for almost all olfactory qualities (with the exception of “intensity,” for which the top 15 features are adequate).

Accurate prediction of personalized olfactory perception from large-scale chemoinformatic features. Hongyang Li, Bharat Panwar, Gilbert S Omenn, and Yuanfang Guan. Gigascience. 2018 Feb; 7(2): 1–11.
Dravnieks A. Atlas of odor character profiles. Philadelphia: ASTM; 1985.
Arctander S. Perfume and flavor chemicals (aroma chemicals). Montclair, NJ: Author; 1969.
Keller A, Vosshall LB. Olfactory perception of chemically diverse molecules. BMC Neurosci. 2016 Aug 8; 17(1):55.

Here is a link to my original post on this DREAM Challenge from 2017

Monday, January 20, 2020

Did You Smell That No I Didn't

aka The 378-Dimensional Individual Olfactory Receptor Subtype Genome
aka The Olfactory Fingerprint


I would like to go back to Conspiracy Keanu on this one, because I know you feel the same way.

It's pretty hard to tell if the Blue you see is the same Blue I see. (And The Dress is still fresh enough in our collective memory to know how this plays out.)

For the most part, we can be sure that when we're detecting energy in the 610-670 THz range, it will be called Blue by anyone who knows what Blue means (which is not everyone; check this out for a lesson on the cultural evolution of color terms).

Smells however, not the same. Isovaleric Acid is the best example of this, because half the people who smell it will call it delicious and the other half disgusting. So we have differences in opinion, big deal. But there's more – we are not perceiving the same thing.

That is to say, the hundreds of chemically-sensitive receptors in our nose, which are programmed by hundreds of genes in our personal genome, they're pretty personal. And they can be quite different from person to person, with as high as a 30% variation. This means the actual receptors encoding molecular features into our minds are not all the same. This isn't just about subjectivity; the actual hardware is different from person to person. 

General olfactory acuity will change based on age, gender, smoking habits, body type, and race. Other factors such as prior upper respiratory infection, trauma, and environmental toxin exposure are also involved. Even something as simple as prior exposure to an odor can change the way we perceive it (in the case of androstenone). Not only that, the same odor will smell different to the same person at different times!

With all that variation, how do we ever really know what each other is talking about? My androstenone is not your androstenone. A great study from a powerhouse in olfactory research was done back in 2012 to characterize these perceptual differences across a diverse metropolitan population, and I'm summarizing it here because without it, all other smell research is kind of useless.

The Study
They gave 66 different odors to 391 people who closely reflected the diverse population of New York City (meaning that this study group has a fighting chance of representing the diversity of the human genome).

The Findings
·      Young, female, non-smoking subjects had the highest average olfactory acuity.
·      Deviations from normal body type in either direction were associated with decreased olfactory acuity.
·      General olfactory acuity declines with age.
·      Reduced general olfactory acuity can be caused by genetics, trauma, exposure to toxic agents, neurodegenerative diseases, or infections.

Differences in Olfactory Acuity Between Races
·      African-Americans 149
·      Asians 231
·      Caucasians 225
·      (Asians are more similar to each other, and African-Americans are more different)

On Hedonics, i.e., Pleasant vs Unpleasant
·      Unpleasant odours were generally perceived to be more intense than pleasant odours
·      Across all subjects, the eight most pleasant odours were food odours such as vanilla, citrus, minty, and cinnamon odours
·      The seven least pleasant odours were fatty acid derivatives associated with the sour smell of rancid butter or body odour
·      The biggest variability in pleasantness perception was found for floral odours
·      The two most pleasant stimuli were the two concentrations of ethyl vanillin, followed by the high concentration of vanillin
·      The least liked odor was isovaleric acid (rancid butter) and isobutyraldehyde (sour)
·      Throughout our subject population, odours perceived to be most and least pleasant were remarkably stable
·      For 18 of the 134 stimuli the pleasantness rating differed significantly between African-American and Caucasian subjects
·      The biggest difference between younger and older subjects was that older subjects perceived anise, the odour of liquorice, to be more pleasant
*I thought this one was easy, since I know that most candy experience by someone born in ~1930 came from licorice, and that changes pretty drastically thereafter; the first Twizzler flavor was licorice, not strawberry, which didn't appear until the late 1970's.
·      Men vs Women: there are older studies that show differences, but these were not reproduced in this study, instead, new sets: For guaiacol, the odour of wood smoke, both concentrations were perceived to be significantly more pleasant by men. The high concentration of guaiacol showed the largest difference between men and women
·      All the odours that were perceived to be more pleasant by perfume users were odours used in perfumes (pentadecalactone, heptyl acetate, octyl aldehyde, nonyl aldehyde); Perfume use may result in these odours being rated as more pleasant, or, alternatively, those who perceive these odours to be more pleasant are more likely to use perfumes.
·      Geranyl acetate (floral, rose, lavender) was perceived to be more pleasant by Caucasians than by Asians
·      Androstenone was more likely to smell “musky” and “aromatic” to women, whereas men found it to be more “chemical” and “sickening,”

Further Findings
Interestingly, for any given stimulus, the responses were as similar when the ratings were spaced over one year apart as when they were around 30 minutes apart.

This may seem surprising, but for thresholds it has even been reported that the variability within a day is significantly larger than the variability between days [1]. Day-to-day variability in olfactory perception is therefore largely a consequence of sniff-to-sniff variability.

The main causes of within-individual variability are processes that operate on the scale of seconds or minutes such as changes in the stimulus signal-to-noise ratio [2] or the reallocation of attention by the subject [3], rather than on the scale of hours or days, such as hormonal changes or infections of the upper respiratory tract.
1. Stevens JC, Cain WS, Burke RJ. Variability of olfactory thresholds. Chem Senses. 1988;13:643–653. doi: 10.1093/chemse/13.4.643.
2. Cain WS. Differential sensitivity for smell: “noise” at the nose. Science. 1977;195:796–798. doi: 10.1126/science.836592.
3. Keller A. Attention and olfactory consciousness. Front Psychol. 2011;2:380.

·      Males have been shown to be more sensitive to the odour bourgeonal (lily-of-the-valley); also the only known odour that men are more sensitive to than women.
Olsson P, Laska M. Human male superiority in olfactory sensitivity to the sperm attractant odorant bourgeonal. Chem Senses. 2010;35:427–432. doi: 10.1093/chemse/bjq030.
·      African-Americans have been shown to have a higher threshold for isovaleric acid than Caucasians, but a lower threshold for pentadecalactone.
Whissell-Buechy D, Amoore JE. Odour-blindness to musk: simple recessive inheritance. Nature. 1973;242:271–273. doi: 10.1038/242271a0.
·      Asians perceive each odour of the homologous series of nonyl aldehyde, decyl aldehyde, and undecanal to be stronger than Caucasians (no mechanistic explanation for this).
·      The three stimuli showing the greatest variability between subjects in intensity ratings were the high concentrations of androstenone and androstadienone, and methanethiol (cabbage-like odour present in urine of people who have previously ingested asparagus).

Specific Odorants – Androstenone and Androstadienone
·      Altered by genetic variation in the odorous steroid-sensitive odorant receptor OR7D4.
Keller A, Zhuang H, Chi Q, Vosshall LB, Matsunami H. Genetic variation in a human odorant receptor alters odour perception. Nature. 2007;449:468–472. doi: 10.1038/nature06162.
Lunde K, Egelandsdal B, Skuterud E, Mainland JD, Lea T, Hersleth M, Matsunami H. Genetic variation of an odorant receptor OR7D4 and sensory perception of cooked meat containing androstenone. PLoS One. 2012;7:e35259. doi: 10.1371/journal.pone.0035259.
Knaapila A, Zhu G, Medland SE, Wysocki CJ, Montgomery GW, Martin NG, Wright MJ, Reed DR. A genome-wide study on the perception of the odorants androstenone and galaxolide. Chem Senses. 2012;37:541–552. doi: 10.1093/chemse/bjs008.
·      Androstadienone was perceived to be stronger by older subjects, women and African-Americans.
·      The functional RT variant of OR7D4 is more common in African-Americans.
·      This is consistent with the finding from the National Geographic Smell Survey.

Specific Odorants – Methanethiol
Associated with a single nucleotide polymorphism within a 50-gene cluster of olfactory receptors.
Pelchat ML, Bykowski C, Duke FF, Reed DR. Excretion and perception of a characteristic odor in urine after asparagus ingestion: a psychophysical and genetic study. Chem Senses. 2011;36:9–17. doi: 10.1093/chemse/bjq081.
Tung JY, Do CB, Hinds DA, Kiefer AK, Macpherson JM, Chowdry AB, Francke U, Naughton BT, Mountain JL, Wojcicki A. et al.Efficient replication of over 180 genetic associations with self-reported medical data. PLoS One. 2011;6:e23473. doi: 10.1371/journal.pone.0023473.
Eriksson N, Macpherson JM, Tung JY, Hon LS, Naughton B, Saxonov S, Avey L, Wojcicki A, Pe’er I, Mountain J. Web-based, participant-driven studies yield novel genetic associations for common traits. PLoS Genet. 2010;6:e1000993. doi: 10.1371/journal.pgen.1000993.

Specific Odorants – Pentadecalactone, Vanillin and Isovaleric Acid
·      Perceived as more intense by non-perfume users.

Any two individuals differ by 30% of their olfactory receptor subtype genome:
Mainland JD, et al. (2014) The missense of smell: Functional variability in the human odorant receptor repertoire. Nat Neurosci 17(1):114–120.

The human olfactory genome contains 418 intact odorant receptor genes and their 912,912 intact odorant receptor alleles:
The 1000 Genomes Project (2008-2015), the largest public catalogue of human variation and genotype data.

378-dimensional individual olfactory receptor subtype genome:
Individual olfactory perception reveals meaningful nonolfactory genetic information.
Secundo L, Snitz K, Weissler K, Pinchover L, Shoenfeld Y, Loewenthal R, Agmon-Levin N, Frumin I, Bar-Zvi D, Shushan S, Sobel N. Proc Natl Acad Sci U S A. 2015 Jul 14; 112(28):8750-5.

Androstenone perception is heavily influenced by prior exposure:
Wysocki CJ, Dorries KM, Beauchamp GK. Ability to perceive androstenone can be acquired by ostensibly anosmic people. Proc Natl Acad Sci U S A. 1989;86:7976–7978. doi: 10.1073/pnas.86.20.7976.

Study on inter-individual variability:
An olfactory demography of a diverse metropolitan population.
Keller A, Hempstead M, Gomez IA, Gilbert AN, Vosshall LB
BMC Neurosci. 2012 Oct 10; 13():122.