Limbic Signal: Semantic Bingo

Tuesday, May 5, 2020

Semantic Bingo

In olfactory research, there's a test called a pairwise similarity test that's used to measure smells, allowing researchers to construction of a map of odor perception.

It's hard to make sense out of smells; it doesn't work like the rest of our sensory system. For a bunch of reasons it's proven quite difficult to produce a model which predicts how a molecule will be perceived.

It's not broken beyond repair, but it is frustrating because we can never seem to get an airtight model that works for all smells and for all people. With hundreds of different receptors, varying over thousands of alleles, scientists often look somewhere else for the organizing principles -- they look for patterns in the words themselves.

In a study from 2015, distributional semantics is used to create an odor map. They say it's the first attempt to do so. This technique rests on the theory that words occuring in similar contexts are in fact similar. Some of you might remember this as "context clues;" if you come across a new word while you're reading, use the surrounding context to help you guess what the word means.

So instead of trying to make a map of molecular features and receptor actuation potentials, they make a map of the words themselves. They use large text datasets, i.e., really big books, one of which was the Sigma-Aldrich Flavors and Fragrances catalog, then score words based on their co-occurances in the text.

I started this post just so I could paste these lists of words, so let's get on with it. On a scale of 0-1, how likely is it that these words can be interchanged?

Similarity Test:

bakery-bread 0.96

grass-lawn 0.96

dog-terrier 0.90

bacon-meat 0.88

oak-wood 0.84

daisy-violet 0.76

daffodil-rose 0.74

Nearest Neighbor Test:

apple - pear, banana, melon, apricot, pineapple

bacon - smoky, roasted, coffee, mesquite, mossy

brandy - rum, whiskey, wine-like, grape, fleshy

cashew - hazlenut, peanut, almond, hawthorne, jam

chocolate - cocoa, sweet, coffee, licorice, roasted

lemon - geranium, grapefruit, tart, floral

cheese - grassy, butter, oily, creamy, coconut

caramel - nutty, roasted, maple, butterscotch, coffee

Notes:

Kiela, D., Bulat, L. & Clark, S. Grounding semantics in olfactory perception. Assoc. Comput. Linguist. 231–326 (2015).

https://www.aclweb.org/anthology/P15-2038.pdf

Distributional Semantics – represents the meanings of words as vectors in a “semantic space”, relying on the distributional hypothesis: the idea that words that occur in similar contexts tend to have similar meanings.

Limbic Signal

Tuesday, May 5, 2020

Semantic Bingo

No comments:

Post a Comment