Artwork by Alex Grey
The title of this post is named after an article about categorizing smells,
although it would work just as well as the title of a work of science fiction.
It's from last August 2018, so it's old news by now; but that title isn't
getting old anytime soon.
Probing the interconnected-ness of odors, and sketching a
map of an omnicategorical odor network, the article starts out with a basic
premise.
Let's say the olfactory system is designed to warn us of
poisons in the environment. But a poison could be many chemicals, or a chemical
we've never encountered before. So it would be necessary for the odors of those
chemicals to be classified not by features intrinsic to the chemicals
themselves, but by the likelihood of their co-occurrence with other chemicals.
You can't be born with a database of chemicals to recognize and avoid. So
instead, the hypothesis here is that an odor is only identified in its
relationship to the other odors it's with.
This idea, at least to my ears, sounds really similar to the
way statistical correlation text analysis can determine whether a piece of
writing was written by a robot or not. (Also called visual forensics.)
It's way easier to visualize, so I'm taking these three
images from the paper itself:
In the above 3 images, the first is a chunk of text written
by a robot (most of the words are green, with a few yellows sprinkled in), the
second is a real New York Times article (only half is green, the rest is
yellow, with some red, and a sprinkle of purple) and the third picture is a
clip from "the most unpredictable human text ever written", James
Joyce's Finnegan's Wake (the colors green, yellow, red, purple are all evenly
distributed about the page).
Green words are very predictably the next word. Yellow words
are less likely to show up after the word they show up after. And red and
purple are for when the next word is something you absolutely did not expect.
Because text-writing algorithms today use a statistical
correlation program based on a compendium of written language (so they know
what words typically occur together, and can therefore sound more like a normal
person) the output of such algos will tend to look like the topmost image with
all green words. Very predictable. The algos can't think for themselves, they
can't "come up with" new stuff, and they can't be unpredictable. The
whole point of writing an algorithm to do this is to prescribe what it's going
to do in advance, i.e., it's predictable.
Bringing this back to olfaction, unfortunately there isn't a compendium of odor associations such as a Bible for smells or an encyclopedia for volatile organic compounds in nature. Furthermore, even if there were, we would need to augment it with a companion encyclopedia of the odors in the anthroposphere, because your supermarket isn't "nature" and yet it organizes a whole lot of our daily scentscape. One day though.
Notes:
Hyperbolic geometry of the olfactory space.
Yuansheng Zhou, Brian H. Smith, Tatyana O. Sharpee
Science Advances 29
Aug 2018: Vol. 4, no. 8, eaaq1458
DOI: 10.1126/sciadv.aaq1458
Catching a Unicorn with GLTR: A tool to detect
automatically generated text.
Hendrik Strobelt and Sebastian Gehrmann. Association for
Computational Linguistics: Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics: System Demonstrations. Florence,
Italy. July 2019. DOI:10.18653/v1/P19-3019
No comments:
Post a Comment