Thursday, March 26, 2020

Colexify My Insides

Comparison of universal colexification networks of emotion concepts with Austronesian and Indo-European language families. Credit: T. H. Henry

How do you know that a 12-inch ruler is in fact 12 inches long? You don't. You trust. I don't know who you trust, if it's the ruler manufacturer, or the society you live in, or who else. But you don't actually know how long that ruler really is.

How do ruler manufacturers know how long 12 inches is? They use a ruler, of course. And where does that ruler come from?

I work in a field where we have to take very precise and accurate measurements of environmental conditions, such as nanogram-concentrations of mercury vapor in the air. If your equipment thinks it's pulling 0.2 liters per minute of air instead of 0.3, then what happens after 8 hours worth of minutes? You get a very distorted sense of how much mercury is in the air (96 vs 144 liters to begin with).

This is why we calibrate our equipment, using another piece of equipment to make sure ours is doing what it says it does. Sure we could talk about The Kilogram, which until last year was used to calibrate every other kilogram-measuring thing ever, and was protected in multiple nested glass encasements in a vault in the basement of a nondescript building in the remote countryside of France.

But instead, we're going talk about language. Because there's no Kilogram for language.

In the same way that we don't know how long any particular ruler is if we don't have an ur-ruler, how do I know that your meaning of a word is the same as mine? This is like asking if the red you see is the same red I see. Or if the pain you feel is the same pain I feel. Language, like feelings in general, is subjective. How can we calibrate something that has no universal standard?

Language, unlike feelings, does offer a metric by which we can compare and even measure it's meaning to different people. It's not a surprise; words are the way we measure language. But not until now, with the era of Big Data fully upon us, can we can put all the words in the world into one database and compare their meanings across all languages, using the database itself as the closest thing to a universal measuring rod that we can get.

This is called colexification, where we draw lines between all the words in that database, and find common denominators and groupings of words. The goal is to create a universal structure of emotional language that can be used to calibrate and understand these words and especially the people who use them. These are called "emotion colexification networks," and they show us for example how in Austronesian languages, "surprise" is  associated with "fear," whereas Tai-Kadai languages associate "surprise" with the concepts "hope" and "want." (Take a look at the top image in this post.)

In other words, we can now see that if you say you're surprised, but you're saying that in an Austronesian language, then you're probably not so happy, although in English, the word surprise represents something more like happiness.

The researchers working with this ultimate cross-lingual lexicon found significant variations on the positioning of words in the network – the meaning of words changes a lot as you go from one language to another, even if those words are translated as equal with each other.

In closing, this is interesting research for the world of olfaction, which is another one of those severely subjective phenomena. In fact, the researchers in this study use the same two data points as for olfactory studies, those being valence and intensity. It should be obvious, because the limbic system is the common denominator between the two. The limbic system is the domain of our emotions and of olfactory experience.

Also like in the very recent olfactory research, this study is made possible because of an advance in the database used. CLICS is a database of colexifications involving 2474 languages from around the world; only a few years ago this database had only 300 languages in it.

J.C. Jackson el al. Science (2019).

Dec 2019,

1 comment: