Probabilistic
programming does in 50 lines of code what used to take thousands
I wanted to put some stuff up here about the state of computer
programming, because the way we smell is akin to a special kind of computer
program, and one which does not act like the kind we know.
I should start like this – I grew up on Logo, and then NES
video games, therefore my experience with, and thinking upon, computer
programming is ‘coded’ according to this top-down style. Someone writes the code,
and the computer executes the code. There are no surprises (unless you have
bugs to fix). You tell the turtle (that’s what they call the cursor in Logo)
what to do and it does exactly that.
Look at the picture above. That little triangle (the turtle) was told to go 100
spaces, rotate 90 degrees, then go 100 more spaces, etc, until a square is
born.
King Koopa was told to jump every time you throw fireballs
at him, or whatever he does. There is no fuzzy logic here. Everything is clear,
concise, exact, predictable. (Again, that’s when the program runs as intended;
surely this kind of programming is unpredictable when it goes wrong.)
Enter a new kind of programming. With the dual advent of big
data and big processors to crunch it, we are seeing a different approach. The computing
power is now so capable that it is asked to figure out its own program from the
data given. This helps with a lot of the problems faced in computing today.
With such variety in the data (this is ultimately what big data is about – not lots
of quantity, but lots of different qualities)
we can no longer write programs equipped to work with such variety. The program
required for that ends up being as big as the dataset.
This is where we see the parallels to smells and olfaction.
The amount of smells we could potentially be exposed to is infinite and
multifarious. Vision has only a few categories. Things can look light or dark,
a binary classification, or they can be categorized by their color on the
spectrum, which is a discrete classification. They have a shape, a size, maybe
a texture category. Odors, however, cannot be organized this way. There are too
many and they are too different from eachother. Therefore, olfactory perception
is distinct from our other senses. In order for us to create an artificial
intelligence that can smell, we would have to come up with a different kind of
programming.
Facial recognition provides a good visual analogy to the
olfaction problem. What a face looks like isn’t really dependent on its color
or its shape, but the combination of these features, the whole. And that makes a lot of initial parameters, in fact, infinite
parameters. Face-rec uses these new types of programs, and they are almost the
opposite, in every way, of what programming has been. I’ll let this guy describe
them:
“When you think about probabilistic programs, you think very
intuitively when you're modeling. You don't think mathematically. It's a very
different style of modeling.” … “The code can be generic if the learning
machinery is powerful enough to learn different strategies for different tasks.”
- Tejas Kulkarni, an MIT graduate student in brain and
cognitive sciences, phys.org
In the same way that we are not born already knowing every
smell we will ever encounter, these programs must ‘learn on the fly.’ This is an
advance in computing, but also it foreshadows a very different world, where information
is not distinct, discrete, exact, etc. It is instead more like that thing you
smell but you don’t know what it is, but you swear you know yet you don’t know…you
know what I’m talking about? Doesn’t sound like the kind of output your
computer would produce.
POST SCRIPT
[lots of good explaining in this article, so I just copied
most of it]
A Grand Unified
Theory of Artificial Intelligence
Embracing uncertainty
In probabilistic AI, by contrast, a computer is fed lots of
examples of something — like pictures of birds — and is left to infer, on its
own, what those examples have in common. This approach works fairly well with
concrete concepts like “bird,” but it has trouble with more abstract concepts —
for example, flight, a capacity shared by birds, helicopters, kites and
superheroes. You could show a probabilistic system lots of pictures of things
in flight, but even if it figured out what they all had in common, it would be
very likely to misidentify clouds, or the sun, or the antennas on top of
buildings as instances of flight. And even flight is a concrete concept
compared to, say, “grammar,” or “motherhood.”
As a research tool, Goodman has developed a computer
programming language called Church — after the great American logician Alonzo Church
— that, like the early AI languages, includes rules of inference. But those
rules are probabilistic. Told that the cassowary is a bird, a program written
in Church might conclude that cassowaries can probably fly. But if the program
was then told that cassowaries can weigh almost 200 pounds, it might revise its
initial probability estimate, concluding that, actually, cassowaries probably
can’t fly.
“With probabilistic reasoning, you get all that structure
for free,” Goodman says. A Church program that has never encountered a
flightless bird might, initially, set the probability that any bird can fly at
99.99 percent. But as it learns more about cassowaries — and penguins, and
caged and broken-winged robins — it revises its probabilities accordingly.
Ultimately, the probabilities represent all the conceptual distinctions that
early AI researchers would have had to code by hand. But the system learns
those distinctions itself, over time — much the way humans learn new concepts
and revise old ones.