When I was on my honeymoon in 1984 with Alice, I took along Doug Hofstadter’s Gödel, Escher, Bach for beach reading. How cool is that? Changing your personal life and your professional life, all in 1.5 weeks of R&R close to Alexander Hamilton’s birthplace.
The essential idea of the Gödel Theorem from 1931 is that any symbolic system sufficiently complex to describe ordinary arithmetic is either logically inconsistent or logically incomplete. What that means is that there are some grammatically correct sentences (symbol strings, logical statements) whose truth cannot be ascertained (“The color of money is A flat minor.”) or logical statements that cannot be evaluated in an internally consistent manner (“The next sentence is false. The previous sentence is false.”). What this means is that laws, languages, and cultures are inevitably subject to illogic and inconsistencies, because language is a symbolic system and laws are expressed in language.
When we founded SpectroClick, one of our goals was to place a virtual expert analytical chemist into our operating software, so that non-experts would always have the benefit of expert thinking in making measurements with our instruments. Thus, the customer would always get a recommendation for useful action without having to deal with numbers, measurement science, instrument engineering, and so on. Of course we knew that Gödel’s Theorem was lurking in the shadows, but it seemed so remote that we didn’t consider it to be limiting.
However, on January 7, 2019 a paper was published: Shai Ben-David, Pavel Hrubeš, Shay Moran, Amir Shpilka, and Amir Yehudayoff, “Learnability Can Be Undecidable,” Nature Machine Intelligence, 1, 44-48 (2019). The paper maps machine learning to the Gödel theorem, showing that for certain sets of information, a sampling of the information cannot give a model of the system sufficiently nuanced that a model of the system can describe its entire functioning.
Let’s look at a couple of examples not described in the Ben-David paper. The first is a system of two, isolated bodies (say a star and a single planet orbiting that star) interacting only via gravity. Newton’s Law of Gravitation (or, in some cases, Einstein’s Special Relativity modification to Newton’s Law) can be applied to short term data to predict the behavior of the system for all future time. That is, observing the system for a short time provides everything one needs to know to understand it for all time.
In contrast, suppose the system we’re looking at is the behavior of people in Champaign, Illinois. We can talk to many of these people and develop a model of how each person behaves and how all the people we’ve met will behave. While that may provide a vague idea of how the other people in town will act, our model of human behavior can not be extrapolated. We cannot predict how the people we haven’t spoken with will act. If our subset is only people without steady jobs, we know nothing about how fully employed people will view reality. Conversely, if we only talk to people who live in McMansions whose mortgages are paid off, we have no idea how people with different economic means will view reality.
Aside: my humanities-degreed wife initially commented on the survey example, “the study was badly designed; biased sampling will always give a biased result.” The point of Ben-David et al. is that, in all but the simplest situations, there is always a bias if there is less than 100% sampling.
The Ben-David et al. paper discusses the theory of machine learning. They show that machine learning is based on sampling a set of information, abstracting patterns in that information, and then extrapolating when presented with novel inputs. Given that the information set is a statistical sampling of some part of the real world, any machine learning will run into situations, per Gödel, that cannot be accurately evaluated. There are limits to machine learning and intelligence.
People can be modeled as one variety of learning machine, a neural network, who learn from their environment. Separately, some day, I hope to explain why humans can never know everything – the amount of information we can each absorb is bandwidth limited. But if we are learning machines, then we too are learning by sampling the data in the real world and the patterns we discern are limited in the manner Ben David et al. recognize. Our models of the world are always incomplete and inconsistent.
And this brings us to SpectroClick and the codification of analytical chemistry and spectrometry. Our strategy for providing people with action recommendations rather than just raw scientific data presumes that we can codify each user’s problem and environment, and then develop a method to solve the problem in that environment. We would anticipate many (or all) the complications one might encounter in carrying out the method, and make video instructions to show users how to proceed to a worthwhile answer. We can test our level of problem anticipation by having inexperienced users try our methods. But now Ben David et al. essentially tell us that we cannot anticipate all the problems because our sampling of customer behavior will be incomplete, as will be the information available to our programs. Therefore, there will always be circumstances we either cannot sense or cannot anticipate.
Perhaps this even exposes what we mean by innovation. If we know inputs and outputs for a particular situation, we have an algorithm, i.e. a statement that, given thus and thus inputs, do a fixed set of operations to give an output. How do we judge if an output is sensible or anomalous? By comparison to experience and expectation. What if an output is not sensible? At some point, an algorithm runs out of options and terminates saying, “this doesn’t make sense and none of the tools to which I have access tells me what to do.”
But an innovator can say, “since we know the existing approaches fail, we need to see where the anomaly occurs, what could lead to that anomaly, and rethink the context of the problem.” The ability to jump outside the system constraints to form a supersystem (thank you, Doug Hofstadter!) is what an innovator does. From the broader perspective, additional refinement of the algorithm may be possible; the dead ends turn into branch points to additional activity. Ben David et al. imply that one never gets to the point where any sufficiently complicated activity is fully containable within an algorithm. Humans interacting with the natural world is probably so complicated that the Ben David et al. argument describes the problem and enlightens us about the solution. SpectroClick may build ever more sophisticated subsystems that help a wider and wider range of people, but we will never get to the point that we function flawlessly for everyone. The question is: will we work well enough for enough people to become a thriving business. Stay tuned!