From Wittgenstein's Philosophy to ChatGPT

From Wittgenstein's Philosophy to ChatGPT

From Wittgenstein's Philosophy to ChatGPT

Why LLM

"The limits of my language mean the limits of my world." — Ludwig Wittgenstein

Language itself and human intelligence grow together, and they are inseparable. If a foundation model can obtain information in all languages, it has all human knowledge in the pocket. Put it simply, there's no knowledge that cannot be expressed in languages.

This is why large language models (LLMs) matter.

Prompt Learning and In-context Learning

Wittgenstein's philosophical view actually goes through two phases:

In the first phase, Wittgenstein explored, in his Tractatus Logico-Philosophicus, how human beings communicate effectively. In his view, we use languages to build images of facts in our minds and describe the world. However, ambiguities are prone to occur in that we have different or even incorrect images in our minds. And it takes long to eliminate such differences. This is because human beings lack a clear and accurate image in their minds for the content to be expressed, at the same time, they often say meaningless or ambiguous words. Hence, Wittgenstein has a negative view on natural languages and points out that the natural languages are not accurate enough to describe the world. He also believes that there are many unspeakable things in the world, and these things, like ethics, aesthetics, logical forms of the world, etc, are more meaningful. "Whereof one cannot speak, thereof one must be silent," the book concluded with the lone statement.

In the second phase, by contrast, Wittgenstein opposed his earlier ideas. In Philosophical Investigations, he points out that languages are not only related to images in our minds, but also a universal tool for expressing ideas. We can use languages to play different games, and describe the objective world as well as the subjective world such as emotion. The reason why there are so many misunderstandings is that we do not know what games others are playing. For example, if a doctor hides the diagnosis result from a cancer patient, we say the doctor's expression is inconsistent with the fact. However, given the context, this is reasonable.

So, maybe ChatGPT giving "nonsensical answers" is actually caused by the inconsistent context between ChatGPT and us? That is to say, we raise a question with an assumption in our mind, while such an assumption is not learned by ChatGPT. The idea of prompt/in-context learning or reinforcement learning from human feedback (RLHF) in LLMs is similar to training ChatGPT in this way.

Of course, we human beings communicate with each other with contexts involved, including facial expressions and tone of the voice. But our conversation with ChatGPT is a symbolic world, which makes it more complicated to surface these contexts.

Can LLMs Do Inference? Token-based Expression and Probability Inference

Just a few weeks ago, I still thought that LLMs performed well in a general sense. Though their analysis and inference capabilities are still insufficient, there are some good signs of artificial general intelligence (AGI) recently:

(1) The difference between the two phases of Wittgenstein's views lies in that the first phase is rational constructivism, hoping to build our language systems in the same way as Chomsky's grammar to achieve accuracy or unambiguity, while the second phase is empiricism, where Wittgenstein believes that combining fact induction and inference is the right way we understand the world.

As ChatGPT now stands, Noam Chomsky, an American public intellectual known for his work in linguistics, political activism, and social criticism, sees the use of ChatGPT as "basically high-tech plagiarism" and "a way of avoiding learning."

(2) Personally, the reason why ChatGPT delivers poor inference capability in science and engineering is that its knowledge in these fields is not tokenized enough. I believe that if we convert sufficient resources in various fields into tokens which are suitable for LLM learning, or we tokenize the external interfaces of related knowledge bases, ChatGPT will definitely reach a high level in inference in these fields.

(3) The inference capability of LLMs may be slightly different from that we understand. There is a probability in LLM inference. Compared with the traditional inference method, this inference method has its own advantages and disadvantages.

Renowned mathematician Terence Tao has used ChatGPT in his mathematical practice (see https://news.ycombinator.com/item?id=35538945).

ChatGPT's inference is more divergent, although uncertain, but may bring creativity necessary to generate novel ideas.

In the meantime, I personally think that because each step has a probability, multiple times of inference may magnify the error. That's why LLMs emphasize the importance of the chain of thought (step-by-step guidance).

What Is the Next Hop of LLMs

LLMs have proved its success in the world of symbols. The next step is to model the physical world and figure out how to connect the physical world to systems such as ChatGPT.