Formal reasoning in LLMs

Igor Zalutski
3 min readAug 4, 2024

--

This is my naive take after seeing this tweet by Linus who works on AI at Notion. It got me thinking, I stated typing a response, and it got out of hand. I have only superficial understanding of LLMs and no credentials in the field whatsoever so please don’t take it too seriously.

We humans acquire formal knowledge through learning, and it tends to be the hardest kind of learning. It probably relies on the same mechanics as any other type of learning - presumably pattern recognition and prediction; but how then is precision achieved?

When you are working a mathematical problem, your brain doesn't have access to any external source of definitive knowledge. And there's no perfect precision either; otherwise scientists wouldn't be able to routinely find mistakes in papers of other scientists.

The brain might be achieving "good enough" precision by doing what a tutor does to help a student who's stuck: what kind of equation is this? Have you seen this before? What known techniques might we try to simplify? It's pure pattern matching, and again imprecise, and the brain does it orders of magnitude faster if no talking is involved.

Curiously, a determined student is often able to unblock themselves by asking the same questions a tutor might ask inside their head - but in this case it'd be still orders of magnitude slower. This seems important: it shows that the brain seems to have two ways of working through formal problems: the "fast" way when the learned knowledge is readily available, and the "slow" way when it's not. (no relation to Kahneman's fast / slow; a curious coincidence nonetheless).

Interestingly, the “slow” way of solving a problem (verbalising each step with a tutor or inside your head) and the learning process itself seem to be indistinguishable - work though many similar problems the “slow” way, and you acquire an ability to solve this class of problems the “fast” way without verbalising each step: “you just know”.

What if formal reasoning with LLMs could be achieved in a similar way? The “slow” way can be simulated via agentic workflow that models a tutor-student conversation, making baby steps until the solution is reached. Just like a tutor would do, it will cross-check validity of every step the student makes, and ask clarifying questions that might surface a potential solution in the student's head if they're stuck.

This is, of course, slow. Each prompt has to be designed to make progress within one step; there could be multiple questions needed before progress is made; and there could be hundreds if not thousands of steps. So with the current speed of LLMs it is impractical.

But humans also resort to the “slow” way of reasoning only as a fallback; most formal reasoning done by humans is orders of magnitude faster because we've solved many similar problems before. What if LLMs can be trained similarly?

There probably isn't much training data out there that could teach LLMs how to work though problems, but it doesn't matter. Just like with humans, knowing the language is sufficient! If we can achieve good enough precision of formal reasoning with the current generation of LLMs using the “baby steps” model - even if it's impractically slow, say solving a simple equation takes a week - we can then produce enough data to train the next LLM to be fast at these problems. And this might unlock using the “slow” way to solve the next class of problems that were not feasible before because it'd be too slow. Just like humans learn!

An act of training of an LLM then is not “creation of a brain” that does or doesn't have some capabilities. Rather, it's one evolutionary step in its learning journey. The brain isn't the LLM - it's their entire lineage!

To get a model capable of advanced mathematics, we'd need to train multiple generations of them, starting with pre-school basics. Each generation can be stretched to the limits of the “slow” way using verbal reasoning, and the reasoning data generated in the process fed into training the next generation that will be able to work though this level of problems quickly without resorting to baby steps.

I doubt that fine-tuning can be enough here; more likely than not it'd be a from-scratch training for each generation - because for humans mastering STEM is much harder than learning the language, it takes way more time and cognitive effort; which suggests that the volume of training data needed for an LLM to become good at maths might dwarf the amount it needs to become good at language; which probably means training from scratch on bigger data set, possibly even with more parameters.

--

--

Igor Zalutski
Igor Zalutski

No responses yet