I'm curious about the limitations of teaching AI, particularly large language models (LLMs). Why don't we focus on making them 'think' more autonomously instead of relying on massive amounts of data for training? Is it feasible to train LLMs with just reasoning skills using ultra-large context windows, without needing to memorize irrelevant details like historical dates? I'm looking for a deeper understanding of how we can achieve better instruction-following without the cost of processing vast amounts of data.
5 Answers
It's interesting to note how much the human brain learns from sensory input long before we can fully understand concepts. Since we lack a similar input for AI, we have to use the internet's vast resources to help these models learn. Without fundamentally understanding brain functions, we end up needing high volumes of training data just to sculpt any form of artificial intelligence.
Good point you raised! But remember, if we could just make AI think without data, it wouldn't even be considered an LLM anymore; it would be something radically different. We're still in the discovery stage of AI learning.
It's a good question, but the reality is we don't fully understand how to make computers 'think' like humans do. We can replicate some kinds of intelligence through models, but they still need a ton of data to learn and grow. Sadly, without figuring out how human reasoning works, we won't be able to bypass all that training data anytime soon. Some researchers are exploring various approaches like hybrid models and self-learning, yet so far, Transformers seem to be the most effective.
To be fair, asking why we don't just teach them to think is a bit like asking why we haven't solved the hard problem of consciousness yet. There's a big gap between our understanding and what you're proposing, and without that knowledge, we're stuck with the current method of data-fed training.
Yeah, and the fact is that every time we try to go beyond that, it seems we need even more data to make meaningful progress.
Also, we should keep in mind that reasoning and learning are not quite the same. LLMs do manage to learn patterns, but they don't truly understand reasoning the way we do—it's mostly about acquiring data to operate effectively.
Exactly! We haven't yet cracked the code on human cognition, so any attempts to mimic it in machines will always require extensive data as a foundation.