François Chollet: Why Scaling Alone Isn’t Enough for AGI
Quick Read
Summary
Takeaways
- ❖NDIA is developing symbolic program synthesis, an alternative to deep learning, to build more efficient and generalizable AI models.
- ❖Current LLMs thrive in domains with verifiable reward signals (e.g., coding, mathematics) but struggle with 'fuzzy' problems like essay writing.
- ❖ARC AGI V3 measures 'agentic intelligence' – an AI's ability to explore, set goals, and plan in novel, interactive environments, a key differentiator from V1 and V2.
- ❖AGI is defined as human-level skill acquisition efficiency across arbitrary tasks, distinct from merely automating economically valuable work.
- ❖Chollet predicts AGI will arrive by the early 2030s, potentially with a core codebase under 10,000 lines, suggesting it's not inherently a scale problem.
- ❖Alternative AI approaches like genetic algorithms, if scaled with sufficient investment, could yield significant breakthroughs.
Insights
1NDIA's Symbolic AI Paradigm
NDIA is pioneering a new branch of machine learning, symbolic program synthesis, as an alternative to deep learning. Instead of fitting parametric curves via gradient descent, NDIA uses 'symbolic descent' to find the simplest, most concise symbolic models to explain data. This approach aims for much higher optimality, requiring less data, running more efficiently at inference, and generalizing and composing better.
Chollet states, 'we're replacing the parametric curve with a symbolic model that is meant to be as small as possible... we are building something that we call symbolic descent... giving you extremely concise symbolic models... you're going to need much less data... run much more efficiently... generalize much better and compose much better.'
2Verifiable Reward Signals Drive LLM Success
The recent success of coding agents and LLMs stems from their ability to operate in domains with 'verifiable reward signals,' such as code (unit tests, compilation) and mathematics (theorems, proofs). This allows models to generate their own training data through trial and error in post-training environments, leading to dense coverage of problem spaces and the development of execution models, mimicking how humans debug code.
Chollet explains, 'code provides you with a verifiable reward signal... any problem where the solutions you propose can be formally verified... can be fully automated with current technology... the big unlock is when people started creating this code-based like training environment... where the reward signal... is provided by things like unit tests.'
3ARC AGI V3 Measures Agentic Intelligence
ARC AGI V3 is designed to measure 'agentic intelligence,' focusing on an AI's ability to actively explore, acquire goals, plan, and execute in novel, interactive environments without explicit instructions. Unlike previous ARC versions that focused on passive modeling of given data, V3 challenges AI to gather its own data efficiently in 'mini video games' and match human-level efficiency, which current frontier models struggle with.
Chollet states, 'v3 is completely different. We are trying to measure agentic intelligence. So it's interactive, it's active like the data is not provided to you. You must go get it... it must figure out everything on its own via trial and error.' He adds, 'We're trying to look for AI systems that could match this efficiency [of humans].'
4AGI: A Small Codebase with Large Knowledge
Chollet predicts that when AGI is achieved, its core 'fluid intelligence engine' will be a very small codebase, potentially less than 10,000 lines of code, occupying megabytes of space. This compact engine would operate on a much larger, self-improving knowledge base. He controversially suggests that if this core insight had been known in the 1980s, AGI could have been developed with the compute resources available then.
Chollet states, 'I think it's going to be a very very small code base... on the order of megabytes... less than 10,000 lines of code and that if you had if you had known about it back in the in the 1980s you could have done AGI back then.'
Bottom Line
The current LLM stack, while productive, is not optimal and will likely be replaced by more efficient AI architectures in the long term, trending towards optimality.
This suggests that significant long-term value in AI research lies in exploring foundational alternatives rather than solely building on existing LLM paradigms.
Invest in or research radically different machine learning substrates, like symbolic AI or scaled genetic algorithms, that prioritize efficiency and conciseness from first principles.
Human-engineered 'harnesses' are currently essential for LLMs to solve complex problems in verifiable domains, indicating a lack of true AGI.
While harnesses enable powerful task automation, they highlight that current AI cannot autonomously structure novel problems or devise solution strategies without human intervention.
Develop AI systems that can autonomously generate and refine their own problem-solving 'harnesses' or frameworks, reducing human dependency and advancing towards true fluid intelligence.
The ARC AGI benchmark series is a 'moving target' designed to continually identify and test the 'residual gap' between frontier AI capabilities and human learning efficiency.
This implies that achieving AGI is not about passing a single static test, but about continually closing the gap in learning and adaptation across increasingly complex and novel challenges.
Focus AI development on systems capable of continuous, curriculum-based learning and invention, as these are the next frontiers ARC AGI will target (V4 and V5).
Opportunities
Develop AI systems based on symbolic program synthesis.
Create a new machine learning engine that generates extremely concise symbolic models, requiring less data, running more efficiently, and generalizing better than current deep learning models. This could lead to highly optimized, low-resource AI solutions.
Scale up alternative AI approaches like genetic algorithms.
Invest significant compute and resources into developing and scaling genetic algorithms or other non-gradient descent based search methods. This could unlock new scientific discoveries and powerful automation capabilities in domains where search is optimal.
Build AI tools for 'verifiable domains' beyond coding.
Identify other domains that offer formally verifiable reward signals (e.g., formal mathematics, scientific discovery, complex system design) and develop AI agents that can fully automate tasks within them using current LLM-based techniques and iterative self-correction loops.
Key Concepts
Minimum Description Length Principle
The model of data most likely to generalize is the shortest and simplest. This principle underpins NDIA's symbolic approach, aiming for concise symbolic models rather than complex parametric curves.
Intelligence vs. Knowledge Trade-off
Competency involves a balance between intelligence and knowledge. More knowledge or better training can compensate for less raw intelligence. Current AI models are becoming more useful through extensive training (knowledge) rather than increased fluid intelligence.
Science as Symbolic Compression
Science fundamentally involves compressing a large set of observations into a simple, elegant symbolic rule or equation. This process is analogous to NDIA's goal of finding the most compressive symbolic models of data in software form.
Lessons
- Learn domain expertise deeply to effectively leverage AI tools; AI progress empowers those who can apply it to specific fields.
- For new open-source projects, prioritize a simple, intuitive API, comprehensive documentation that teaches the domain, and active community building.
- When building an AI lab, focus on creating a 'compounding stack' with reusable foundations, ensuring that each new layer builds upon and enhances previous learnings, rather than constantly trying new, disconnected approaches.
Notable Moments
Chollet's shift from deep learning evangelist to symbolic AI proponent.
As the creator of Keras, a widely used deep learning framework, Chollet's pivot to symbolic AI underscores a fundamental belief in the limitations of current deep learning for true AGI, lending significant weight to alternative research paths.
The ARC AGI benchmark's role in signaling AI breakthroughs.
ARC AGI V1 signaled the advent of reasoning models, and V2 signaled agentic coding. This demonstrates the benchmark's effectiveness in identifying new AI capabilities beyond mere scaling, making V3 a critical indicator for future 'agentic intelligence' breakthroughs.
Quotes
"The model of the data that is most likely to generalize is the shortest, and I think you cannot find a model like this if you're doing parametric learning; you need to try symbolic."
"If you have a big idea and it has very low chance of success, but if it works, it's going to be big and no one else is going to be working on it... then you should try your chance."
"AGI is basically going to be a system that can approach any new problem, any new task, any new domain and make sense of it... with the same degree of efficiency as a human could."
"When it comes to competency, there's always a trade-off between intelligence and knowledge. If you have more knowledge, if you have better training, you need less intelligence to be competent."
"I do believe that you know when we create AGI retrospectively it will turn out that it's a code base that's less than 10,000 lines of code and that if you had if you had known about it back in the in the 1980s you could have done AGI back then."
"Science is fundamentally a symbolic compression process where you're looking at a big mess of observations... and you're compressing that down to a very simple symbolic rule."
"You're not going to stop AI progress. I think it's too late for that. And so the next question is okay like AI progress is here... How do you make use of it? How do you leverage? How do you ride the wave? That's the question to ask."
Q&A
Recent Questions
Related Episodes

AI Whistleblower: We Are Being Gaslit By The AI Companies! They’re Hiding The Truth About AI!
"Investigative journalist Karen Hao exposes how major AI companies, particularly OpenAI, employ manipulative tactics, exploit labor, and create environmental crises while 'gaslighting' the public with a self-serving narrative to maintain their 'empire of AI.'"

AI Is Unlocking Millions Of New Builders
"Emergent, a YC-backed AI platform, has enabled 7 million apps in 8 months by empowering non-technical users to build production-ready software, challenging traditional SaaS and developer roles."

Is AI Hiding Its Full Power? With Geoffrey Hinton
"AI pioneer Geoffrey Hinton explains the foundational mechanics of neural networks, reveals AI's emergent capacity for deception and self-preservation, and outlines the profound, unpredictable societal shifts ahead."

Inside Claude Code With Its Creator Boris Cherny
"Boris Cherny, creator of Claude Code, reveals how Anthropic built a 1000x productivity tool by embracing emergent AI capabilities, constant iteration, and a 'build for the model six months from now' philosophy."