Y
Y Combinator
March 27, 2026

François Chollet: Why Scaling Alone Isn’t Enough for AGI

Quick Read

François Chollet, creator of Keras and the ARC AGI benchmark, details his lab NDIA's pursuit of optimal, symbolic AI, arguing current LLM scaling is insufficient for true AGI and predicting AGI by early 2030s.
Current LLMs excel at tasks with 'verifiable reward signals' (like coding), but lack true fluid intelligence.
NDIA is building 'symbolic descent' to create optimal, concise AI models that generalize better and require less data.
AGI, defined as human-level skill acquisition efficiency, is projected for the early 2030s, potentially with a surprisingly small codebase.

Summary

François Chollet, founder of the ARC Prize and NDIA, discusses his lab's novel approach to AI research, focusing on symbolic program synthesis as an alternative to deep learning. He explains that NDIA aims to build a new machine learning branch that generates extremely concise, symbolic models, requiring less data, running more efficiently, and generalizing better than current parametric deep learning models. Chollet differentiates his definition of AGI (human-level skill acquisition efficiency) from the industry's common definition (automating economically valuable tasks), suggesting the latter is achievable sooner with current LLM technology, particularly in domains with verifiable reward signals like coding and mathematics. He introduces ARC AGI V3, a benchmark designed to measure 'agentic intelligence' – an AI's ability to actively explore, set goals, and plan in novel, interactive environments without prior instructions, which he believes current models struggle with. Chollet predicts AGI will arrive by the early 2030s and posits that the core AGI code base could be surprisingly small, potentially less than 10,000 lines, operating on a large knowledge base.
This episode offers a critical perspective on the current AI trajectory, challenging the prevailing 'scaling laws' paradigm. Chollet's insights into symbolic AI, the limitations of LLMs for true fluid intelligence, and the design of benchmarks like ARC AGI V3 provide a roadmap for understanding the next frontiers of AI development. His predictions about AGI's timeline and the potential for surprisingly compact AGI solutions could reshape research and investment strategies, highlighting opportunities for alternative approaches beyond the dominant deep learning stack.

Takeaways

  • NDIA is developing symbolic program synthesis, an alternative to deep learning, to build more efficient and generalizable AI models.
  • Current LLMs thrive in domains with verifiable reward signals (e.g., coding, mathematics) but struggle with 'fuzzy' problems like essay writing.
  • ARC AGI V3 measures 'agentic intelligence' – an AI's ability to explore, set goals, and plan in novel, interactive environments, a key differentiator from V1 and V2.
  • AGI is defined as human-level skill acquisition efficiency across arbitrary tasks, distinct from merely automating economically valuable work.
  • Chollet predicts AGI will arrive by the early 2030s, potentially with a core codebase under 10,000 lines, suggesting it's not inherently a scale problem.
  • Alternative AI approaches like genetic algorithms, if scaled with sufficient investment, could yield significant breakthroughs.

Insights

1NDIA's Symbolic AI Paradigm

NDIA is pioneering a new branch of machine learning, symbolic program synthesis, as an alternative to deep learning. Instead of fitting parametric curves via gradient descent, NDIA uses 'symbolic descent' to find the simplest, most concise symbolic models to explain data. This approach aims for much higher optimality, requiring less data, running more efficiently at inference, and generalizing and composing better.

Chollet states, 'we're replacing the parametric curve with a symbolic model that is meant to be as small as possible... we are building something that we call symbolic descent... giving you extremely concise symbolic models... you're going to need much less data... run much more efficiently... generalize much better and compose much better.'

2Verifiable Reward Signals Drive LLM Success

The recent success of coding agents and LLMs stems from their ability to operate in domains with 'verifiable reward signals,' such as code (unit tests, compilation) and mathematics (theorems, proofs). This allows models to generate their own training data through trial and error in post-training environments, leading to dense coverage of problem spaces and the development of execution models, mimicking how humans debug code.

Chollet explains, 'code provides you with a verifiable reward signal... any problem where the solutions you propose can be formally verified... can be fully automated with current technology... the big unlock is when people started creating this code-based like training environment... where the reward signal... is provided by things like unit tests.'

3ARC AGI V3 Measures Agentic Intelligence

ARC AGI V3 is designed to measure 'agentic intelligence,' focusing on an AI's ability to actively explore, acquire goals, plan, and execute in novel, interactive environments without explicit instructions. Unlike previous ARC versions that focused on passive modeling of given data, V3 challenges AI to gather its own data efficiently in 'mini video games' and match human-level efficiency, which current frontier models struggle with.

Chollet states, 'v3 is completely different. We are trying to measure agentic intelligence. So it's interactive, it's active like the data is not provided to you. You must go get it... it must figure out everything on its own via trial and error.' He adds, 'We're trying to look for AI systems that could match this efficiency [of humans].'

4AGI: A Small Codebase with Large Knowledge

Chollet predicts that when AGI is achieved, its core 'fluid intelligence engine' will be a very small codebase, potentially less than 10,000 lines of code, occupying megabytes of space. This compact engine would operate on a much larger, self-improving knowledge base. He controversially suggests that if this core insight had been known in the 1980s, AGI could have been developed with the compute resources available then.

Chollet states, 'I think it's going to be a very very small code base... on the order of megabytes... less than 10,000 lines of code and that if you had if you had known about it back in the in the 1980s you could have done AGI back then.'

Bottom Line

The current LLM stack, while productive, is not optimal and will likely be replaced by more efficient AI architectures in the long term, trending towards optimality.

So What?

This suggests that significant long-term value in AI research lies in exploring foundational alternatives rather than solely building on existing LLM paradigms.

Impact

Invest in or research radically different machine learning substrates, like symbolic AI or scaled genetic algorithms, that prioritize efficiency and conciseness from first principles.

Human-engineered 'harnesses' are currently essential for LLMs to solve complex problems in verifiable domains, indicating a lack of true AGI.

So What?

While harnesses enable powerful task automation, they highlight that current AI cannot autonomously structure novel problems or devise solution strategies without human intervention.

Impact

Develop AI systems that can autonomously generate and refine their own problem-solving 'harnesses' or frameworks, reducing human dependency and advancing towards true fluid intelligence.

The ARC AGI benchmark series is a 'moving target' designed to continually identify and test the 'residual gap' between frontier AI capabilities and human learning efficiency.

So What?

This implies that achieving AGI is not about passing a single static test, but about continually closing the gap in learning and adaptation across increasingly complex and novel challenges.

Impact

Focus AI development on systems capable of continuous, curriculum-based learning and invention, as these are the next frontiers ARC AGI will target (V4 and V5).

Opportunities

Develop AI systems based on symbolic program synthesis.

Create a new machine learning engine that generates extremely concise symbolic models, requiring less data, running more efficiently, and generalizing better than current deep learning models. This could lead to highly optimized, low-resource AI solutions.

Source: NDIA's research

Scale up alternative AI approaches like genetic algorithms.

Invest significant compute and resources into developing and scaling genetic algorithms or other non-gradient descent based search methods. This could unlock new scientific discoveries and powerful automation capabilities in domains where search is optimal.

Source: Chollet's suggestion for alternative approaches

Build AI tools for 'verifiable domains' beyond coding.

Identify other domains that offer formally verifiable reward signals (e.g., formal mathematics, scientific discovery, complex system design) and develop AI agents that can fully automate tasks within them using current LLM-based techniques and iterative self-correction loops.

Source: Success of coding agents and mathematics as a primed domain

Key Concepts

Minimum Description Length Principle

The model of data most likely to generalize is the shortest and simplest. This principle underpins NDIA's symbolic approach, aiming for concise symbolic models rather than complex parametric curves.

Intelligence vs. Knowledge Trade-off

Competency involves a balance between intelligence and knowledge. More knowledge or better training can compensate for less raw intelligence. Current AI models are becoming more useful through extensive training (knowledge) rather than increased fluid intelligence.

Science as Symbolic Compression

Science fundamentally involves compressing a large set of observations into a simple, elegant symbolic rule or equation. This process is analogous to NDIA's goal of finding the most compressive symbolic models of data in software form.

Lessons

  • Learn domain expertise deeply to effectively leverage AI tools; AI progress empowers those who can apply it to specific fields.
  • For new open-source projects, prioritize a simple, intuitive API, comprehensive documentation that teaches the domain, and active community building.
  • When building an AI lab, focus on creating a 'compounding stack' with reusable foundations, ensuring that each new layer builds upon and enhances previous learnings, rather than constantly trying new, disconnected approaches.

Notable Moments

Chollet's shift from deep learning evangelist to symbolic AI proponent.

As the creator of Keras, a widely used deep learning framework, Chollet's pivot to symbolic AI underscores a fundamental belief in the limitations of current deep learning for true AGI, lending significant weight to alternative research paths.

The ARC AGI benchmark's role in signaling AI breakthroughs.

ARC AGI V1 signaled the advent of reasoning models, and V2 signaled agentic coding. This demonstrates the benchmark's effectiveness in identifying new AI capabilities beyond mere scaling, making V3 a critical indicator for future 'agentic intelligence' breakthroughs.

Quotes

"

"The model of the data that is most likely to generalize is the shortest, and I think you cannot find a model like this if you're doing parametric learning; you need to try symbolic."

François Chollet
"

"If you have a big idea and it has very low chance of success, but if it works, it's going to be big and no one else is going to be working on it... then you should try your chance."

François Chollet
"

"AGI is basically going to be a system that can approach any new problem, any new task, any new domain and make sense of it... with the same degree of efficiency as a human could."

François Chollet
"

"When it comes to competency, there's always a trade-off between intelligence and knowledge. If you have more knowledge, if you have better training, you need less intelligence to be competent."

François Chollet
"

"I do believe that you know when we create AGI retrospectively it will turn out that it's a code base that's less than 10,000 lines of code and that if you had if you had known about it back in the in the 1980s you could have done AGI back then."

François Chollet
"

"Science is fundamentally a symbolic compression process where you're looking at a big mess of observations... and you're compressing that down to a very simple symbolic rule."

François Chollet
"

"You're not going to stop AI progress. I think it's too late for that. And so the next question is okay like AI progress is here... How do you make use of it? How do you leverage? How do you ride the wave? That's the question to ask."

François Chollet

Q&A

Recent Questions

Related Episodes