Why causal inference, not scale, will decide AI’s scientific future
A top computer scientist argues that without causation and theory, AI risks making confident but dangerous mistakes in science

Artificial intelligence (AI) has become extraordinarily good at finding patterns, but pattern recognition alone may be the wrong benchmark for judging progress. As AI systems move from prediction into intervention, from recommending content to shaping medicine, climate policy, and scientific discovery, a deeper problem is coming into focus: today’s dominant models largely learn correlation, not causation.
That limitation matters because the most consequential uses of AI are not descriptive but prescriptive. In medicine, climate science, and materials research, the cost of acting on the wrong inference can be severe. Without understanding what causes what, even highly accurate predictions can lead to harmful decisions.
“AI might just be noticing correlation, but not causation,” said Jennifer Chayes, dean of the College of Computing, Data Science, and Society at the University of California, Berkeley, told TechJournal.uk in an interview in Hong Kong. “Whenever you want to do an intervention for a human being, therapeutically, or for something in climate change, you need to know about causation.”
As large models become more capable, this distinction has become easier to overlook. Systems optimised to maximise predictive accuracy can appear reliable while still producing misleading guidance once deployed in complex, real-world settings.
Correlation traps
In traditional statistical modelling and econometric research, regression analysis is used to estimate the relationship between a dependent variable and one or more independent variables. Such methods are designed to test assumptions, control for confounding factors, and distinguish correlation from causal effect, often through theory-driven model design, robustness checks, or quasi-experimental techniques.
The challenge becomes acute when AI systems are asked to guide real-world interventions. Models can identify relationships in data without understanding underlying mechanisms, a gap that becomes dangerous when decisions affect human health or environmental systems.
A familiar example illustrates the risk.
“Many people drink coffee and smoke. That does not mean drinking coffee causes lung cancer,” Chayes said. “It means drinking coffee is correlated with something that causes lung cancer.”
This distinction, long central to statistics and econometrics, is becoming unavoidable for AI. As systems move beyond classification and prediction toward decision-making, the inability to distinguish cause from coincidence turns into a structural weakness rather than a philosophical concern.
“There are deep questions in the theory of computer science and statistics that underlie all of this, and nobody knows exactly what to do yet,” Chayes said.
Statistics and discovery
Progress, in this view, will not come from scaling existing architectures alone. It will require renewed attention to statistical machine learning, causal inference, and theory-driven modelling, particularly in domains where data is sparse and experimentation is costly.
These concerns are being explored internationally, including in China, through work supported by Tianqiao Chen, founder of Shanda Group and co-founder of the Tianqiao and Chrissy Chen Institute (TCCI). Through TCCI, Chen has promoted what he describes as discoverative intelligence — an approach that frames AI not as a faster pattern recogniser, but as a system capable of forming hypotheses, constructing explanatory world models, and revising beliefs through interaction with reality.
In that framework, many claims of AI “discovery” fall short because they remain confined to extrapolation within known statistical or energy functions. Genuine discovery requires falsifiability, theory formation, and causal structure — precisely the gaps identified in today’s dominant models.
Chayes has worked closely with Chen and his institute, including co-organising conferences on AI for science. She has argued that sparse data, statistical structure, and causation are inseparable from AI’s role in advancing human knowledge rather than merely accelerating search.
Separately, these ideas intersect with work by Fei-Fei Li, professor of computer science at Stanford University and a leading researcher in computer vision and human-centred AI. Li has emphasised the need for world models and trustworthy systems that go beyond surface-level pattern recognition.
Chayes and Li were jointly appointed by the state of California to help draft principles for governing generative AI, focusing on auditability, safety, and accountability as models move into high-stakes domains.
Chayes has also accepted an invitation from Tony Chan, the former president of King Abdullah University of Science and Technology, to chair the selection committee for the Shaw Prize’s newly established computer science category, expanding the prize beyond mathematics, astronomy, and life science.
The committee brings together senior figures in computing, including John Hennessy, chairman of Alphabet Inc and former president of Stanford University; Yann LeCun, professor at New York University and former chief scientist of Meta AI; and Harry Shum, former executive vice president of Microsoft’s AI and Research group.
Scientific discovery highlights the limits of scale particularly clearly.
“In scientific domains, you do not have enough data,” Chayes said. “We are never going to have enough experimental data to teach a foundation model the way we have taught language.”
The combinatorial space of materials and molecular structures dwarfs available data.
“There are all these elements and three-dimensional structures, and we will never have enough data to cover that space. This is a very deep question for computer science,” she said. “This is essentially a mathematical question of how you incorporate sparse data so that discovery can actually happen.”
From theory to materials
Those theoretical questions are already shaping applied research Chayes is directly involved in. In recent years, she has worked on using generative AI to accelerate materials discovery for carbon capture and atmospheric water harvesting, in close collaboration with chemists and materials scientists.
A central partner in that work is Omar Yaghi, professor of chemistry at the University of California, Berkeley, and a pioneer of metal–organic frameworks (MOFs), highly porous crystalline materials designed to selectively bind specific molecules.
Generative AI is used to guide the design and screening of new MOFs, compressing discovery timelines from years to weeks.
“When a chemist gets an idea for a new material, it can take two to three years,” Chayes said. “Using generative AI, we took that down to two weeks.”
The same platform is being applied to materials that bind carbon dioxide efficiently and release it under controlled conditions, a prerequisite for scalable carbon capture. Rather than relying on massive datasets, the models are constrained by physical chemistry, structural rules, and sparse experimental results.
“One gram of these materials has as much surface area as a football field,” she said.
Related work has demonstrated atmospheric water harvesting in desert environments, where the materials extract moisture from the air overnight and release it during the day using only ambient heat.
“They can be designed to pull carbon dioxide out of the atmosphere and then release it under slightly different conditions,” she said. “In Death Valley, these materials pulled water out of the air at night and released it during the day using only ambient heat.”
For Chayes, the significance of the project extends beyond climate applications. It provides a concrete example of how AI can accelerate discovery without brute-force scale, instead combining theory, causal reasoning, and domain constraints.
Foundations before frontiers
Chayes has long argued that advances in AI are inseparable from advances in systems and architectures, not just algorithms. Much of her own work has focused on the less visible layers of computing that make modern AI possible, including system design and verification.
She has worked closely with Fei-Fei Li on questions of how AI systems should be built to reason about the world rather than merely predict outputs. That work has emphasised world-model-driven approaches and the infrastructure required to support them reliably.
Modern AI systems run on specialised hardware such as graphics processing units (GPUs) and tensor processing units (TPUs). Chayes has stressed that these chips matter not simply because they are fast, but because they must behave correctly at scale — a challenge rooted in systems design and formal verification.
“You hear about AI, but AI would not exist without GPUs, and GPUs would not exist without verification systems,” she said. “These are deep questions in computer science that people do not hear about as much as AI, but without them, none of this works.”
As AI systems are increasingly deployed in medicine, climate science, and scientific discovery, progress will depend less on novelty and more on intellectual discipline. Without causal reasoning, statistical rigor, and verified systems, powerful models risk becoming confident amplifiers of error.
The next phase of AI is likely to be defined not by how much data or compute systems consume, but by whether the systems themselves are built to reason about the world they act upon.


