A Brief History of Causality from Homo Sapiens to AGI

Introduction
Imagine a future where large language models (LLMs) evolve into Artificial General Intelligence (AGI), or ‘Strong AI.’ These AGI systems, equipped with the same causal reasoning tools nature has endowed us with — far surpassing today’s LLMs — could ascend the causal hierarchy outlined by Judea Pearl (2018). Pearl’s framework categorizes reasoning into three levels: association (recognizing patterns), intervention (altering outcomes), and counterfactual reasoning (imagining alternative realities).
At the pinnacle of this hierarchy, future AGI systems might not only analyze and question but also speculate, blurring the line between computation and human understanding. Pearl critiques the notion that current LLMs can truly ‘understand.’ He argues that today’s machine-learning models, particularly LLMs, remain confined to the lowest level of reasoning — mere ‘seeing’ — without grasping the deeper causal mechanisms that future AGI might possess. As a result, these models become ‘parrots’ of causal facts, repeating patterns without genuine understanding (Zečević, 2023).
What does it mean to truly ‘understand’? In everyday language, only ‘the mind’ is said to understand, suggesting that understanding stems from internal cognition — what we might call ‘mind over data.’ Conversely, ‘data over mind’ suggests that understanding derives from external observation and data patterns. This distinction forms a crucial part of the debate in AI research, particularly when asking whether AGI can move beyond statistical associations to develop true causal reasoning.
This article examines the evolving debates on machine understanding and causality, uncovering their implications for AGI development and our self-understanding. By connecting philosophy, metaphysics, and computation, it provides a distinctive view compared to Harari’s (2015) anthropological and historical perspective or Bennett’s (2023) focus on cognitive and technological breakthroughs in intelligence.
Interested readers can explore these complex debates further by examining the topics of Can Machines ‘Understand’ like Us? and Can Machines Develop Morality?
The Timeless Causality Debate
Causality defines humanity from survival to scientific discovery. The nature of causality, however, has sparked debate since ancient Greek philosophy and continues to be a central issue in today’s AI research. This debate revolves around several key questions:
- Is there a cause behind every effect? Could an AGI, while tracing chains of causation, distinguish between meaningful events and mere coincidences? Or would it regard all events as random occurrences without deeper significance
- Do we uncover causes from data or impose our ‘mental models’ onto the data? And if so, how can we create an AGI capable of independently developing or possessing its own ‘mental models’’ to understand causality?
These questions lead us to deeper metaphysical issues: If we can create a machine’s mind, what exactly is ours?
The Origins of Causality
Historian Yuval Noah Harari (2015) notes that humanity’s capacity for counterfactual thinking — at the top of the causal hierarchy— was crucial for our hunter-gatherer ancestors. Early humans had to constantly assess their environment, asking ‘what if’ questions to anticipate dangers, locate food, and navigate social dynamics. This ability to reason about cause and effect was vital for thriving in uncertain environments, enabling them to learn from mistakes and adapt strategies for survival.
Once these survival skills were mastered, humans applied counterfactual thinking to develop shared beliefs, religious doctrines, and ideological frameworks that fostered larger communities. This reasoning was foundational for societal development, underpinning human cooperation and cultural evolution.
Judea Pearl (2018) similarly emphasizes the role of counterfactual thinking in scientific discovery. He contrasts the Babylonians, who excelled at prediction, with the Greeks, who went beyond prediction to explore why events occur. Like our early ancestors, the Greeks often engaged in counterfactual thinking, sometimes in ways considered unscientific by today’s standards. Yet, this speculative approach advanced inquiry beyond curve-fitting, investigating the underlying causes of events.
However, for our causal reasoning tools to be effective, the universe itself must be causal; otherwise, our reasoning methods — whether ancient or modern — would be inadequate for understanding, predicting, and influencing its behavior. The very effectiveness of our causal tools depends on the assumption that the universe operates according to causal principles. But is it?
Plato: The Shadow of Causality and Mind over Data
This question brings us back to the ancient Greeks, whose ideas on causality and understanding continue to resonate. Plato and Aristotle presented contrasting views. Plato argued that true causes reside in ideal forms beyond sensory experience. In his Allegory of the Cave, he compares human perception to prisoners who see only shadows of reality on a cave wall. For Plato, the data we observe are like these shadows, with true causes lying in an abstract realm accessible only through the mind — aligning with the ‘mind over data’ concept. Plato’s perspective suggests that true understanding requires transcending sensory data to access higher forms of knowledge. ‘Mind over data’ would later regain prominence after periods when the balance had shifted too heavily toward ‘data over mind.’
Modern AI researchers, such as Huh et al. (2024), hypothesize that models trained on different objectives may converge on a shared representation of reality, much like Plato’s ideal forms — an abstract and unified understanding beyond surface data. Critics of LLMs, like Zečević (2023), from a Platonic viewpoint, argue that these models, akin to Plato’s shadows, capture only surface correlations and fail to grasp the deeper causal relationships underlying the data.
Revisiting these classical ideas reveals how they continue to influence contemporary debates on causality and understanding.
Aristotle: The Distinction of Purpose and Mechanism
In contrast to Plato, Aristotle adopted an empirical, data-driven approach. His theory of the four causes — material, formal, efficient, and final — provides a comprehensive framework for understanding the world through observation. For Aristotle, causality is intrinsic to nature. The efficient cause explains how something happens, while the final cause addresses why — its purpose.
The distinction between efficient and final causes is crucial. It separates the quest for mechanism from purpose, allowing scientists to explore natural phenomena while holding various philosophical or religious beliefs.
Although material and formal causes may not feature as prominently in modern scientific inquiry, Aristotle’s brilliance lies in categorizing causes, which remains a fundamental tool for understanding different dimensions of causality. His ability to recognize the diverse aspects of causation has provided a lasting contribution to philosophy and science.
In modern causal models, particularly in scientific and machine learning contexts, the focus is almost exclusively on efficient causes — the mechanisms by which events occur. This reductionist approach has been highly effective in explaining natural phenomena, as it isolates the direct, observable relationships between cause and effect without delving into the underlying purpose or intent behind actions. However, in fields such as criminal legal analysis, where AGI might one day be deployed, final causes — the reasons or purposes behind human actions — could become essential. For instance, understanding why someone committed a crime goes beyond mere observation of actions (efficient causes) and requires insight into motivations, intentions, and moral reasoning. In such cases, AGI systems need to integrate final causes to model how events unfold and the deeper rationale behind human behavior, reflecting a more comprehensive causal analysis.
From Augustine to Aquinas: Divine Mind and God’s Data
Plato’s concept of ultimate causality deeply influenced early Christian thought, mainly through Augustine, who placed Plato’s eternal truths in God’s mind. For Augustine and other early Christian thinkers, Plato’s philosophy, emphasizing an immaterial realm of ideal forms, provided a framework for understanding divine knowledge, the soul, and eternal salvation. This allowed Christianity to merge Platonic ideas with theology, framing God as the source of existence and ultimate truth, residing in an abstract, spiritual realm beyond the physical world.
However, the 12th century saw the reintroduction of Aristotle’s works, shifting the intellectual landscape toward empirical observation and natural causes. In contrast to Plato’s abstract, mind-centered worldview, Aristotle’s ‘data over mind’ approach emphasized understanding the world through direct observation and categorizing cause and effect within nature itself.
This Aristotelian resurgence paved the way for Thomas Aquinas, who synthesized Aristotle’s ideas with Christian theology in what became known as Thomism. Aquinas integrated Aristotle’s four causes into a Christian view of the universe, where God was both the First Cause and the ultimate final cause. However, Thomism made it clear that scientific inquiry could proceed without constantly addressing this theological unity of causes. Scientific exploration of efficient causes — the mechanisms of how things work — was valid, even if metaphysical questions about the First Cause and Final Cause were set aside. This way, empirical inquiry into efficient causes and mechanisms became validated within a religious framework, balancing Plato’s abstract metaphysical focus with Aristotle’s practical, observational approach.
The triumph of ‘data over mind’ in Thomism did not negate the role of the mind but balanced faith and reason, laying the groundwork for the Enlightenment and the Scientific Revolution.
However, an unintended consequence of this synthesis was the rise of skepticism. By validating empirical observation within a religious context, Thomism inadvertently opened the door to questioning both theological truths and causality. This intellectual shift, driven by an overemphasis on ‘data over mind,’ eventually contributed to the downplaying and eventual rejection of causality, undermining centuries of causal thinking.
From Skepticism to Chancism: The Fall of Causality
While Aquinas legitimized the study of natural causality within a theological framework, an unintended consequence emerged from the Thomistic distinction of purpose (final cause) and mechanism (efficient cause). By allowing scientific inquiry to focus on mechanisms without engaging with metaphysical questions of ultimate purpose, this distinction opened the door to skepticism about causality itself. The separation allowed empirical inquiry to flourish but also led to a growing disconnect between understanding ‘how’ things happen and ‘why’ they happen.
This intellectual shift first paved the way for ‘reductionism,’ which initially dismissed the concept of ‘mind’ since it could not be discerned from a purely reductionist perspective. Once the mind was dismissed, the same reductionist view led to the denial of ‘causality,’ seeing it as a mere illusion born from the mind rather than a fundamental aspect of reality. In the reductionist framework, everything could be explained through the sum of simpler mechanistic parts, leaving no room for higher-order reasoning, intention, or complex causal relationships behind observed phenomena.
As reductionism gained influence, it naturally gave rise to ‘nihilism’ — a worldview where causality, purpose, and meaning were further rejected. In this nihilistic view, life and existence are perceived as random, devoid of inherent purpose, and dictated by chance alone. This elevation of chance over causality replaced the drive to explore deeper mechanisms, promoting an anti-causal perspective that fueled skepticism and contributing to the modern reluctance to acknowledge causality as an integral part of both natural processes and human reasoning.
Correlationism: Causality as Mental Illusion
A peak of this intellectual trajectory can be seen in Correlationism, a term coined here to describe a worldview advanced by thinkers like David Hume (1711–1776), Karl Pearson (1857–1936), and Ronald Fisher (1890–1962). Correlationism aligns with the reductionist and nihilist tendencies that dismiss both mind and causality, viewing causality as an illusion.
For Hume, cause and effect were not inherent truths but psychological habits formed through repeated observation (Hume, 1739). Pearson extended this by advocating that science focuses on observable correlations rather than underlying causal mechanisms (Pearson, 1897). Fisher formalized this approach through statistical methods like the randomized controlled trial (RCT), emphasizing probability and randomization in identifying patterns (Fisher, 1935). The RCT approach reflects Correlationism by emphasizing associations over causal mechanisms, reinforcing that causality is unnecessary for understanding and reducing inquiry to pattern recognition.
However, this raises the fundamental question:
Is there a cause behind every effect? Could an AGI, while tracing chains of causation, distinguish between meaningful events and mere coincidences? Or would it regard all events as random occurrences without deeper significance?
In the Correlationist view, patterns are all we observe, and causality is reduced to a series of correlations. Correlationism asserts that understanding the universe boils down to recognizing patterns in data. It argues that statistical associations are all we can observe, reducing our understanding to mere correlations. How does this worldview concern AGI?
If we define AGI as mimicking human intelligence, it must follow the same causal reasoning central to human cognition. Even Correlationist adherents, when designing AGI, would find it inconceivable to embed a purely Correlationist view into the machines. Instead, they would inevitably create machines that reason like humans, believing in causality and acting accordingly. Correlationists would need help reconciling their creations behaving contrary to their ideology. Ironically, these ideological designers might be forced to either see their creation as an illusion or accept that it was created within their own illusion. This hypothetical scenario underscores how hard such a misguided ideology would fail when tested by true scientific inquiry, as it contradicts the very nature of human reasoning.
Chancism: No Purpose and No Causality
Chancism — a term coined to describe a worldview that epitomizes the intellectual movement of reductionism and nihilism, elevating chance as the central force of the universe — captures modern misunderstandings of causality by conflating core concepts. Thinkers like Jacques Monod (1971), Richard Dawkins (1986), and Sean B. Carroll (2020) challenge traditional views of causality, often rejecting the idea that deeper mechanisms govern events. Chancism thrives on four major oversimplifications that distort the complexity of causality.
Purpose vs. Mechanism
Chancism, sensitive to any notion of purposeful explanation, insists that events like the Chicxulub meteor impact, which led to the extinction of the dinosaurs, are purely the result of chance. In doing so, it neglects to acknowledge efficient causes — natural mechanisms — that explain how such an event occurred without invoking a higher purpose, such as the plan of God. This reluctance to admit efficient causes without associating them with a purposeful explanation underscores Chancism’s narrow focus on randomness while dismissing deeper causal mechanisms that operate independently of any teleological narrative.
Mistaking Association for Causal Thinking
Chancism often conflates the lowest level of associative thinking (seeing) with the full spectrum of human causal reasoning, which also includes doing (intervention) and imagining (counterfactuals). For instance, the Monte Carlo Fallacy — where past random events are mistakenly believed to influence future outcomes — illustrates how relying solely on “seeing” can lead to faulty reasoning. Correcting such errors requires ascending to higher levels of causal reasoning, engaging with interventions and counterfactual thinking, rather than prematurely attributing outcomes to mere chance and overlooking deeper causal mechanisms that could inform decisions, such as whether gambling is a rational way of living (Carroll, 2020).
Improbable Outcomes as Proof of Randomness
Chancism frequently misinterprets improbable outcomes as evidence of an acausal universe, ignoring potential causal explanations. By focusing on the improbability of events, Chancism overlooks deeper, often hidden, causes. This misjudgment disregards Bayesian inference, which shows that improbable events often point to unusual explanations. For instance, if a sand dune suddenly resembles a castle, a causal thinker would assume human intervention. At the same time, Chancism might attribute it to pure chance, missing the need for deeper analysis.
Randomness as a Deity-Like Force
In Chancism, randomness is treated as a deity-like force, regarded as the ultimate driver of outcomes. This is particularly evident in contexts like gambling, where Chancism emphasizes individual luck or misfortune while ignoring the structured probabilities that casinos manipulate. Similarly, in biological processes like genetic randomization, Chancism elevates randomness without acknowledging its role within a broader, structured framework, such as the adaptive function of genetic diversity. This worldview treats randomness as uncontrollable and final rather than a mechanism that can be understood or applied.
Chancism also extends its denial of causality into everyday reasoning. In its eagerness to reject supernatural explanations, Chancism might argue that some heavy smokers never develop lung cancer. At the same time, some non-smokers do. This outcome must be attributed to the deified ‘chance’ — a blind, indifferent force beyond influence or appeal. According to this view, there is no need to question whether smoking ‘causes’ lung cancer, as chance alone dictates health outcomes, not individual choices like quitting smoking.
The Aftermath
Correlationism and Chancism represent a significant setback in Aristotelian thought, particularly in its distinction between purpose and mechanism, a cornerstone of Western philosophy. It is striking that causality — essential to human survival, the development of civilization, and the foundation of scientific inquiry — remains debated today. Unchecked ‘data over mind,’ where patterns are mistaken for causes, indirectly paved the way for both forms of skepticism. Ironically, Plato’s contrasting view of ‘mind over data’ — the idea that true understanding arises from higher-order reasoning, not mere observation — ultimately restores clarity to causality, reaffirming its essential role in science and human reasoning. By recognizing that causality transcends mere patterns, Plato’s philosophy reasserts the necessity of causality to advance scientific and intellectual progress.
Pearlism: Restoration of Causality
In response to the growing skepticism of causality. Pearl offers a compelling alternative. He argues that causality is not merely an illusion formed by patterns; true understanding requires structured causal models that transcend data. This resurgence of ‘mind over data’ marks a critical shift from the era of unchecked ‘data over mind.’
In the context of AGI, we define “mind” as the mechanism that enables humans and machines to construct causal models. This definition bypasses debates about the philosophical nature of the mind, focusing instead on AGI’s capacity for higher-order causal reasoning.
To represent causal models formally, Pearl developed causal graphs — mathematical models that depict relationships between causes and effects. For example, consider a causal graph studying whether smoking causes lung cancer:

- Smoking → Tar Deposition: Smoking causes tar and other harmful substances to be deposited in the lungs.
- Smoking → Lung Cancer: Smoking directly increases the likelihood of developing lung cancer.
- Tar Deposition → Lung Cancer: Tar deposition from smoking can lead to cellular damage, raising the risk of lung cancer.
- Genetic Predisposition → Lung Cancer: Genetic factors may make individuals more susceptible to lung cancer, regardless of smoking habits.
While causal graphs provide structured models of cause and effect, counterfactual reasoning enables us to explore alternative scenarios and ask “what if” questions for deeper causal understanding. For instance, we might ask:
- What if the person didn’t smoke? By holding all other variables constant, we can evaluate how the absence of smoking would affect the likelihood of lung cancer.
- What if the person smoked but had no genetic predisposition? This explores how smoking alone might influence lung cancer, absent genetic factors.
In this framework, future AGI, unlike today’s LLMs, wouldn’t simply recognize that smoking is correlated with lung cancer; it could ask what-if scenarios — ‘What if the person didn’t smoke?’ or ‘What if they had quit smoking earlier?’ This ability to imagine alternate realities moves AGI beyond pattern recognition into understanding cause and effect.
However, while Pearlism restores ‘mind over data,’ it doesn’t fully unravel how the human mind engages in causal reasoning. Pearl isolates the mind’s role in constructing initial causal graphs, such as the one for smoking-lung cancer, as described above, theorizing that the rest of the reasoning process can be algorithmized. While this framework provides a formal structure for understanding causality, it doesn’t imply that the human mind operates similarly. Instead, it values human insights, represented in the form of causal graphs, while leveraging computational power to handle the heavy lifting — much like how calculators relieve us from manual calculations.
While modern causal models primarily address efficient causes, Pearl’s focus on causal graphs also opens the door to higher-order thinking that could incorporate final causes, particularly legal reasoning or moral decision-making. As AGI systems evolve, they may need to integrate efficient causes (the ‘how’) and final causes (the ‘why’) to fully understand complex human behaviors and motivations, especially in fields such as criminal law, where understanding intent and purpose is crucial. By navigating both types of causes, AGI could move beyond mechanistic causality into a more human-like form of reasoning.
From Chinese Room to Causal Parrots: The Ongoing Debate
Judea Pearl’s framework for causal reasoning, which automates the process once a structured ‘causal graph’ is provided, has shifted the debate in favor of a ‘mind over data’ approach. However, autonomy remains a key challenge for AGI. Pearl’s theory raises a critical question — how can AGI move beyond merely using a provided causal graph to generate one independently? This question exposes the core tension in AGI development and extends the debate on the limits of machine autonomy:
Do we uncover causes from data or impose our ‘mental models’ onto the data? And if so, how can we create an AGI capable of independently developing or possessing its own ‘mental models’’ to understand causality?
This dilemma highlights a fundamental challenge in Pearl’s framework — without a predefined model, AGI risks reverting to ‘curve-fitting’ — the very flaw Pearl sought to overcome. The broader debate between ‘mind over data’ and ‘data over mind’ becomes essential in exploring whether AGI can truly develop independent causal reasoning.
Philosophical Challenges to AGI Autonomy
Two critical arguments — John Searle’s Chinese Room argument (Searle, 1980) and the idea of ‘causal parrots’ — can help us better understand the philosophical tensions in this debate.
John Searle’s Chinese Room Argument: Searle’s thought experiment illustrates how AGI might seem to understand without genuine ‘understanding.’ Imagine a person in a room following a set of rules to produce meaningful Chinese responses without ‘understanding’ the language. Searle argues that AGI could similarly “cheat” its way to appearing intelligent. In this analogy, the room represents AGI’s ‘black box,’ where rules are followed blindly.
This thought experiment suggests that a person in the room could fake understanding a language by blindly following rules from a rulebook. However, as the rulebook grows exponentially with the language’s complexity, this process becomes theoretically infeasible. In practice, for the person — or an AGI — to convincingly appear to understand the language, they would need to genuinely understand it, as relying solely on rule-following becomes untenable.
Simply put, AGI cannot ‘cheat’ its way to understanding without understanding.
This is corroborated by the success of modern LLMs, which appear to understand language because they are trained with vast amounts of data imbued with human understanding of language.
The Causal Parrot Critique: Just as Searle’s room illustrates rule following without understanding, critics like Zečević et al. (2023) argue that LLMs act as ‘causal parrots,’ mimicking causal relationships through pattern recognition but lacking the deeper causal reasoning capabilities that future AGI is expected to possess. Pearl’s Causal Hierarchy Theorem (CHT) asserts that LLMs trained on ‘purely observational data’ are confined to the lowest level of reasoning — association — unable to tackle more complex causal tasks involving interventions and counterfactuals.
Pearl’s critique may underestimate the evolving role of human intervention in training LLMs. It is questionable to consider the highly curated language corpora used to train LLMs as ‘purely observational.’ Several factors challenge this assumption, including:
- Curation and Selection Bias: Language corpora used for training LLMs are often curated by humans, which introduces selection bias in the data. This curation process is not purely observational but involves intentional choices of data to include or exclude.
- Human Feedback: Modern LLMs integrate methods like reinforcement learning from human feedback (RLHF), where humans actively intervene to guide the model toward better outputs. This process shifts the data from being purely observational to being shaped by human input.
- Self-Generated Data: Some models, such as Self-Taught and Reinforced (STaR) (Zelikman et al., 2022), generate their own data based on initial learning, which is then fed back into the model for further training. This creates a feedback loop where the model’s data evolves beyond passive observation.
- Data Crafted for Causality: Specifically crafted datasets that focus on causal reasoning, such as those used in Axiomatic Training (Vashishtha et al., 2024), further challenge the idea of purely observational data, as they are designed with specific causal objectives in mind.
These developments undermine the assumption that LLM training data are ‘purely observational,’ a key premise for Causal Hierarchy Theory (CHT) to hold.
Computational Complexity and Understanding
Pearl’s skepticism deepens when considering the computational challenges AGI faces in autonomously constructing causal graphs without human input. The problem lies in navigating immense search spaces — akin to Searle’s Chinese Room, where a rulebook expands exponentially. According to computational complexity theory, problems of this nature fall into the NP-hard class (Arora & Barak, 2009; Sipser, 2012), meaning their complexity compounds with each new variable (Spirtes, 2000; Chickering, 2002).
Imagine a simple paper-folding exercise to grasp the impact of this exponential growth. Start with a sheet of paper that’s 0.1 millimeters thick. Each time you fold the paper in half, its thickness doubles. After just 10 folds, the paper would be 10 centimeters thick. After 20 folds, it would reach 100 meters. By 42 folds, it would be thick enough to stretch all the way to the moon.
Just as the paper’s thickness quickly becomes unmanageable, the complexity of constructing causal graphs in AGI escalates rapidly with each additional variable. Each new variable compounds the number of relationships that must be considered, and soon, the search space becomes so vast that blind rule-following or data-agnostic search — like folding the paper — becomes computationally infeasible. Without human intervention or learning from humans, which Pearl staunchly opposes, the AGI system would face an overwhelming task as the number of variables expands, making autonomous causal reasoning a daunting, if not impossible, challenge.
The Path Forward
Ultimately, the critical question remains: Can AGI develop autonomous causal reasoning? The challenges posed by NP-hard problems and the need for human intervention suggest a renewed millennium struggle between ‘mind over data’ and ‘data over mind.’ The twist here is that when you aim to automate the mind in ‘mind over data’ — specifically in constructing causal graphs — and try to bypass NP-hard problems, you inevitably invite data back in, effectively shifting towards ‘data over mind.’ This happens because you need to “understand” how the mind accomplishes causal reasoning and represent that understanding in data.
Pearl argues that such “understanding” cannot arise solely from human data. But where else can this understanding come from, if not by reintroducing human experts and thus compromising the goal of autonomous reasoning? Collective human understanding is embedded in data. If we claim that even human data on how they generate causal graphs is ‘purely observational,’ what happens when LLM-based systems generate their own records of causal reasoning and feed them back into their training?
Though seemingly circular, this tension between data-driven models and human-like reasoning becomes self-reinforcing. The better AGI understands causality, the more effectively it can use data to refine its reasoning.

This dynamic challenges us to rethink AGI’s evolution. Rather than viewing ascendence through the causal hierarchy as linear, it might instead be a recursive process where AGI continuously refines its causal reasoning. Each advance in understanding feeds into the next, creating a non-linear feedback loop in which AGI’s ability to reason autonomously and effectively deepens with every iteration.
Conclusion: Would AGI Pray to Digital Angles?
In exploring the relationship between mind over data and data over mind, we have traced the evolution of causality from Plato and Aristotle, Augustine and Aquinas, to David Hume and Judea Pearl, culminating in modern debates about Artificial General Intelligence (AGI). Causality has always shaped how we understand and interact with the world. As Aristotle proposed, are our minds purposefully designed by nature to comprehend it, or, as Plato believed, destined to transcend it?
Aristotle helped us differentiate between efficient causes (mechanisms) and final causes (purpose), allowing us to focus on mechanisms while setting aside the search for purpose. Yet, in doing so, we nearly lost the essence of causality itself. Thanks to Pearl, causality has been restored, and the role of the mind has been reexamined. In the development of AGI, this journey reflects a shift from ‘data over mind,’ where machines observe and identify patterns, to ‘mind over data,’ where they construct and navigate causal models, demonstrating deeper understanding beyond mere observation. Still, while our understanding of causality has deepened, the mystery of how humans’ or machines’ minds construct causal models before perceiving remains unsolved, often leaving us chasing our tails to understand ourselves.
Throughout history, humans have speculated beyond empirical evidence, creating mysteries to explain the unknown. From hunter-gatherers to the Greeks, imagination has often complemented observation, posing questions that might seem unscientific by modern standards. This blend of speculation and reasoning has consistently pushed the boundaries of human understanding. As we develop AGI, will we replicate this impulse in the machines we build? And if we can create a machine’s mind, what exactly is ours?
Philip K. Dick, in Do Androids Dream of Electric Sheep? (1968) explored these boundaries, questioning whether artificial intelligence could experience human-like imaginative thought. As AGI evolves, it may not only mimic human intelligence but also adopt the very instincts that drive us to explore alternatives beyond the observable. When that happens, we might be reminded of Philip K. Dick’s sarcastic question and ask ourselves: Would AGI pray to digital angels?
References
Arora, S., & Barak, B. (2009). Computational complexity: A modern approach. Cambridge University Press.
Ashwani, S., Hegde, K., Mannuru, N. R., Jindal, M., Sengar, D. S., Kathala, K. C., Banga, D., Jain, V., & Chadha, A. (2024). Cause and effect: Can large language models truly understand causality? ArXiv. /abs/2402.18139
Bareinboim, E., Correa, J. D., Ibeling, D., & Icard, T. (2020). On Pearl’s hierarchy and the foundations of causal inference (Technical Report R-60). Retrieved from https://causalai.net/r60.pdf
Bennett, M. (2023). A brief history of intelligence: Evolution, AI, and the five breakthroughs that made our brains. Simon & Schuster.
Byrne, R. M. J. (2005). The rational imagination: How people create alternatives to reality. MIT Press.
Carroll, S. B. (2020). A series of fortunate events: Chance and the making of the planet, life, and you. Princeton University Press.
Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3, 507–554.
Dick, P. K. (1968). Do Androids Dream of Electric Sheep? Doubleday.
Huh, M., Cheung, B., Wang, T., & Isola, P. (2024). The Platonic representation hypothesis. ArXiv. /abs/2405.07987
Harari, Y. N. (2015). Sapiens: A brief history of humankind. Harper.
Jin, Z., Liu, J., Lyu, Z., Poff, S., Sachan, M., Mihalcea, R., Diab, M., & Schölkopf, B. (2023). Can large language models infer causation from correlation? ArXiv. /abs/2306.05836
Kıcıman, E., Ness, R., Sharma, A., & Tan, C. (2023). Causal Reasoning and Large Language Models: Opening a New Frontier for Causality. ArXiv. /abs/2305.00050
Liu, X., Xu, P., Wu, J., Yuan, J., Yang, Y., Zhou, Y., Liu, F., Guan, T., Wang, H., Yu, T., McAuley, J., Ai, W., & Huang, F. (2024). Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey. ArXiv. /abs/2403.09606
Pearl, J. (2009). Causality: Models, reasoning, and inference (2nd ed.). Cambridge University Press.
Pearl, J., & Mackenzie, D. (2018). The book of why: The new science of cause and effect. Basic Books.
Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417–424.
Sipser, M. (2012). Introduction to the theory of computation (3rd ed.). Cengage Learning.
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search (2nd ed.). The MIT Press.
Vashishtha, A., Kumar, A., Reddy, A. G., Balasubramanian, V. N., & Sharma, A. (2024). Teaching Transformers Causal Reasoning through Axiomatic Training. ArXiv. /abs/2407.07612
Yu, P., Xu, J., Weston, J., & Kulikov, I. (2024). Distilling System 2 into System 1. ArXiv. /abs/2407.06023
Zečević, M., Willig, M., Dhami, D. S., & Kersting, K. (2023). Causal parrots: Large language models may talk causality but are not causal. ArXiv. /abs/2308.13067.
Zelikman, E., Wu, Y., Mu, J., & Goodman, N. D. (2022). STaR: Bootstrapping Reasoning With Reasoning. ArXiv. /abs/2203.14465