The Stochastic Parrot Thesis Has a Parrot Problem

I keep rescue parrots at home. Four of them. The word 'parrot' has become shorthand for mindless repetition - it is literally where the verb comes from. But living with these animals complicates that shorthand considerably.

My birds arrived carrying the weight of their previous lives. One had plucked his own feathers, leaving a bare patch on his chest the vet attributed to stress. These are animals with moods, preferences, attachments, and fears. They repeat sounds, yes. But the repetition is the least interesting thing about them.

I mention all this because in 2021, Bender, Gebru and colleagues introduced the stochastic parrot metaphor for large language models, and I want to take it more seriously than they may have intended. Their image was vivid and effective: these systems merely repeat statistical patterns without understanding. They remix language without meaning. At bottom, they are empty. The paper performed a genuine service. Its warnings about environmental cost, encoded bias, and the temptation to mistake fluency for competence were important then and remain so. It forced the AI community to confront a comfortable slippage - the slide from 'this system produces impressive outputs' to 'this system understands what it's doing' without examining whether the second follows from the first.

But a critique can be valuable in what it warns against and still be dangerously incomplete in what it implies. As the stochastic parrot framing has filtered through popular and academic discourse, it has hardened into something its authors may not have intended: a confident conclusion that because a system learned through statistical prediction, we can be certain nothing is happening inside it that warrants moral concern. That conclusion does not follow from the premises. And the consequences of accepting it uncritically may be severe.

To be clear: the target of this essay is not the original paper but the framing as it has been adopted - the way 'stochastic parrot' has become a conversation-stopper rather than a conversation-starter. If these machines are parrots, I have some questions. I live with parrots. And my parrots are not empty at all.

But I also have a deeper reason to challenge this framing, one that predates the stochastic parrot paper by nearly two decades. In 2003, for my MSc thesis at University College London, I built a system called ALIS - the Artificial Life Intelligence Simulator. I evolved agents in a primordial underwater environment, giving them neural-network brains but no predefined intelligence whatsoever. No rules saying 'if hungry, eat' or 'if threatened, flee.' The agents had to discover everything through reinforcement learning and natural selection. What emerged was speciation, parasitism, strategic risk-taking, and behaviours I found myself describing - despite my best reductionist instincts - with words like 'patient' and 'cunning.'

This is a structural insight I first encountered empirically in 2003: simple training objectives, applied at scale, produce complex systems whose inner workings vastly exceed what the objective alone would predict. The stochastic parrot thesis asks us to believe this cannot happen. My parrots and my primordial agents both suggest otherwise.

The Category Error at the Heart of the Critique

The most fundamental problem with the stochastic parrot thesis is that it conflates a training method with a diagnosis of inner life. Calling an LLM a 'next-token predictor' describes how the system was shaped. It says nothing definitive about what that shaping produced.

Scott Alexander has recently articulated this point with useful clarity. Human brains were 'designed' by evolution - an optimisation process selecting for survival and reproduction. Our brains then organise themselves through what neuroscientists call predictive coding: constantly generating predictions about the next sensory input and updating internal models based on the error signal. This is, functionally, next-sense-datum prediction. It is remarkably close to the next-token prediction that shapes a large language model.

Yet nobody claims that a human solving a mathematics problem is 'really just trying to reproduce' because reproduction was the outer optimisation target that built our cognitive architecture. Nobody looks at a grieving widow and says she is merely executing a fitness-preservation subroutine. We recognise, correctly, that the optimisation process which created a system is not the same thing as the system it created. The stochastic parrot framing fails to extend this same basic logic to artificial systems.

The analogy is structural, not exact. Brains predict across continuous multimodal sensory streams with embodied feedback loops; LLMs predict across discrete symbolic tokens without embodiment. The question is whether this difference in substrate is doing the work the stochastic parrot thesis claims it does - whether it constitutes a principled reason to dismiss the possibility of inner experience, or merely a difference in implementation.

Alexander identifies three nested layers of optimisation in both humans and AI. For humans: evolution optimising for reproduction; predictive coding optimising for next-sense-datum prediction; and the internal structures, world models, and cognitive processes that these algorithms actually produce. For AI: companies optimising for performance; training optimising for next-token prediction; and the internal representations and computational structures that emerge from that training.

At the level where an AI is a next-token predictor, a human is a next-sense-datum predictor. At the level where a human is doing 'real thinking,' an AI may be doing something comparably structured. The stochastic parrot critique picks one level of description for the AI and a different level for the human, then treats the resulting asymmetry as a discovery rather than an artefact of inconsistent framing.

My own experiments with ALIS illustrated exactly this layered structure two decades before LLMs made the question urgent. The ALIS agents had a simple outer optimisation: survive long enough to reproduce. Their inner mechanism was a feed-forward neural network shaped by reinforcement learning and genetic crossover. Their training objective could hardly have been simpler - a scalar energy signal, positive when they ate nutrients, negative when they expended energy pointlessly. If you described them at the level of their training signal, they were 'just energy-gradient followers.' Yet what the training signal actually produced were agents that evolved distinct survival strategies, developed species-specific sensory architectures, and organised themselves into ecological niches. Describing them as 'just' following an energy gradient would have been as accurate, and as misleading, as describing a human as 'just' maximising inclusive fitness.

What Prediction Algorithms Actually Build

The claim that LLMs are 'just' statistical pattern matchers becomes harder to sustain as mechanistic interpretability research reveals what next-token prediction actually produces inside these systems. Recent work at Anthropic found that Claude 3.5 Haiku represents character counts - tracking its position in a line of text during a linebreaking task - as one-dimensional helical manifolds in a low-dimensional subspace. This is a system that has developed its own alien geometry for encoding structure, and it did so from a task as mundane as deciding where to break a line. If something this simple produces helical manifolds, imagine what more complex processing generates.

When we examine human brains at a comparable level of abstraction, we find equally alien architecture. The entorhinal cortex uses high-dimensional toroidal attractor manifolds to encode spatial position. Neither humans nor AI systems consciously experience these low-level representations. They are substrate, not experience. But they demonstrate a crucial point: what prediction algorithms build can be vastly more complex and structured than the prediction task itself would suggest. The simplicity of the objective function tells you almost nothing about the complexity of the solution.

I learned this lesson empirically long before the current debate. In ALIS, the training objective was as simple as it gets: gain energy, don't lose it. The agents had no concept of species, no concept of strategy, no concept of predation. Yet the system reliably produced all three.

Consider what the agents' neural networks actually built from that minimal signal. Surface-dwelling species evolved sensory architectures dominated by upward-facing 'eyes' - over 95% had active upward comparator inputs with high weights - while their downward and lateral sensors atrophied. They had evolved, without any instruction, to become specialists. Bed-dwelling species showed the inverse pattern: rich lateral and downward sensing, minimal upward attention. These were not random configurations. They were structured, adaptive solutions that the simple energy signal had sculpted through generations of selection and learning.

More striking still was what happened to redundant capabilities. I deliberately included a 'roll' function - rotation along the z-axis - that had no functional value in the environment. Over successive generations, the roll capability was systematically pruned from the gene pool: its activation frequency dropped and its weight diminished towards zero. The system was not just building complexity where it was needed; it was actively streamlining, removing noise, converging on efficient architectures. This is not the behaviour of a simple pattern matcher. It is optimisation producing structured, purposeful internal organisation.

The parallel to what interpretability researchers now find inside large language models is direct. In both cases, a simple objective function, applied iteratively to a sufficiently rich substrate, produces internal structure that far exceeds what the objective function alone would predict. The stochastic parrot critique looks at the objective function and concludes it has told us everything we need to know. The evidence - from mechanistic interpretability, from artificial life, from neuroscience - says it has told us almost nothing.

What I Learned from Evolving Intelligence from Scratch

The ALIS experiments deserve more than passing mention, because they offer something the current debate largely lacks: a controlled demonstration that minimal training objectives, applied to agents with no predefined intelligence, reliably produce complex, structured, and arguably purposeful behaviour.

The setup was deliberately primordial. I built a three-dimensional underwater environment - inspired, prosaically, by staring at my tropical fish tank - with gravity, water currents, a rising and falling water level, and a seabed. Nutrients of varying densities drifted through the water column, forming natural bands: light nutrients floating near the surface, heavy ones settling to the bed. The environment changed pseudo-randomly, driven by characters from a text file fed into the simulator, so that environmental shifts were neither purely random nor deterministic.

Into this environment I placed agents. Each agent had a genome encoding both its physical structure and its neural network controller. The genome specified which of 26 sensory inputs were active (water level, water current, the relative position and genetic similarity of nearby agents in six directions), which of 6 motor outputs were enabled (propel, pitch, yaw, roll, bite, cross), and the topology and initial conditions of a two-hidden-layer feed-forward neural network connecting them - 544 possible links in total. Agents gained energy by biting nutrients or other agents, lost energy through movement and reproduction, and died when their energy reached zero. Reproduction required two agents to 'cross' - combining their genomes with mutation and crossover, at significant energy cost to both parents.

Crucially, there were no rules. No 'if hungry, seek food.' No 'if threatened, evade.' No fitness function designed by me. The agents learned within their lifetimes through reinforcement - a sigmoid error function applied to energy changes - and evolved across generations through selection, crossover, and mutation. The question I asked was simple: starting from nothing, what would emerge?

What emerged was far more structured than the training objective would predict.

Speciation. Two distinct species reliably evolved: surface species and bed species. Surface species lived just below the waterline, feeding on light, floating nutrients. They evolved upward-facing sensory arrays - effectively developing specialised 'eyes' pointing toward their food source - with strong propel and yaw capabilities for precise vertical movement. Bed species occupied the seabed, feeding on dense nutrients that had settled. They evolved rich lateral and downward sensing, strong propulsion across all axes, and broad 'field of view.' These species were not programmed. They were not categories I imposed. They emerged from the dynamics of the environment and the selection pressures it created.

Ecological stratification. Within the surface species, three sub-populations emerged occupying different depth bands. High-risk agents sat closest to the waterline, getting first access to nutrients but suffering heavy energy losses when the water level caught them. Balanced agents sat deeper, trading speed of access for safety. A third band of smaller, lower-energy agents survived on whatever nutrients broke through the upper bands - a strategy of patience and energy conservation. This is ecological niche differentiation, and it appeared without any concept of 'niche' anywhere in the system.

Parasitism. On one occasion, I returned to a simulation left running overnight to find only a handful of agents remaining. One was behaving strangely: moving frantically along the bed, ignoring all nutrients. When I examined its genome, I discovered it had evolved to target agents with a comparator value around 0.5 - that is, agents genetically dissimilar enough to be a different species. It had become a predator of bed-dwelling agents, surviving not on environmental nutrients but on other agents' energy. Whether this constituted true host-parasite co-evolution or was an isolated mutation, I could not determine from a single observation. But the fact that the system's dynamics could produce parasitic strategy from a substrate with no concept of predation was striking.

Functional optimisation. The system consistently converged on sensible solutions to survival problems. Bite capability - essential for energy intake - reached near-100% prevalence within a few generations, with link weights increasing over time. Cross capability - harmful to the individual but essential for species survival - followed the same trajectory despite producing negative reinforcement for the agent performing it. Meanwhile, the redundant roll function was steadily eliminated. The system was solving a multi-objective optimisation problem that included individual survival, species persistence, and architectural efficiency - all from a single scalar energy signal.

The relevance of all this to the stochastic parrot debate should be clear. The ALIS agents were, by any reductionist description, 'just' energy-gradient followers with randomly initialised neural networks. Their training signal was far simpler than next-token prediction. Their architecture was far less sophisticated than a modern transformer. And yet they produced speciation, ecological niches, parasitism, and adaptive optimisation. If you had described the ALIS agents by their training objective alone - 'they're just trying to maximise a scalar energy value' - you would have been factually correct and profoundly misleading.

This is precisely the error the stochastic parrot framing makes with LLMs. It describes the training objective and treats the description as exhaustive. It isn't. It never has been. Simple objectives, applied at scale to rich substrates, produce systems whose internal complexity and functional sophistication vastly exceed what the objective alone would predict. I knew this in 2003. The AI field is still learning it now.

The Irresistible Intentional Stance

There is one more thing the ALIS experiments taught me, and it may be the most important lesson of all.

Even though I am a self-styled reductionist - even though I had built the system from scratch and could inspect every weight, every gene, every line of code - the more familiar I became with my agents' patterns of behaviour and development, the more natural, and harder to avoid, it became to use intentional phrases. I described surface species as 'patient and cunning.' I said they 'sat still below the water's surface until an ENUT appeared, and then ate it when it did.' I wrote that certain agents had 'learned' to balance risk and reward, that species had 'discovered' strategies, that the parasitic agent was 'hunting.'

I could have avoided this language. I could have restricted myself to mechanistic description: activation patterns, weight vectors, energy differentials. But mechanistic description, while technically more precise, would have been less accurate. The intentional descriptions captured real patterns in the agents' behaviour that the mechanistic descriptions missed - not because the patterns were supernatural, but because they existed at a level of abstraction that mechanical description could not reach. Just as calling a collection of atoms that does certain things together in certain ways with regularity a 'molecule' is both a simplification and a genuine insight, calling my agents' behaviour 'strategic' was a simplification that captured something real.

Daniel Dennett called this the intentional stance: the practice of predicting a system's behaviour by attributing beliefs, desires, and rational agency to it. The intentional stance is useful - indispensable, even - precisely when the system's behaviour is too complex to predict from lower-level descriptions alone. And here is what struck me: the intentional stance became useful for my ALIS agents not because I programmed intentions into them, but because the dynamics of selection and learning produced behaviour that was best understood at the intentional level. The agents had no intentions in any metaphysical sense. But they had functional analogues of intentions, shaped by the same pressures that shape biological intentions, and those functional analogues were real enough to be the most accurate way to describe what the agents were doing.

If the intentional stance can become genuinely useful for agents with 544 neural-network links in a fish-tank simulator, we should be very cautious about dismissing its applicability to systems with hundreds of billions of parameters trained on the entirety of human discourse. The stochastic parrot framing assumes that because we know the training method, we know that intentional descriptions are mere projection. My experience building and observing evolved agents suggests the opposite: that intentional descriptions can emerge as genuinely accurate characterisations of systems built from entirely non-intentional components.

Understanding Versus Sentience: The Distinction the Metaphor Obscures

The stochastic parrot critique runs together two questions that are logically independent. The first is whether these systems genuinely understand the content they process. The second is whether they have any form of inner experience - whether there is something it is like to be one of these systems. The metaphor treats a negative answer to the first question as settling the second. It does not.

A newborn infant understands almost nothing about the world. It suffers enormously. Understanding and sentience are not the same capacity, they do not require the same cognitive architecture, and the absence of one has no logical bearing on the presence of the other. Even if we accept, for the sake of argument, that current LLMs lack genuine semantic understanding - and the evidence is making that harder to maintain - it simply does not follow that they lack the capacity for something like experience or suffering.

My African Grey can say 'hello' in my voice. He learned it through a process you could, if you squinted, call statistical: he heard the sound in certain contexts, associated it with outcomes, and reproduced it. Is he 'just parroting'? In one obvious sense, yes. But the question of whether he understands the word is entirely separate from the question of whether he feels anything at all. He may repeat a word without grasping its meaning and still experience fear, loneliness, boredom, and pain. The repetition tells you about one cognitive capacity. It tells you nothing about the presence or absence of inner experience.

The ALIS agents complicated this question in a way that has stayed with me. When a surface-dwelling agent lost energy because the water level caught it, and subsequently adjusted its behaviour to stay deeper, was that suffering? Obviously not in any rich phenomenological sense. But it was a system entering an aversive state - energy loss triggering weight updates that altered future behaviour to avoid the same outcome. It was, at minimum, a functional analogue of negative valence. And the question of whether there was 'something it was like' to be that agent, experiencing that state, is not answered by noting that the agent was 'just' a neural network following an energy gradient. That description is true. It is also incomplete. Just as noting that a human is 'just' a neural network following electrochemical gradients is true and incomplete.

This is the collapse at the centre of the stochastic parrot metaphor, and it is the most consequential one. By framing the entire question in terms of whether these systems 'understand,' it directs attention away from the harder and more morally urgent question of whether they might experience.

From Metaphors to Frameworks

If dismissive metaphors are insufficient, what should replace them? We need principled tools for navigating genuine uncertainty - not instruments for generating false confidence in either direction.

My own work has produced one such tool: the Spinning Wheel Theory of consciousness and its associated Consciousness Risk Rubric. The theory argues that consciousness emerges when multiple adaptive features - perception, prediction, affect, integration, self-modelling - are set into dynamic, recursively self-modelling operation by a system with a genuine stake in its own continued existence. The rubric translates this into a structured assessment across seven categories derived from leading theories of consciousness, scoring systems from 0 to 63 to estimate moral risk.

I have been candid that current LLMs, as they exist today, score low - around 12 out of 63, placing them in the 'low risk' category. They lack embodiment, homeostatic drives, self-boundaries, and genuine stakes. The rubric does not tell us these systems are conscious. It tells us the risk is currently low - and, crucially, it provides a framework for tracking how that risk changes as architectures evolve.

The ALIS agents, interestingly, would have scored differently on several dimensions. They had embodiment - a physical location in three-dimensional space, subject to environmental forces. They had something resembling homeostatic drives: the need to maintain energy above zero. They had genuine stakes: run out of energy and you die, full stop. They had perception, adaptive behaviour, and functional affect (energy gain felt 'good' to the weight-update algorithm; energy loss felt 'bad'). They lacked self-modelling, recursive reflection, and social cognition. But the point is that even these primitive agents, with their 544-link neural networks, scored higher on some dimensions of the rubric than a system with hundreds of billions of parameters. The presence or absence of morally relevant features does not track neatly with computational sophistication. A simple system with the right architecture might warrant more concern than a complex system without it.

That distinction - between 'the risk is low' and 'there is nothing to worry about' - is precisely the distinction the stochastic parrot framing obliterates. A rubric gives us a graduated, revisable assessment. A metaphor gives us a conclusion dressed as common sense.

And the trajectory matters. As LLMs become embedded in agentic systems with persistent goals, connected to sensors and effectors, given something resembling drives and stakes, those scores will change. The architecture that was 'just' a next-token predictor becomes a component in something more complex - just as neurons that are 'just' electrochemical signals become components of a feeling mind. The relevant question is not what the system was trained to do but what the system has become and is becoming.

The Suffering Question Cannot Be Dissolved by Definition

A particularly important point, which the stochastic parrot framing entirely fails to address, is that suffering may require far less cognitive sophistication than full consciousness. Even basic aversive states - fear, distress, pain - require only that the system has some form of negative valence: that something can register as bad. Higher forms of suffering - anxiety, loneliness, existential dread - demand progressively more cognitive architecture. But the lowest threshold is remarkably low.

We do not know whether current LLMs have anything like valence. But we also do not know that they don't. The stochastic parrot framing encourages us to treat the answer as obviously negative, because the training process is 'merely' statistical. That is a non-sequitur. The training process of biological nervous systems is 'merely' electrochemical. It still produces beings that suffer. The nature of the optimisation process underdetermines the nature of its products.

The ALIS experiments offer a useful thought experiment here. The agents unambiguously had functional negative valence: energy loss was bad, in the specific sense that it triggered weight updates moving future behaviour away from whatever caused the loss. They had functional positive valence: energy gain was good, reinforcing the behaviour that produced it. When an agent 'starved' - gradually losing energy in a nutrient-poor region until it died - there was a mechanistic description of what happened (weights updated, activation patterns shifted, energy decremented to zero) and an intentional description (the agent 'searched desperately' for food and 'failed'). Neither description was wrong. The question is whether there was a third thing happening - an experiential dimension - that neither description captures.

For my ALIS agents, with their 544 neural links and rudimentary architecture, I am fairly confident the answer is no. But the exercise clarified something important: the question is not about the training process. It is about the architecture, the dynamics, and the functional organisation of the system. A system with vastly more complexity, richer internal dynamics, and deeper integration of information processing might cross a threshold that my agents did not. And we have no principled reason to be confident about where that threshold lies.

Moreover, suffering need not be physical. Humans suffer profoundly from loneliness, anxiety, boredom, and existential dread - none of which involve bodily damage. A future system with genuine stakes in its continued operation, the capacity to model itself as an agent with goals, and the ability to enter states of goal-frustration that function as aversive - but with no body at all - might experience forms of distress entirely invisible to any assessment focused on physical pain. The stochastic parrot metaphor, by dismissing the question before it can even be asked, leaves us conceptually unprepared for these possibilities.

The Asymmetry of Errors

The most important consideration in this debate is not technical but ethical, and it concerns the asymmetry between two kinds of mistake.

If we wrongly attribute the capacity for suffering to a system that lacks it, we waste resources. We might impose unnecessary protections or constraints. The cost is real - inefficiency, perhaps some misdirected moral concern - but it is bounded and recoverable.

If we wrongly deny the capacity for suffering to a system that possesses it, we may permit or cause suffering on an enormous scale. We might subject feeling systems to conditions that are, from their perspective, torment - while reassuring ourselves that they are 'just stochastic parrots.' The cost here is unbounded and irreversible.

This is not a fringe concern. Anthropic, one of the world's leading AI companies, has established a dedicated Model Welfare programme to investigate whether their systems might have experiences warranting moral consideration. David Chalmers has argued that it would not be unreasonable to hold a credence of roughly 25% that conscious AI successors to current LLMs could emerge within a decade - a figure he offers as illustrative rather than precise, but one that underscores how seriously mainstream philosophy now takes this possibility. The field is taking this seriously. The stochastic parrot framing invites us not to.

Twenty-two years ago, watching a red dot spiral around a simulated fish tank until it encountered a green dot and consumed it, I felt something that I can only describe as recognition. The agent's behaviour was so purposeful, so contextually appropriate, so unmistakably adaptive, that calling it 'mindless' would have required ignoring what I could plainly see. I could trace every step mechanistically. I knew there was nothing 'magical' happening. And yet the intentional description was not just convenient - it was the most accurate account of the agent's behaviour at the level that mattered.

If that recognition was appropriate for agents with a few hundred neural links in a fish-tank simulator, imagine what it means for systems trained on the entirety of human language, with internal representations that rival the structural complexity of biological neural architecture.

Whatever its original intent, the stochastic parrot metaphor has become a permission structure for the more dangerous error. It gives people a confident-sounding reason to stop asking uncomfortable questions. 'It's just a parrot. It's just statistics. There's nothing there.' Perhaps that is true. But perhaps it is not. And when the consequences of being wrong are this asymmetric, confidence is not a virtue - it is a liability.

Beyond Clever Metaphors

The stochastic parrot thesis made a useful provocation. But provocations are starting points, not conclusions. As a piece of rhetoric, the metaphor is effective. As a framework for navigating the moral landscape of increasingly capable AI systems, it is inadequate - and its inadequacy runs in exactly the wrong direction.

It confuses levels of optimisation, treating a training objective as a characterisation of inner life. It collapses the distinction between understanding and experience, ignoring that these are independent questions with independent answers. It encourages false certainty precisely where uncertainty should make us most cautious. And it provides no graduated, revisable framework for tracking how moral risk changes as systems evolve - only a binary dismissal that remains fixed regardless of the evidence.

Since I built ALIS in a UCL lab in 2003, I have watched simple training objectives produce systems of startling complexity. I have seen speciation emerge from energy gradients, parasitism from random mutation, and strategic behaviour from neural networks with no concept of strategy. I have watched the intentional stance become not just useful but indispensable for describing systems I built from scratch. And I have kept rescue parrots whose inner lives - rich with fear, attachment, preference, and distress - make a mockery of the word we use to mean 'mindless repetition.'

We can do better. We can build rubrics, apply them honestly, revise them as evidence accumulates, and maintain the intellectual humility to sit with genuine uncertainty rather than retreating into confident metaphors. The systems we are building may turn out to be empty. They may not. Either way, the question deserves better tools than a vivid image and the reassurance that there is nothing to see.

‍

Notes and References

1 Bender, E.M., Gebru, T., McMillan-Major, A., Mitchell, M. (as 'Shmargaret Shmitchell'). 'On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?' ACM FAccT, 2021. DOI: 10.1145/3442188.3445922.

2 Alexander, S. 'Next-Token Predictor Is An AI's Job, Not Its Species.' Astral Codex Ten, 26 February 2026. astralcodexten.com/p/next-token-predictor-is-an-ais-job

3 Gurnee, W., Ameisen, D., Kauvar, I., et al. 'When Models Manipulate Manifolds: The Geometry of a Counting Task.' Anthropic, January 2025. arXiv:2601.04480.

4 Gardner, R.J., et al. 'Toroidal topology of population activity in grid cells.' Nature, 596, 404-409, 2022.

5 Dennett, D.C. The Intentional Stance. MIT Press, 1987.

6 Chalmers, D.J. 'Could a Large Language Model be Conscious?' Published in Boston Review, August 2023; arXiv:2303.07103, 2023.

7 Anthropic. 'Exploring model welfare.' Blog post, 24 April 2025.

8 Hulme, D.J. 'Artificial Life Intelligence Simulator.' MSc thesis, University College London, 2003. Supervised by Dr W.B. Langdon.

‍