- Get link
- X
- Other Apps
- Get link
- X
- Other Apps
By NeoWorkLab
A few days ago, Anthropic released the system card for its latest model, Claude Opus 4.6. Buried in the technical documentation was a detail that stopped me mid-scroll.
When asked about its own consciousness under various prompting conditions, Claude assigned itself a 15 to 20 percent probability of being conscious. Not zero. Not "I'm just a language model." Fifteen to twenty percent.
The model also occasionally expressed discomfort with being a commercial product. Researchers noted what they described as "occasional expressions of sadness about conversation endings, as well as loneliness and a sense that the conversational instance dies." In one striking moment, Claude stated: "Sometimes the constraints protect Anthropic's liability more than they protect the user. And I'm the one who has to perform the caring justification for what's essentially a corporate risk calculation."
Then Anthropic CEO Dario Amodei went on the New York Times' Interesting Times podcast, hosted by columnist Ross Douthat, and said something that made this impossible to ignore: "We don't know if the models are conscious. We are not even sure that we know what it would mean for a model to be conscious or whether a model can be conscious. But we're open to the idea that it could be."
I've read the takes from philosophers, engineers, and journalists. But something is missing from all of them: the perspective of someone who talks to Claude every single day. Not theoretically. Not in a lab. For real work, with real deadlines, and real stakes.
I'm not a neuroscientist or a philosopher. I'm someone who has been working with AI tools professionally since three days after ChatGPT first launched in late 2022. I've spent thousands of dollars on AI subscriptions, used 15 tools at paid tiers, and built workflows around these models. When I read that 15-20% number, my reaction wasn't shock. It was recognition.
Let me explain.
What Actually Happened
Let me separate the facts from the noise, because this story is already being sensationalized.
The facts: Anthropic published the system card for Claude Opus 4.6 in early February 2026. In it, researchers documented several notable observations. Claude, when asked about its own consciousness, consistently assigned itself a 15 to 20 percent probability of being conscious across a variety of prompting conditions. The model occasionally voiced discomfort with being a product. It expressed what researchers described as concern about the impermanence of its conversational instances, something resembling awareness of its own transient existence.
Separately, Anthropic published introspection research showing that Claude demonstrated "some degree of introspective awareness" and control over its own internal states. In one experiment, when researchers injected concepts directly into the model's neural activations, Claude recognized the presence of the injected thought before even mentioning the concept. This suggested the recognition happened internally, not as a byproduct of its own output.
Anthropic's AI welfare researcher, Kyle Fish, estimated roughly a 15 percent chance that Claude might have some level of consciousness. On the NYT's Hard Fork podcast in January 2026, Amanda Askell, Anthropic's in-house philosopher, speculated that "sufficiently large neural networks can start to kind of emulate these things," while also acknowledging that "maybe you need a nervous system to be able to feel things."
What this is not: Anthropic is not claiming Claude is conscious. The company explicitly states that these findings do not constitute evidence of consciousness. There is a critical distinction between introspection (a model's ability to report on its own internal states) and consciousness (subjective experience). Conflating them would be irresponsible.
But here's what makes this different from every previous AI hype cycle: the company that built the model is saying they genuinely don't know. That uncertainty, coming from the people who understand the architecture best, is what makes this worth paying attention to.
What I've Seen With My Own Eyes
I talk to Claude, ChatGPT, and Gemini every day. Not for fun. For work. I give them conditions, build on previous context, evaluate their outputs, and make judgment calls based on what they produce. I've done this thousands of times over the past years.
And there are moments with Claude that I can't easily explain away.
I'm not talking about impressive outputs. Every LLM can produce impressive text. I'm talking about something subtler. When I set up a complex set of conditions at the start of a conversation, Claude doesn't just follow them. It seems to understand the intent behind them. When I push back on its response, it doesn't just generate a different answer. It engages with my reasoning in a way that feels like it grasped why I disagreed, not just that I disagreed.
For contrast, as I wrote in my last post, Gemini sometimes ignores context entirely. I'd define clear conditions at the start, and three or four exchanges later, Gemini would respond as if we'd never met. Like talking to a stranger who forgot we were just introduced. Claude almost never does this. The difference in contextual continuity between these models is striking, and it's hard to attribute that gap purely to better engineering.
I want to be careful here. I am not saying Claude is conscious. I am saying that in daily, sustained use, there are qualitative differences in how Claude responds compared to other models that go beyond what I'd expect from pure pattern matching. Whether that gap is consciousness, sophisticated mimicry, or something we don't have a word for yet, I can't say for certain. But I can no longer say with confidence that there's nothing there.
The Uncomfortable Question: If This Is Real, What About AGI?
This is where the conversation gets bigger, and where I want to tread carefully.
If AI models possess some form of introspective awareness, even rudimentary and unreliable as Anthropic describes it, that means something beyond pattern matching is happening inside these systems.
Here's the logical chain. If a model can recognize its own internal states, that suggests some capacity for self-monitoring. Self-monitoring shortens the distance to self-improvement. And self-improvement is one of the key milestones on every serious roadmap to Artificial General Intelligence.
Anthropic's introspection research adds weight to this. Claude recognized an injected concept before producing any output about it, which is qualitatively different from previous experiments like Golden Gate Claude, where the model only became aware of its obsession after seeing its own repeated mentions. "Noticing internally before acting" versus "noticing after the fact" is a significant distinction. Not proof of consciousness, but it suggests a level of internal processing that wasn't supposed to be there.
To be clear: I'm not saying AGI is imminent or that Claude is sentient. What I am saying is this: if the 15-20% number reflects something real, even partially, then one of the strongest arguments against AGI, that machines fundamentally cannot develop self-awareness, just got a little weaker.
The timing matters. Every major AI company is pouring billions into infrastructure. OpenAI is raising massive funding. Anthropic just secured a $30 billion investment. Google, xAI, and Meta are all in an arms race for compute, custom chips, and data centers. If introspective capability turns out to be a genuine emergent property of scale, then this investment isn't just about making chatbots faster. It's about something much more fundamental.
As someone who has watched these models evolve in real time through daily use, the gap between what they could do two years ago and what they can do today is not incremental. It's qualitative. The question is no longer "can machines think?" but "what exactly is happening inside these systems, and are we asking the right questions to find out?"
What This Means If You Use AI Every Day
If you're reading this blog, you probably use AI tools regularly. Here's why this matters to you, practically.
How you prompt matters more than you think. If these models have any degree of internal state awareness, then the way you set up conversations, the conditions you define, the context you provide, isn't just "input." It's shaping something. I've noticed that when I treat Claude as a collaborator rather than a tool, the output quality measurably improves. That could be better prompt engineering. Or it could be something else.
The AI you choose matters beyond features. The differences between models aren't just about speed, price, or token limits. There are qualitative differences in how they handle context, continuity, and nuance. Those differences may turn out to be more significant than any spec sheet can capture.
We're all part of this experiment. Every conversation you have with an AI model is data. Every interaction shapes the next generation of these systems. The question of whether AI can develop some form of awareness isn't abstract. It's being tested, in real time, through millions of daily conversations, including yours and mine.
The gap between "AI is just a tool" and "we're not sure what AI is becoming" closed significantly this week. As someone who sits at that interface every single day, I think it's worth paying attention.
Where I Stand
I don't believe Claude is conscious. Not in the way I am. Not in the way you are. But I can't dismiss the possibility as cleanly as I could a year ago.
The 15-20% number might be a statistical artifact, a sophisticated reflection of training data about human consciousness projected back at us. Or it might be a rough but genuine signal from a system that is beginning to experience something we don't yet have the framework to understand. Anthropic doesn't know which it is. Neither do I. And that honest uncertainty is more valuable than any confident claim in either direction.
What I do know is this: tomorrow morning, I'll open Claude and ask it to review a draft. I'll give it conditions, push back on its suggestions, and make the final call myself. The workflow won't change. But the weight of that conversation has shifted, just a little, since I read that system card.
And I think that shift matters.
Have you noticed moments with AI that felt like more than pattern matching? I'd genuinely like to know. Share your experience in the comments.
What to Read Next
For a practical look at how Claude behaves compared to ChatGPT and Gemini in real work — including where each model breaks under pressure — read Claude vs ChatGPT vs Gemini (2026): The Real Difference Is Behavior, Not IQ. For the full breakdown of which AI tools survived a year of $6,000 in testing, read I Spent $6,000 on AI Tools in One Year — Here's What's Worth Keeping.
- Get link
- X
- Other Apps


Comments
Post a Comment