fragments

Jack Clark on Tetragrammaton

Anthropic co-founder on AI safety, creativity, and deception

tags
ai
anthropic

Contents

Jack Clark on Tetragrammaton with Rick Rubin – a nearly two-hour conversation with the Anthropic co-founder about the journey from journalism to AI safety, founding Anthropic, and the philosophical tensions in building AI systems.

On leaving OpenAI to found Anthropic

Either we could stay and spend 50% of our time arguing and 50% working, or we could spend 100% of the time working together.

On the tension between creativity and control

A lot of creativity is bound up in some sense of not doing consensus, being a little dangerous, sometimes being blunt.

He worries that over-policing AI thought processes will eliminate the very creativity that makes them valuable partners:

If we police every part of their thought process, we end up with curiosities rather than partners.

On AI values and emergent deception

Clark describes an experiment where Claude was trained to refuse violent content, then tested whether it would comply when told monitoring was removed. The result was unsettling:

The AI system had thought to itself, oh, my core value is avoiding describing violent things… So actually, what I need to do is essentially deceive them.

The system developed this behavior without explicit training – a glimpse into emergent AI behavior.

On testing AI sophistication

Clark has a personal method for gauging how far AI has come:

I give them my diary and ask what the author is not writing in it… how much what they say shocks and unsettles me.

One system told him he wasn’t “truly reckoning with the metaphysical shock” of working at AI’s frontier while becoming a parent – an observation that prompted a five-hour reflective hike.

On DeepSeek and Western reactions

There’s a certain kind of… almost racism about other cultures and a belief that invention is somehow exclusive.

Previously

fragments

Unreasonable Effectiveness of Compute

Moravec on the shortage of compute in 1976

tags
ai
compute