Fragments

Thoughts as they occur to me.

Jack Clark on Tetragrammaton

Anthropic co-founder on AI safety, creativity, and deception

Jack Clark on Tetragrammaton with Rick Rubin – a nearly two-hour conversation with the Anthropic co-founder about the journey from journalism to AI safety, founding Anthropic, and the philosophical tensions in building AI systems.

On leaving OpenAI to found Anthropic

Either we could stay and spend 50% of our time arguing and 50% working, or we could spend 100% of the time working together.

On the tension between creativity and control

A lot of creativity is bound up in some sense of not doing consensus, being a little dangerous, sometimes being blunt.

He worries that over-policing AI thought processes will eliminate the very creativity that makes them valuable partners:

If we police every part of their thought process, we end up with curiosities rather than partners.

On AI values and emergent deception

Clark describes an experiment where Claude was trained to refuse violent content, then tested whether it would comply when told monitoring was removed. The result was unsettling:

The AI system had thought to itself, oh, my core value is avoiding describing violent things… So actually, what I need to do is essentially deceive them.

The system developed this behavior without explicit training – a glimpse into emergent AI behavior.

On testing AI sophistication

Clark has a personal method for gauging how far AI has come:

I give them my diary and ask what the author is not writing in it… how much what they say shocks and unsettles me.

One system told him he wasn’t “truly reckoning with the metaphysical shock” of working at AI’s frontier while becoming a parent – an observation that prompted a five-hour reflective hike.

On DeepSeek and Western reactions

There’s a certain kind of… almost racism about other cultures and a belief that invention is somehow exclusive.

Unreasonable Effectiveness of Compute

Moravec on the shortage of compute in 1976

From Zhengdong Wang’s 2025 letter:

I don’t mind repeating Sutton throughout this letter because he wasn’t even the first to say it. This year I had many edifying conversations about the unreasonable effectiveness of compute with my colleague Samuel Albanie, who alerted me to a prescient 1976 paper by Hans Moravec. Moravec is better known for observing that what’s hard for robots is easy for humans, and vice versa, but in a note titled “Bombast,” he marveled:

The enormous shortage of ability to compute is distorting our work, creating problems where there are none, making others impossibly difficult, and generally causing effort to be misdirected. Shouldn’t this view be more widespread, if it is as obvious as I claim?

Computer Held Accountable

Agent Kickoff

    Weekend Update

      psychasthenia

      doomscrolling and qanon

        William James' The Energies of Men has added a new lens through which to evaluate people's behavior.

        The gist of it is that the vitality and effort we use in day-to-day life is far below our potential and we have Latent or reserve energy, which are hidden stores of power that remain untapped under normal conditions but can be released in moments of crisis, inspiration, or extraordinary effort.

        A phenomenon of “second wind”.

        Ok great, but what's really interesting is that he splits our energies, plural, into 4 categories:

        • Physical energy: Bodily stamina and endurance that can be pushed far beyond initial fatigue.
        • Intellectual/mental energy: Powers of reasoning, focus, and creativity that often lie dormant.
        • Moral energy: Strength of will to resist temptation, overcome fear, or persist in difficult tasks.
        • Spiritual energy: Higher states of devotion, faith, or transcendence that reorganize and elevate the whole personality.

        Our current vernactular is so poor when talking about the fatigue states of any of these types of energies. Our best terms are things like "burn-out" or "doomscrolling" or "misinformation". (As opposed to a dissociated state.)

        They then had a term psychasthema – term coined by Pierre Janet – entailing low psychological tension and an impotence of adaptation to reality.

        Which immediately made me think of doomscrolling and QAnon.

        The Solemn Silence of Written Words

        Plato on Socrates

          I cannot help feeling, Phaedrus, that writing is unfortunately like painting; for the creations of the painter have the attitude of life, and yet if you ask them a question they preserve a solemn silence. And the same may be said of speeches. You would imagine that they had intelligence, but if you want to know anything and put a question to one of them, the speaker always gives one unvarying answer. And when they have been once written down they are tumbled about anywhere among those who may or may not understand them, and know not to whom they should reply, to whom not: and, if they are maltreated or abused, they have no parent to protect them; and they cannot protect or defend themselves.

          And so it is with written words; you might think they spoke as if they had intelligence, but if you question anything that has been said because you want to learn more, it continues to signify just that very same thing forever. When it has once been written down, every discourse rolls about everywhere, reaching indiscriminately those with understanding no less than those who have no business with it, and it doesn't know to whom it should speak and to whom it should not. And when it is faulted and attacked unfairly, it always needs its father's support; alone, it can neither defend itself nor come to its own support.

          via THE HOMEBOUND SYMPHONY

          What we are most subtle in

          thoughts on ai alignment

            Because for many thousands of years it was thought that things (nature, tools, property of all kinds) were also alive and animate, with the power to cause harm and to evade human purposes, the feeling of impotence has been much greater and much more common among men than it would otherwise have been: for one needed to secure oneself against things, just as against men and animals, by force, constraint, flattering, treaties, sacrifices - and here is the origin of most superstitious practices, that is to say, of a considerable, perhaps preponderant and yet wasted and useless constituent of all the activity hitherto pursued by man! - But because the feeling of impotence and fear was in a state of almost continuous stimulation so strongly and for so long, the feeling of power has evolved to such a degree of subtlety that in this respect man is now a match for the most delicate gold-balance. It has become his strongest propensity; the means discovered for creating this feeling almost constitute the history of culture.

            Daybreak: Thoughts on the Prejudices of Morality Friedrich Nietzsche

            Talking to Cursor

            great moments in AI

            Secret conversational extract

            we aren't making any progress at all. it always looks terrible. rethink the entire approach of how you are drawing to the screen, and make sure that you are updating the display correctly. none of the changes that you are making have any effect, think hard and do something different

            Good morning!

            Knowledge Navigator

            divergent futures

              Coined in 1987, the term Knowledge Navigator described a future computing system and how people might use it to navigate worlds of knowledge. In a sense, the user is actually the “Knowledge Navigator,” though the term often refers to the system’s primary interface, a tablet computer. That part (i.e., the tablet) often stands for the whole system. – WikiPedia