Fragments

Thoughts as they occur to me.

Unnecessary Knowledge

keep it lean

    From Sherlock Holmes:

    "His ignorance was as remarkable as his knowledge. Of contemporary literature, philosophy and politics he appeared to know next to nothing. Upon my quoting Thomas Carlyle, he inquired in the naivest way who he might be and what he had done. My surprise reached a climax, however, when I found incidentally that he was ignorant of the Copernican Theory and of the composition of the Solar System. That any civilized human being in this nineteenth century should not be aware that the earth travelled round the sun appeared to be to me such an extraordinary fact that I could hardly realize it.

    “You appear to be astonished,” he said, smiling at my expression of surprise. “Now that I do know it I shall do my best to forget it.”

    “To forget it!”

    “You see,” he explained, “I consider that a man’s brain originally is like a little empty attic, and you have to stock it with such furniture as you choose. A fool takes in all the lumber of every sort that he comes across, so that the knowledge which might be useful to him gets crowded out, or at best is jumbled up with a lot of other things so that he has a difficulty in laying his hands upon it. Now the skillful workman is very careful indeed as to what he takes into his brain-attic. He will have nothing but the tools which may help him in doing his work, but of these he has a large assortment, and all in the most perfect order. It is a mistake to think that that little room has elastic walls and can distend to any extent. Depend upon it there comes a time when for every addition of knowledge you forget something that you knew before. It is of the highest importance, therefore, not to have useless facts elbowing out the useful ones.”

    “But the Solar System!” I protested.

    “What the deuce is it to me?” he interrupted impatiently; “you say that we go round the sun. If we went round the moon it would not make a pennyworth of difference to me or to my work.”

    The Focus AI

    I’m back

      I’ve recently started up a company called The Focus AI.

      I’m going to keep writing up things here as I find them, and then cross posting them as it makes sense. Think of what’s happening over there as part of the new mailing list.



      Quality Code Swearing

      more profanity the better

        One of the most fundamental unanswered questions that has been bothering mankind during the Anthropocene is whether the use of swearwords in open source code is positively or negatively correlated with source code quality. To investigate this profound matter we crawled and analysed over 3800 C open source code containing English swearwords and over 7600 C open source code not containing swearwords from GitHub. Subsequently, we quantified the adherence of these two distinct sets of source code to coding standards, which we deploy as a proxy for source code quality via the SoftWipe tool developed in our group. We find that open source code containing swearwords exhibit significantly better code quality than those not containing swearwords under several statistical tests. We hypothesise that the use of swearwords constitutes an indicator of a profound emotional involvement of the programmer with the code and its inherent complexities, thus yielding better code based on a thorough, critical, and dialectic code analysis process.

        Bachelor’s Thesis of Jan Strehmel

        Vibe check

        who needs science

        On the web I started with chatgpt, and it turns out that it makes more sense than claude or gemini.

        Though gemini is way faster.

        In emacs I started with Zephyr, and you know? I think it just does the best with defining words. (Which is what I use the most in emacs)

        With coding, I started with claude in cursor, and man it really kills it. I switched to chatgpt and you know it just wasnt the same.

        So, basically, you just like the thing you used first and there's no objective measure to anything.

        The raven

          i heart ruby

            JavaScript has won, so of course I've been moving more into typescript and javascript. And these last few weeks as I've been going deeper into the world of AI and LLM I've been dipping into the python ecosystem. And… I'm not convinved.

            Obviously you need to go to where the libraries are but ruby is still the most delightful.

            rust to wasm to javascript

            I was poking around the implementation of the obsidian-extract-url plugin, and its written in rust but compiled and run as WASM inside of the obsidian plugin environment.

            Novel use case for WASM.

            Coding in one file

            Tailwind and Server Actions

            One file that contains design, layout, code, and remote server code.

            I'm not totally sure how to debug server actions but its a whole bunch of functionality in one place.

            No shifting between different files, just doing what you set out to do in one context.

            Thoughts on reading the llama 3.1 paper

            I read through the llama 3 paper. Some random toughts:

            The big model performs more or less as well as the other major models (GPT, Gemini, and Claude) but you can pull it down and fine tune it for your needs. This is a remarkable move I assume to undermine the competetive advantage of the big AI companies. It's means that you don't need 10 billion to enter the AI race in a deep way.

            It took 54 days running on 16,000 N100s. That is a lot of compute.

            During training, tens of thousands of GPUs may increase or decrease power consumption at the same time, for example, due to all GPUs waiting for checkpointing or collective communications to finish, or the startup or shutdown of the entire training job. When this happens, it can result in instant fluctuations of power consumption across the data center on the order of tens of megawatts, stretching the limits of the power grid. This is an ongoing challenge for us as we scale training for future, even larger Llama models.

            Moving data around, both training data and intermediate check point training checks, required a huge amount of engineering work. The Meta infrastructure – even ourside of the compute stuff – was instrumental to this amount of effort.

            One interesting observation is the impact of environmental factors on training performance at scale. For Llama 3 405B , we noted a diurnal 1-2% throughput variation based on time-of-day. This fluctuation is the result of higher mid-day temperatures impacting GPU dynamic voltage and frequency scaling.

            Sourcing quality input data seemed like it was all cobbled together. There was a bunch of work to pull data out of webpages.

            It's mostly trained on English input, and then a much smaller fraction of other languages. I would imagine that quality in English is much higher, and people who use the models in different languges would be at a disadvantage.

            It filtered out stuff I'd expect, like how to make a bomb or create a bioweapon, but I was surprised that it filtered out "sexual content" which it labeled under "adult content". So if sexuality is part of your life, don't expect the models to know anything about it.

            There's the general pre-training model, which was fed a sort of mismash of data. "Better quality input", whatever that objectively means at this sort of scale.

            Post-training is basically taking a whole bunch of expert human-produced data and making sure that the models answer in that sort of way. So the knowledged and whatever else that is embedded is sort of forced into it at that area.

            Pre-training then is like putting in the full corpus of how language works and the concepts that our languages have embedded. This is interesting in itself because it represents how we model the world in our communication, though it's fully capable of spitting out coherent bullshit it doesn't really have any of the "understanding of experts" that would differentiate knowing what you are talking about.

            The post-training is to put in capabilities that are actually useful – both in terms of elevating accepted knowledge, but also other capabilities like tool use. This sort of tuning seems like cheating, or at least a very pragmatic engineering method that "gets the model to produce the types of answers we want".

            The obvious thing is the -instruct variation, which adds things like "system prompt" and "agent" and "user", so you can layer on the chat interface that everyone knows and loves. But tool use and coding generation – it can spit out python code for evaluation when it needs a quantiative answer – are also part of that. I believe that this sort of post-training is of a different sort than the "process all of the words so I understand embedded conceptions in linguistic communication".

            The paper is also a sort of blueprint of what you'd need to do if you wanted to make your own foundation model. They didn't use necessarily the most advanced techniques – preferring to push the envelope on data quality and training time – but the results are working and I suppose in tune with the general "more data less clever" idea in AI.

            The methodolgy of training these things is probably well known by the experts out there, but if it was obfucated knowledge before it's no longer.

            Vacation Book Reading

              Spent a few weeks in Spain, managed to get some good reading in!

              The Gutenberg Parenthesis: The Age of Print and Its Lessons for the Age of the Internet (2023)

              Hard to get into, but once you are there worth the read. The biggest take away for me was the idea of "the mass" as being created by the medium, so people reading (say) twitter all of a sudden become this group, which doesn't really have an existance, but, like Santa Claus, changes everyones reality.

              Tomorrow, and Tomorrow, and Tomorrow (2002)

              Expected very little of this, got much more than expected.

              A Stainless Steel Trio: A Stainless Steel Rat Is Born/The Stainless Steel Rat Gets Drafted/The Stainless Steel Rat Sings the Blues (1985)

              Its funny to reread these books when you are older and get all of the references. Really holds up, both as a satire on the Heinlien adventure-boy genre, but as an interesting discussion on how to fit inbetween places. It's a very impressive feat to put so much philosophy and sociology in a page turning absurdist caper plot that keeps 12 year old's attentions.

              The Latchkey Murders (2015)

              Since Ksenia is Russian I have a new appreciation of the Russians and Soviets. Not totally satisfying as a mystery, but felt like I got a glimpse into Moscow in the 60s.

              Server Driven Webapps with HTMX (2024)

              As far as server side JavaScript frameworks go, this is exceedingly clever but I'm not sure that it really makes things simplier.

              Wintersmith (2015)

              Ah… English and their distain for the wrong sort of hemegonic thought. In many ways the same sort of book as the Stainless Steel Rat – satire and commentary under the guise of silliness, as a way to harmlessly subvert the – but ultimately just one thing after another. Fun if you are in the right sort of mood.

              Cynicism and Magic: Intelligence and Intuition on the Buddhist Path (2021)

              I can't possibly do much justice. The smallest book that took the longest to read. Very interesting to see the early ways and how of how Buddhism got into the states, and also so many important reminders.

              Deja Dead (2007)

              What's funny about this book is that it spawned an empire, and reading just the first one you'd never really expect it. Also a bit of time travel here since it's in the precell phone days and it's hard to find anyone.

              Surfing the Internet (1995)

              This book was amazing – both obscure and also exactly of my world, I knew all of the references and all of the things they talked about and I had forgotten so much. Was also interesting the interview with what we would now call an incel, but before the time when the sadness had turned into hostility towards these losers. I very much respected this author and tracked down a whole bunch of other stuff she wrote.

              The Enigma of Room 622 (2023)

              An absurd story inside of a dumb story wrapped in a magical outer story that brought it all home at the end. I do want to read more of his work, but not in a rush.

              Narrative of Auther Gordon Pym of Nantucket (1838)

              Poe invented nearly everything! The so-called ending of this book is infuriating, but the ripples of it have influenced so much. Its the sort of book that's a key to understanding a whole bunch of other books, so necessary in a complete-your-education sort of way, but without the context it's a bit strange.

              The Prisoner of Heaven (2013)

              I never heard of Carlos Ruiz Zafón before but picked it up in a small english section of a Benidorm bookstore, very clever, moody and oddly soothing. Will go through the rest of his oevre.

              Homage to Catalonia (1938)

              Reading Orwell after reading both the Gutenberg Parenthesis and reflected on Foucaults 40 years was a tremendous experience and I look at intellectual efforts completely differently now, and honestly feel better about the state of the world than I did before. With the Supreme Court making kings and the farce of the elections its not getting worse; its actually the same as it ever was and we were just fed a load of liberal democracy nonsense all this time, and it's capital and power all along.