fragments

Why are LLMs so small?

so much knowledge in such a small space

tags

Contents

LLMs are compressing information in a wildly different way than I understand. If we compare a couple open source LLMs to Wikipedia, they are all 20%-25% smaller than the compressed version of English wikipedia. And yet you can ask questions about the LLM, they can – in a sense – reason about things, and they know how to code.

NAMESIZE
gemma:7b5.2 GB
llava:latest4.7 GB
mistral:7b4.1 GB
zephyr:latest4.1 GB

Contrast that to the the size of English wikipedia – 22gb. That's without media or images.

Shannon Entropy is a measure of information desitity, and whatever happens in training LLMs gets a lot closer to the limit than our current way of sharing information.

Previously

fragments

5 year old hacking chatgpt

tags

Next

howto

POSSE rss to mastodon

keep it local and then share

tags
mastodon
ruby
POSSE