5 year old hacking chatgpt
- tags
LLMs are compressing information in a wildly different way than I understand. If we compare a couple open source LLMs to Wikipedia, they are all 20%-25% smaller than the compressed version of English wikipedia. And yet you can ask questions about the LLM, they can – in a sense – reason about things, and they know how to code.
NAME | SIZE | |
gemma:7b | 5.2 GB | |
llava:latest | 4.7 GB | |
mistral:7b | 4.1 GB | |
zephyr:latest | 4.1 GB |
Contrast that to the the size of English wikipedia – 22gb. That's without media or images.
Shannon Entropy is a measure of information desitity, and whatever happens in training LLMs gets a lot closer to the limit than our current way of sharing information.
Previously
Next