LLM Neuroanatomy: Topping the Leaderboard Without Changing a Weight
AI
LLMs
inference
links
Duplicate a block of middle transformer layers and run them twice at inference. The power of thinking it over twice.
Improve an LLM without training by duplicating a specific block of middle transformer layers and running them twice during inference. The power of thinking it over twice [1].
References
[1] “LLM Neuroanatomy: How I Topped the LLM Leaderboard Without Changing a Single Weight.” dnhkng.github.io. https://dnhkng.github.io/posts/rys/
Originally posted on LinkedIn.
