LLM Research Papers: The 2025 List (January to June)
The latest in LLM research with a hand-curated, topic-organized list of over 200 research papers from 2025.
The latest in LLM research with a hand-curated, topic-organized list of over 200 research papers from 2025.
KV caches are one of the most critical techniques for efficient inference in LLMs in production. KV caches are an important component for compute-efficient…
Why build an LLM from scratch? It’s probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot…
A lot has happened this month, especially with the releases of new flagship models like GPT-4.5 and Llama 4. But you might have noticed that reactions to…
As you know, I’ve been writing a lot lately about the latest research on reasoning in LLMs. Before my next research-focused blog post, I wanted to offer…
This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged…
In this article, I will describe the four main approaches to building reasoning models, or how we can enhance LLMs with reasoning capabilities. I hope this…
This article covers 12 influential AI research papers of 2024, ranging from mixture-of-experts models to new LLM scaling laws for precision.
This is a standalone notebook implementing the popular byte pair encoding (BPE) tokenization algorithm, which is used in models like GPT-2 to GPT-4, Llama…
I want to share my running bookmark list of many fascinating (mostly LLM-related) papers I stumbled upon in 2024. It’s just a list, but maybe it will come…