deep-learning - Provide.ai

ai, deep-learning, llm, Machine Learning

LLM Research Papers: The 2025 List (July to December)

Sebastian Raschka, PhD / December 30, 2025

A curated list of LLM research papers from July–December 2025, organized by reasoning models, inference-time scaling, architectures, training efficiency…

ai, deep-learning, llm, Machine Learning

From Random Forests to RLVR: A Short History of ML/AI Hello Worlds

Sebastian Raschka, PhD / December 8, 2025

Two years ago, I posted a list of Hello World examples for machine learning and AI on social. Here, the Hello World means beginner-friendly examples to…

ai, deep-learning, llm, Machine Learning

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

Sebastian Raschka, PhD / December 3, 2025

Similar to DeepSeek V3, the team released their new flagship model over a major US holiday weekend. Given DeepSeek V3.2’s really good performance (on GPT-5…

ai, deep-learning, llm, Machine Learning

Recommendations for Getting the Most Out of a Technical Book

Sebastian Raschka, PhD / November 12, 2025

This short article compiles a few notes I previously shared when readers ask how to get the most out of my building large language model from scratch books…

ai, deep-learning, llm, Machine Learning

Beyond Standard LLMs

Sebastian Raschka, PhD / November 4, 2025

After I shared my Big LLM Architecture Comparison a few months ago, which focused on the main transformer-based LLMs, I received a lot of questions with…

ai, deep-learning, llm, Machine Learning

DGX Spark and Mac Mini for Local PyTorch Development

Sebastian Raschka, PhD / October 29, 2025

The DGX Spark for local LLM inferencing and fine-tuning was a pretty popular discussion topic recently. I got to play with one myself, primarily working…

ai, deep-learning, llm, Machine Learning

Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)

Sebastian Raschka, PhD / October 5, 2025

Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples

ai, deep-learning, llm, Machine Learning

Understanding and Implementing Qwen3 From Scratch

Sebastian Raschka, PhD / September 6, 2025

Previously, I compared the most notable open-weight architectures of 2025 in The Big LLM Architecture Comparison. Then, I zoomed in and discussed the…

ai, deep-learning, llm, Machine Learning

From GPT-2 to gpt-oss: Analyzing the Architectural Advances

Sebastian Raschka, PhD / August 9, 2025

OpenAI just released their new open-weight LLMs this week: gpt-oss-120b and gpt-oss-20b, their first open-weight models since GPT-2 in 2019. And yes, thanks…

ai, deep-learning, llm, Machine Learning

The Big LLM Architecture Comparison

Sebastian Raschka, PhD / July 19, 2025

It has been seven years since the original GPT architecture was developed. At first glance, looking back at GPT-2 (2019) and forward to DeepSeek-V3 and…