Open Source Video & LLM Roundup: The Best of What’s New | Runpod Blog
Open-source AI is booming—and 2024 delivered an incredible wave of new LLMs and generative video models. Here’s a quick roundup of the most exciting releases you can run today.
Open-source AI is booming—and 2024 delivered an incredible wave of new LLMs and generative video models. Here’s a quick roundup of the most exciting releases you can run today.
Learn when to use open source vs. closed source LLMs, and how to deploy models like Llama-7B with vLLM on Runpod Serverless for high-throughput, cost-efficient inference.
Need a virtual desktop with serious GPU power? This guide walks you through setting up a GPU-accelerated virtual desktop on Runpod—perfect for 3D rendering, video editing, and other high-performance workflows in the cloud.
Learn how vLLM achieves up to 24x higher throughput than Hugging Face Transformers by using PagedAttention to eliminate memory waste, boost inference performance, and enable efficient GPU usage.
DeepSeek R1 remains one of the top open-source models. This post shows how you can run it efficiently on just 480GB of VRAM without sacrificing performance.
Runpod surpasses $120M ARR, now serving over 500,000 developers worldwide. Founder Zhen reflects on the journey from basement GPU rigs to AI-first cloud infrastructure powering startups, research labs, and Fortune 500 teams.
Deploy Serverless endpoints directly from GitHub and roll back instantly if needed. Runpod’s improved GitHub integration lets you revert to previous builds without rebuilding Docker images, enabling faster, safer, and more confident deployments.
Slurm on RunPod Instant Clusters makes it simple to scale distributed AI and scientific computing across multiple GPU nodes. With pre-configured setup, advanced job scheduling, and built-in monitoring, users can efficiently manage training, batch proce…
Learn how to deploy ComfyUI as a serverless API endpoint on Runpod to run AI image generation workflows at scale. The tutorial covers deploying from Runpod Hub templates or Docker images, integrating with Python for synchronous API calls, and customizi…
DeepSeek V3.1 introduces a breakthrough hybrid reasoning architecture that dynamically toggles between fast inference and deep chain-of-thought logic using token-controlled templates—enhancing performance, flexibility, and hardware efficiency over its …