Guest User - Provide.ai

Uncategorised

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

Guest User / March 25, 2025

We present Stable Video 4D 2.0 (SV4D 2.0), a multi-view video diffusion
model for dynamic 3D asset generation. Compared to its predecessor SV4D,
SV4D 2.0 is more robust to occlusions and large motion, generalizes better
to real-world videos, and produces higher-quality outputs in terms of
detail sharpness and spatio-temporal consistency.

Uncategorised

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Guest User / March 18, 2025

Diffusion models are the main driver of progress in image and video
synthesis, but suffer from slow inference speed. Distillation methods, like
the recently introduced adversarial diffusion distillation (ADD) aim to
shift the model from many-shot to single-step inference, albeit at the cost
of expensive and difficult optimization due to its reliance on a fixed
pretrained DINOv2 discriminator.

Uncategorised

Stable Virtual Camera: Multi-View Video Generation with 3D Camera Control

Guest User / March 18, 2025

We present Stable Virtual Camera, a generalist diffusion model that creates
novel views of a scene, given any number of input views and target cameras.

Uncategorised

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Guest User / January 8, 2025

We study the problem of single-image 3D object reconstruction. Recent works
have diverged into two directions: regression-based modeling and generative
modeling. In this paper, we present SPAR3D, a novel two-stage approach
aiming to take the best of both directions.

Uncategorised

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

Guest User / August 1, 2024

We present SF3D, a novel method for rapid and high-quality textured object
mesh reconstruction from a single image in just 0.5 seconds.

Uncategorised

SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency

Guest User / July 24, 2024

We present Stable Video 4D (SV4D), a latent video diffusion model for
multi-frame and multi-view consistent dynamic 3D content generation.

Uncategorised

Stable Audio Open

Guest User / July 19, 2024

Here we describe the architecture and training process of a new
open-weights text-to-audio model trained with Creative Commons data. Our
evaluation shows that the model’s performance is competitive with the
state-of-the-art across various metrics.

Uncategorised

Shaping Realities: Enhancing 3D Generative AI with Fabrication Constraints

Guest User / April 15, 2024

This workshop paper highlights the limitations of generative AI tools in
translating digital creations into the physical world and proposes new
augmentations to generative AI tools for creating physically viable 3D
models.

Uncategorised

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Guest User / March 18, 2024

We present Stable Video 3D (SV3D) — a latent video diffusion model for
high-resolution, image-to-multi-view generation of orbital videos around a
3D object.

Uncategorised

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Guest User / March 5, 2024

In this work, we improve existing noise sampling techniques for training
rectified flow models by biasing them towards perceptually relevant scales.

Author name: Guest User