Store
Text-to-Speech using IndicF5 for Indian languages
[NVIDIA ONLY] Direct3D-S2 is a scalable 3D shape generation framework leveraging sparse volumetric representations for high-resolution outputs. It features Spatial Sparse Attention (SSA), a novel mechanism that accelerates Diffusion Transformer computations on sparse data, achieving up to 9.6× speedup in training. The unified Sparse VAE architecture maintains a consistent sparse volumetric format across input, latent, and output stages, significantly improving efficiency and stability.
GLM-4-Voice | 端到端中英语音对话模型
A tool for hosting AI vtubers that runs fully locally and offline: https://github.com/0Xiaohei0/LocalAIVtuber
[NVIDIA Only] Dead simple web UI for training FLUX LoRA with LOW VRAM support (From 12GB)
Install Control-Lora Models and Workflows to ComfyUI with 1 click
Transform YouTube videos into stunning animated GIFs with perfectly-timed, stylized subtitles and eye-catching effects.
github
lavieFeatured
Text-to-Video (T2V) generation framework from Vchitect https://github.com/Vchitect/LaVie
A local implementation of the Kokoro Text-to-Speech model
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions. https://github.com/halr9000/sdxs
SD.Next: All-in-one WebUI for AI generative image and video creation
This project is an enhanced version of the IC-Light repository, designed for advanced image relighting and enhancement using Stable Diffusion and deep learning techniques
Dia is a 1.6B parameter text to speech model created by Nari Labs. Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control. The model can also produce nonverbal communications like laughter, coughing, clearing throat, etc. https://github.com/nari-labs/dia
A local-install LLM backend
[NVIDIA ONLY - WINDOWS ONLY] InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity [LoRA support fork] https://github.com/petermg/InfiniteYou
