Store
#ai20#tts7#15#image-edit5#video5#gradio4#image4#video-generation4#audio3#audio-generation3#image-generation3#lora3#music3#training3##ai-#audio-generation-#song2#ai-video-generator2#cags2#face2#faceswap2#generation2#lipsync2#pictures2#prompt-helper2#prompting2#qwen2#song2#stableaudio2#stableaudio32#user-interface2#wan2
DramaBoxFeatured
Expressive TTS with voice cloning, prompt-driven speech synthesis built on LTX-2.3 by Resemble AI
Image to PromptFeatured
Generate editable Ideogram JSON prompts from uploaded images.
FaceFusion 3.6.1Featured
Industry leading face manipulation platform
Ultimate-TTS-StudioFeatured
Kokoro, KittenTTS, Higgs audio, Chatterbox/Multi, Fish-Speech, F5 & index-tts & indextts2, VoxCPM and VibeVoice in one app
Clarity Refiners UIFeatured
An enhanced local port of finegrain-image-enhancer powered by Refiners (https://huggingface.co/spaces/finegrain/finegrain-image-enhancer), which was adapted from philz1337x's Clarity Upscaler (https://github.com/philz1337x/clarity-upscaler)
Wan2GP - AMDFeatured
[AMD ONLY] Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video, Flux and more. (On Windows supported by all dedicated AMD GPUs from RDNA 2 - RDNA 4)
PhospheneFeatured
Local generative video, image, and character training on Apple Silicon. Train face + voice LoRAs in-app. Q8 HQ for character clips. MLX native — no cloud, no API key.
fluxgymFeatured
[NVIDIA Only] Dead simple web UI for training FLUX LoRA with LOW VRAM support (From 12GB)
StableDAWFeatured
Browser-based AI audio DAW for Stable Audio 3 with text-to-audio, inpainting, LoRA training, FFmpeg effects, waveform editing, sequencer, piano roll, and persistent library. https://github.com/gantasmo/stabledaw
ComfyuiFeatured
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. https://github.com/comfyanonymous/ComfyUI
Wan2GPFeatured
Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP
Qwen3-TTS MLX WebUI EnhancedFeatured
High-quality text-to-speech with Beautiful Web UI & API, optimized for Apple Silicon using MLX. Features include Custom Voice (preset speakers), Voice Design (natural language), and Voice Cloning. With enhanced features for saving custom voices and long-form / endless TTS streaming.
SongGeneration StudioFeatured
AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model. [NVIDIA ONLY]
RMBG-2-StudioFeatured
Enhanced background remove and replace app built around BRIA-RMBG-2.0 https://huggingface.co/briaai/RMBG-2.0
Orpheus-TTS-FastAPIFeatured
Orpheus TTS is an open-source text-to-speech system built on the Llama-3b backbone. Orpheus demonstrates the emergent capabilities of using LLMs for speech synthesis https://github.com/canopyai/Orpheus-TTS
OdysseusFeatured
Self-hosted AI workspace for local-first chat, agents, tools, memory, research, documents, email, and model endpoint management.
Whisper-WebUIFeatured
A Web UI for easy subtitle using whisper model.
IdeopromptFeatured
Describe an image, get a 100% schema-valid Ideogram 4 JSON prompt — generated fully locally with an embedded llama.cpp (no Ollama or LM Studio required).
IP-Adapter-FaceIDFeatured
Enter a face image and transform it to any other image. Demo for the h94/IP-Adapter-FaceID model https://huggingface.co/spaces/multimodalart/Ip-Adapter-FaceID
UnderfitFeatured
LoRA fine-tuning dashboard for Stable Audio 3
