Store
Agentic AI Software Engineer https://github.com/stitionai/devika
dreamtalkFeatured
When Expressive Talking Head Generation Meets Diffusion Probabilistic Models (https://github.com/ali-vilab/dreamtalk)
StyleAlignedFeatured
Style Aligned Image Generation via Shared Attention https://style-aligned-gen.github.io/
A Web UI for easy subtitle using whisper model (https://github.com/jhj0517/Whisper-WebUI)
Install AnimateDiff Automatic1111 Extension and the models with one click
A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA.
Upload the picture of an image, and generate images with that image style. Instant generation with no LoRA required https://huggingface.co/spaces/InstantX/InstantStyle
Gradio web interface for Photoroom's PRX-1024-t2i-beta text-to-image model
Generate realistic and expressive speech with natural language voice design.
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics,
A mass video player for easy browsing of large video datasets
Improving Diffusion Models for Authentic Virtual Try-on in the Wild https://huggingface.co/spaces/yisol/IDM-VTON
Super fast Multilingual TTS supporting 54 voices across 8 languages.
SongBloom, a novel framework for full-length song generation
[NVIDIA ONLY] Requires 24GB VRAM (Use the lowvram option, it has the same quality). High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models. https://github.com/Tencent/Hunyuan3D-2
Florence2Featured
An advanced vision foundation model from MicroSoft https://huggingface.co/spaces/gokaygokay/Florence-2
(WINDOWS)NVIDIA, Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
[NVIDIA ONLY] Image generation, image editing and free-form manipulation with a VLM (Minimum Requirements 12GB VRAM / 32GB RAM Recommended Requirements 24GB VRAM / 48GB RAM)
