Store
differential-diffusion-uiFeatured
Differential Diffusion modifies an image according to a text prompt, and according to a map that specifies the amount of change in each region https://differential-diffusion.github.io/
Image generation using zai-org/GLM-Image with Gradio UI. Supports text-to-image and image-to-image generation.
Image Upscale is an AI-powered application designed to enhance and upscale images using advanced techniques like Stable Diffusion and Tile ControlNet. It provides high-quality image enhancement with options for HDR effects and customizable settings.
halloFeatured
[NVIDIA Only] Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation https://github.com/fudan-generative-vision/hallo
StoryDiffusion ComicsFeatured
create a story by generating consistent images https://github.com/HVision-NKU/StoryDiffusion
LFM2-Audio-1.5B is Liquid AI's first end-to-end audio foundation model. Designed with low latency and real time conversation in mind
Pinokio script for https://huggingface.co/Ole1/Joy_Caption_Batch-GUI
Gradio-based web interface for the LuxTTS voice cloning and text-to-speech model, enabling users to generate customized speech from text using uploaded or recorded audio references with adjustable parameters like speed, guidance scale, and inference steps.
A tool that takes a text document containing a book or a novel, ingests it with an LLM to produce an annotated script, and then uses a TTS API to generate the voice lines, finally stitching them together into an audiobook in MP3 format.
OneTrainer para Pinokio vato loco
Imposing Consistent Light - Control lighting of images

Fast AI Video Generation per GPU poor (Wan2.1, Hunyuan, LTV). Gradio UI su http://127.0.0.1:7860
RVCFeatured
1 Click Installer for Retrieval-based-Voice-Conversion-WebUI (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI)
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion WebUI (based on Gradio) to make development easier, optimize resource management, and speed up inference. https://github.com/Panchovix/stable-diffusion-webui-reForge
An open-source, modern-design ChatGPT/LLMs UI/Framework. Supports speech-synthesis, multi-modal, and extensible (function call) plugin system. https://github.com/lobehub/lobe-chat
DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
One-click installer for Microsoft TRELLIS.2: High-quality 3D asset generation from images with PBR textures.
Google's official AI agent for your terminal. Access Gemini 2.5 Pro with 1M token context window directly from the command line.

Mon portail IA personnel