Store

A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Next-generation face-swapping and enhancement (Codeberg fork of Roop). Easy GUI for images & videos.
Industry leading face manipulation platform
DreamID-V: Bridging the Image-to-Video Gap for High-Fidelity Face Swapping via Diffusion Transformer
Contribute to taylorchu/2cent-tts development by creating an account on GitHub.
clone voices into different languages by using just a quick 3-second audio clip. (a local version of https://huggingface.co/spaces/coqui/xtts)
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio quality
This application lets you upload an image and generate a caption tailored to your choice of style and length. You can select from options like descriptive, informal, or specific formats like traini...
🚀🪐🌕🌑☄️🛸 Opensource equivalent of Google's Antigravity
🙌 OpenHands: AI-Driven Development
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Transform any video into a professional multilingual production with natural voice cloning, lip-sync, and on-screen text translation. No cloud APIs, no subscriptions, no data leaving your machine.
Taming Stable Diffusion for Lip Sync!
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Arbitrary-steps Image Super-resolution via Diffusion Inversion (CVPR 2025)
Join the discussion on this paper page
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
Edit Videos with Wan 2.2