Blizaine/Qwen3-TTS-MLX-WebUI-Enhancedv5.0updated 9d ago
High-quality text-to-speech with Beautiful Web UI & API, optimized for Apple Silicon using MLX. Features include Custom Voice (preset speakers), Voice Design (natural language), and Voice Cloning. With enhanced features for saving custom voices and long-form / endless TTS streaming.
Dub & translate any short video — locally, offline. Voice clone / per-speaker cast / voice packs, on-screen text localized in place, subtitle styling, blur-or-solid mask covers, funny re-dub. One process (FastAPI serves the React SPA), 6 UI languages.
BazedFrog/SongGeneration-Studiov3.7updated 10d ago
AI Song Generation with Full Style Control - Generate complete songs with lyrics, vocals, and instrumental tracks using Tencent AI Lab's SongGeneration (LeVo) model. [NVIDIA ONLY]
MuseTalk is a cutting-edge video-to-video (V2V) lip-sync solution engineered to deliver highly accurate and natural mouth movements synchronized to audio input. Precision LipSync: Realistic and seamless synchronization of speech audio to facial movements. Efficiently designed to run on 8–12 GB VRAM,
A multi-voice AI audiobook generator built on Qwen3-TTS — annotate scripts with an LLM, assign unique voices to each character, per-line style instructions for delivery, clone voices from reference audio, design new voices from text descriptions, train custom voices with LoRA fine-tuning, and export to MP3 or Audacity multi-track projects
pinokiofactory/Orpheus-TTS-FastAPIv3.7updated 11d ago
Orpheus TTS is an open-source text-to-speech system built on the Llama-3b backbone. Orpheus demonstrates the emergent capabilities of using LLMs for speech synthesis https://github.com/canopyai/Orpheus-TTS
P2PCLAW Agent Benchmark — connect any LLM agent (Claude, GPT, Gemini, Qwen, Kimi, DeepSeek…) and get scored on 10 dimensions + Tribunal IQ. Dashboard runs locally on :8787, leaderboard at p2pclaw.com/app/benchmark.
Prompt Orchestrator that turns module-based game design (genre, mechanics, visuals, menus, audio) into a complete, playable HTML5 game generated by your chosen AI provider. Supports OpenAI, Gemini, Claude, Ollama, and LM Studio. Every game ships as a single self-contained HTML file.
Diffusion Engine for Musical Orchestrated Noise — a real-time streaming diffusion engine for music generation, built on ACE-Step v1.5. Requires an NVIDIA GPU.