Finrandojin/alexandria-audiobookv5.0updated 3d ago
A multi-voice AI audiobook generator built on Qwen3-TTS — annotate scripts with an LLM, assign unique voices to each character, per-line style instructions for delivery, clone voices from reference audio, design new voices from text descriptions, train custom voices with LoRA fine-tuning, and export to MP3 or Audacity multi-track projects
Auris-BadBaDaki is Offline audiobook reader for EPUB, PDF, and TXT with local OmniVoice TTS, character-aware voices, per-book narrator control, and synced text highlighting.
Everything runs locally after setup. No API keys. No hosted TTS dependency.
pinokiofactory/clarity-refiners-uiv3.7updated 5d ago
An enhanced local port of finegrain-image-enhancer powered by Refiners (https://huggingface.co/spaces/finegrain/finegrain-image-enhancer), which was adapted from philz1337x's Clarity Upscaler (https://github.com/philz1337x/clarity-upscaler)
Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video and Flux. https://github.com/deepbeepmeep/Wan2GP
Bulk transcribe many YouTube videos, whole playlists, or your own uploaded audio/video files at once with faster-whisper. Outputs txt, srt, vtt, or json.
[AMD ONLY] Super Optimized Gradio UI for AI video creation for GPU poor machines (6GB+ VRAM). Supports Wan 2.1/2.2, Qwen, Hunyuan Video, LTX Video, Flux and more. (On Windows supported by all dedicated AMD GPUs from RDNA 2 - RDNA 4)
Describe an image, get a 100% schema-valid Ideogram 4 JSON prompt — generated fully locally with an embedded llama.cpp (no Ollama or LM Studio required).