Store
llama.cpp with BakLLaVA model describes what does it see (https://github.com/Fuzzy-Search/realtime-bakllava)
Contribute to cocktailpeanut/stable-diffusion-webui-forge development by creating an account on GitHub.
Segment Anything for Stable Diffusion WebUI
Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Stable Diffusion UI with patches by lllyasviel
Flexible Automapper for Beatsaber made for any difficulty
A Streamlit app that uses Google's AI to summarize YouTube video transcripts, providing concise, point-form notes. Perfect for quick content overviews.
Fault-tolerant, highly scalable GPU orchestration, and a machine learning framework designed for training models with billions to trillions of parameters
Contribute to coqui-ai/xtts-streaming-server development by creating an account on GitHub.
๐ธ๐ฌ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
A Web UI for easy subtitle using whisper model.
A fork of WebUI to edit dataset captions for txt2img models
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1
Wav2Lip UHQ extension for Automatic1111
Unofficial Re-Trained AnimateAnyone (Image + DWPose Video โ Animated Video of Image)
nsfw protection bypass for the Next generation face swapper and enhancer
PhotoMaker
Contribute to candywrap/Moore-AnimateAnyone-for-windows development by creating an account on GitHub.
Useful tool to track location or mobile number
