moondream1 is a tiny (1.6B parameter) vision language model trained by @vikhyatk that performs on par with models twice its size. It is trained on the LLaVa training dataset, and initialized with SigLIP as the vision tower and Phi-1.5 as the text encoder. https://huggingface.co/spaces/vikhyatk/moondream1
Declaratively define and modify agents and multi-agent workflows through a point and click, drag and drop interface (e.g., you can select the parameters of two agents that will communicate to solve your task).
Upload a clean 20 seconds WAV file of the vocal persona you want to mimic, type your text-to-speech prompt and hit submit! A local version of https://huggingface.co/spaces/fffiloni/instant-TTS-Bark-cloning
an open vision-language model by Google. PaliGemma is designed as a versatile model for transfer to a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation https://huggingface.co/spaces/google/paligemma
supersonic13/sillytavern-pinokiov1.5updated 2y ago
Brought to you by Cohee, RossAscends, and the SillyTavern community, SillyTavern is a local-install interface that allows you to interact with text generation AIs (LLMs) to chat and roleplay with custom characters.
supersonic13/superprompter-pinokiov1.0updated 2y ago
SuperPrompter is a Python-based application that utilises the SuperPrompt-v1 model to generate optimised text prompts for AI/LLM image generation (for use with Stable Diffusion etc...) from user prompts.
The script utilizes various deep learning models to create detailed character cards, including names, summaries, personalities, greeting messages, and character avatars.