Store
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Sharp Monocular View Synthesis in Less Than a Second
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
The 3rd Eye is a modular OSINT (Open Source Intelligence) framework built on an agent-based, graph-driven architecture. It automates public information discovery, identity correlation, and exposure analysis across multiple platforms, and generates structured intelligence reports. The system follows a LangGraph agent design.
Industry leading face manipulation platform
Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)
AI-powered image generation tool using Hugging Face API and Stable Diffusion. Create images from text prompts with multiple style options.
This ComfyUI node lets you browse the Civitai gallery directly within the interface, featuring infinite scroll, advanced filters (including NSFW), and favorites management. It also allows you to retrieve prompts, metadata, and images/videos to seamlessly reuse them in your workflows.
Generative Models by Stability AI
Qwen-Image text to image lora trainer
Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
OpenKombai: A free, privacy-first alternative to Kombai. Instantly convert screenshots and designs into production-ready React + Tailwind code using local LLMs (Llama 3.2 Vision & Qwen 2.5). No API keys, zero cloud costs.
Contribute to beeble-ai/SwitchLight-Studio development by creating an account on GitHub.
[ICLR 2025] CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
Contribute to iamdinhthuan/viterbox-tts development by creating an account on GitHub.
