Facefusion
Dione TeamIndustry leading face manipulation platform.
Explore the latest available applications at dione.
Industry leading face manipulation platform.
Prompt, run, edit, and deploy full-stack web applications using any LLM you want!
A Universal Customization Method for Both Single and Multi-Subject Conditioning
A unified image generation model that you can use to perform various tasks, including but not limited to text-to-image generation, subject-driven generation, Identity-Preserving Generation, and image-conditioned generation.
Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution.
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Stable Video Matting with Consistent Memory Propagation
Applio is a powerful, AI-driven voice conversion tool that enables you to create personalized voices or make use of a variety of pre-existing voices. Whether you prefer local installation or cloud-based usage through Google Colab, Applio is designed to be efficient and user-friendly.
MMAudio generates synchronized audio given video and/or text inputs. Our key innovation is multimodal joint training which allows training on a wide range of audio-visual and audio-text datasets. Moreover, a synchronization module aligns the generated audio with the video frames.
This project builds upon the foundation of the browser-use, which is designed to make websites accessible for AI agents.
A Step Towards Music Generation Foundation Model
WanGP by DeepBeepMeep : The best Open Source Video Generative Models Accessible to the GPU Poor
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion WebUI (based on Gradio ) to make development easier, optimize resource management, speed up inference, and study experimental features.
Invoke is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. Invoke offers an industry leading web-based UI, and serves as the foundation for multiple commercial products.
n8n is a workflow automation platform that gives technical teams the flexibility of code with the speed of no-code. With 400+ integrations, native AI capabilities, and a fair-code license, n8n lets you build powerful automations while maintaining full control over your data and deployments.
NVIDIA Only - LLM UI with advanced features, easy setup, and multiple backend support.
NVIDIA ONLY – All-in-One TTS App with Kokoro, Chatterbox, Fish-Speech, F5 & index-tts. Supports Conversation Mode & eBook-to-Audiobook. All features work across all engines in a unified interface.
This project provides a user-friendly Gradio-based Graphical User Interface (GUI) for Kohya's Stable Diffusion training scripts. Stable Diffusion training empowers users to customize image generation models by fine-tuning existing models, creating unique artistic styles, and training specialized models like LoRA (Low-Rank Adaptation).
Transform your images with AI-powered editing magic! Upload an image and describe your desired changes - watch the magic happen! All under 20gb VRAM
A web interface for Stable Diffusion, implemented using Gradio library.
Zonos is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS providers.
Welcome to Kokoro, a high-quality text-to-speech synthesis program powered by deep learning. This tool converts any text into high-fidelity speech in just a few seconds. Simply input text, select a voice, adjust the speed, and enjoy the generated audio.
Stable Audio Open allows anyone to generate up to 47 seconds of high-quality audio data from a simple text prompt. Its specialised training makes it ideal for creating drum beats, instrument riffs, ambient sounds, foley recordings and other audio samples for music production and sound design.
A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
A powerful, AI-powered Gradio application for downloading, transcribing, and analyzing YouTube videos and audio.
Forget everything you thought you knew about AI art generation - RuinedFooocus is here to completely reinvent the game!
Want to see more? Check out the home for more details.