💬 Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
Project description
Visual ChatGPT
Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.
See our paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
Demo
System Architecture
Quick Start
# create a new environment
conda create -n visgpt python=3.8
# activate the new environment
conda activate visgpt
# prepare the basic environments
pip install -r requirement.txt
# download the visual foundation models
bash download.sh
# prepare your private openAI private key
export OPENAI_API_KEY={Your_Private_Openai_Key}
# create a folder to save images
mkdir ./image
# Start Visual ChatGPT !
python visual_chatgpt.py
GPU memory usage
Here we list the GPU memory usage of each visual foundation model, one can modify self.tools
with fewer visual foundation models to save your GPU memory:
Foundation Model | Memory Usage (MB) |
---|---|
ImageEditing | 6667 |
ImageCaption | 1755 |
T2I | 6677 |
canny2image | 5540 |
line2image | 6679 |
hed2image | 6679 |
scribble2image | 6679 |
pose2image | 6681 |
BLIPVQA | 2709 |
seg2image | 5540 |
depth2image | 6677 |
normal2image | 3974 |
InstructPix2Pix | 2795 |
Acknowledgement
We appreciate the open source of the following projects:
Hugging Face LangChain Stable Diffusion ControlNet InstructPix2Pix CLIPSeg BLIP
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for visualchatgpt-0.0.1.dev0.linux-x86_64.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24c1bec087a1f8d1d2807d75782fb4f5a955e3dcf147caebf10d5712107a1a38 |
|
MD5 | 5602a98e306679ea69efd577469ea382 |
|
BLAKE2b-256 | 38457a59ee00311e6222ce94f4ea148497cf73abaee206ef6bd4d97936b4c634 |