Awesome Computer Vision Prompting

Description

The repo is used for studying how to use Prompt Engineering for Computer Vision tasks.

Use state-of-art models like `Diffusion` or other baseline models to generate, inpaint, and paint images.

streamlit run basic_app.py --server.port 5555 --server.enableCORS false

User interaction with Segment Anything Model

streamlit run sam_app.py --server.port 5555 --server.enableCORS false

Inpaint via user-interaction with `Diffusion` and `SAM`

streamlit run sam_inpaint_app.py --server.port 5555 --server.enableCORS false

Image assets/van.jpg

Prompts:

[Van] A Volkswagen California van, parked on a beach, with a surfboard on the roof.
[Ground] Big grassland, a lot grass, green or grey grass.
[Between sky and ground] Endless grassland.
[Sky] Clear night sky with stars and full moon.

Few-shot of tracking objects via `SAM` in video

Use SAM to get mask of the object, use the mask of object to track through all frames.

streamlit run sam_tracker.py --server.port 5556 --server.enableCORS false

Video src: https://dl.dropbox.com/s/0lalmh95tylyw4s/sculpture.mp4

Working comments

I personally think that prompting is a new programming approach. Don’t assume that guiding models with natural language is easy. On the contrary, I believe it’s quite the opposite. Natural language programming lacks the syntax of traditional programming languages, which means there are no type checks or any protective mechanisms in place. If the model (AI) receives an inappropriate prompt, the generated results can be completely different from what was expected.

Here is a prompt I have used the Diffusion model in computer vision. Although it has brought some surprises, it is not actually my ultimate goal.

AI new trend, prompt engineering

Mouse interactive prompt engineering

Install via Docker

# setup
docker build --no-cache --tag cv-prompt-engineering -f Dockerfile .

# run
docker run --gpus all -v /home/ubuntu/work/cv-prompt-engineering/:/workspace/    -p 5555:5555 --rm  -it --shm-size=55gb -d cv-prompt-engineering tail -f /dev/null

Run

streamlit run basic_app.py --server.port 5555 --server.enableCORS false

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.vscode		.vscode
api		api
assets		assets
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
basic_app.py		basic_app.py
sam_app.py		sam_app.py
sam_inpaint_app.py		sam_inpaint_app.py
sam_tracker.py		sam_tracker.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Computer Vision Prompting

Description

Use state-of-art models like `Diffusion` or other baseline models to generate, inpaint, and paint images.

User interaction with Segment Anything Model

Inpaint via user-interaction with `Diffusion` and `SAM`

Prompts:

Few-shot of tracking objects via `SAM` in video

Working comments

Install via Docker

Run

About

Releases

Packages

Languages

XinyueZ/cv-prompt-engineering

Folders and files

Latest commit

History

Repository files navigation

Awesome Computer Vision Prompting

Description

Use state-of-art models like Diffusion or other baseline models to generate, inpaint, and paint images.

User interaction with Segment Anything Model

Inpaint via user-interaction with Diffusion and SAM

Prompts:

Few-shot of tracking objects via SAM in video

Working comments

Install via Docker

Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Use state-of-art models like `Diffusion` or other baseline models to generate, inpaint, and paint images.

Inpaint via user-interaction with `Diffusion` and `SAM`

Few-shot of tracking objects via `SAM` in video

Packages