Samhq model addition #35147

sushmanthreddy · 2024-12-08T09:14:24Z

Pull Request Title: Add HQ-SAM Functionality to Transformers Library

Model Overview

HQ-SAM (Segment Anything in High Quality) is an enhanced version of the Segment Anything Model (SAM), addressing limitations in mask quality for intricate structures and challenging segmentation tasks. The model refines SAM’s predictions using a High-Quality Output Token and Global-Local Feature Fusion while preserving SAM’s efficiency and zero-shot generalization capabilities.

According to the original implementation, HQ-SAM significantly improves mask boundaries and reduces segmentation errors by introducing minimal additional parameters (<0.5%) and computational overhead. The model is designed to maintain compatibility with SAM’s existing prompt-based design and mask decoder architecture.

Repository and Weights

The HQ-SAM implementation and pre-trained weights are available in the following repository:
https://github.com/SysCV/sam-hq

HQ-SAM provides three pre-trained weight variants:

sam_hq_vit_b – Small vision encoder.
sam_hq_vit_l – Medium vision encoder.
sam_hq_vit_h – Large vision encoder.

The main difference between these variants is the size of the Vision Transformer (ViT) encoder, while the prompt encoder and mask decoder remain unchanged.

Functionality

For each input (e.g., bounding boxes, 2D points, or coarse masks), HQ-SAM predicts high-quality binary masks that enhance segmentation precision. Improvements include:

More accurate boundaries.
Correction of coarse masks and segmentation errors.
Enhanced detail preservation for thin structures and complex object geometries.

Reviewers: @molbap

sushmanthreddy · 2024-12-22T09:37:30Z

still work in progress ,work is being converted from model files to modular file due to there is lot of code of code we can reuse from the sam model

sushmanthreddy added 2 commits December 8, 2024 14:41

added the configuartion for sam_hq

a284572

added the modeelling for sam_hq

2a3caa2

sushmanthreddy marked this pull request as draft December 8, 2024 09:15

added the sam hq mask decoder with hq features

92f291a

qubvel added New model Vision run-slow labels Dec 8, 2024

sushmanthreddy and others added 21 commits December 9, 2024 02:19

added the code for the samhq

395a5a5

added the code for the samhq

091da86

added the code for the samhq

2fa1ac4

Delete src/transformers/models/sam_hq/modelling_sam_hq.py

7453aad

added the code for the samhq

419575a

added the code for the samhq

138c1f3

added the chnages for the modeelling

0579d8a

added the code for sam hq for image processing

c7c0e81

added code for the sam hq model

2c8e5d9

added the required changes

2d836c4

added the changes

dc8b7e3

added the key mappings for the sam hq

6c454bc

adding the working code of samhq

d109255

added the required files

682cf0b

adding the pt object

f2564ef

added the push to hub account

896eb7c

added the args for the sam maks decoder

0869e0a

added the args for the sam hq vision config

f8b8c30

aded the some more documentation

7edc5a5

removed the unecessary spaces

4304188

all required chnages

395956b

sushmanthreddy marked this pull request as ready for review December 20, 2024 00:48

removed the image processor

f3d63ba

sushmanthreddy added 2 commits December 20, 2024 07:01

added the required file

89aaeae

added the changes for the checkcopies

c27c3b7

sushmanthreddy marked this pull request as draft December 22, 2024 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Samhq model addition #35147

Samhq model addition #35147

sushmanthreddy commented Dec 8, 2024 •

edited

Loading

sushmanthreddy commented Dec 22, 2024

Samhq model addition #35147

Are you sure you want to change the base?

Samhq model addition #35147

Conversation

sushmanthreddy commented Dec 8, 2024 • edited Loading

Model Overview

Repository and Weights

Functionality

sushmanthreddy commented Dec 22, 2024

sushmanthreddy commented Dec 8, 2024 •

edited

Loading