Skip to content

[Official Repo] A Survey on Vision Mamba: Models, Applications and Challenges

Notifications You must be signed in to change notification settings

jiangren-tech/Awesome-Vision-Mamba-Models

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Awesome-Vision-Mamba-Models

Awesome License: MIT GitHub last commit GitHub issues Arxiv Page

[NEWS.2024/04/29] Our paper is released!

[NEWS.2024/05/02] 🎉🎉🎉Congratulations to Vision Mamba on being accepted in ICML 2024.

[NEWS.2024/07/06] The updated version of our paper is now available!

[NEWS.2024/09/26] 🎉🎉🎉Congratulations to VMamba on being accepted in NeurIPS 2024.

📢NOTE: If you have any questions, please don't hesitate to contact us at any of the following emails: [email protected], [email protected], [email protected], [email protected].

Mamba, a novel state space model, has gained recognition across diverse domains for its exceptional performance and efficient computational complexity. By addressing the limitations inherent in traditional visual foundation architectures, Mamba emerges as a promising contender poised to catalyze advancements in the field of computer vision.

⭐ This repository hosts a curated collection of literature associated with Mamba models in computer vision. Feel free to star and fork. For further details, refer to the following paper:

Visual Mamba: A Survey and New Outlooks
Rui Xu, Shu Yang, Yihui Wang, Yu Cai, Bo Du, Hao Chen
SMART Lab, The Hong Kong University of Science and Technology

If you find this repository is useful for you, please cite our paper:

@misc{2024visual_mamba,
      title={Visual Mamba: A Survey and New Outlooks}, 
      author={Rui Xu and Shu Yang and Yihui Wang and Yu Cai and Bo Du and Hao Chen},
      year={2024},
      eprint={2404.18861},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contents

Mamba

Date Paper Figure Link Code
Arxiv 23.12.01 (COLM 2024) Mamba: Linear-Time Sequence Modeling with Selective State Spaces image image Link Code
Arxiv 24.05.31 (ICML 2024) Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality image image Link Code

Related Survey

Date Paper Link
Arxiv 24.04.15 State Space Model for New-Generation Network Alternative to Transformers: A Survey Link
Arxiv 24.04.24 A Survey on Visual Mamba Link
Arxiv 24.04.24 Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges Link
Arxiv 24.05.07 Vision Mamba: A Comprehensive Survey and Taxonomy Link
Arxiv 24.06.05 Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis Link
Arxiv 24.06.24 Venturing into Uncharted Waters: The Navigation Compass from Transformer to Mamba Link
Arxiv 24.08.02 A Survey of Mamba Link

Visual Mamba Backbone Networks

image

Detailed Performance Comparison

Date Paper Figure Link Code
Arxiv 24.01.17 (ICML 2024) Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model image Link Code
Arxiv 24.01.18 (NeurIPS 2024) VMamba: Visual State Space Model image image Link Code
Arxiv 24.02.08 (ECCV 2024 Oral) Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data image Link Code
Arxiv 24.03.14 LocalMamba: Visual State Space Model with Windowed Selective Scan image Link Code
Arxiv 24.03.15 EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba image Link Code
Arxiv 24.03.22 SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series image Link Code
Arxiv 24.03.26 PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition image Link Code
Arxiv 24.05.23 (NeurIPS 2024) Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model image image Link Code
Arxiv 24.05.23 Scalable Visual State Space Model with Fractal Scanning image Link
Arxiv 24.05.23 Mamba-R: Vision Mamba ALSO Needs Registers image Link Code
Arxiv 24.05.29 Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain image Link Code
Arxiv 24.06.11 Autoregressive Pretraining with Mamba in Vision image image Link Code
Arxiv 24.07.10 MambaVision: A Hybrid Mamba-Transformer Vision Backbone image Link Code
Arxiv 24.07.18 GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model image Link Code
Arxiv 24.07.26 VSSD: Vision Mamba with Non-Causal State Space Duality image image Link Code
Arxiv 24.08.30 Stochastic Layer-Wise Shuffle: A Good Practice to Improve Vision Mamba Training image Link Code
Arxiv 24.09.15 SparX: A Sparse Cross-Layer Connection Mechanism for Hierarchical Vision Mamba and Transformer Networks image image Link Code
Arxiv 24.09.18 Distillation-free Scaling of Large SSMs for Images and Videos image Link
Arxiv 24.10.01 MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining image Link

Vision Application

Image

Natural Image

Date Paper Figure Link Code Task
Arxiv 24.02.06 U-shaped Vision Mamba for Single Image Dehazing image Link Code Dehazing/Low Light Enhancement/Deraining
Arxiv 24.02.08 Scalable Diffusion Models with State Space Backbone image Link Code Image Generation
Arxiv 24.02.23 (ECCV 2024) MambaIR: A Simple Baseline for Image Restoration with State-Space Model image Link Code Super-resolution/Denoising
Arxiv 24.03.04 MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection image Link Code Infrared Image Segmentation
Arxiv 24.03.13 Activating Wider Areas in Image Super-Resolution image Link Super-resolution
Arxiv 24.03.18 VmambaIR: Visual State Space Model for Image Restoration image Link Code Image Restoration
Arxiv 24.03.20 (ECCV 2024) ZigMa: A DiT-style Zigzag Mamba Diffusion Model image Link Code Generation
Arxiv 24.03.27 Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction image Link 3D Reconstruction
Arxiv 24.03.29 Learning Enriched Features via Selective State Spaces Model for Efficient Image Deblurring image Link Image Deblurring
Arxiv 24.04.04 InsectMamba: Insect Pest Classification with State Space Model image Link Image Classification
Arxiv 24.04.09 (NeurIPS 2024) MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection image Link code Anomaly Detection
Arxiv 24.04.11 (ACM MM 2024) DGMamba: Domain Generalization via Generalized State Space Model image Link Code Domain Generalization
Arxiv 24.04.15 (ACM MM 2024) FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining image Link Deraining
Arxiv 24.04.17 CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration image Link Denoising/Deblurring
Arxiv 24.04.22 MambaUIE: Unraveling the Ocean's Secrets with Only 2.8 FLOPs image Link Code Image Enhancement
Arxiv 24.05.03 FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space image Link Code Emotion recognition & Facial Expression Recognition & Detection
Arxiv 24.05.05 DVMSR: Distillated Vision Mamba for Efficient Super-Resolution image Link Code Super-Resolution
Arxiv 24.05.05 SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion image Link Motion Style Transfer
Arxiv 24.05.06 Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement image Link Code Image Enhancement
Arxiv 24.05.07 VMambaCC: A Visual State Space Model for Crowd Counting image Link Crowd Counting
Arxiv 24.05.14 WaterMamba: Visual State Space Model for Underwater Image Enhancement image Link Image Enhancement
Arxiv 24.05.16 IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model image Link Code Infrared Image Super-resolution
Arxiv 24.05.23 Efficient Visual State Space Model for Image Deblurring image Link Code Image Deblurring
Arxiv 24.05.23 DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis image Link Code Generation
Arxiv 24.05.25 Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation image Link Generation
Arxiv 24.05.25 (NeurIPS 2024) MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space image Link Code Image Enhancement
Arxiv 24.05.26 Image Deraining with Frequency-Enhanced State Space Model image Link Image Deraining
Arxiv 24.05.28 MambaVC: Learned Visual Compression with Selective State Spaces image Link Code Visual Compression
Arxiv 24.05.29 FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining image Link Image Deraining
Arxiv 24.06.03 LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding Network image Link Low-Light Enhancement
Arxiv 24.06.06 MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation image Link Depth Estimation
Arxiv 24.06.09 Mamba YOLO: SSMs-Based YOLO For Object Detection image Link Code Object Detection
Arxiv 24.06.12 PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement image Link Code Image Enhancement
Arxiv 24.06.18 LFMamba: Light Field Image Super-Resolution with State Space Model image Link Code Super-Resolution
Arxiv 24.06.13 Q-Mamba: On First Exploration of Vision Mamba for Image Quality Assessment image Link Image Quality Assessment
Arxiv 24.06.23 Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning image Link Super-resolution
Arxiv 24.06.24 Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces image Link Crack Segmentation
Arxiv 24.06.25 SUM: Saliency Unification through Mamba for Visual Attention Modeling image Link Code Visual Saliency Prediction
Arxiv 24.07.02 (ECCV 2024) MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders image Link Code Multi-Task Dense Scene Understanding
Arxiv 24.07.08 Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning image Link Code Few-Shot Class-Incremental Learning
Arxiv 24.07.11 (ICML 2024 Workshop) Parallelizing Autoregressive Generation with Variational State Space Models image Link Generation
Arxiv 24.07.12 (NeurIPS 2024) Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba image Link Code 3D Hand Reconstruction
Arxiv 24.07.16 PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer image Link Image Classification/Object Detection/Point Cloud Object Detection
Arxiv 24.07.22 Mamba meets crack segmentation image Link Code Segmentation
Arxiv 24.07.23 MxT: Mamba x Transformer for Image Inpainting image Link Image Inpainting
Arxiv 24.07.25 ALMRR: Anomaly Localization Mamba on Industrial Textured Surface with Feature Reconstruction and Refinement image Link Code Anomaly Localization
Arxiv 24.07.27 Mamba-UIE: Enhancing Underwater Images with Physical Model Constraint image Link Code Image Enhancement
Arxiv 24.07.27 Mamba? Catch The Hype Or Rethink What Really Helps for Image Registration image Link Code Image Registration
Arxiv 24.08.01 MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection image Link Monocular 3D Object Detection
Arxiv 24.08.02 (ACM MM 2024) Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement image Link Code Image Enhancement
Arxiv 24.08.04 DeMansia: Mamba Never Forgets Any Tokens Link Code Classification
Arxiv 24.08.05 LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba image Link Generation
Arxiv 24.08.06 Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network image Link Human Pose Estimation
Arxiv 24.08.07 PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model image Link Human Pose Estimation
Arxiv 24.08.11 Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition image Link Palm-Vein Recognition
Arxiv 24.08.16 QMambaBSR: Burst Image Super-Resolution with Query State Space Model image Link Super-Resolution
Arxiv 24.08.19 Multi-Scale Representation Learning for Image Restoration with State-Space Model image Link Image Restoration
Arxiv 24.08.21 MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive Reordering image Link Code Occupancy Prediction
Arxiv 24.08.21 MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs image Link Code Super-resolution
Arxiv 24.08.22 Scalable Autoregressive Image Generation with Mamba image Link Code Generation
Arxiv 24.08.23 O-Mamba: O-shape State-Space Model for Underwater Image Enhancement image Link Code Image Enhancement
Arxiv 24.08.27 ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning image Link Code Zero-Shot Learning
Arxiv 24.08.27 MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders image Link Code Multi-Task Dense Scene Understanding
Arxiv 24.08.31 A Hybrid Transformer-Mamba Network for Single Image Deraining image Link Code Deraining
Arxiv 24.09.02 DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios image Link Object Detection
Arxiv 24.09.09 DSDFormer: An Innovative Transformer-Mamba Framework for Robust High-Precision Driver Distraction Identification image Link Driver Distraction Identification
Arxiv 24.09.11 Retinex-RAWMamba: Bridging Demosaicing and Denoising for Low-Light RAW Image Enhancement image Link Code Image Enhancement
Arxiv 24.09.15 (ECCV 2024 Workshop) Famba-V: Fast Vision Mamba with Cross-Layer Token Fusion image Link Code Efficiency
Arxiv 24.09.16 Mamba-ST: State Space Model for Efficient Style Transfer image Link Code Style Transfer
Arxiv 24.09.20 OneBEV: Using One Panoramic Image for Bird's-Eye-View Semantic Mapping image Link Code Bird's-Eye-View Semantic Mapping
Arxiv 24.09.25 Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement image Link Code Image Enhancement
Arxiv 24.09.29 (NeurIPS 2024) Hybrid Mamba for Few-Shot Segmentation image Link Code Few-Shot Segmentation
Neurocomputing 2024 MambaTSR: You only need 90k parameters for traffic sign recognition image Link Code Traffic Sign Recognition
ACM MM 2024 Realistic Full-Body Motion Generation from Sparse Tracking with State Space Model image Link Motion Generation

Remote Sensing Image

Date Paper Figure Link Code Task
Arxiv 24.03.28 (GRSL 2024) RSMamba: Remote Sensing Image Classification with State Space Model image Link Code Remote Sensing Images Classification
Arxiv 24.04.02 Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model image Link Code Semantic Segmentation
Arxiv 24.04.03 (GRSL 2024) RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation image Link Code Semantic Segmentation
Arxiv 24.04.03 RS-Mamba for Large Remote Sensing Image Dense Prediction image Link Code Semantic Segmentation/Change Detection
Arxiv 24.04.04 ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model image Link Code Change Detection/Building Damage Assessment
Arxiv 24.04.12 SpectralMamba: Efficient Mamba for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.04.15 HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising image Link Hyperspectral Denoising
Arxiv 24.04.28 S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.04.29 Spectral-Spatial Mamba for Hyperspectral Image Classification image Link Hyperspectral Image Classification
Arxiv 24.05.02 SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients image Link Code Detection
Arxiv 24.05.02 (TGRS 2024) SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising image Link Code Hyperspectral Image Denoising
Arxiv 24.05.08 Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution image Link Super Resolution
Arxiv 24.05.13 GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB Images image Link Code Spectral Reconstruction from RGB Images
Arxiv 24.05.14 Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study image Link Semantic Segmentation
Arxiv 24.05.16 RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing image Link Dehazing
Arxiv 24.05.17 CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation image Link Code Semantic Segmentation
Arxiv 24.05.20 Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.05.21 3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification image Link Hyperspectral Image Classification
Arxiv 24.06.01 Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging image Link Code Spectral Compressive Imaging
Arxiv 24.06.06 CDMamba: Remote Sensing Image Change Detection with Mamba image Link Code Change Detection
Arxiv 24.06.09 HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space Model image Link Code Dehazing
Arxiv 24.06.11 DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification image Link Hyperspectral Image Classification
Arxiv 24.06.16 PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery image Link Code Semantic Segmentation
Arxiv 24.07.08 A Mamba-based Siamese Network for Remote Sensing Change Detection image Link Code Change Detection
Arxiv 24.07.09 HTD-Mamba: Efficient Hyperspectral Target Detection with Pyramid State Space Model image Link Code Hyperspectral Target Detection
Arxiv 24.07.11 DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing image Link Code Oriented Object Detection
Arxiv 24.07.11 GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.08.01 Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local Enhancement image Link Snapshot Compressive Imaging
Arxiv 24.08.02 Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.08.02 WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.08.02 Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.08.21 UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images image Link Code Semantic Segmentation
Arxiv 24.08.26 MSFMamba: Multi-Scale Feature Fusion State Space Model for Multi-Source Remote Sensing Image Classification image Link Code Hyperspectral Image Classification
Arxiv 24.09.05 UV-Mamba: A DCN-Enhanced State Space Model for Urban Village Boundary Identification in High-Resolution Remote Sensing Images image Link Segmentation
Arxiv 24.09.10 PPMamba: A Pyramid Pooling Local Auxiliary SSM-Based Model for Remote Sensing Image Semantic Segmentation image image Link Semantic Segmentation
Arxiv 24.09.15 SITSMamba for Crop Classification based on Satellite Image Time Series image Link Code SITS Classification
TGRS 2024 MambaHSI: Spatial–Spectral Mamba for Hyperspectral Image Classification image Link Code Hyperspectral Image Classification
GRSL 2024 MambaFormerSR: A Lightweight model for Remote-Sensing Image Super-Resolution image Link Super-Resolution
ACM MM 2024 VmambaSCI: Dynamic Deep Unfolding Network with Mamba for Compressive Spectral Imaging image Link Compressive Spectral Imaging

Medical Image

Date Paper Figure Link Code Task
Arxiv 24.01.09 U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation image Link Code 2D Medical Segmentation/
3D Medical Segmentation
Arxiv 24.01.24 (MICCAI 2024) SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation image Link Code 3D Medical Segmentation
Arxiv 24.02.04 VM-UNet: Vision Mamba UNet for Medical Image Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.02.05 nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model image Link Code 3D Medical Segmentation
Arxiv 24.02.05 (MICCAI 2024) Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining image Link Code 2D Medical Segmentation
Arxiv 24.02.07 Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.02.09 FD-Vision Mamba for Endoscopic Exposure Correction image Link Code Endoscopic Exposure Correction
Arxiv 24.02.11 (KBS 2024) Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.02.13 P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation image Link 2D Medical Segmentation
Arxiv 24.02.16 Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.02.28 MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation image Link Code Medical Image Reconstruction/Uncertainty Estimation
Arxiv 24.03.06 MedMamba: Vision Mamba for Medical Image Classification image Link Code 2D Medical Classification
Arxiv 24.03.08 LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation image Link Code 2D Medical Segmentation/
3D Medical Segmentation
Arxiv 24.03.08 MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models image Link Cancer Subtyping
Arxiv 24.03.11 (MICCAI 2024) MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology image Link Code Cancer Subtyping/
Survival Prediction
Arxiv 24.03.12 Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention image Link Code 2D Medical Segmentation/
3D Medical Segmentation
Arxiv 24.03.12 (MICCAI 2024) LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.03.13 MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction image Link Code Radiation Dose Prediction (Segmentation)
Arxiv 24.03.14 (ISBRA 2024) VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.03.20 H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.03.20 (MICCAI 2024) ProMamba: Prompt-Mamba for polyp segmentation image Link 2D Medical Segmentation
Arxiv 24.03.25 CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification image Link Alzheimer’s disease Classification (CT/MRI)
Arxiv 24.03.26 Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion image Link 2D Medical Segmentation (2D MRI)
Arxiv 24.03.26 Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models image Link Image Resotration
Arxiv 24.03.26 Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation image Link 2D Medical Segmentation
Arxiv 24.03.29 UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation image Link Code 2D Medical Segmentation
Arxiv 24.04.01 T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation image Link Code 3D Medical Segmentation (Tooth)
Arxiv 24.04.10 ViM-UNet: Vision Mamba for Biomedical Segmentation image Link Code 2D Medical Segmentation (Cell/Neurite)
Arxiv 24.04.19 (CVPR 2024 Workshop) Vim4Path: Self-Supervised Vision Mamba for Histopathology Images image Link Code Cancer Subtyping
Arxiv 24.04.26 Optimizing Universal Lesion Segmentation: State Space Model-Guided Hierarchical Networks with Feature Importance Adjustment image Link Universal Lesion Segmentation
Arxiv 24.04.26 Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model image Link ODT Sparse Reconstruction
Arxiv 24.05.05 AC-MAMBASEG: An adaptive convolution and Mamba-based architecture for enhanced skin lesion segmentation image Link Code Skin Lesion Segmentation
Arxiv 24.05.08 HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation image Link 2D Medical Segmentation
Arxiv 24.05.09 VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis image Link Medical Image Generation
Arxiv 24.05.24 MUCM-Net: A Mamba Powered UCM-Net for Skin Lesion Segmentation image Link Code Medical Image Segmentation
Arxiv 24.05.25 UU-Mamba: Uncertainty-aware U-Mamba for Cardiac Image Segmentation image Link Medical Image Segmentation
Arxiv 24.09.22 UU-Mamba: Uncertainty-aware U-Mamba for Cardiovascular Segmentation image Link Code Medical Image Segmentation
Arxiv 24.05.27 TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction image Link Code Pre-training/Medical Image Segmentation
Arxiv 24.05.27 Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba image Link Medical Image Reconstruction
Arxiv 24.05.28 (MICCAI 2024 Oral) Cardiovascular Disease Detection from Multi-View Chest X-rays with BI-Mamba image Link Code CVD Risk Prediction
Arxiv 24.06.01 SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation image Link Medical Image Segmentation
Arxiv 24.06.05 Combining Graph Neural Network and Mamba to Capture Local and Global Tissue Spatial Relationships in Whole Slide Images image Link Code Cancer Subtyping/Survival Prediction
Arxiv 24.06.09 Vision Mamba: Cutting-Edge Classification of Alzheimer's Disease with 3D MRI Scans image Link 3D Medical Classification
Arxiv 24.06.09 Convolution and Attention-Free Mamba-based Cardiac Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.06.10 MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba image Link Code Medical Image Segmentation
Arxiv 24.06.12 On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models Link Code Medical Image Segmentation
Arxiv 24.06.22 Soft Masked Mamba Diffusion Model for CT to MRI Conversion image Link Code CT to MRI Conversion
Arxiv 24.07.04 Vision Mamba for Classification of Breast Ultrasound Images image Link Classification
Arxiv 24.07.08 (MICCAI 2024) Deform-Mamba Network for MRI Super-Resolution image Link Super-resolution
Arxiv 24.07.08 Self-Prior Guided Mamba-UNet Networks for Medical Image Super-Resolution image Link Super-resolution
Arxiv 24.07.11 SR-Mamba: Effective Surgical Phase Recognition with State Space Model image Link Code Surgical Phase Recognition
Arxiv 24.07.11 SliceMamba for Medical Image Segmentation image Link Medical Image Segmentation
Arxiv 24.08.14 Costal Cartilage Segmentation with Topology Guided Deformable Mamba: Method and Benchmark image Link Medical Image Segmentation
Arxiv 24.08.15 MambaMIM: Pre-training Mamba with State Space Token-interpolation image Link Code Medical Image Segmentation
Arxiv 24.08.21 HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.08.23 Hierarchical Spatio-Temporal State-Space Modeling for fMRI Analysis image Link Medical Image Classification and Regression
Arxiv 24.08.25 MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.08.26 (MICCAI 2024) ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation image Link Medical Image Segmentation
Arxiv 24.08.26 LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.08.28 SpineMamba: Enhancing 3D Spinal Segmentation in Clinical Imaging through Residual Visual Mamba Layers and Shape Priors image Link Medical Image Segmentation
Arxiv 24.09.06 MpoxMamba: A Grouped Mamba-based Lightweight Hybrid Network for Mpox Detection image Link Code Medical Image Classification
Arxiv 24.09.06 Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space Model image Link Medical Image Segmentation
Arxiv 24.09.09 SX-Stitch: An Efficient VMS-UNet Based Framework for Intraoperative Scoliosis X-Ray Image Stitching image Link Medical Image Stitching
Arxiv 24.09.12 Microscopic-Mamba: Revealing the Secrets of Microscopic Images with Just 4M Parameters image Link Code Medical Image Classification
Arxiv 24.09.12 OCTAMamba: A State-Space Model Approach for Precision OCTA Vasculature Segmentation image Link Code Medical Image Segmentation
Arxiv 24.09.12 MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation image Link Medical Image Segmentation
Arxiv 24.09.13 (MICCAI 2024) Tri-Plane Mamba: Efficiently Adapting Segment Anything Model for 3D Medical Images image Link Code Medical Image Segmentation
Arxiv 24.09.17 SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance image Link Code Medical Image Segmentation
Arxiv 24.09.18 SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba image Link Surgical Phase Recognition
Arxiv 24.09.19 MambaRecon: MRI Reconstruction with Structured State Space Models image Link Code Medical Image Reconstruction
Arxiv 24.09.19 MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.09.24 Segmentation Strategies in Deep Learning for Prostate Cancer Diagnosis: A Comparative Study of Mamba, SAM, and YOLO image Link Code Medical Image Segmentation
Arxiv 24.09.25 Classification of Gleason Grading in Prostate Cancer Histopathology Images Using Deep Learning Techniques: YOLO, Vision Transformers, and Vision Mamba image Link Code Medical Image Classification
Arxiv 24.09.26 EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image Segmentation image Link Code Medical Image Segmentation
Arxiv 24.09.28 MambaEviScrib: Mamba and Evidence-Guided Consistency Make CNN Work Robustly for Scribble-Based Weakly Supervised Ultrasound Image Segmentation image Link Code Medical Image Segmentation
KDD 2024 Workshop State Space Model-based Classification of Major Depressive Disorder Across Multiple Imaging Sites image Link Medical Image Classification
Scientific Reports 2024 A mixed Mamba U-net for prostate segmentation in MR images image Link Medical Image Segmentation
MICCAI 2024 EM-Net: Efficient Channel and Frequency Learning with Mamba for 3D Medical Image Segmentation Medical Image Segmentation
MICCAI 2024 PathMamba: Weakly Supervised State Space Model for Multi-class Segmentation of Pathology Images Medical Image Segmentation
MICCAI 2024 Efficient and Gender-adaptive Graph Vision Mamba for Pediatric Bone Age Assessment Bone Age Assessment

Video

Date Paper Figure Link Code Task
Arxiv 24.01.25 Vivim: a Video Vision Mamba for Medical Video Object Segmentation image Link Code Medical Video Segmentation
Arxiv 24.03.11 (ECCV 2024) VideoMamba: State Space Model for Efficient Video Understanding image Link Code Action Recognition/Video Understanding/Text-to-video Retrieval
Arxiv 24.03.12 SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces image Link Code Video Generation
Arxiv 24.03.14 Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding image Link Code Action Recognition/Action Localization/...
Arxiv 24.04.09 RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos image Link Code Remote photoplethysmography Prediction
Arxiv 24.04.11 Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos image Link Skeleton Action Recognition
Arxiv 24.05.05 Matten: Video Generation with Mamba-Attention image Link Video Generation
Arxiv 24.05.30 DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark image Link Code AI-Generated Video Detection
Arxiv 24.06.18 Slot State Space Models image Link Object-centric Video Understanding/3D Visual Reasoning/Video Prediction
Arxiv 24.07.02 VideoMambaPro: A Leap Forward for Mamba in Video Understanding image Link Code Video Understanding
Arxiv 24.07.02 (NeurIPS 2024) VFIMamba: Video Frame Interpolation with State Space Models image Link Code Video Frame Interpolation
Arxiv 24.07.03 BVI-RLV: A Fully Registered Dataset and Benchmarks for Low-Light Video Enhancement image Link Code Low-Light Video Enhancement
Arxiv 24.07.04 QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024 image Link Video Action Forecasting
Arxiv 24.07.11 (ECCV 2024) VideoMamba: Spatio-Temporal Selective State Space Model image Link Code Action Recognition
Arxiv 24.07.25 Harnessing Temporal Causality for Advanced Temporal Action Detection image Link Code Moment Queries/Action Recognition/Action Detection/Audio-Based Interaction Detection
Arxiv 24.08.15 MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking image Link RGB-T Tracking
Arxiv 24.08.17 (ACM MM 2024 Oral) MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model image Link Multiple Object Tracking
Arxiv 24.08.20 DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal Mamba image Link Video Demoireing
Arxiv 24.08.31 TrackSSM: A General Motion Predictor by State-Space Model image Link Motion Prediction
Arxiv 24.09.02 FMRFT: Fusion Mamba and DETR for Query Time Sequence Intersection Fish Tracking image Link Fish Tracking
Arxiv 24.09.04 MADiff: Motion-Aware Mamba Diffusion Models for Hand Trajectory Prediction on Egocentric Videos image Link Code Hand Trajectory Prediction
Arxiv 24.09.18 (CCBR 2024) PhysMamba: Efficient Remote Physiological Measurement with SlowFast Temporal Difference Mamba image Link Code Remote Photoplethysmography
CVPR 2024 Workshop VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting image Link Code Spatiotemporal Forecasting
ACM MM 2024 Object-Level Pseudo-3D Lifting for Distance-Aware Tracking image Link Tracking
ACM MM 2024 (Oral) RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining image Link Code Deraining

Point Cloud

Date Paper Figure Link Code Task
Arxiv 24.02.16 (NeurIPS 2024) PointMamba: A Simple State Space Model for Point Cloud Analysis image Link Code Classification, Part Segmentation
Arxiv 24.02.23 (CVPR 2024 Spotlight, SSM) State Space Models for Event Cameras image Link Code Object Detection
Arxiv 24.03.01 Point Cloud Mamba: Point Cloud Learning via State Space Model image Link Code Classification, Part Segmentation, Semantic Segmentation
Arxiv 24.03.11 Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy image Link Code Classification, Semantic Segmentation
Arxiv 24.04.08 3DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering image Link Point Cloud Filtering
Arxiv 24.04.10 3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion image Link Point Cloud Completion
Arxiv 24.04.23 (ACM MM 2024) Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model image Link Classification, Part Segmentation
Arxiv 24.05.09 Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba image Link Classification, Regression
Arxiv 24.05.13 OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition image Link Code LiDAR Place Recognition
Arxiv 24.05.23 MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models image Link Point Cloud Video Understanding
Arxiv 24.05.24 PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis image Link Code Classification, Part Segmentation
Arxiv 24.05.27 LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling image Link Classification, Part Segmentation, Object Detection
Arxiv 24.06.07 Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs image Link Code Generation
Arxiv 24.06.10 PointABM: Integrating Bidirectional State Space Model with Multi-Head Self-Attention for Point Cloud Analysis image Link Classification
Arxiv 24.06.15 (NeurIPS 2024) Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection image Link Code Object Detection
Arxiv 24.06.25 Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model image Link Semantic Segmentation
Arxiv 24.07.15 Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model image Link Semantic Segmentation, Instance Segmentation
Arxiv 24.07.25 LION: Linear Group RNN for 3D Object Detection in Point Clouds image image Link Code Object Detection
Arxiv 24.08.19 Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms image Link Code Action Recognition
Arxiv 24.08.20 MambaEVT: Event Stream based Visual Object Tracking using State Space Model image Link Code Object Tracking
Arxiv 24.08.20 MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation image Link Code Object Segmentation
Arxiv 24.08.20 OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model image Link Code Semantic Prediction/Scene Completion
Arxiv 24.09.17 Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation image Link Object Detection
Arxiv 24.09.24 FSF-Net: Enhance 4D Occupancy Forecasting with Coarse BEV Scene Flow for Autonomous Driving image Link 4D Occupancy Forecasting
ACM MM 2024 MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model image Link Object Segmentation

Multi-Modal

Date Paper Figure Link Code Task Modality
Arxiv 24.01.25 MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration image Link Code Registration MRI & CT
Arxiv 24.02.19 Pan-Mamba: Effective pan-sharpening with State Space Model image Link Code Pansharpening HISR Images & LRMS Images
Arxiv 24.03.07 InstructGIE: Towards Generalizable Image Editing image Link Image Editing Image & Text
Arxiv 24.03.12 (ECCV 2024) Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM image Link Code Text-to-Motion Generation Motion & Text
Arxiv 24.03.20 VL-Mamba: Exploring State Space Models for Multimodal Learning image Link Code MLLM tasks Image & Text
Arxiv 24.03.21 Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference image Link Code MLLM tasks Image & Text
Arxiv 24.03.26 (ECCV 2024) ReMamber: Referring Image Segmentation with Mamba Twister image Link Referring Image Segmentation Image & Text
Arxiv 24.04.01 SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding image Link Temporal Video Grounding Video & Text
Arxiv 24.04.05 Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation image Link Code Semantic Segmentation RGB Images & Depth/Thermal Images
Arxiv 24.04.07 VMambaMorph: a Multi-Modality Deformable Image Registration Framework based on Visual State Space Model with Cross-Scan Module image Link Code Registration MRI & CT
Arxiv 24.04.09 Deep Mamba Multi-modal Learning image Link Image & Text Multimedia Retrieval
Arxiv 24.04.11 SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction image Link Cancer Subtyping/Survival Prediction WSIs & Gene
Arxiv 24.04.11 FusionMamba: Efficient Image Fusion with State Space Model image Link Pansharpening HISR Images & LRMS Images
Arxiv 24.04.12 MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion image Link Multi-modality Image Fusion RGB & Thermal Images, MRI & CT/PET/SPECT
Arxiv 24.04.14 Fusion-Mamba for Cross-modality Object Detection image Link Visible-infrared Images Fusion RGB Images & Infrared Images
Arxiv 24.04.14 A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion image Link Pansharpening HISR Images & LRMS Images
Arxiv 24.04.15 FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba image Link Code Image Fusion RGB & Infrared Images, MRI & CT/PET/SPECT, PC & GFP
Arxiv 24.04.17 Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion image Link Temporal Grounding Motion & Text
Arxiv 24.04.25 CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions image Link Code Visible-infrared Images Fusion RGB Images & Infrared Images
Arxiv 24.04.27 Revisiting Multi-modal Emotion Learning with Broad State Space Models and Probability-guidance Fusion image Link Multi-modal Emotion Recognition Text & Video & Audio
Arxiv 24.04.28 Mamba-FETrack: Frame-Event Tracking via State Space Model image Link Code RGB-Event Tracking RGB Frames & Event
Arxiv 24.04.29 (GRSL 2024) RSCaMa: Remote Sensing Image Change Captioning with State Space Model image Link Code Image Captioning Remote Sensing Image & Text
Arxiv 24.04.30 CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation image Link Code OOD Image & Text
Arxiv 24.05.22 I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling image Link Code Medical Image Generation MRI/CT
Arxiv 24.05.24 (NeurIPS 2024) Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models image Link Code Large Language and Vision Model Image & Text (Qestion/Rationale)
Arxiv 24.05.29 (NeurIPS 2024) Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model image Link multi-modal sentiment analysis Text & Video & Audio
Arxiv 24.05.31 S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion image Link Image Fusion RGB Images & Infrared Images
Arxiv 24.06.02 MGI: Multimodal Contrastive Pre-training of Genomic and Medical Imaging image Link Multimodal Contrastive Pre-training Medical Image & Genomic
Arxiv 24.06.03 Dimba: Transformer-Mamba Diffusion Models image Link Code Text to Image Generation Image & Text
Arxiv 24.06.06 (NeurIPS 2024) RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation image Link Code Robot Reasoning and Manipulation Image & Text
Arxiv 24.06.10 MVGamba: Unify 3D Content Generation as State Space Sequence Modeling image Link 3D Generation Image & Text
Arxiv 24.07.02 MMR-Mamba: Multi-Contrast MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion image Link Image Fusion Multi-Contrast MRI
Arxiv 24.07.14 InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation image Link Code Text-to-Motion Generation Motion & Text
Arxiv 24.07.15 An Empirical Study of Mamba-based Pedestrian Attribute Recognition image Link Code Pedestrian Attribute Recognition Image & Text
Arxiv 24.07.15 OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting image Link 360-degree Image Out-painting Image & Text
Arxiv 24.07.22 GFE-Mamba: Mamba-based AD Multi-modal Progression Assessment via Generative Feature Extraction from MCI image Link Code AD Progression Assessment MRI & PET
Arxiv 24.07.29 ML-Mamba: Efficient Multi-Modal Large Language Model Utilizing Mamba-2 image Link Code MLLM Tasks Image & Text
Arxiv 24.07.29 (ACM MM 2024) MambaGesture: Enhancing Co-Speech Gesture Generation with Mamba and Disentangled Multi-Modality Fusion image Link Co-Speech Gesture Generation Motion & Audio
Arxiv 24.08.01 DiM-Gesture: Co-Speech Gesture Generation with Adaptive Layer Normalization Mamba-2 framework image Link Code Co-Speech Gesture Generation Motion & Audio
Arxiv 24.08.02 MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection image Link Code Pedestrian Detection RGB & Thermal Images
Arxiv 24.08.02 PhysMamba: Leveraging Dual-Stream Cross-Attention SSD for Remote Physiological Measurement image Link Remote Physiological Measurement Video & rPPG
Arxiv 24.08.03 JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Language Model image Link Motion & Audio
Arxiv 24.08.07 DRAMA: An Efficient End-to-end Motion Planner for Autonomous Driving with Mamba image Link Driver Motion Plan Image & Text
Arxiv 24.08.15 ColorMamba: Towards High-quality NIR-to-RGB Spectral Translation with Mamba image Link Code NIR-to-RGB Translation NIR Images & RGB Images
Arxiv 24.08.16 RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba image Link RGBT Tracking RGB Videos & TIR Videos
Arxiv 24.08.19 R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation image Link Code Medical Report Generation Image & Text
Arxiv 24.08.19 OccMamba: Semantic Occupancy Prediction with State Space Models image Link Semantic Occupancy Prediction LiDAR Points & RGB Images
Arxiv 24.08.20 Event Stream based Sign Language Translation: A High-Definition Benchmark Dataset and A New Algorithm image Link Code Event Stream based Sign Language Translation Event & Text
Arxiv 24.08.20 MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval image Link Code Text-video Retrieval Video & Text
Arxiv 24.08.22 Adapt CLIP as Aggregation Instructor for Image Dehazing image Link Dehazing Image & Text
Arxiv 24.08.27 DualKanbaFormer: Kolmogorov-Arnold Networks and State Space Model DualKanbaFormer: Kolmogorov-Arnold Networks and State Space Model Transformer for Multimodal Aspect-based Sentiment Analysis image Link Multi-modal Sentiment Analysis Image & Text
Arxiv 24.08.28 MambaPlace:Text-to-Point-Cloud Cross-Modal Place Recognition with Attention Mamba Mechanisms image Link Code Cross-Modal Place Recognition Point Cloud & Text
Arxiv 24.09.03 PixelBytes: Catching Unified Embedding for Multimodal Generation image Link Code Multi-Modal Generation Image & Text
Arxiv 24.09.03 Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image Fusion image Link Multi-Modality Image Fusion HISR Images & LRMS Images, MRI & CT/PET/SPECT
Arxiv 24.09.04 LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture image Link Code MLLM Tasks Image & Text
Arxiv 24.09.05 Why mamba is effective? Exploit Linear Transformer-Mamba Network for Multi-Modality Image Fusion image Link Multi-Modality Image Fusion RGB & Thermal Images, MRI & CT/PET/SPECT
Arxiv 24.09.08 Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations image Link Code Multi-modal Emotion Recognition Text & Audio & Video
Arxiv 24.09.09 Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling image Link Code MLLM Tasks Image & Text
Arxiv 24.09.11 Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models image Link Code Action Prediction Point Cloud & Robot State
Arxiv 24.09.13 Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection image Link Code Open-Vocabulary Detection Image & Text
Arxiv 24.09.17 Mamba Fusion: Learning Actions Through Questioning image Link Code Action Prediction/Action Anticipation Video & Text
Arxiv 24.09.22 GraspMamba: A Mamba-based Language-driven Grasp Detection Framework with Hierarchical Feature Learning image Link Code Grasp Detection Image & Text
Arxiv 24.09.24 DepMamba: Progressive Fusion Mamba for Multimodal Depression Detection image Link Code Multi-modal Depression Detection Video & Audio
Arxiv 24.09.30 MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation image Link Image Generation Image & Text
Arxiv 24.10.01 CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset image Link Code Medical Report Generation Image & Text
TGRS 2024 Mask-Guided Mamba Fusion for Drone-based Visible-Infrared Vehicle Detection image Link Cross-Modal Detection RGB Images & Infrared Image

Others

Date Paper Figure Link Code Task
Arxiv 24.02.24 Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning image Link Code Food Classification
Arxiv 24.03.08 Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy image Link Code Endoscope Tip Tracking
Arxiv 24.03.14 MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models image Link Gesture Synthesis
Arxiv 24.03.22 Music to Dance as Language Translation using Sequence Models image Link Code Music-to-Dance

Valuable Insights

Date Paper Link
Arxiv 24.03.03 The Hidden Attention of Mamba Models Link
Arxiv 24.03.15 On the low-shot transferability of [V]-Mamba? Link
Arxiv 24.03.16 Understanding Robustness of Visual State Space Models for Image Classification Link
Arxiv 24.05.13 MambaOut: Do We Really Need Mamba for Vision? Link
Arxiv 24.05.26 (NeurIPS 2024) Demystify Mamba in Vision: A Linear Attention Perspective Link
Arxiv 24.05.26 A Unified Implicit Attention Formulation for Gated-Linear Recurrent Sequence Models Link
Arxiv 24.06.11 (NeurIPS 2024) MambaLRP: Explaining Selective State Space Sequence Models Link
Arxiv 24.06.13 Towards Evaluating the Robustness of Visual State Space Models Link

Other Domains

Reinforcement Learning

Date Paper Figure Link Code
Arxiv 24.05.20 (NeurIPS 2024) Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? image Link
Arxiv 24.05.31 Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling image Link
Arxiv 24.06.04 Mamba as Decision Maker: Exploring Multi-scale Sequence Modeling in Offline Reinforcement Learning image Link Code
Arxiv 24.06.08 (NeurIPS 2024) Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL image Link
Arxiv 24.06.12 MaIL: Improving Imitation Learning with Mamba image image Link
Arxiv 24.06.21 KalMamba: Towards Efficient Probabilistic State Space Models for RL under Uncertainty image Link
Arxiv 24.08.05 Context-aware Mamba-based Reinforcement Learning for social robot navigation image Link
Arxiv 24.08.20 Integrating Multi-Modal Input Token Mixer Into Mamba-Based Decision Models: Decision MetaMamba image Link Code
Arxiv 24.09.04 Mamba as a motion encoder for robotic imitation learning image Link
Arxiv 24.09.23 DiSPo: Diffusion-SSM based Policy Learning for Coarse-to-Fine Action Discretization image Link

Graph Learning

Date Paper Figure Link Code
Arxiv 24.02.13 (KDD 2024) Graph Mamba: Towards Learning on Graphs with State Space Models image Link Code
Arxiv 24.05.22 HeteGraph-Mamba: Heterogeneous Graph Learning via Selective State Space Model image Link
Arxiv 24.08.08 DyGMamba: Efficiently Modeling Long-Term Temporal Dependency on Continuous-Time Dynamic Graphs with State Space Models image Link
Arxiv 24.08.13 DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs image Link
Arxiv 24.09.18 Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes image Link
KDD 2024 Workshop Identifying Subphenotypes for Sepsis with Acute Kidney Injury via Multimodal Graph State Space Models image Link

Audio

Date Paper Figure Link Code
Arxiv 24.04.02 SPMamba: State-space model is all you need in speech separation image Link
Arxiv 24.05.02 TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms image Link
Arxiv 24.05.10 An Investigation of Incorporating Mamba for Speech Enhancement image Link
Arxiv 24.05.20 SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model image Link Code
Arxiv 24.05.21 Mamba in Speech: Towards an Alternative to Self-Attention image Link
Arxiv 24.05.22 Audio Mamba: Pretrained Audio State Space Model For Audio Tagging Link Code
Arxiv 24.06.04 Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations image Link Code
Arxiv 24.06.05 Audio Mamba: Bidirectional State Space Model for Audio Representation Learning image Link Code
Arxiv 24.06.10 RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection image Link Code
Arxiv 24.06.24 (Interspeech 2024) Exploring the Capability of Mamba in Speech Applications image Link
Arxiv 24.07.13 Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis image Link Mamba-TasNet Code ConMamba Code
Arxiv 24.08.09 SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation image Link
Arxiv 24.09.04 MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision image Link
Arxiv 24.09.04 (SLT 2024 Workshop) An Analysis of Linear Complexity Attention Substitutes with BEST-RQ image Link
Arxiv 24.09.07 Cross-attention Inspired Selective State Space Models for Target Sound Extraction image image Link
Arxiv 24.09.08 TF-Mamba: A Time-Frequency Network for Sound Source Localization image Link
Arxiv 24.09.09 Vector Quantized Diffusion Model Based Speech Bandwidth Extension image Link
Arxiv 24.09.10 A Two-Stage Band-Split Mamba-2 Network for Music Separation image Link Code
Arxiv 24.09.11 Rethinking Mamba in Speech Processing by Self-Supervised Models image Link Code
Arxiv 24.09.13 MambaFoley: Foley Sound Generation using Selective State-Space Models image Link Code
Arxiv 24.09.14 Wave-U-Mamba: An End-To-End Framework For High-Quality And Efficient Speech Super Resolution image Link
Arxiv 24.09.15 Self-supervised Learning for Acoustic Few-Shot Classification image Link
Arxiv 24.09.16 Ultra-Low Latency Speech Enhancement - A Comprehensive Study Link
Arxiv 24.09.16 Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement image Link
Arxiv 24.09.18 Dense-TSNet: Dense Connected Two-Stage Structure for Ultra-Lightweight Speech Enhancement image Link Code
Arxiv 24.09.19 DeFT-Mamba: Universal Multichannel Sound Separation and Polyphonic Audio Classification image Link
Arxiv 24.09.26 MC-SEMamba: A Simple Multi-channel Extension of SEMamba image Link
Arxiv 24.09.27 Speech-Mamba: Long-Context Speech Recognition with Selective State Spaces Models image Link Code
Arxiv 24.09.30 Mamba for Streaming ASR Combined with Unimodal Aggregation image Link Code
Arxiv 24.10.01 Zero-Shot Text-to-Speech from Continuous Text Streams image Link Code
Expert Systems with Applications 2024 A barking emotion recognition method based on Mamba and Synchrosqueezing Short-Time Fourier Transform image Link Code

Time Series

Date Paper Figure Link Code
Arxiv 24.03.14 (ECAI 2024) TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting image Link Code
Arxiv 24.04.23 SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series Forecasting image Link Code
Arxiv 24.04.23 Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting image Link
Arxiv 24.04.24 Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting image Link
Arxiv 24.05.11 DTMamba : Dual Twin Mamba for Time Series Forecasting image Link
Arxiv 24.05.25 Time-SSM: Simplifying and Unifying State Space Models for Time Series Forecasting image Link
Arxiv 24.05.26 MambaTS: Improved Selective State Space Models for Long-term Time Series Forecasting image Link
Arxiv 24.06.06 Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models image Link
Arxiv 24.06.06 TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification image Link
Arxiv 24.06.08 C-Mamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting image Link Code
Arxiv 24.06.17 SpoT-Mamba: Learning Long-Range Dependency on Spatio-Temporal Graphs with Selective State Spaces image Link Code
Arxiv 24.07.15 MSegRNN:Enhanced SegRNN Model with Mamba for Long-Term Time Series Forecasting image Link
Arxiv 24.07.20 FMamba: Mamba based on Fast-attention for Multivariate Time-series Forecasting image Link
Arxiv 24.08.04 Mamba-Spike: Enhancing the Mamba Architecture with a Spiking Front-End for Efficient Temporal Data Processing image Link Code
Arxiv 24.08.22 (CGI24) Simplified Mamba with Disentangled Dependency Encoding for Long-Term Time Series Forecasting image Link
Arxiv 24.08.27 Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need image image Link Code
Arxiv 24.09.13 (ICECCE 2024) Integration of Mamba and Transformer -- MAT for Long-Short Range Time Series Forecasting with Application to Weather Dynamics image Link
Arxiv 24.09.21 Test Time Learning for Time Series Forecasting image Link
Arxiv 24.09.30 A SSM is Polymerized from Multivariate Time Series image Link Code

About

[Official Repo] A Survey on Vision Mamba: Models, Applications and Challenges

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published