From f1aed083d17164826ff5f6f8bf8c7f31f3c6e9df Mon Sep 17 00:00:00 2001 From: Zizheng Pan Date: Fri, 3 Jun 2022 13:38:46 +1000 Subject: [PATCH] update readme --- README.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/README.md b/README.md index e9fb319..1e74110 100644 --- a/README.md +++ b/README.md @@ -5,13 +5,11 @@ By [Zizheng Pan](https://scholar.google.com.au/citations?user=w_VMopoAAAAJ&hl=en ![hilo](.github/arch.png) -
LITv2 Architecture
We introduce LITv2, a simple and effective ViT which performs favourably against the existing state-of-the-art methods across a spectrum of different model sizes with faster speed. ![hilo](.github/hilo.png) -
HiLo Attention
The core of LITv2: **HiLo attention**. HiLo is inspired by the insight that high frequencies in an image capture local fine details and low frequencies focus on global structures, whereas a multi-head self-attention layer neglects the characteristic of different frequencies. Therefore, we propose to disentangle the high/low frequency patterns in an attention layer by separating the heads into two groups, where one group encodes high frequencies via self-attention within each local window, and another group performs the attention to model the global relationship between the average-pooled low-frequency keys from each window and each query position in the input feature map.