You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great work and the publicly available code! I noticed in Table 3 of the paper that the fine-tuning parameter percentage for PYRA in ViT-B/16 is listed as 0.35%. Could you please clarify how this percentage is calculated?
When I try to replicate the results using the experiments/LoRA/ViT-B_prompt_lora_8.yaml configuration from the public code, I get the following parameter information:
total training parameters: 399652 adapter 0 LoRA 294912 prompt 0 prefix 0 PYRA 27840 head 76900
total parameters in model: 86198308
However, when I calculate the percentage as (LoRA + PYRA) / total, I get (294912 + 27840) / 86198308 = 0.3744%. Alternatively, when I calculate it as (LoRA + PYRA) / (total - head), I get (294912 + 27840) / (86198308 - 76900) = 0.3748%, which is different from the value of 0.35% in Table 3.
Similarly, when I use the experiments/LoRA/ViT-L_prompt_lora_12.yaml configuration to fine-tune ViT-L, the parameter information is as follows:
total training parameters: 1356068 adapter 0 LoRA 1179648 prompt 0 prefix 0 PYRA 73920 head 102500
total parameters in model: 304657700
The percentage I calculate is (1179648 + 73920) / 304657700 = 0.4115% and (1179648 + 73920) / (304657700 - 102500) = 0.4116%, which again does not match the value of 0.40% in Table 3.
Could you please explain how the fine-tuning parameter percentage is computed in the paper? Am I misunderstanding the calculation process?
Thank you for your time and assistance!
The text was updated successfully, but these errors were encountered:
Hello! Sorry for the misunderstanding between our codebase and the article. When we were working on the draft, we have adopted a version of PYRA without the LayerNorm, and the numbers in the paper corresponds to the LayerNorm-free version.
You are correct and that was a good catch. We will update the draft on arXiv accordingly.
Dear authors,
Thank you for your great work and the publicly available code! I noticed in Table 3 of the paper that the fine-tuning parameter percentage for PYRA in ViT-B/16 is listed as 0.35%. Could you please clarify how this percentage is calculated?
When I try to replicate the results using the
experiments/LoRA/ViT-B_prompt_lora_8.yaml
configuration from the public code, I get the following parameter information:total training parameters: 399652 adapter 0 LoRA 294912 prompt 0 prefix 0 PYRA 27840 head 76900 total parameters in model: 86198308
However, when I calculate the percentage as
(LoRA + PYRA) / total
, I get(294912 + 27840) / 86198308 = 0.3744%
. Alternatively, when I calculate it as(LoRA + PYRA) / (total - head)
, I get(294912 + 27840) / (86198308 - 76900) = 0.3748%
, which is different from the value of 0.35% in Table 3.Similarly, when I use the
experiments/LoRA/ViT-L_prompt_lora_12.yaml
configuration to fine-tune ViT-L, the parameter information is as follows:total training parameters: 1356068 adapter 0 LoRA 1179648 prompt 0 prefix 0 PYRA 73920 head 102500 total parameters in model: 304657700
The percentage I calculate is
(1179648 + 73920) / 304657700 = 0.4115%
and(1179648 + 73920) / (304657700 - 102500) = 0.4116%
, which again does not match the value of 0.40% in Table 3.Could you please explain how the fine-tuning parameter percentage is computed in the paper? Am I misunderstanding the calculation process?
Thank you for your time and assistance!
The text was updated successfully, but these errors were encountered: