Add resume for adapter_v2, enable continued finetuning for adapter #1354

altria-zewei-wang · 2024-04-25T06:04:04Z

Hi all!
I was checking #238 to add a function to finish resume finetuning for adapter. It would search the largest number of step point it saved from out_dir and update the state_dict.
Current Problem: I updated the step_count but find out to keep the iteration count from last time would have to read in the metrics in the log folder. The problem is that I don't know how to retrieve the corresponding version in the log file without adding an input of the version of the metrics.csv (currently not implemented).
Let me know what you think! Thanks for your repo!

rasbt · 2024-04-25T16:15:35Z

litgpt/finetune/adapter_v2.py

@@ -16,12 +16,12 @@
 from torchmetrics import RunningMean

 from litgpt.adapter_v2 import GPT, Block, Config, adapter_filter, mark_only_adapter_v2_as_trainable
-from args import EvalArgs, TrainArgs
+from litgpt.args import EvalArgs, TrainArgs


This is also how we import elsewhere and looks good to me.

rasbt · 2024-04-25T16:23:30Z

Thanks for looking into this. Sorry, I haven't spent much time on thinking through the ramifications here. But would the simple resuming from the full finetuning code not work in your case:

https://github.com/Lightning-AI/litgpt/blob/main/litgpt/finetune/full.py#L43

altria-zewei-wang · 2024-04-25T16:44:17Z

I was specifically looking into testing finetuning with adapter and loras in my paper, and that my gpu cuts off after certain time limit. I figure adding this feature can help anyone who is in similar situation as me.

adapter_update

5abc8e0

altria-zewei-wang requested review from awaelchli, carmocca and lantiga as code owners April 25, 2024 06:04

fixed some issue

4750aa5

rasbt approved these changes Apr 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add resume for adapter_v2, enable continued finetuning for adapter #1354

Add resume for adapter_v2, enable continued finetuning for adapter #1354

altria-zewei-wang commented Apr 25, 2024

rasbt Apr 25, 2024

rasbt commented Apr 25, 2024

altria-zewei-wang commented Apr 25, 2024

Add resume for adapter_v2, enable continued finetuning for adapter #1354

Are you sure you want to change the base?

Add resume for adapter_v2, enable continued finetuning for adapter #1354

Conversation

altria-zewei-wang commented Apr 25, 2024

rasbt Apr 25, 2024

Choose a reason for hiding this comment

rasbt commented Apr 25, 2024

altria-zewei-wang commented Apr 25, 2024