Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

希望作者可以将最新的Aquila-7B和baichuan-7B模型集成进来 #45

Open
1 task
AILWQ opened this issue Jun 16, 2023 · 9 comments
Open
1 task
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@AILWQ
Copy link

AILWQ commented Jun 16, 2023

  • [✅] I checked to make sure that this is not a duplicate issue
  • I'm submitting the request to the correct repository (for model requests, see here)

Describe the solution you'd like

如题,希望作者可以把智源的Aquila-7B和百川的baichuan-7B集成进来,感谢🙏

@AILWQ AILWQ added the enhancement New feature or request label Jun 16, 2023
@shibing624
Copy link
Owner

在训练中

@shibing624
Copy link
Owner

训练代码是可以通用的,我稍微改下。

@AILWQ
Copy link
Author

AILWQ commented Jun 16, 2023

在训练中

感谢!

@shibing624
Copy link
Owner

@AILWQ
Copy link
Author

AILWQ commented Jun 17, 2023

baichuan-7B的训练已经兼容了: https://github.com/shibing624/textgen/blob/main/examples/gpt/training_baichuan_mydata_demo.py

感谢作者!但我在运行的过程中遇到了bug:
image

应该是在算交叉熵的时候input和target的维度不一致了,为什么会出现这个错误呢?

@shibing624
Copy link
Owner

代码更新了吗? 出现这个错误的原因一般是collator后的input_ids 和 labels 维度不一致导致的 。

@AILWQ
Copy link
Author

AILWQ commented Jun 18, 2023

代码更新了吗? 出现这个错误的原因一般是collator后的input_ids 和 labels 维度不一致导致的 。

下载安装了最新的代码,还是会有这个问题;另外,在跑ChatGLM-6B的时候出现了一个问题:

/data/home/scv9197/.conda/envs/competition/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:731: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:245.)
  tensor = as_tensor(value)

然后在加载数据的时候异常缓慢(7w的数据加载了两个半小时),之前没有出现过这个问题,不知作者是否对加载数据这块做了变动。

@shibing624
Copy link
Owner

不清楚你的数据格式是啥,有多轮对话格式吗?

另外,两个半小时是不正常的,一般就2分钟不到。

百川7b,我alpaca和belle-multi-round的数据都sft完成了的。 如果数据有问题,可以用示例数据测试,没问题再上自己数据。

Copy link

stale bot commented Dec 27, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.(由于长期不活动,机器人自动关闭此问题,如果需要欢迎提问)

@stale stale bot added the wontfix This will not be worked on label Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants