Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ilab data generate does not specify the correct num of samples generated #227

Open
alinaryan opened this issue Jul 29, 2024 · 4 comments · May be fixed by #445
Open

ilab data generate does not specify the correct num of samples generated #227

alinaryan opened this issue Jul 29, 2024 · 4 comments · May be fixed by #445
Assignees
Labels
bug Something isn't working

Comments

@alinaryan
Copy link
Member

when running through the cli e2e workflow, the output of the 'ilab data generate' cli command said only 2 samples were generated when more than 2 were generated. Should account for the correct number of generated samples every time.

@alinaryan alinaryan added the bug Something isn't working label Jul 29, 2024
@aakankshaduggal
Copy link
Member

Thanks @alinaryan for pointing this out. Could you also share what command you were using and what your final output looked like.

@khaledsulayman khaledsulayman self-assigned this Jul 29, 2024
@nathan-weinberg nathan-weinberg changed the title 'ilab data generate' does not specify the correct num of samples generated ilab data generate does not specify the correct num of samples generated Aug 20, 2024
Copy link

This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.

@github-actions github-actions bot added the stale label Nov 19, 2024
@bbrowning
Copy link
Contributor

This was fixed quite a while back, but looks like we forgot to close this issue. Thanks for reporting it, and sorry we missed closing it when it was fixed!

@bbrowning
Copy link
Contributor

Actually, looks like not fixed! I just rediscovered this issue in some of our CI logs, so reopening.

@bbrowning bbrowning reopened this Dec 10, 2024
bbrowning added a commit to bbrowning/instructlab-sdg that referenced this issue Dec 10, 2024
We were logging the length of our `generated_data` list instead of the
number of newly generated samples in each leaf node, causing confusion
as this is typically 1, 2, 3, etc instead of 50, 500, 1000, etc that
users would expect.

Fixes instructlab#227

Signed-off-by: Ben Browning <[email protected]>
@github-actions github-actions bot removed the stale label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants