You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
amrita-rajput
changed the title
Increasing the number of samples in generated synthetic data
Not able to Increase the number of samples in generated synthetic data
Nov 8, 2024
Because the number of samples generated is always stuck at 30, this user is generating skills and is hitting #420 - our behavior today is confusing because we do use --sdg-scale-factor to scale some of our generated data, but we only ever mix in 30 samples from any taxonomy leaf nodes into the final output.
I'm using the below command to generate the synthetic data with sdg-scale-factor as 100:
ilab data generate --endpoint-url http://localhost:8080/v1 --chunk-word-count 1000 --model mistralai/mistral-large --pipeline full --sdg-scale-factor 100
The number of samples in synthetic data generated always remains the same that is 30 and it's not getting affected by --sdg-scale-factor value.
The text was updated successfully, but these errors were encountered: