-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consolidated.safetensors #9916
base: master
Are you sure you want to change the base?
consolidated.safetensors #9916
Conversation
easier handling (as eg for Ministral)
for filename in os.listdir(dir_model): | ||
if filename.startswith(prefix) and filename.endswith(suffix): | ||
if any(filename.startswith(prefix) for prefix in prefixes) and any(filename.endswith(suffix) for suffix in suffixes): | ||
part_names.append(filename) | ||
elif filename == "consolidated.safetensors": | ||
part_names.append(filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are both model*.safetensors
files and consolidated.safetensors
in the same directory?
For example, https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1/ (which needs #9126) has both consolidated.safetensors
and model-0000?-of-00003.safetensors
.
Since git config --local lfs.fetchinclude <some_pattern>
can be used to selectively download model files, I'm not sure how to handle that case if consolidated.safetensors
is detected. I think the convert script should not use both at once (since duplicated tensor names are problematic), but how to choose?
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed. something like below then?
def get_model_part_names(dir_model: Path, prefixes: list[str], suffixes: list[str]) -> list[str]:
"""
Retrieves the list of model part filenames from the model directory.
Prioritizes 'model-XXXX-of-XXXX.safetensors' files over 'consolidated.safetensors'.
Parameters:
- dir_model (Path): Path to the model directory.
- prefixes (list[str]): List of filename prefixes to match.
- suffixes (list[str]): List of filename suffixes to match.
Returns:
- list[str]: Sorted list of model part filenames.
"""
part_names: list[str] = []
# Collect files matching the given prefixes and suffixes
for filename in os.listdir(dir_model):
if any(filename.startswith(prefix) for prefix in prefixes) and any(filename.endswith(suffix) for suffix in suffixes):
part_names.append(filename)
elif filename == "consolidated.safetensors":
part_names.append(filename)
# Sort the list for consistency
part_names.sort()
# Check if both split files and 'consolidated.safetensors' are present
split_files = [f for f in part_names if f.startswith("model-") and f.endswith(".safetensors")]
consolidated_present = "consolidated.safetensors" in part_names
if split_files and consolidated_present:
logger.debug("Both split model files and 'consolidated.safetensors' found. Ignoring 'consolidated.safetensors'.")
# Remove 'consolidated.safetensors' from part_names
part_names = [f for f in part_names if f != "consolidated.safetensors"]
# Final sort after potential removal
part_names.sort()
if not part_names:
logger.warning("No model weight files found in the directory.")
return part_names
easier handling (as eg for Ministral)