Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow PredictionWriter to create different kinds of saved output #265

Open
sjfleming opened this issue Nov 7, 2024 · 1 comment
Open

Comments

@sjfleming
Copy link
Contributor

Perhaps write_prediction could be an input argument, and maybe you could specify a different writer function for different use cases? Not sure if this could work.

Now we write batches as separate CSV output files with write_prediction()

def write_prediction(
prediction: torch.Tensor,
ids: np.ndarray,
output_dir: Path | str,
postfix: int | str,
) -> None:
"""
Write prediction to a CSV file.
Args:
prediction:
The prediction to write.
ids:
The IDs of the cells.
output_dir:
The directory to write the prediction to.
postfix:
A postfix to add to the CSV file name.
"""
if not os.path.exists(output_dir):
os.makedirs(output_dir, exist_ok=True)
df = pd.DataFrame(prediction.cpu())
df.insert(0, "db_ids", ids)
output_path = os.path.join(output_dir, f"batch_{postfix}.csv")
df.to_csv(output_path, header=False, index=False)

Is it possible to do something like keep appending to the same h5 file, for instance? Perhaps PredictionWriter could be made to accommodate other kinds of output writers.

@sjfleming
Copy link
Contributor Author

Maybe the more sensible thing to do is have a separate PredictionWriter, like PredictionWriterH5, for that kind of thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant