Add RETURNN compute perplexity job #563

christophmluscher · 2024-12-09T09:25:08Z

No description provided.

michelwi · 2024-12-09T10:46:17Z

returnn/perplexity.py

+        self,
+        returnn_config: ReturnnConfig,
+        returnn_model: Union[PtCheckpoint, Checkpoint],
+        eval_dataset: tk.Path,


Suggested change

eval_dataset: tk.Path,

eval_dataset: Union[tk.Path, Dict[str, Any]],

in principle any dataset is valid? Actually, does eval_datasets = {"eval": "/path/to/file"} constitute a correct dataset definition?

I have not tested if any dataset is valid. I am assuming that any dataset with text should be valid.. but this should be used with LmDataset.

No that is not the correct dataset definition. It needs to be something like: {"eval": "class": "LmDatset", ...}

michelwi · 2024-12-09T10:47:24Z

returnn/perplexity.py

+        # TODO verify paths
+        if isinstance(returnn_model, PtCheckpoint):
+            model_path = returnn_model.path
+            self.add_input(returnn_model.path)


I think it should already be input to the Job as we pass the PT/Checkpoint object to the Job.

Are you sure??
Pt/Checkpoint are normal Python classes which I think are not covered by the extract_paths from sisyphus https://github.com/rwth-i6/sisyphus/blob/master/sisyphus/tools.py#L74
or am I missing something?

Are you sure??

never without actually testing ;)

but very confident because:

in recognition we also just give the checkpoint object and it works

are not covered by the extract_paths from sisyphus

extract_paths should arrive in the last else and then call get_object_state which should then via get_members_ descend into the dict of the Checkpoint object and return the underlying tk.Path object.

ah yes. thanks for the explanation :)

returnn/training.py

michelwi · 2024-12-09T10:51:25Z

returnn/perplexity.py

+        shutil.move("returnn_log", self.out_returnn_log.get_path())
+
+    def gather(self):
+        for data_key in self.out_perplexities.keys():


tk.Variable has no keys()

Thanks I think that is from an earlier version...

JackTemaki

I have to say I do not like the concept of this job, because it does not really add any value on top of a ReturnnForwardJob. Also, currently we are not doing anything with the perplexities, so I wonder what the task of the Job is.

I would suggest to just run a Forward Job, and then pass the learning_rate files to an "ExtractPerplexities" Job or so. Also in the end we can anyway not guarantee that people have the right settings in their config to calculate the perplexities correctly, so I feel an ComputePerplexityJob in the way here is a little misleading / faking confidence about doing the right thing.

Another addition that you could to an extract PPL job is to pass the corpus and a BPE vocab, so you could directly compute word-level PPL from the BPE-level PPL.

michelwi · 2024-12-09T15:34:01Z

it does not really add any value on top of a ReturnnForwardJob.

Maybe inherit from ReturnnForwardJob and add some checks of the returnn-config or learning-rate file parsing as a value add?

Co-authored-by: michelwi <[email protected]>

christophmluscher · 2024-12-10T07:58:35Z

Thanks for the feedback. I think @JackTemaki makes a good point. I think automatically adding the word-level PPL for subword-level models would be a useful addition.

Summation:

Take output from ReturnnForwardJob
Word-level PPL for subword-level models

What do you think of @michelwi suggestion:

Maybe inherit from ReturnnForwardJob and add some checks of the returnn-config or learning-rate file parsing as a value add?

?
Inheriting would mean one less job but potentially you would rerun the forward part several times depending on the pipeline. So I would go with a separate job instead. WHat are your thoughts on this?

michelwi · 2024-12-10T09:24:48Z

Inheriting would mean one less job but potentially you would rerun the forward part several times depending on the pipeline. So I would go with a separate job instead. WHat are your thoughts on this?

forwarding should not be done more than needed. I don't currently see how we do unnecessary forwardings (i.e. I think each "compute ppl from LR file" would go with exactly one forwarding) but I have no hard feelings against separate jobs if it makes more sense.

christophmluscher · 2024-12-10T11:38:45Z

I was thinking about a situation where you compute the PPL of a model on a dataset and additionally do some other step for example some prior computation of sort.. but maybe that is a bit far fetched.. 🤔

add returnn ppl job

6be7b7c

christophmluscher requested a review from JackTemaki December 9, 2024 09:25

michelwi reviewed Dec 9, 2024

View reviewed changes

JackTemaki requested changes Dec 9, 2024

View reviewed changes

doc fix

9e93939

Co-authored-by: michelwi <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RETURNN compute perplexity job #563

Add RETURNN compute perplexity job #563

christophmluscher commented Dec 9, 2024

michelwi Dec 9, 2024

christophmluscher Dec 10, 2024

michelwi Dec 9, 2024

christophmluscher Dec 10, 2024

michelwi Dec 10, 2024

christophmluscher Dec 10, 2024

michelwi Dec 9, 2024

christophmluscher Dec 10, 2024

JackTemaki left a comment •

edited

Loading

michelwi commented Dec 9, 2024

christophmluscher commented Dec 10, 2024

michelwi commented Dec 10, 2024

christophmluscher commented Dec 10, 2024

	eval_dataset: tk.Path,
	eval_dataset: Union[tk.Path, Dict[str, Any]],

Add RETURNN compute perplexity job #563

Are you sure you want to change the base?

Add RETURNN compute perplexity job #563

Conversation

christophmluscher commented Dec 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JackTemaki left a comment • edited Loading

Choose a reason for hiding this comment

michelwi commented Dec 9, 2024

christophmluscher commented Dec 10, 2024

michelwi commented Dec 10, 2024

christophmluscher commented Dec 10, 2024

JackTemaki left a comment •

edited

Loading