Refactor Prescriptors and project to handle new prescription #89

danyoungday · 2024-06-04T17:21:11Z

Refactored the Prescriptors to be more user-oriented instead of being a convenient wrapper around our ESP-like training. Reran evolution and used the new prescriptors in the app and experiments. Removed references to ESP.

New prescriptor hierarchy:

Prescriptor: Abstract class with abstract predict method - needs to be moved to SDK

LandUsePrescriptor: Implementation of Prescriptor specific to our project. Wraps Candidate from evolution

Candidate: PyTorch NN trained with NSGA-II wrapped by LandUsePrescriptor

PrescriptorManager: Takes in many Prescriptors and a single Predictor and allows us to compare the Prescriptors

…Still have to test them

…of prescriptors

danyoungday · 2024-06-04T17:22:04Z

use_cases/eluc/Dockerfile

@@ -22,6 +22,9 @@ RUN pip install --no-cache-dir --upgrade pip && \
 # Copy source files over
 COPY . .

+# Python setup script - downloads data and processes it
+RUN python -m app.process_data


Change our dockerfile to download the data from HuggingFace before we start the app so that we don't have to push the csv to the git repo

How fast (or slow) is it?

It takes about a minute. We could also upload the preprocessed app dataset to huggingface which would remove most of this time.

danyoungday · 2024-06-04T17:22:44Z

use_cases/eluc/app/app.py

-}
-prescriptor = TorchPrescriptor(None, encoder, None, 1, candidate_params)
+# Load prescriptors
+prescriptor_manager = utils.load_prescriptors()


Just use a single PrescriptorManager object now that can hold all the individual LandUsePrescriptors. This allows us to in the future add Heuristics too since they implement Prescriptor

danyoungday · 2024-06-04T17:23:08Z

use_cases/eluc/app/app.py

@@ -544,7 +533,7 @@ def compute_land_change(sliders, year, lat, lon, locked):
        warnings.append(html.P("WARNING: Negative values detected. Please lower the value of a locked slider."))

    # Compute total change
-    change = prescriptor.compute_percent_changed(context_actions_df)
+    change = prescriptor_manager.compute_percent_changed(context_actions_df)


compute_percent_changed is now part of PrescriptorManager

danyoungday · 2024-06-04T17:23:33Z

use_cases/eluc/app/data/pareto.csv

For now we hard code which prescriptors to use and what their places on the pareto are

That's fine for now.
Just wondering: could that be a file we push to HF and that the app downloads when it starts? That way editing this file and restarting the app would be enough to update the presriptors

Yes, this can go in the special repository that handles the app on the huggingface space.

danyoungday · 2024-06-04T17:24:04Z

use_cases/eluc/app/utils.py

+
+    prescriptor_manager = PrescriptorManager(prescriptors, None)
+
+    return prescriptor_manager


Reads the hard-coded pareto and downloads the appropriate models from HuggingFace

danyoungday · 2024-06-04T17:25:39Z

use_cases/eluc/experiments/prescriptor_experiments.ipynb

Reran experiments with new prescriptors and removed references to ESP

danyoungday · 2024-06-04T17:27:43Z

use_cases/eluc/prescriptors/nsga2/land_use_prescriptor.py

New Prescriptor logic. Handles abstract prescribe method as well as save and load. Additionally has a special torch_prescribe method to be used during evolution so that we can avoid the overhead of converting the entire evaluation dataset from pandas to PyTorch.

danyoungday · 2024-06-04T17:29:36Z

use_cases/eluc/prescriptors/nsga2/trainer.py

-            eluc_df, change_df = self.prescriptor.predict_metrics(context_actions_df)
+            prescriptor = LandUsePrescriptor(candidate, self.encoder, self.batch_size)
+            context_actions_df = prescriptor.torch_prescribe(self.context_df, self.encoded_context_dl)
+            eluc_df, change_df = prescriptor_manager.predict_metrics(context_actions_df)


This is where we use the new prescriptor logic during evolution. We create a LandUsePrescriptor wrapped around whichever Candidate we are evaluating. We also create a dummy PrescriptorManager that wraps our Predictor. Then we can call torch_prescribe on the evaluation dataset (which is in the PyTorch format so we don't have to convert) and then predict our metrics using the PrescriptorManager.

That comment would be useful in the code itself

danyoungday · 2024-06-04T17:31:49Z

use_cases/eluc/prescriptors/prescriptor.py

+
+            if not Path(local_dir).exists() or not Path(local_dir).is_dir():
+                hf_args["local_dir"] = local_dir
+                snapshot_download(repo_id=path_or_url, **hf_args)


This needs to be rewritten according to our Persistor logic. Currently it's hard copy/pasted from Predictor

Right. We also need to dissociate the prescriptor interface from the way the models are persisted and loaded. For another PR.

danyoungday · 2024-06-04T17:32:22Z

use_cases/eluc/prescriptors/prescriptor_manager.py

Wraps a dict of Prescriptor objects and a single Predictor so we can compare them and use them in the app.

ofrancon

lgtm, good progress!

ofrancon · 2024-06-05T01:25:32Z

use_cases/eluc/Dockerfile

@@ -22,6 +22,9 @@ RUN pip install --no-cache-dir --upgrade pip && \
 # Copy source files over
 COPY . .

+# Python setup script - downloads data and processes it
+RUN python -m app.process_data


How fast (or slow) is it?

ofrancon · 2024-06-05T01:31:00Z

use_cases/eluc/app/data/pareto.csv

That's fine for now.
Just wondering: could that be a file we push to HF and that the app downloads when it starts? That way editing this file and restarting the app would be enough to update the presriptors

ofrancon · 2024-06-05T01:41:23Z

use_cases/eluc/prescriptors/nsga2/trainer.py

-            eluc_df, change_df = self.prescriptor.predict_metrics(context_actions_df)
+            prescriptor = LandUsePrescriptor(candidate, self.encoder, self.batch_size)
+            context_actions_df = prescriptor.torch_prescribe(self.context_df, self.encoded_context_dl)
+            eluc_df, change_df = prescriptor_manager.predict_metrics(context_actions_df)


That comment would be useful in the code itself

ofrancon · 2024-06-05T01:44:18Z

use_cases/eluc/prescriptors/prescriptor.py

+
+            if not Path(local_dir).exists() or not Path(local_dir).is_dir():
+                hf_args["local_dir"] = local_dir
+                snapshot_download(repo_id=path_or_url, **hf_args)


Right. We also need to dissociate the prescriptor interface from the way the models are persisted and loaded. For another PR.

ofrancon · 2024-06-05T01:46:22Z

use_cases/eluc/prescriptors/prescriptor_manager.py

+
+        return eluc_df, change_df
+
+    # TODO: Move this to its own predictor


Right, compute_percent_changed could be a Predictor. Means the constructor of PrescriptorManager should probably take a list of Predictor objects instead of a single one. And to compute the metrics we loop over the predictors. For another PR.

…raining

danyoungday · 2024-06-07T23:02:12Z

use_cases/eluc/data/eluc_data.py

Refactored structure of ELUCData to be loaded from classmethods rather than being 2 different classes with 2 different loading functions.

danyoungday · 2024-06-07T23:02:48Z

use_cases/eluc/data/eluc_encoder.py

Moved ELUCEncoder to its own file. Made constructor load fields. Classmethods to load from pandas or json.

danyoungday added 13 commits May 30, 2024 16:58

Refactored prescriptors to be more user-oriented vs. train oriented. …

1f1001a

…Still have to test them

Added saving loading and frompretrained to prescriptor

9db5c04

Updated heuristics

a184505

Removed references to ESP

33059f2

Implemented saving and loading for heuristics

7acd0d2

ignore esp files

1e5c73b

Updated experiments to work with new prescriptor architecture

8bc07d0

reran training with fixed distance calculation then reran experiments

b917f17

Modified app to use new prescriptors and performed minor refactoring …

1cb7620

…of prescriptors

Merge branch 'refactor-app' into hf-prescriptors

d0596b3

renamed indices so that we don't have duplicate columns

f5ae448

Fixed test to download data and use it

babf912

Linted to reach threshold

926ffe0

danyoungday requested a review from ofrancon June 4, 2024 17:21

danyoungday self-assigned this Jun 4, 2024

danyoungday commented Jun 4, 2024

View reviewed changes

ofrancon approved these changes Jun 5, 2024

View reviewed changes

Base automatically changed from refactor-app to main June 6, 2024 22:45

danyoungday added 3 commits June 6, 2024 15:47

Merge branch 'main' into hf-prescriptors

7b53ef1

Added some documentation to show where prescriptor logic is used in t…

3bdde8c

…raining

Modified ELUCData by consolidating yucky classes into single clean class

0f58c24

danyoungday added 3 commits June 7, 2024 14:27

Updated documentation for new data file

04c26e6

Refactored data and encoder, updated all things to work with new data

2f34f9d

Linted files to reach 9.7

8a96046

danyoungday commented Jun 7, 2024

View reviewed changes

danyoungday merged commit 4aa17b2 into main Jun 7, 2024
1 check passed

danyoungday deleted the hf-prescriptors branch June 7, 2024 23:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Prescriptors and project to handle new prescription #89

Refactor Prescriptors and project to handle new prescription #89

danyoungday commented Jun 4, 2024 •

edited

Loading

danyoungday Jun 4, 2024

ofrancon Jun 5, 2024

danyoungday Jun 6, 2024 •

edited

Loading

danyoungday Jun 4, 2024

danyoungday Jun 4, 2024

danyoungday Jun 4, 2024

ofrancon Jun 5, 2024

danyoungday Jun 6, 2024

danyoungday Jun 4, 2024

danyoungday Jun 4, 2024 •

edited

Loading

danyoungday Jun 4, 2024

danyoungday Jun 4, 2024

ofrancon Jun 5, 2024

danyoungday Jun 6, 2024

danyoungday Jun 4, 2024

ofrancon Jun 5, 2024

danyoungday Jun 4, 2024

ofrancon left a comment

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

ofrancon Jun 5, 2024

danyoungday Jun 7, 2024

danyoungday Jun 7, 2024


		prescriptor_manager = PrescriptorManager(prescriptors, None)

		return prescriptor_manager


		return eluc_df, change_df

		# TODO: Move this to its own predictor

Refactor Prescriptors and project to handle new prescription #89

Refactor Prescriptors and project to handle new prescription #89

Conversation

danyoungday commented Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danyoungday Jun 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danyoungday Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ofrancon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danyoungday commented Jun 4, 2024 •

edited

Loading

danyoungday Jun 6, 2024 •

edited

Loading

danyoungday Jun 4, 2024 •

edited

Loading