Add pydra tasks, workflow and update CLI #57

maestroque · 2024-07-23T10:46:42Z

Closes #

Proposed Changes

Create two pydra tasks, one to compute metrics and one to export metrics
Integrate them in a single pydra workflow, utilizing pydra tasks from physutils
Inspiration is taken from the current CLI implementation, aiming to replace and enhance it

Change Type

Checklist before review

I added everything I wanted to add to this PR.
[Code or tests only] I wrote/updated the necessary docstrings.
[Code or tests only] I ran and passed tests locally.
[Documentation only] I built the docs locally.
My contribution is harmonious with the rest of the code: I'm not introducing repetitions.
My code respects the adopted style, especially linting conventions.
The title of this PR is explanatory on its own, enough to be understood as part of a changelog.
I added or indicated the right labels.

I added information regarding the timeline of completion for this PR.

Please, comment on my PR while it's a draft and give me feedback on the development!

maestroque · 2024-07-23T10:47:10Z

Note that the PR is currently stemming from integrate-physutils

maestroque · 2024-07-23T12:28:23Z

Also I would like you to enlighten me on retroicor if possible. In the current workflow implementation, the metrics are exported as such:

for metric in metrics:
        if metric == "retroicor_card":
            args = select_input_args(retroicor, kwargs)
            args["card"] = True
            retroicor_regrs = retroicor(physio, **args)
            for vslice in range(len(args["slice_timings"])):
                for harm in range(args["n_harm"]):
                    key = f"rcor-card_s-{vslice}_hrm-{harm}"
                    regr[f"{key}_cos"] = retroicor_regrs[vslice][:, harm * 2]
                    regr[f"{key}_sin"] = retroicor_regrs[vslice][:, harm * 2 + 1]
        elif metric == "retroicor_resp":
            # etc. etc.

Shall I keep it this way or is more research needed? I am not familiar with how retroicor is used.
@m-miedema @me-pic @smoia

m-miedema · 2024-07-23T16:02:57Z

Also I would like you to enlighten me on retroicor if possible. In the current workflow implementation, the metrics are exported as such:

for metric in metrics:
        if metric == "retroicor_card":
            args = select_input_args(retroicor, kwargs)
            args["card"] = True
            retroicor_regrs = retroicor(physio, **args)
            for vslice in range(len(args["slice_timings"])):
                for harm in range(args["n_harm"]):
                    key = f"rcor-card_s-{vslice}_hrm-{harm}"
                    regr[f"{key}_cos"] = retroicor_regrs[vslice][:, harm * 2]
                    regr[f"{key}_sin"] = retroicor_regrs[vslice][:, harm * 2 + 1]
        elif metric == "retroicor_resp":
            # etc. etc.

Shall I keep it this way or is more research needed? I am not familiar with how retroicor is used. @m-miedema @me-pic @smoia

Keep it this way for now -- the handling of these derivatives is something we'll need to improve but we can circle back after the workflow is in place!

codecov · 2024-08-25T14:00:32Z

Codecov Report

Attention: Patch coverage is 0% with 48 lines in your changes missing coverage. Please review.

Project coverage is 49.84%. Comparing base (f074cad) to head (d150e76).
Report is 2 commits behind head on master.

Files	Patch %	Lines
phys2denoise/tasks.py	0.00%	48 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #57      +/-   ##
==========================================
- Coverage   53.85%   49.84%   -4.02%     
==========================================
  Files           8        9       +1     
  Lines         596      644      +48     
==========================================
  Hits          321      321              
- Misses        275      323      +48

Files	Coverage Δ
phys2denoise/workflow.py	`0.00% <ø> (ø)`
phys2denoise/tasks.py	`0.00% <0.00%> (ø)`

maestroque · 2024-08-25T16:17:30Z

I have discovered a discrepancy about using loguru within the pydra tasks. E.g. when running this

@pydra.mark.task
def compute_metrics(phys, metrics):
    if isinstance(metrics, list) or isinstance(metrics, str):
        for metric in metrics:
            if metric not in _available_metrics:
                # print(f"Metric {metric} not available. Skipping")
                logger.warning(f"Metric {metric} not available. Skipping")
                continue

            args = select_input_args(metric, {})
            phys = globals()[metric](phys, **args)
            logger.info(f"Computed {metric}")
    return phys

When including the loguru logs, when defining the task as in task2 = compute_metrics(phys=fake_physio, metrics=["respiratory_variance"]), it throws the following error:

>       task2 = compute_metrics(phys=fake_physio, metrics=["respiratory_variance"])

phys2denoise/tests/test_tasks.py:15:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pydra-p2d/lib/python3.10/site-packages/pydra/mark/functions.py:47: in decorate
    return FunctionTask(func=func, **kwargs)
pydra-p2d/lib/python3.10/site-packages/pydra/engine/task.py:146: in __init__
    fields.append(("_func", attr.ib(default=cp.dumps(func), type=bytes)))
pydra-p2d/lib/python3.10/site-packages/cloudpickle/cloudpickle.py:1479: in dumps
    cp.dump(obj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <cloudpickle.cloudpickle.Pickler object at 0x7fe5ea17e140>, obj = <function compute_metrics at 0x7fe5e9fb3910>

    def dump(self, obj):
        try:
>           return super().dump(obj)
E           TypeError: cannot pickle 'EncodedFile' object

pydra-p2d/lib/python3.10/site-packages/cloudpickle/cloudpickle.py:1245: TypeError

While when not including such calls it is not raised.
That might have to do with something related to the parallel handling of tasks within pydra. So we could either not use logs at all within tasks (which tbh is not a viable workaround imo), or delve deeper. I'm trying to find online resources about pydra + loguru but there is no luck up to this point.

The good thing is that this is only specific to loguru as it seems, because stdlib logging calls seem to work. I'm planning to work with those for now and see how we'll move on.
@smoia @me-pic @m-miedema

m-miedema · 2024-08-29T20:29:19Z

Currently needed to manually install nest_asyncio for testing, but after that the tests pass with the exception of the caplog logging assertion (this is also true for me in physiopy/physutils#7). Still taking a closer look at the rest, will have a full review for you tomorrow!

phys2denoise/workflow.py

smoia · 2024-08-30T08:17:51Z

phys2denoise/tests/data/ECG.csv

I can help with setting things up, but it would be FAR better if we upload all the data in our OSF repository and remove it from here.
Do you need help with it?

Tbh, I just used the files from peakdet, because in the tests there these files are also uploaded. I mostly use the fake_physio.phys for the tests here which is created on the spot using a util function. I can delete ECG.csv if you want

maestroque · 2024-08-30T12:47:09Z

phys2denoise/cli/run.py

Transfer the main workflow in workflow.py as that's the entry point of the CLI. This file should only contain the parser

Done @smoia

m-miedema · 2024-08-30T20:55:27Z

@m-miedema what failed tests are you getting? Everything should pass now

I am getting the attached failure:

Let me know if there's something you can point me to that's wrong on my end!

for more information, see https://pre-commit.ci

maestroque · 2024-09-02T21:19:01Z

@m-miedema what failed tests are you getting? Everything should pass now

I am getting the attached failure:

Let me know if there's something you can point me to that's wrong on my end!

@m-miedema It seems that the code is the same version, and it passes locally I cannot recreate it. Also we cannot see the CI yet before physiopy/physutils#7 merges and releases

Could you try to add the following to test/__init__.py to see if this fixes it?

from loguru import logger
import sys

logger.add(sys.stderr)

me-pic · 2024-09-03T12:56:05Z

@m-miedema can't replicate the error you are getting either

me-pic · 2024-09-03T12:58:17Z

@maestroque Not sure if that should even be addressed in this PR, but in the chest_belt.py script, we are using the np.math function which I believe is deprecated and might eventually cause some issue eventually...

me-pic

@maestroque Thank for your work on that PR ! Good job overall 🎉 🎉

Would it be possible to add some tests to cover the CLI ? If you need some examples, you could refer to the tests in the giga_connectome package.

maestroque · 2024-09-03T13:15:51Z

@maestroque Not sure if that should even be addressed in this PR, but in the chest_belt.py script, we are using the np.math function which I believe is deprecated and might eventually cause some issue eventually...

Yes, you are right! There is this open issue about numpy v2 compatibility opened for that #62

maestroque · 2024-09-03T13:36:34Z

@maestroque Thank for your work on that PR ! Good job overall 🎉 🎉

Would it be possible to add some tests to cover the CLI ? If you need some examples, you could refer to the tests in the giga_connectome package.

Sure, on it

me-pic · 2024-09-03T17:41:50Z

phys2denoise/workflow.py

+# def phys2denoise(
+#     filename,
+#     outdir=".",
+#     metrics=[
+#         crf,
+#         respiratory_pattern_variability,
+#         respiratory_variance,
+#         respiratory_variance_time,
+#         rrf,
+#         "retroicor_card",
+#         "retroicor_resp",
+#     ],
+#     debug=False,
+#     quiet=False,
+#     **kwargs,
+# ):


Can we delete what is commented ?

Oops, yeah I thought I did

m-miedema · 2024-09-03T22:09:07Z

@m-miedema what failed tests are you getting? Everything should pass now

I am getting the attached failure:
Let me know if there's something you can point me to that's wrong on my end!

@m-miedema It seems that the code is the same version, and it passes locally I cannot recreate it. Also we cannot see the CI yet before physiopy/physutils#7 merges and releases

Could you try to add the following to test/__init__.py to see if this fixes it?
from loguru import logger
import sys

logger.add(sys.stderr)

This does not fix it - I'm not sure what's going on but since you and @me-pic are not able to replicate it, I continued on to test the rest of the CLI (see my next comment).

m-miedema · 2024-09-03T22:13:23Z

I'm having trouble using the CLI to calculate metrics when using a .phys object as my input. For example, I thought I'd run a quick test using the physio objects generated in the OHBM tutorial here but the CLI returns e.g. "Metric rv not computed. Skipping" without more useful information (even when I run in --debug mode - which as a side-note, doesn't actually change the output for me). Has anyone else successfully output metrics in this case?

maestroque · 2024-09-03T22:34:49Z

I'm having trouble using the CLI to calculate metrics when using a .phys object as my input. For example, I thought I'd run a quick test using the physio objects generated in the OHBM tutorial here but the CLI returns e.g. "Metric rv not computed. Skipping" without more useful information (even when I run in --debug mode - which as a side-note, doesn't actually change the output for me). Has anyone else successfully output metrics in this case?

@m-miedema I need you to provide the precise logs you get in order to understand the problem. I haven't had this issue personally

m-miedema · 2024-09-04T13:35:55Z

I'm having trouble using the CLI to calculate metrics when using a .phys object as my input. For example, I thought I'd run a quick test using the physio objects generated in the OHBM tutorial here but the CLI returns e.g. "Metric rv not computed. Skipping" without more useful information (even when I run in --debug mode - which as a side-note, doesn't actually change the output for me). Has anyone else successfully output metrics in this case?

@m-miedema I need you to provide the precise logs you get in order to understand the problem. I haven't had this issue personally

Certainly! If you've been able to get the CLI to run on a Physio object with peaks/troughs, could you share the object and the call with me and I can try it? So far I've tried a few different ways, but in general this is the type of call and output:

phys2denoise -in '.\sub-007_ses-05_task-rest_run-01_resp_peaks.phys' -e 'rv' -nscans 400 -tr 1.5 -sr 40 -lags 0 -win 6 --debug -out .

2024-09-04 09:32:47.905 | INFO     | phys2denoise.workflow:phys2denoise:208 - Running phys2denoise version: 0+untagged.276.gabd4c93.dirty
2024-09-04 09:32:47.914 | DEBUG    | phys2denoise.workflow:phys2denoise:233 - Metrics: []
2024-09-04 09:32:47.914 | DEBUG    | phys2denoise.workflow:phys2denoise:233 - Metrics: []
2024-09-04 09:32:48.866 | DEBUG    | physutils.tasks:wrapped_func:27 - Creating pydra task for transform_to_physio
2024-09-04 09:32:48.866 | DEBUG    | physutils.tasks:wrapped_func:27 - Creating pydra task for transform_to_physio
2024-09-04 09:32:51.227 | DEBUG    | physutils.io:load_physio:185 - Instantiating Physio object from a file
2024-09-04 09:32:51.227 | DEBUG    | physutils.physio:__init__:293 - Initializing new Physio object
Metric rv not computed. Skipping

m-miedema · 2024-09-04T14:35:40Z

One thing I think we should strongly consider as a future direction for the CLI is to set up a heuristic file with more metric specific parameters. For example, here we can calculate different metrics, but not specific different window sizes for each in the same call. I'm putting this comment along with a new issue here not to lose track of it - if others think this is a useful idea I can follow up in the future :)

m-miedema · 2024-09-04T14:37:25Z

I'm having trouble using the CLI to calculate metrics when using a .phys object as my input. For example, I thought I'd run a quick test using the physio objects generated in the OHBM tutorial here but the CLI returns e.g. "Metric rv not computed. Skipping" without more useful information (even when I run in --debug mode - which as a side-note, doesn't actually change the output for me). Has anyone else successfully output metrics in this case?

@m-miedema I need you to provide the precise logs you get in order to understand the problem. I haven't had this issue personally

Certainly! If you've been able to get the CLI to run on a Physio object with peaks/troughs, could you share the object and the call with me and I can try it? So far I've tried a few different ways, but in general this is the type of call and output:

phys2denoise -in '.\sub-007_ses-05_task-rest_run-01_resp_peaks.phys' -e 'rv' -nscans 400 -tr 1.5 -sr 40 -lags 0 -win 6 --debug -out .
2024-09-04 09:32:47.905 | INFO     | phys2denoise.workflow:phys2denoise:208 - Running phys2denoise version: 0+untagged.276.gabd4c93.dirty
2024-09-04 09:32:47.914 | DEBUG    | phys2denoise.workflow:phys2denoise:233 - Metrics: []
2024-09-04 09:32:47.914 | DEBUG    | phys2denoise.workflow:phys2denoise:233 - Metrics: []
2024-09-04 09:32:48.866 | DEBUG    | physutils.tasks:wrapped_func:27 - Creating pydra task for transform_to_physio
2024-09-04 09:32:48.866 | DEBUG    | physutils.tasks:wrapped_func:27 - Creating pydra task for transform_to_physio
2024-09-04 09:32:51.227 | DEBUG    | physutils.io:load_physio:185 - Instantiating Physio object from a file
2024-09-04 09:32:51.227 | DEBUG    | physutils.physio:__init__:293 - Initializing new Physio object
Metric rv not computed. Skipping

@maestroque I think it would be very helpful to make the "Metric rv not computed. Skipping" message slightly more verbose so that the user knows it is stemming from the export argument, rather than the computational argument. As it stands it's quite confusing! E.g. "Metric X not computed, skipping the export of metric X." or even better to throw a warning when a metric is provided as an export argument but not a computational argument.

m-miedema · 2024-09-04T14:53:31Z

As well, I opened a new issue to address this, but I'm finding that the number of time points in the exported resampled metric files don't match the -nscans argument in the CLI. I think users would expect this to be the case, so it's something we should dig into another time.

m-miedema

I'm looking forward to seeing more documentation and to resolving some of the logging integration with pydra (I will open an issue if I'm still having logging-related failures in physutils and phys2denoise local testing following the merge). Please be sure to provide an example of running the CLI and the expected outputs in the documentation, including logs. Other than my minor point about the calculated vs. exported metric message, I won't suggest any other changes to address at this point. Thanks for all your hard work @maestroque !

maestroque · 2024-09-21T09:38:57Z

Cleaned up, this should be ready to merge once physutils is

m-miedema · 2024-09-26T18:22:00Z

@maestroque thanks for updating this one!

me-pic · 2024-10-26T17:50:45Z

phys2denoise/tests/test_tasks_integration.py

+    wf.set_output([("result", wf.compute_metrics.lzout.out)])
+
+    with Submitter(plugin="cf") as sub:
+        sub(wf)


The test is failing here. I'm getting the following error:
File "/home/user/Documents/physio/phys2denoise/phys2denoise/metrics/chest_belt.py", line 288, in respiratory_variance data = physio.check_physio(data, ensure_fs=True, copy=True) File "/home/user/Documents/physio/phys2denoise/env/lib/python3.9/site-packages/physutils/physio.py", line 149, in check_physio if ensure_fs and np.isnan(data.fs): TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

I've tried to investigate a bit more, and turn out that the value of data.fs when it's crashing is NOTHING:

(Pdb) data Physio(size=18750, fs=_Nothing.NOTHING) (Pdb) data.fs NOTHING (Pdb) type(data.fs) <enum '_Nothing'>

I remember this issue, however afaik it was fixed. Can you try replacing the isnan check with pandas.isna()?

Thank you @maestroque for your comment ! pandas.isna() is indeed able to evaluate the NOTHING value rather than just throwing an error. However, pandas.isna(NOTHING) returns False so the value error is not raised.. That causes other problem afterward, for example when instantiating the Physio object:

File "/home/user/Documents/physio/phys2denoise/env/lib/python3.9/site-packages/physutils/physio.py", line 305, in __init__ self._fs = np.float64(fs) TypeError: float() argument must be a string or a number, not '_Nothing'

Follow up question: Are we expecting the value of data.fs to be NOTHING. Just wondering if the problem might comes from something before e.g. Workflow or generate_physio are called.

The error seems to come from generate_physio in physutils. Specifically, in that function, when load_physio is called, it specifies fs=fs, which is unecessary and causes issue. Will open an issue + PR in physutils

All tests are passing with the changes made in physutils PR #11

maestroque added 2 commits July 22, 2024 17:44

Initial export_metrics pydra task

1718cbf

Add initial compute and export metrics pydra tasks

7625806

maestroque requested review from smoia, m-miedema and me-pic July 23, 2024 10:46

maestroque self-assigned this Jul 23, 2024

github-actions bot added Testing This is for testing features, writing tests or producing testing code. Internal Changes affect the internal API. It doesn't increase the version, but produces a changelog labels Jul 23, 2024

maestroque added Minormod-breaking For development only, this PR increments the minor version (0.+1.0) but breaks compatibility and removed Internal Changes affect the internal API. It doesn't increase the version, but produces a changelog labels Jul 23, 2024

Pull updates from master

d150e76

github-actions bot added the Internal Changes affect the internal API. It doesn't increase the version, but produces a changelog label Aug 25, 2024

maestroque removed the Internal Changes affect the internal API. It doesn't increase the version, but produces a changelog label Aug 25, 2024

github-actions bot added the Internal Changes affect the internal API. It doesn't increase the version, but produces a changelog label Aug 25, 2024

maestroque added 5 commits August 26, 2024 00:15

Add functional compute_metrics pydra task and unit tests

4e38eaf

Add functional export_metrics task and tests

aef61d2

Initial pydra workflow integration test

1d27cae

Add workflow integration test

e053e1d

Improve integration test

bf2ae2e

maestroque mentioned this pull request Aug 28, 2024

Issues integrating loguru into pydra tasks nipype/pydra#763

Open

maestroque added 3 commits August 28, 2024 17:44

Add buildable pydra worfklow to be used in CLI and initial CLI arguments

d347bba

Add initial CLI implementation

7180576

Minor CLI optimizations

c27e8d5

maestroque changed the title ~~WIP: Add pydra tasks and workflow~~ Add pydra tasks, workflow and update CLI Aug 28, 2024

smoia reviewed Aug 30, 2024

View reviewed changes

phys2denoise/workflow.py Outdated Show resolved Hide resolved

smoia reviewed Aug 30, 2024

View reviewed changes

maestroque commented Aug 30, 2024

View reviewed changes

maestroque and others added 3 commits September 3, 2024 00:05

Fix CLI calling method

a03381d

[pre-commit.ci] auto fixes from pre-commit.com hooks

9e3d0d4

for more information, see https://pre-commit.ci

Remove nest_asyncio dependency

85be699

me-pic requested changes Sep 3, 2024

View reviewed changes

maestroque mentioned this pull request Sep 3, 2024

Add BIDS reading support and prepare input loading for pydra workflow physiopy/physutils#7

Merged

18 tasks

Add auto mode for input files in CLI

abd4c93

me-pic reviewed Sep 3, 2024

View reviewed changes

m-miedema reviewed Sep 4, 2024

View reviewed changes

Cleanup

9d15abd

Integrate physutils changes

a71a745

me-pic reviewed Oct 26, 2024

View reviewed changes

me-pic mentioned this pull request Oct 28, 2024

fs paramater specification causing problem in load_physio physiopy/physutils#10

Open

Add pydra tasks, workflow and update CLI #57

Are you sure you want to change the base?

Add pydra tasks, workflow and update CLI #57

Conversation

maestroque commented Jul 23, 2024

Proposed Changes

Change Type

Checklist before review

maestroque commented Jul 23, 2024

maestroque commented Jul 23, 2024

m-miedema commented Jul 23, 2024

codecov bot commented Aug 25, 2024

Codecov Report

maestroque commented Aug 25, 2024

m-miedema commented Aug 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m-miedema commented Aug 30, 2024

maestroque commented Sep 2, 2024

me-pic commented Sep 3, 2024

me-pic commented Sep 3, 2024

me-pic left a comment

Choose a reason for hiding this comment

maestroque commented Sep 3, 2024

maestroque commented Sep 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m-miedema commented Sep 3, 2024

m-miedema commented Sep 3, 2024 • edited Loading

maestroque commented Sep 3, 2024 • edited Loading

m-miedema commented Sep 4, 2024 • edited Loading

m-miedema commented Sep 4, 2024 • edited Loading

m-miedema commented Sep 4, 2024 • edited Loading

m-miedema commented Sep 4, 2024

m-miedema left a comment

Choose a reason for hiding this comment

maestroque commented Sep 21, 2024

m-miedema commented Sep 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

me-pic Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

me-pic Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m-miedema commented Aug 29, 2024 •

edited

Loading

m-miedema commented Sep 3, 2024 •

edited

Loading

maestroque commented Sep 3, 2024 •

edited

Loading

m-miedema commented Sep 4, 2024 •

edited

Loading

m-miedema commented Sep 4, 2024 •

edited

Loading

m-miedema commented Sep 4, 2024 •

edited

Loading

me-pic Oct 28, 2024 •

edited

Loading

me-pic Oct 28, 2024 •

edited

Loading