Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Enable models for sparsely sampled fMRI series #376
base: master
Are you sure you want to change the base?
ENH: Enable models for sparsely sampled fMRI series #376
Changes from 7 commits
90b861d
9057a7c
4921a0a
ff319d1
03cbeb0
42384fc
979a4a2
998ce37
10af708
b6d8271
0f0531c
09e8d9b
aa72c46
258f243
2d32955
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm probably missing something, but if we have to give up on a value that neatly bins
TR
andTA
, why not just useSR
at that point? E.g., if dt = 50 and SR = 10, then this will give us dt = 10, so we're at the SR anyway. If dt = 50 and SR = 17, we get dt = 25, which both prevents neat binning and is bigger than we wanted. Is the idea that at least everydt//SR
^th bin will align nicely this way?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
25 also permits neat binning...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, duh. The inversion tripped me up. That is indeed the thing I was missing. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, I'm still not sure we shouldn't just use the
SR
. Since the user could have set thesr
on the collection manually, the compromise solution here risks explicitly ignoring their intentions. This isn't an entirely benign decision, because in subsequent transformations, if a mismatch in sampling rates is detected, that will trigger a resampling in at least one variable, when potentially none might have been needed. They could also have code that depends on knowing what the SR is.I think maybe we need to come up with a general policy about this, because it crops up in other places too (e.g., the issue about what to do if event files contain very short durations). I.e., the general question is, in a situation where the user appears to be picking sampling rates that are objectively inefficient or result in loss of information, should we be paternalistic and fix it for them, or just let them know they can probably do better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing we could also do, though it's a bit annoying, is internally maintain a flag that indicates whether or not the user has explicitly overridden the default sampling rate. If
SR=10
purely because that's the default, overriding it seems fine. If the user explicitly set it themselves, maybe we don't want to pick a different value for them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like we're trying to thread a needle where we make smart default choices based on our expertise on the one hand, and on the other, load a gun, cock it and point it at the foot of anyone who looks like they might know what they're doing...
Currently we reserve the footgun for people who use
ToDense
to get themselves a sampling interval that is relatively prime togcd(TR, TA)
. If someone knows enough to manipulate the default sampling rate, they can learn thatConvolve
with a sparse acquisition paradigm will choose a slightly different interval than with a continuous acquisition paradigm unless the variable is already dense.But perhaps there's a better general approach here? I admit that I chose this simply because building a boxcar function and taking a mean is easier than learning how to do this with
interp1d
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general issue potentially affects anyone who sets the
sampling_rate
explicitly inBIDSLayout.get_collections
orload_variables
. A sophisticated user could certainly know enough about their dataset to think "I'll use a sampling rate of 50, because that's high enough to account for my short-duration events, but still manageable computationally". Thereafter, I think that user could reasonably expect that if they callToDense
orConvolve
without explicitly specifying a different sampling rate, the resulting variable will have an effective sampling rate of 50.I think making decisions for the user internally is fine if there's an explicit argument in the transformation that implies as much. I.e., if
ToDense
has defaultsampling_rate='auto'
, then the docstring can just explain that we will act intelligently unless a numeric value is passed, and the result may not equal what's currently set in the collection. That seems fine. The issue is that in this case, there's no provision to specify the sampling rate inConvolve
. I was against doing that because I don't think the spec should have that argument, but I guess we can make it a pybids-only argument, if only for the sake of making it clear to a sophisticated user what's going on.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is ok for upsampling not fine for downsampling. depending on frequency content this will introduce aliasing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure avoiding aliasing is in scope... or at least, I'm not sure we want to implicitly start low-pass filtering the input signal without the user's explicit consent. I'd be okay adding an explicit
TemporalFilter
transformation (actually I think this is already in the spec and just not yet implemented), and the user can then call that themselves before the resampling step if they like. But doing some magic internally to pick a suitable filter or otherwise avoid aliasing means (a) reproducibility across packages is limited, and (b) we're taking on the responsibility for producing sensible results, and to my mind this really falls on the user.