Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of analysis of non-FISH signal in regions defined by FISH signal #337

Open
1 of 9 tasks
vreuter opened this issue Jun 26, 2024 · 13 comments
Open
1 of 9 tasks

Comments

@vreuter
Copy link
Collaborator

vreuter commented Jun 26, 2024

  • Same idea as final step initially developed for DNA DSB project
  • Keep 1 row per spot, 1 column group per molecule of interest x measurement (e.g., mean RAD51, standard deviation RAD51)
  • Use, by default, just the central plane +/- 1 $z$-slice

Checklist:

@vreuter
Copy link
Collaborator Author

vreuter commented Nov 20, 2024

Supersedes #138

vreuter added a commit to vreuter/looptrace that referenced this issue Nov 24, 2024
vreuter added a commit to vreuter/looptrace that referenced this issue Nov 24, 2024
@vreuter vreuter modified the milestones: v0.11, v0.11.1 Nov 25, 2024
@vreuter
Copy link
Collaborator Author

vreuter commented Nov 25, 2024

For the locus-specific spots, consider whether or not we're also re-measuring pixel values in regional spots (particularly if the same ROI diameter is used for regional and for locus-specific spots), as well as the units of measure! In particular, we need to ensure that the ROI size is specified with units, and that we convert as necessary based on which of the fields of the locus spot records are being accessed in order to define the location of each region. We should probably prefer to continue to work in pixels for the specification of the ROI size/diameter, since we'll use the region center and size in order to define a region (in pixels) from which to extract pixel values and over which to compute statistics.

@ines-prlesi
Copy link

From a first glance, I think it would be great to have an "average aggregation" of the IF signal, with the modification where we can input the specification of the +/- N step relative to the z center; and not the whole z stack at the given ROI. The idea would be that we then get the mean intensity of the signal over +/- 2-3 slices over the z. 
However, depending on how the images look like and the step size (@TLSteinacker you must know this better than me) maybe it's not even necessary and then taking the "exact aggregation" would be enough (since this is what I am using for my analyses with the reparafil pipeline and it gives reliable results).

@vreuter
Copy link
Collaborator Author

vreuter commented Nov 25, 2024

Thanks @ines-prlesi

Other ideas from discussion:

  • If a user's specifying a $z$ depth, this could be interpreted either as an absolute value or as a percentage of the total available $z$ depth
  • If $z$ depth is interpreted absolutely, then care should be taken that units' space is clear/explicit (i.e., image space and units in pixels, or physical space and units as e.g. nanometers)

@TLSteinacker
Copy link

@vreuter thanks for initiating this discussion
@ines-prlesi what parameters are you currently using? have you tested different ones and how do they change the relative outcome?
I agree that the whole z would not be good, and think that either a selection of z slices (~2-3) or the calculation of the sphere might be preferential.

@vreuter
Copy link
Collaborator Author

vreuter commented Nov 25, 2024

Thanks @TLSteinacker , what's implemented for the DNA DSB project didn't give the user any flexibility, but rather computes all three of these (central slice, average, and max-projection). The reason I'd like to parameterize this, though, is because there are 5 numbers computed for each method (min, max, median, mean, and standard deviation), so the number of values is already relatively large, and for reasons I think we've discussed together or in pairs, the most used of these methods is the central-plane one.

So the "sphere" is a new idea, but of the previous three I think there's consensus to keep the central plane one but provide flexibility to add a +/- n value, therefore in total having the 2D ROI area x ($2n + 1$) layers of $z$ as the voxel in which to compute the summary statistics.

@vreuter
Copy link
Collaborator Author

vreuter commented Nov 25, 2024

@ines-prlesi @TLSteinacker how about having either of the following be valid, then...

  • Option for sphere, with a single parameter value representing either the radius or diameter a sphere in $z$
  • Option for rectangular prism, with single value representing either the "height" of the box (number of $z$ units), or the half-width?

?

At least one would be required, and then either we could allow one-and-only-one, or allow both and then use both methods, differentiating column names by, e.g., a __box or __sphere suffix.

Here could be the format of a valid specification...

{
    "shape": shape
    dimension: "n px",
}

where $n$ is a positive integer, $shape \in ("sphere", "box")$, and $dimension \in ("diameter", "sideLength")$

This would of course be a particular choice balancing expressiveness/flexibility, user burden, and clarity.

WDYT? Do you prefer something else to differently balance these tradeoffs?

@TLSteinacker
Copy link

Sounds like a good plan to me! Regarding ' with single value representing either the "height" of the box (number of z units), or the half-width?', would there not be one value for the xy size and one for z? Or is xy anyway already specified somewhere else?

@vreuter
Copy link
Collaborator Author

vreuter commented Nov 25, 2024

would there not be one value for the xy size and one for z? Or is xy anyway already specified somewhere else?

Indeed, sorry, I was operating with the "diameter (xy) is already set" model in my head since that's currently the case, but you're right @TLSteinacker we'd need separate values for xy.

Another thing just occurred to me...we began by saying +/- a margin from central plane in $z$, and with a diameter in the xy plane. I then started talking about a sphere, but actually we'd have an ellipsoid since no constraint that the $z$ depth match the diameter in $xy$. In fact, I acknowledge it's a bit nonsense to talk this way anyhow since the $xy$ units are pixels in a way that $z$ are not. So how about this...

{
  "x": "<x> px", 
  "y": "<y> px", 
  "z": <z>, 
  "shape": shape
}

where the $x$, $y$, $z$ values are all populated by positive integers, with the "px" added to be clear that it's pixels, not a fixed physical unit. $z$ is left as a scalar to note the difference with the other two. $shape$ must be either "ellipsoid" or "box", and then the values are used accordingly to construct the actual volume of pixels over which summary stats of the pixel values are calculated. This leaves flexibility to define $x$ and $y$ separately in the event (even if rare) that the regions are known/expected to have more rectangular form in $xy$, at the small user burden of adding an extra key-value pair to specify. I'd favor interpreting these (implicitly, without additional specification) as side lengths (not half-widths) in the "box" case, and analogously, axis lengths in the "ellipsoid" case.

WDYT? @ines-prlesi @TLSteinacker

@TLSteinacker
Copy link

Cool, thanks for clarifying and for pointing out the ellipsoid! would it be helpful to specify z as 'slices' to avoid any ambiguity?

I agree that FWHM would be confusing and would prefer side/axis lengths

@vreuter
Copy link
Collaborator Author

vreuter commented Nov 26, 2024

would it be helpful to specify z as 'slices' to avoid any ambiguity?

Yes, i think that makes it clearer, and saves the confusion of a config file reader thinking that the config file author has mistakenly forgotten units of z (and then really badly, erroneously "correcting" it by adding something which would in fact be wrong). So indeed, we can make it so that the $z$ specification must include "slices" for clarity. @ines-prlesi is that OK with you? The "px" suffix is how I've defined pixels, and those parameter values will parse as such.

Increasingly, computations will be done with values already carrying units of measure (and therefore, implicitly, the "dimension" of the quantity e.g. length, area, etc.), just for safety so that things like the error with regional pairwise distances having been pixels rather than nanometers can't happen. I alluded to this in a previous weekly meeting update/spiel, but just reiterating here. we use the config files (or, in the future the actual image files to define what a pixel in physical space terms, and increasingly we'll be more precise about using that throughout the computations and attaching units to other such parameters.

@TLSteinacker
Copy link

sounds good @vreuter, thanks!!

@ines-prlesi
Copy link

Sorry @vreuter and @TLSteinacker, I'm still new to github so I missed the notifications; however I read the whole thread and think that the final conclusion is good!

@vreuter vreuter modified the milestones: v0.11.1, v0.12, v0.13 Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants