Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add flux normaliser processor #1878

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open

Add flux normaliser processor #1878

wants to merge 34 commits into from

Conversation

hrobarts
Copy link
Contributor

@hrobarts hrobarts commented Jul 25, 2024

Description

New FluxNormaliser processor to normalise data

  • to a float value
  • to an array of floats with size matching the number of angles in the data
  • to the mean value calculated from a specified region of interest in the data itself

Example Usage

Changes

Testing you performed

Demo scripts in https://github.com/TomographicImaging/CIL-Demos/blob/flux_normaliser_demo/misc/flux_normaliser.ipynb
Tests in test_DataProcessor.py TestFluxNormaliser class

Related issues/links

#1877

Checklist

  • I have performed a self-review of my code
  • I have added docstrings in line with the guidance in the developer guide
  • I have updated the relevant documentation
  • I have implemented unit tests that cover any new or modified functionality
  • CHANGELOG.md has been updated with any functionality change
  • Request review from all relevant developers
  • Change pull request label to 'Waiting for review'

Contribution Notes

Please read and adhere to the developer guide and local patterns and conventions.

  • The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in CIL (the Work) under the terms and conditions of the Apache-2.0 License
  • I confirm that the contribution does not violate any intellectual property rights of third parties

@hrobarts
Copy link
Contributor Author

hrobarts commented Aug 1, 2024

Some examples:
Normalise all data with a scalar value

processor = FluxNormaliser(flux=10)
processor.set_input(data)
data_norm = processor.get_output()
show2D([data, data_norm], ['Data','Normalised data'])

image

Normalise data with a scalar value per projection

processor = FluxNormaliser(flux=np.arange(1,2,(2-1)/(data.get_dimension_size('angle'))))
processor.set_input(data)
data_norm = processor.get_output()
show2D([data, data_norm], slice_list=('vertical',80))

image

Normalise data using a region in the background of each projection.

roi = {'vertical':(0,14), 'horizontal':(0,14)}
processor = FluxNormaliser(roi=roi)
processor.set_input(data)
processor.show_roi()

The default output of show_roi() shows the roi plotted on the dataset for the first and last projection and the projection which has the maximum variation from the mean intensity (Z-score) in the roi, as well as a plot of the mean intensity per projection.
image
In this case we can see that although the roi is empty for most projections, there are some cases where there is material in the roi like index 176.
We can also use processor.show_roi(angle_index=250) with a specific angle_index if we want to check a different projection which looks like an outlier
image
In this case, if we change the roi we can make sure it's empty for all projections

roi = {'vertical':(0,10), 'horizontal':(0,10)}
processor = FluxNormaliser(roi=roi)
processor.set_input(data)
processor.show_roi()

image

data_norm = processor.get_output()

In this example with synchrotron data, the background has a gradient so the Z-score might not be the best metric to use.

roi = {'vertical':(10,135), 'horizontal':(125,150)}
processor = FluxNormaliser(roi=roi)
processor.set_input(data)
processor.show_roi()

image

roi = {'vertical':(10,135), 'horizontal':(135,150)}
processor = FluxNormaliser(roi=roi)
processor.set_input(data)
processor.show_roi()

image

data_norm = processor.get_output()
show2D([data, data_norm], slice_list=('vertical',80))

image

The gradient is opposite if we use the roi on the other side

roi = {'vertical':(10,135), 'horizontal':(10,32)}
processor = FluxNormaliser(roi=roi)
processor.set_input(data)
processor.show_roi()

image

data_norm = processor.get_output()
show2D([data, data_norm], slice_list=('vertical',80))

image

@hrobarts
Copy link
Contributor Author

Updated to plot the mean, minimum and maximum in the roi
image

@hrobarts
Copy link
Contributor Author

Updated so preview_configuration() still works on data with one angle
image

Also works on 2D data where the roi is specified with one of horizontal or vertical. Plot the roi on the sinogram
image

Also works if the dataorder is different
image

Also works if the data has multiple channels, the flux to normalise to is the mean in the roi averaged over the channels. preview_configuration() shows the central channel by default
image
or the channel can be specified processor.preview_configuration(channel=0)
image

@hrobarts hrobarts linked an issue Aug 20, 2024 that may be closed by this pull request
@hrobarts hrobarts marked this pull request as ready for review August 20, 2024 14:15
@hrobarts hrobarts requested a review from gfardell August 20, 2024 14:50
@hrobarts
Copy link
Contributor Author

@gfardell we discussed whether the preview_configuration should just be an option in logging. At the moment I've left it as a function the users can access - what do you think?

Copy link
Member

@lauramurgatroyd lauramurgatroyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great and super useful! Excellent documentation in this PR to go along with it as well. as I have said in my comments, we need to make sure it lives somewhere useful!

I added some comments, mostly to improve the unit tests a bit and add some docs

Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/test/test_DataProcessor.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/test/test_DataProcessor.py Show resolved Hide resolved
Wrappers/Python/test/test_DataProcessor.py Outdated Show resolved Hide resolved
@hrobarts
Copy link
Contributor Author

Hi @lauramurgatroyd thank you for your review! I have made lots of updates

  • More efficiently access data in process (following conversation with @gfardell )
  • Error if horizontal and vertical are not in the last two dimensions of the data
  • Change norm_value to target and allow it to be passed as a number or string mean or first which gets the target from the mean or first value of the flux array (e.g. this could be the mean value in the roi across all projections)
  • Moved _calculate_flux and _calculate_target to separate functions out of check_input
  • New tests to test this functionality

I have added the notebook where I created the examples above to CIL-Demos/misc in this PR TomographicImaging/CIL-Demos#187

CHANGELOG.md Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
equal to the number of projections in the dataset. If flux=None, calculate
flux from the roi, default is None.

roi: dict (optional)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
roi: dict (optional)
roi: dict (optional), default=None

Please add defaults like this, as in docs guidance: https://tomographicimaging.github.io/CIL/nightly/developer_guide/


Parameters:
-----------
flux: float or list of floats (optional)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
flux: float or list of floats (optional)
flux: float or list of floats (optional), default=None

and/ or 'horizontal'. If an axis is not specified in the roi dictionary,
the full range will be used, default is None.

target: string or float
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
target: string or float
target: string or float, default='mean'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please list string options like: {'hi', 'ho'} in this example https://numpydoc.readthedocs.io/en/latest/example.html#example

Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Wrappers/Python/cil/processors/FluxNormaliser.py Outdated Show resolved Hide resolved
Comment on lines +159 to +168
if d in self.roi:
# check indices are ints
if not all(isinstance(i, int) for i in self.roi[d]):
raise TypeError("roi values must be int, found {} and {}"
.format(str(type(self.roi[d][0])), str(type(self.roi[d][1]))))
# check indices are in range
elif (self.roi[d][0] >= self.roi[d][1]) or (self.roi[d][0] < 0) or self.roi[d][1] > dataset.get_dimension_size(d):
raise ValueError("roi values must be start > stop and between 0 and {}, found start={} and stop={} for direction '{}'"
.format(str(dataset.get_dimension_size(d)), str(self.roi[d][0]), str(self.roi[d][1]), d ))
# create slice
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can make this some utility for validating roi throughout CIL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(could make a separate issue)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Laura, this is a good point, I'll make an issue. I think the requirement for the ROI can vary between processors but there are definitely some common things.


return True
# check if flux array contains 0s
if 0 in self.flux:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this catch 0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've added some tests.

Copy link
Member

@lauramurgatroyd lauramurgatroyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and nice examples in CIL-Demos misc.
As discussed this needs:

  • an example of 'first' in the demo
  • the demo being merged to main

Copy link
Member

@gfardell gfardell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. I'd be happy to merge it in as it is. But I think it would be quick to add a numba implementation of the actual calculation. We can look at it together this week.

Comment on lines +40 to +45
roi: dict, optional
Dictionary describing the region of interest containing the background
in the image. The roi is specified as `{'axis_name1':(start,stop),
'axis_name2':(start,stop)}`, where the key is the axis name 'vertical'
and/ or 'horizontal'. If an axis is not specified in the roi dictionary,
the full range will be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just acquisition data then we know the axis names so I would remove the axis_name1 axis_name2


if 'horizontal' in dataset.dimension_labels:
self.h_axis = dataset.get_dimension_axis('horizontal')
self.h_size = dataset.get_dimension_size('horizontal')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know these methods existed, I'd always just found the index of dimension labels and then used that in shape!

Comment on lines +251 to +255
angle: float, optional
Angle to plot, default=None displays the data with the minimum
and maximum pixel values in the roi, otherwise the angle to display
can be specified as a float and the closest angle will be displayed.
For 2D data, the roi is plotted on the sinogram.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elsewhere I think we use 'angle' as the name of the axis, then the argument is the index of that projection in a list, not the value. It might be worth just discussing this further, not because it's wrong but it could just be a change of expected behaviour. Would it cause any issues if they specified an index?

It would also have to be careful about units - especially if we decide to convert things in the future and store in a defined way.

Comment on lines +401 to +406
for i in range(num_proj):
arr_proj = data.array.flat[i*proj_size:(i+1)*proj_size]
if len(self.flux.flat) > 1:
f = self.flux.flat[i]
arr_proj *= self.target_value/f
out.array.flat[i*proj_size:(i+1)*proj_size] = arr_proj
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be trivial to add a numba implementation, and here it would speed things up a lot. So I think we should look at that before we merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a FluxNormaliser processor
4 participants