-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Photutils tutorial notebook #2: source detection #3
Photutils tutorial notebook #2: source detection #3
Conversation
# Conflicts: # .gitignore # photutils/01_photutils_background_estimation.ipynb # photutils/02_photutils_source_detection.ipynb
Thanks @laurenmarietta - Tom is @Onoddil, so definitely makes sense for him to have a look at this as a "user". I'll also plan to take a look at it from my perspective in more detail later. On the specific concerns:
I agree this is rather odd. I would have expected <5% difference, not ~90%! I'll try to have a look/ask around some other experts and see if we're missing something.
I think the length is fine. I agree the segmentation seems a bit out-of-context, but I think it might be even worse to have a separate one for segmentation... An interesting experiment to tie it together might be to do a comparison of the segmentation and other methods? My prediction is that it'll work better on galaxies and worse for stars... but you never know! |
@laurenmarietta - Ah, I think I have part of the answer. If you look at the Two other things to just mention that I noticed on the way:
|
So I don't think the issue can be the elliptical nature of the Gaussian as, without specifying, DAOStarFinder has ratio=1.0 as its default value, assuming circular Gaussians. If I run IRAFFinder with iraffind = IRAFStarFinder(fwhm=5.0, threshold=20.*std, minsep_fwhm=0.0, sharplo=0.2, I get: id xcentroid ... flux mag bringing me into ~96% agreement with the DAOStarFinder. From a quick look at the reason seems to be minsep_fwhm, as if I remove that and run iraffind = IRAFStarFinder(fwhm=5.0, threshold=20.*std, sharplo=0.2, I get id xcentroid ... flux mag which seems to agree with @eteq on the 30% difference. I therefore think the issue the default requirement that objects have to be ~13 pixels separated, which is just not going to be the case in the Hubble EXtremely Deep Field! Re: @eteq 's 3 vs 5 pixel FWHM: if you decrease the FWHM IRAFStarFinder gains sources (likely missed stars), but DAOStarFinder loses sources (likely dropped galaxies larger than the new, smaller FWHM). However, for this test case I'm not sure that matters; we should simply go with the FWHM that gives a good selection of all sources, and perhaps briefly mention the tuning of the parameters to the science being done if that isn't already mentioned. Just to throw a slight spanner in the self-similarity works if you run IRAFStarFinder(fwhm=5.0, minsep_fwhm=0.0, threshold=20.*std) (i.e., the IRAF default sharpness etc., just without the minimum distance requirement) you only get 280 sources, so at that point the large differences between default roundness and sharpness are likely dropping galaxies, as daofind allows roundness to be in the domain [-1, 1] but iraffind limits roundness to [0, 0.2] and iraffind requiring a lower sharpness of 0.5 to daofind's 0.2. In summary, for this comment is a bit disjointed by this point: for a self-similar comparison you can set IRAFStarFinder to the same settings as DAOStarFinder, but require minsep_fwhm = 0.0, or you can justify the differences with a combination of the 13 pixel separation requirement and stricter limits on sharpness and roundness (where roundness is likely the galaxy/star divide issue), and consider a FWHM that gives the best balance between stars and galaxies for both finders. |
After what feels like a year, I have finally had the time to take another look at this notebook. I've made just a couple changes - thanks, Erik, for that clever code to show cutouts of the sources. And thanks to both of you for your sleuthing regarding the differences between If either of you spot anything else that could use clarification or improvement, please let me know. |
Managed to have a read of this notebook, but haven't got iPython to play nicely so I've not managed to verify the plot outputs (I'll have another look at these notebooks once I'm back in the US). The flow and words etc. look good though; I really just have one minor comment. I'm not sure if it was deliberate or a slight formatting issue, but any of the "inline" comments (e.g., https://github.com/eteq/csi-stsci-notebooks/pull/3/files#diff-fbc70e2f8789c7a80f84a34d56d11e00R39 or any of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@laurenmarietta - Overall this looks really good! I saw several minor things inline, but actually I think this is almost there.
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's also define some `matplotlib` parameters, such as title font size and the dpi, to make sure our plots look nice. To make it quick, we'll do this by loading a [shared style file](photutils_notebook_style.mplstyle) into `pyplot`. We will use this style file for all the notebook tutorials. (See [here](https://matplotlib.org/users/customizing.html) to learn more about customizing `matplotlib`.)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
"Let's also define some `matplotlib` parameters, such as title font size and the dpi, to make sure our plots look nice. To make it quick, we'll do this by loading a [shared style file](photutils_notebook_style.mplstyle) into `pyplot`. We will use this style file for all the notebook tutorials. (See [here](https://matplotlib.org/users/customizing.html) to learn more about customizing `matplotlib`.)" | |
"Let's also define some `matplotlib` parameters, such as title font size and the dpi, to make sure our plots look nice. To make it quick, we'll do this by loading a [style file shared with the other photutils tutorials](photutils_notebook_style.mplstyle) into `pyplot`. We will use this style file for all the notebook tutorials. (See [here](https://matplotlib.org/users/customizing.html) to learn more about customizing `matplotlib`.)" |
"source": [ | ||
"As described in the introduction, we will be using Hubble eXtreme Deep Field (XDF) data. Since this file is too large to store on GitHub, we will just use `astropy` to directly download the file from the STScI archive: https://archive.stsci.edu/prepds/xdf/ \n", | ||
"\n", | ||
"(Generally, the best package for web queries of astronomical data is `astroquery`; however, the dataset we are using is a High Level Science Product (HLSP) and thus is not located within a catalog that could be queried with `astroquery`.)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps also have the "astroquery" text be a link?
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"unit = u.ct / u.s\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"unit = u.ct / u.s\n", | |
"unit = u.electron / u.s\n", |
It might be clearer to leave it as counts, but since electron is a unit here, you might want to do that?
(There's a subtlty here: "ADUs" and "electrons" are generally related via the gain, but "counts" I think are not actually clearly defined to be one or the other. So it's probably better to specify electrons specifically, but maybe I'm mis-understanding what "counts" means here?)
"cbar.ax.set_yticklabels(labels)\n", | ||
"\n", | ||
"# Define labels\n", | ||
"cbar.set_label(r'Flux Count Rate ($e^{-1}/s$)', rotation=270, labelpad=30)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be neat to use the unit itself to render the latex. That is, you could do something like:
"cbar.set_label(r'Flux Count Rate ($e^{-1}/s$)', rotation=270, labelpad=30)\n", | |
"cbar.set_label(r'Flux Count Rate (${}$)'.format(xdf_image.unit.to_string('latex')), rotation=270, labelpad=30)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same comment applies below in several places)
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Take this example as reminder to be critical when selecting a source detection algorithm, and when defining algorithm parameters!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this wording. Is it "critical" as in "pay attention to the details"? If so, maybe a better option would be "reminder to be careful" or even "examine your results and play with the parameters" or something like that?
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"For more general source detection cases that do not require rigorous comparison with models, `photutils` offers the `find_peaks` function. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor):
"For more general source detection cases that do not require rigorous comparison with models, `photutils` offers the `find_peaks` function. \n", | |
"For more general source detection cases that do not require comparison with models, `photutils` offers the `find_peaks` function. \n", |
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Next, how many of these sources match? We can answer this question by using [sets](https://docs.python.org/2/library/sets.html) to compare the centroids of the different sources (rounding to the first decimal place)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems to link to the python 2 version of the docs. I think you want just the version-less one?: https://docs.python.org/library/stdtypes.html#set-types-set-frozenset
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"*Remember that you can double-click on the plot to zoom in and look around!*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this. Is that actually true in general for most browsers? (I haven't had a chance to check this in the live notebook, but will check later to be sure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me. You double click to "zoom in", the page doesn't increase in width, but the figure becomes larger within its box (although the height increases a bit), and then you can scroll sideways (OS X + magic mouse and trackpad at least; not sure how this interacts with a Windows desktop though) to see the full figure.
"source": [ | ||
"## Conclusions\n", | ||
"\n", | ||
"The `photutils` package provides users with a variety of methods for detecting sources in their data, from familar algorithms such as `DAOFind` and `starfind`, to more complex and customizable image segmentation algorithms. These methods allow for easy creation of a diverse array of apertures that can be used for photometric analysis.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there should also be a "and if you want to write your own algorithm, you can do that too!" at the end here? One or both of these could be mentioned:
- The photutils docs are not as fleshed out on this as might be desirable, but the basic idea is that anything that subclasses this: https://photutils.readthedocs.io/en/latest/api/photutils.detection.StarFinderBase.html#photutils.detection.StarFinderBase can be used in the exact same way.
- Photutils is open developed so PRs implementing new star finders are always welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This notebook looks great overall, @laurenmarietta. As with @eteq mostly minor comments, primarily focussed on slightly weird formatting errors I get on my machine or deprecation warnings and code changes to make them go away, but a few small improvements I think will help to make the notebook even better.
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-warning\">**Important:** Before proceeding, please be sure to update your versions of `astropy`, `matplotlib`, and `photutils`, or this notebook may not work properly. Or, if you don't want to handle packages individually, you can always use (and keep updated!) the [AstroConda](https://astroconda.readthedocs.io) distribution.</div>\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my Jupyter notebook, the inline formatting didn't show (I saw the full URL link, and `Astropy`
, etc.) without new lines and empty lines before and after the div
breaks. However if this is just me and my slightly odd notebook setup then please feel free to ignore this issue (but let me know so I can see about fixing it)
"<div class=\"alert alert-block alert-warning\">**Important:** Before proceeding, please be sure to update your versions of `astropy`, `matplotlib`, and `photutils`, or this notebook may not work properly. Or, if you don't want to handle packages individually, you can always use (and keep updated!) the [AstroConda](https://astroconda.readthedocs.io) distribution.</div>\n", | |
"<div class=\"alert alert-warning\">\n", | |
" \n", | |
"**Important:** Before proceeding, please be sure to update your versions of `astropy`, `matplotlib`, and `photutils`, or this notebook may not work properly. Or, if you don't want to handle packages individually, you can always use (and keep updated!) the [AstroConda](https://astroconda.readthedocs.io) distribution.\n", | |
"\n", | |
"</div>" |
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"As explained in the [previous notebook](01_photutils_background_estimation.ipynb) on background estimation, it is important to **mask** these data, as a large portion of the values are equal to zero. We will mask out the non-data, so all of those pixels that have a value of zero don't interfere with our statistics and analyses of the data. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe this could just do with a slight clarification to repeat why we're masking the dataset, highlighting that it's part of the image that we don't care about that are gone, instead of just data that happen to have a value of zero:
"As explained in the [previous notebook](01_photutils_background_estimation.ipynb) on background estimation, it is important to **mask** these data, as a large portion of the values are equal to zero. We will mask out the non-data, so all of those pixels that have a value of zero don't interfere with our statistics and analyses of the data. " | |
"As explained in the [previous notebook](01_photutils_background_estimation.ipynb) on background estimation, it is important to **mask** these data, as a large portion of the values are equal to zero. We will mask out the non-data portions of the image, so all of those pixels that have a value of zero don't interfere with our statistics and analyses of the data. " |
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"mean, median, std = sigma_clipped_stats(xdf_image.data, sigma=3.0, iters=5, mask=xdf_image.mask)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get an AstropyDeprecationWarning
now, as iters
is being removed in the future and maxiters
should be used instead.
"mean, median, std = sigma_clipped_stats(xdf_image.data, sigma=3.0, iters=5, mask=xdf_image.mask)" | |
"mean, median, std = sigma_clipped_stats(xdf_image.data, sigma=3.0, maxiters=5, mask=xdf_image.mask)" |
"# Plot the data\n", | ||
"norm_image = ImageNormalize(vmin=1e-4, vmax=5e-2, stretch=LogStretch())\n", | ||
"xdf_image_clipped = np.clip(xdf_image, 1e-4, None) # clip to plot with logarithmic stretch\n", | ||
"fitsplot = ax1.imshow(np.ma.masked_where(xdf_image.mask, xdf_image_clipped), norm=norm_image)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not particularly relevant to the quality of the notebook but I was a little bit put off by the large swathes of yellow the figure produces by default, but I couldn't find a nice way to set cmap = plt.get_cmap('viridis'); cmap.set_bad(color='k')
(i.e., use the masked data from xdf_image
to set the plot to black) that interacted with ImageNormalize
.
It would be great if there was a way to get the masked data set to black while still keeping the rest of the figure the same, however; but this is a really minor thing so don't worry too much if you also can't figure out how to make that happen. This comment applies anywhere an imshow
exists, as I suspect it happens a lot and it would be nice if the plots could be a little bit less "in your face" with the colour scheme...
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I ran into some issues with this block where the lack of line breaks revealed the underlying formatting, instead of correctly displaying the formatted data (sorry about the odd formatting of this comment, it seems you can't suggest changes across multiple lines...). Note that I also am unable to get the **
bolding to work within a <h3>
block, but if that's just me again, please ignore this:
csi-stsci-notebooks/photutils/02_photutils_source_detection.ipynb
Lines 317 to 320 in 348bf08
"<div class=\"alert alert-block alert-info\">\n", | |
"<h3>**Exercises:**</h3><br>\n", | |
"Re-run the `DAOStarFinder` algorithm with a smaller threshold (like 5σ), and plot the sources that it finds. Do the same, but with a larger threshold (like 100σ). How did changing the threshold affect the results?\n", | |
"</div>" |
"<div class=\"alert alert-block alert-info\">\n", | |
"<div class=\"alert alert-block alert-info\">\n", | |
"\n", | |
"<h3>Exercises:</h3>\n", | |
"\n", | |
"Re-run the `DAOStarFinder` algorithm with a smaller threshold (like 5σ), and plot the sources that it finds. Do the same, but with a larger threshold (like 100σ). How did changing the threshold affect the results?\n", | |
"\n", | |
"</div>" |
"# Create a segmentation image\n", | ||
"segm = detect_sources(xdf_image.data, threshold, npixels)\n", | ||
"\n", | ||
"print('Found {} sources'.format(segm.max))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got an AstropyDeprecationWarning
again:
"print('Found {} sources'.format(segm.max))" | |
"print('Found {} sources'.format(segm.max_label))" |
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", | ||
"<h3>**Exercises:**</h3><br>\n", | ||
"Recompute the `SegmentationImage`, but alter the threshold and the minimum number of pixels in a source. How does changing the threshold affect the results? What about changing the number of pixels?\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, for this to render properly for me the **
and the <br>
need removing, and line returns need adding after each line
"Recompute the `SegmentationImage`, but alter the threshold and the minimum number of pixels in a source. How does changing the threshold affect the results? What about changing the number of pixels?\n", | |
"<div class=\"alert alert-block alert-info\">\n", | |
" \n", | |
"<h3>Exercises:</h3>\n", | |
"\n", | |
"Recompute the `SegmentationImage`, but alter the threshold and the minimum number of pixels in a source. How does changing the threshold affect the results? What about changing the number of pixels?\n", | |
"\n", | |
"</div>" |
"catalog = source_properties(xdf_image.data, segm)\n", | ||
"table = catalog.to_table()\n", | ||
"print(table)\n", | ||
"print(table.colnames)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this section could do with a little bit of context about what we've calculated and some small examples of what the user can do with them? It feels a little bit "floaty" currently, just a "hey neat we can do this thing" side note but doesn't give the bigger picture context that makes the rest of the notebook work really well.
"metadata": {}, | ||
"source": [ | ||
"<div class=\"alert alert-block alert-info\">\n", | ||
"<h3>**Exercises:**</h3><br>\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final instance of "I can't render this as currently formatted" I promise! Please add line breaks, remove the **
and the <br>
, etc.
"The `photutils` package provides users with a variety of methods for detecting sources in their data, from familar algorithms such as `DAOFind` and `starfind`, to more complex and customizable image segmentation algorithms. These methods allow for easy creation of a diverse array of apertures that can be used for photometric analysis.\n", | ||
"\n", | ||
"\n", | ||
"** To continue with this `photutils` tutorial, go on to the [aperture photometry](03_photutils_aperture_photometry.ipynb) or [PSF photometry notebook](04_photutils_psf_photometry.ipynb).**" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The space between the **
and To
here causes this to fail to render as bold for me. If the choice was to show **
at each end of the sentence then ignore me, however.
Thanks to you both for these wonderful reviews :) First things first, per the new plan for finishing up these notebooks, I am actually going to close this PR - any future discussion will take place on the spacetelescope/notebooks repo, on the new PR thread: spacetelescope/notebooks#101 I appreciated the very helpful feedback - also so many fun tips and shortcuts, I learn every time I get another review. I did implement all your suggestions, with the following comments:
|
The second notebook tutorial for
photutils
, demonstrating methods for source detection.Some preliminary concerns I have about this notebook:
IRAFStarFinder
andDAOStarFinder
have such different results (because IRAF doesn't allow for elliptical Gaussians) is correct, since I have only the most superficial understanding of how those algorithms work. I would appreciate some feedback from someone who understands them well.@eteq I know you mentioned that Tom might be reviewing these notebooks now - if you know his GitHub account and want to tag/assign him, please do!