-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for a one-dimensional, unbinned, spectrum format #138
Comments
The definition for binned spectral data that you're referring to doesn't exist any longer in the current version of the specs, as far as I can tell (your link is to v0.1, latest is v0.2). |
Dear @lmohrmann, thanks for replying and sorry if it took me a while to come back to this.
Imagine one wants to perform a more sophisticated unbinned likelihood analysis with some methods actually not available in (or yet covered by) You might want also to consider a simple separation of processes: in a first step one can reduce a large amount of observations to spectral data and in a second, separate, step load them for the statistical analysis (this is what is currently possible with binned data). I spoke privately with @adonath about this, and he supports the integration of such formats in Starting from the MAGIC DL3 files in the joint-crab project, I created some prototypical unbinned spectral data in this repository As for the OGIP files I created a file for the ON and OFF data and one for the effective area and energy migration. The IRFs format are just OGIP files with a very fine binning. I'd like to keep iterating with you on this, thanks. |
Thank @cosimoNigro, for the prototyping on this issue. Just a few more thought from my side: For consistency we decided to only work with binned datasets in Gammapy for now, but long-term we would like to support unbinned analyses in Gammapy as well (side comment: basically because it's a requirement for the CTA science tools, personally I think the use-cases are very limited and in practice a fine energy binning will work just as good...). So far we have not decided how unbinned analyses will be integrated into our API. The question is how we can support the unbinned case in our data reduction chain already, so that it can be used for @cosimoNigro use cases. For this I see two options:
Currently I'd prefer option 1, because unbinned analyses are not a priority for Gammapy now and option 2 is much more work. Concerning the data formats: I basically agree with @lmohrmann, that we don't need any specialised data format here, because the building blocks are already defined:
So in summary my proposal would be the following:
What do you think @registerrier and @cosimoNigro? |
And we should add a short paragraph to https://gamma-astro-data-formats.readthedocs.io/en/latest/spectra/ogip/index.html#, explaining this option of writing the event lists. |
Dear @cosimoNigro & @adonath , Is there any difference with fine bins 1D spectra? What do you gain specifically by keeping the events? What I understand from the paper you link, is that you get a better sensitivity by using a background model rather than with the I am not sure another data format is needed then, it is simply that you want to perform a background model fit on your ON and OFF data simultaneously. Right? |
Dear all, This proposal for a format of such type of DL4 data is interesting and deserve indeed some discussion. I would like to replace it into the history of the gadf and its current usage. Nowadays, one can claim that the DL3 format has some maturity, well described and used by more and more experiments. CTA has adopted it for its latest Data Challenge. This does not mean that this format is frozen, evolutions are always possible. Concerning the DL4 format, few discussion have been made. It appears also that some Science Tools has different views regarding what should be a DL4, and thus their formats are different. They are interrogation about what should be stored or not, which kind of intermediate products should be kept. The discussions have not led to a 'common DL4 format', even for the most basic case (see the latest post of @registerrier about the OGIP format). For the DL5 format, I think that the situation is not clear also. The latest publications of the current experiments or CTA do not share or use systematically the same format. It deserves also additional discussions to be more precise and complete all types of products (and Science Cases). To come back to your proposal about such DL4 product, my feeling is that a F2F discussion with developers of Science Tools (hess, magic, veritas, ctools, gammapy) might help to identify all the issues and to propose formats for different use cases. It is hard to discuss the unbinned 1D spectrum format alone, the binned one needs an agreement also, as for the sky maps format. To be planned for a CTA meeting? |
Dear @adonath, @regis and @bkhelifi, thanks for your comments.
certain science cases demand the knowledge of event-wise observables. For example in search for Lorentz Invariance Violation (LIV) you would study the correlation between the energy and the arrival time of individual photons (see Martinez et al. (2009)). LIV searches are performed mostly for AGN and GRB, so in that case (as for any point-like source) you would accept
Also, as @adonath pointed out, the unbinned analyses are a requirements for the CTA science tools. If you are against the definition of an @bkhelifi I understand it would be appropriate to discuss the DL4 and DL5 implementation in person, but I think I am asking for a small addition to the spectral data format that is easy to implement in the actual tools and that can be eventually expanded in the future. Thanks |
Hi, |
Thanks for the clarification @bkhelifi! For some reason I had this in mind incorrectly. I'll study the current status of the requirements once again. |
As mentionned by Bruno, SUSS requirements are not yet on Jama as not fully validated. |
Thanks again @bkhelifi and @cboisson for the clarification. If there is not even a requirement for unbinned analysis in the science tools, then I really think there is no point in talking about special DL4 data formats now. It seems too early. @cosimoNigro To solve this issue here pragmatically I would propose, that we avoid the data format discussion for now. Instead we still add |
@adonath thanks for your offer to implement it in I believe the addition of what I am proposing to the data formats (and not only to
I'd like to have these informations clarified here, as they pertain to a new kind of data we will make possible to store, besides the OGIP. Other higher-level tools (implementing more complex statistical analysis or astrophysical modelling) might want to refer here for their inputs. I also agree these specification can go into a Now what I want to understand is if there is somebody explicitly against this addition to the data formats, if yes: |
Hello all,
the format currently foresees binned spectral data, the so-called OGIP files, in turn based on NASA's definition.
The OGIP format provides all the input necessary for a classical one-dimensional binned spectral analysis.
Me and @javierrico would like to ask whether there might be interest in adding an unbinned one-dimensional spectral format.
Unbinned one-dimensional likelihoods have been (and are being) used - for example - in dark matter searches, see the likelihood definition in Aleksić et al. (2012) and its later usage in Aleksić et al. (2014). Additionally, any point like analysis (e.g. extragalactic) can profit of this technique.
In my opinion it will not be a big effort to introduce such definition in the format.
Prototypical 1D unbinned spectral data should contain:
an ON event list, containing all the events in the source region with their reconstructed energies;
an OFF event list, containing all the events in the background region with their reconstructed energies;
A column, or a HEADER keyword, with the ratio between the ON and OFF region exposures.
[ alternatively to 1) and 2) a single Event list with reconstructed energies and the ON / OFF columns could be used; ]
unbinned effective area as a function of true energy, at the offset of the observation, corrected with the PSF (if full enclosure IRFs are provided);
unbinned energy migration, at the offset of the observation (alternatively resolution and bias as a function of true energy).
From the point of view of the science tools, I believe
gammapy
andctools
already produce data with such informations, they are simply not stored.Such data should be the intermediate products obtained from DL3 right after the signal / background separation in SpectrumExtraction and csphagen, respectively.
They could be dumped in a FITS file before being binned to produce the OGIP files.
Any thoughts?
Thank you.
The text was updated successfully, but these errors were encountered: