From f1a5495e5976c61d191f8c8c26f6f1838e8de5ca Mon Sep 17 00:00:00 2001 From: Mathieu Servillat Date: Tue, 22 Oct 2024 23:46:18 +0200 Subject: [PATCH] typos, rewording and reorganisation of the text --- VOHE-Note.tex | 240 +++++++++++++++++++++++++++++++------------------- 1 file changed, 147 insertions(+), 93 deletions(-) diff --git a/VOHE-Note.tex b/VOHE-Note.tex index dee3253..db93fe7 100644 --- a/VOHE-Note.tex +++ b/VOHE-Note.tex @@ -10,7 +10,43 @@ % interest groups. \ivoagroup{DM} -\author{HE club} +\author{ +Mathieu Servillat and the HE group +} + +% 1st ASOV meeting +% Ada Nebot +% Bruno Khélifi +% Catherine Boisson +% François Bonnarel +% Laurent Michel +% Mathieu Servillat +% Mireille Louys +% Pierre Cristofari + +% 2nd ASOV meeting (including above authors) +% Fabian Schussler +% Ian Evans +% Janet Evans +% Jutta Schnabel +% Karl Kosack +% Mark Cresitello-Dittmar +% Matthias Fuessling +% Régis Terrier + +% Note contributors +% Mathieu Servillat +% François Bonnarel +% Bruno Khélifi +% Laurent Michel +% Mark Cresitello-Dittmar +% Karl Kosack +% Matthias Fuessling +% Ian Evans +% Tess Jaffe + +% IVOA HE group + \editor{Mathieu Servillat} @@ -58,16 +94,16 @@ \section{Introduction} % We should introduce the purpose of the note in distribution and access of event list data products. Science cases should be focused to highlight that. -High Energy (HE) astronomy typically includes X-ray astronomy, gamma-ray astronomy of the GeV range (HE), the TeV range -(very high energy, VHE) up to the ultra high energy (UHE) above 100 TeV, the VHE neutrino -astronomy, and studies of cosmic rays. This domain is now sufficiently developed to provide high level data such as catalogs, images, including full-sky surveys for some missions, and sources properties in the shape of spectra and time series. -Such high level, HE observations have been included in the VO, via data access endpoints provided by observatories or by agencies and indexed in the VO Registry. +High Energy (HE) astronomy typically includes X-ray astronomy, gamma-ray astronomy, +% of the GeV range (HE), the TeV range (very high energy, VHE) up to the ultra high energy (UHE) above 100 TeV, the VHE +neutrino astronomy, and studies of cosmic rays. This domain is now sufficiently developed to provide high level data such as catalogs, images, including full-sky surveys for some missions, and sources properties in the shape of spectra and time series. +Some high level HE observations have been included in the VO, via data access endpoints provided by observatories or by agencies and indexed in the VO Registry. %Some high-energy (HE) data is already available via the VO. Images, time series, and spectra may be described with Obscore and access. -However, after browsing this data, users may want to download lower level data and reapply data reduction steps relevant to their Science objectives. A common scenario is to download HE "event" lists, i.e. lists of detected events on a HE detector, that are expected to be detection of particles (e.g. a HE photon), and the corresponding calibration files, including Instrument Response Functions (IRFs). The findability and accessibility of these data via the VO is the focus of this note. +However, after browsing this data, users may want to download lower level data and reapply data reduction steps relevant to their Science objectives. A common scenario is to download HE "event" lists, i.e. lists of detected events on a HE detector, that are expected to be detection of particles (e.g. a HE photon, or a neutrino), and the corresponding calibration files, including Instrument Response Functions (IRFs). The findability and accessibility of these data via the VO is the focus of this note. -We first report typical use cases for data access and analysis of data from current HE observatories. From those use -cases, we note that some existing IVOA recommendations are of interest to the domain. They should be further explored +We report typical use cases for data access and analysis of data from current HE observatories. From those use +cases, we note that some existing IVOA recommendations are of interest to the domain. They should be further explored and tested by HE observatories. We then discuss how standards could evolve to better integrate specific aspects of HE data, and if new standards should be developed. @@ -75,14 +111,14 @@ \subsection{Objectives of the document} The main objective of the document is to analyse how HE data can be better integrated to the VO. -We first identify and expose the specificities of HE data from several HE observatories. Then we intend to illustrate how HE data is or can be handled using current IVOA standards. Finally, we explore several topics that could lead to HE specific recommandations. +We first identify and expose the specificities of HE data as provided by several HE observatories. Then we intend to illustrate how HE data is or can be handled using current IVOA standards. Finally, we explore several topics that could lead to HE specific recommandations. A related objective is to provide a context and a list of topics to be further discussed within the IVOA by a dedicated HE Interest Group (HEIG). \subsection{Scope of the document} -This document mainly focuses on HE data discovery through the VO, with the identification of common use cases in the HE astrophysics domain, which provides an insight of the specific metadata to be expose through the VO for HE data. +This document mainly focuses on HE data discovery through the VO, with the identification of common use cases in the HE astrophysics domain, which provides an insight of the specific metadata to be exposed through the VO for HE data. Some current existing IVOA recommendations are discussed in this document within the HE context and they will be in-depth studied in the HEIG. @@ -113,8 +149,8 @@ \section{High Energy observatories and experiments} %XMM use case scenario %Données attachées ? data link? -There are various observatories, either from ground, space or deep-sea based, that distribute high-energy data with -different level of involvement in the VO. We list here the observatories currently represented in the VO HE group. +There are various observatories, either ground, space or deep-sea based, that distribute high-energy data with +different levels of involvement in the VO. We list here the observatories currently represented in the VO HE group. There are also other observatories that are connected to the VO in some way, and may join the group discussions at IVOA. @@ -123,9 +159,9 @@ \subsection{Gamma-ray programs} \subsubsection{H.E.S.S} \label{sec:hess} -The High Energy Steresopic System (H.E.S.S.) experiment is an array of Imaging Atmospheric Cherenkov Telescopes (IACT) +The High Energy Stereoscopic System (H.E.S.S.) experiment is an array of Imaging Atmospheric Cherenkov Telescopes (IACT) located in Namibia that investigates cosmic very high energies (VHE) gamma rays in the energy range from 10s of GeV to -100 of TeV. It is constituted of four telescopes officially inaugurated in 2004, and a much larger fifth telescope +100 of TeV. It is comprised of four telescopes officially inaugurated in 2004, and a much larger fifth telescope operational since 2012, extending the energy coverage towards lower energies and further improving sensitivity. The H.E.S.S. collaboration operates the telescopes as a private experiment and publishes mainly high level data, @@ -142,7 +178,7 @@ \subsubsection{H.E.S.S} %% Need to describe the IRFs like for Chandra? In September 2018, the H.E.S.S. collaboration has, for the first time and unique time, released a small subset of its -archival data using the GADF format (see~\ref{sec:GADF}) serialized into the Flexible Image Transport System (FITS) format, +archival data using the GADF format (see~\ref{sec:GADF}) serialised into the Flexible Image Transport System (FITS) format, an open file format widely used in astronomy. The release consists of Cherenkov event-lists and IRFs for observations of various well-known gamma-ray sources \citep{hess-zenodo.1421098}. @@ -217,22 +253,22 @@ \subsubsection{Chandra}\label{sec:chandra} % According to https://heasarc.gsfc.nasa.gov/docs/heasarc/caldb/docs/memos/cal_gen_92_002/cal_gen_92_002.html#tth_sEc2.1, % RMF, ARF and PSF does not depend on spectral models -Finally, the CXC distributes the CIAO data analysis package to allow users to recalibrate and analyze their data. A key +Finally, the CXC distributes the CIAO data analysis package to allow users to recalibrate and analyse their data. A key aspect of CIAO is to provide users the ability to create instrument responses (RMFs, ARFs, PSFs, etc) for their -observations. The Sherpa modeling and fitting package supports N-dimensional model fitting and optimization in Python, +observations. The Sherpa modeling and fitting package supports N-dimensional model fitting and optimisation in Python, and supports advanced Bayesian Markov chain Monte Carlo analyses. \subsubsection{XMM-Newton} The European Space Agency's (ESA) X-ray Multi-Mirror Mission (XMM-Newton) \footnote{https://www.cosmos.esa.int/web/xmm-newton} was launched by an Ariane 504 on December 10th 1999. XMM-Newton is ESA's second cornerstone of the Horizon 2000 Science Programme. -It carries 3 high throughput X-ray telescopes with an unprecedented effective area, 2 reflexion grating spectrometers and an optical monitor. +It carries 3 high throughput X-ray telescopes with an unprecedented effective area, 2 reflexion grating spectrometers and an optical monitor. The large collecting area and ability to make long uninterrupted exposures provide highly sensitive observations. The XMM-Newton mission is helping scientists to solve a number of cosmic mysteries, ranging from the enigmatic black holes to the origins of the Universe itself. Observing time on XMM-Newton is being made available to the scientific community, applying for observational periods on a competitive basis. -One of the mission's ground segment modules, the SSC /footnote{http://xmmssc.irap.omp.eu/}, is in charge of maximizing the scientific return of +One of the mission's ground segment modules, the SSC \footnote{http://xmmssc.irap.omp.eu/}, is in charge of maximising the scientific return of this space observatory by exhaustively analyzing the content of the instruments' fields of view. During the development phase (1996-1999), the SSC, in collaboration with the SOC (ESAC), designed and produced the scientific analysis software (SAS). @@ -241,31 +277,31 @@ \subsubsection{XMM-Newton} The general pipeline is operated as ESAC (Villafranca, Spain) since 2012, except for the part concerning cross-correlation with astronomical archives which runs in Strasbourg. The information thus produced is intended for the guest observer and, after a proprietary period of one year, -for the international community. +for the international community. In parallel, the SSC regularly compiles an exhaustive catalog of all X-ray sources detected by EPIC cameras. -The SSC validates these catalogs, enriches them with multi-wavelength data and exploits them in several scientific programs. +The SSC validates these catalogs, enriches them with multi-wavelength data and exploits them in several scientific programs. The XMM catalog is published through various WEB applications: XSA \footnote{https://www.cosmos.esa.int/web/xmm-newton/xsa}, XCatDB \footnote{https://xcatdb.unistra.fr/4xmm}, IRAP \footnote{http://xmm-catalog.irap.omp.eu/} and HEASARCH \footnote{http://heasarc.gsfc.nasa.gov/db-perl/W3Browse/w3browse.pl}. -It is also published in the VO, mainly as TAP services. -It is to be noted that the TAP service operated in Strasbourg (\url{https://xcatdb.unistra.fr/xtapdb} - to be deployed in 10/2024) returns responses where data is mapped on the MANGO model with MIVOT. +It is also published in the VO, mainly as TAP services. +It is to be noted that the TAP service operated in Strasbourg (\url{https://xcatdb.unistra.fr/xtapdb} - to be deployed in 10/2024) returns responses where data is mapped on the MANGO model with MIVOT (see section \ref{sec:vorecs}) -\todo[inline]{To be validated by ADA.} +%\todo[inline]{To be validated by ADA.} \subsubsection{SVOM} -SVOM (Space-based multi-band astronomical Variable Objects Monitor) \footnote{https://www.svom.eu/en/home/} +SVOM (Space-based multi-band astronomical Variable Objects Monitor) \footnote{https://www.svom.eu/en/home/} is a Sino-French mission dedicated -to the study of the transient high-energy sky, and in particular to the detection, localization and +to the study of the transient high-energy sky, and in particular to the detection, localisation and study of Gamma Ray Bursts (GRBs). Gamma-ray bursts are sudden, intense flashes of X-ray and gamma-ray light. -They are associated with the cataclysmic formation of black holes, either by the merger of two compact stars +They are associated with the cataclysmic formation of black holes, either by the merger of two compact stars (neutron star or black hole) or by the sudden explosion of a massive star, twenty to one hundred times the mass of our Sun. The birth of a black hole is accompanied by the ejection of jets of matter that reach speeds close to the speed of light. These jets of matter then decelerate in the circumstellar medium, sweeping away everything in their path. -Gamma-ray bursts can be observed at the very edge of the universe, acting as lighthouses that illuminate +Gamma-ray bursts can be observed at the very edge of the universe, acting as lighthouses that illuminate the dark ages of its creation. Although they have been studied extensively over the past fifteen years, gamma-ray bursts are still poorly understood phenomena. To better understand them, China and France have decided to join forces with the SVOM satellite, which is specifically dedicated to the study of gamma-ray bursts. @@ -306,6 +342,7 @@ \subsection{KM3Net and neutrino detection} expectations for actual neutrino observation, especially correctly interpreting the observation time interval and re-weighting and limiting any probabilistic measures to a dedicated study must be facilitated for proper use of neutrino data. + \section{Common practices in the High Energy community} \label{sec:vhespec} @@ -332,17 +369,18 @@ \subsubsection{Data levels}\label{sec:datalevels} \item[1] An event-list with calibrated temporal and spatial characteristics, e.g. sky coordinates for a given epoch, event arrival time with time reference, and a proxy for particle energy. \item[2] Binned and/or filtered event list suitable for preparation of science images, spectra or light-curves. For some instruments, corresponding instrument responses associated with the event-list, calculated but not yet applied (e.g, exposure maps, sensitivity maps, spectral responses). \item[3] Calibrated maps, or spectral energy distributions for a source, or light-curves in physical units, or adjusted source models. - \item[4] An additional data level may correspond to catalogs, e.g. a source catalog pointing to several data products (e.g. collection of high level products) with each one corresponding to a source, catalog of source models generated with an uniform analyse. \end{itemize} +An additional higher data level may correspond to catalogs, e.g. a source catalog pointing to several data products (e.g. collection of high level products) with each one corresponding to a source, catalog of source models generated with an uniform analyse. + However, the definitions of these data levels can vary significantly from facility to facility. For example, in the VHE Cherenkov astronomy domain (e.g. CTA), the data levels listed above are labelled DL3\footnote{lower level data (DL0--DL2), that are specific to the used instrumentation (IACT, WCD), are reconstructed and filtered, which -constitute the events lists called DL3.} to DL5. However, for Chandra X-ray data, the first two levels correspond to L1 and L2 data products (excluding the responses), while transmission-grating data products are designated L1.5 and source catalog and associated data products are all designated L3. +constitute the events lists called DL3.} to DL5. For Chandra X-ray data, the first two levels correspond to L1 and L2 data products (excluding the responses), while transmission-grating data products are designated L1.5 and source catalog and associated data products are all designated L3. \subsubsection{Background signal} -Observations in HE may contain a high background component, that may be due to instrument noises, or to unresolved astrophysical sources, emission from extended regions or other terrestrial sources producing particles similar to the signal. The characterization and estimation of this background may be particularly important to then apply corrections during the analysis of a source signal. +Observations in HE may contain a high background component, that may be due to instrument noises, or to unresolved astrophysical sources, emission from extended regions or other terrestrial sources producing particles similar to the signal. The characterisation and estimation of this background may be particularly important to then apply corrections during the analysis of a source signal. In the VHE domain with the IACT, WCD and neutrino techniques, the main source of background at the DL3 level is created by cosmic-ray induced events. The case of unresolved astrophysical sources, emission from extended regions are treated as models of gamma-ray or neutrino emission. @@ -353,7 +391,7 @@ \subsubsection{Time intervals} Depending on the stability of the instruments and observing conditions, a HE observation can be decomposed into several intervals of time that will be further analysed. -For example, Stable Time Intervals (STI) are defined in Cherenkov astronomy to characterize periods of time during which the instrument response is stable. In the X-ray domain, Good Time Intervals (GTI) are computed to exclude time periods where data are missing or invalid, and may be used to reject periods impacted by high radiation, e.g. due to space weather. In contrast, for neutrino physics, relevant observation periods can cover up to several years due to the low statistics of the expected signal and a continuous observational coverage of the full field of view. +For example, Stable Time Intervals (STI) are defined in Cherenkov astronomy to characterise periods of time during which the instrument response is stable. In the X-ray domain, Good Time Intervals (GTI) are computed to exclude time periods where data are missing or invalid, and may be used to reject periods impacted by high radiation, e.g. due to space weather. In contrast, for neutrino physics, relevant observation periods can cover up to several years due to the low statistics of the expected signal and a continuous observational coverage of the full field of view. \subsubsection{Instrument Response Functions} @@ -364,17 +402,12 @@ \subsubsection{Instrument Response Functions} estimation of the former (such as the real flux of particles arriving at the instrument, the spectral distribution of the particle flux, and the temporal variability and morphology of the source). -Note that the small number of particles -detected in many types of HE observations (i.e. within a Poisson regime) and the fact that the IRFs may not be directly invertible, -techniques such as forward-folding fitting \citep{mattox:1996} are needed to estimate the physical properties of the -source from the observables. - The instrumental responses typically vary with the true energy of the event, the arrival direction of the event into the detector. A further complication of ground-based detectors like IACTs and WCTs is that the instrumental responses also vary with: \begin{itemize} \item The horizontal coordinates of the atmosphere, i.e. the response to a photon at low elevation is different from that at zenith due to a larger air column density, and different azimuths are affected by different magnetic field strengths and directions that modify the air-shower properties. \item The atmosphere density, which can have an effect on the response that changes throughout a year, depending on the site of observation. - \item The brightness of the sky (for IACTs), i.e. the response is worse when the moon is up, or when there is a strong nigh-sky-background level from e.g. the Milky Way or Zodiacal light. + \item The brightness of the sky (for IACTs), i.e. the response is worse when the moon is up, or when there is a strong night-sky-background level from e.g. the Milky Way or Zodiacal light. \end{itemize} Since these are not aligned with a sky coordinate system, field-rotation during an observation must also be taken into account. Therefore the treatment of the temporal variation of IRFs is important, and is often taken into account in analysis by averaging over some short time period, such as the duration of the observation, or intervals within. @@ -385,17 +418,28 @@ \subsubsection{Granularity of data products} for the observed and/or estimated physical parameters (e.g. arrival time, position on detector or in the sky, energy or pulse height, and additional properties such as errors or flags that are project-dependent). -The list of columns present in the event-list is for example described in the data format in use in the HE domain, -such as OGIP or GADF as introduced below. The data formats in use generally describe the event-list data together +The list of columns in the event-list is for example defined in the data format, +such as OGIP or GADF as introduced further below (\ref{sec:data_formats}). The data formats in use generally describe the event-list data together with the IRFs (Effective Area, Energy Dispersion, Point Spread Function, Background) and other relevant information, such -as: Stable or Good Time Interval, dead time, ... +as: Stable and/or Good Time Interval, dead time, ... + +Such time interval are use to define the granularity of the data products, e.g. it may be practical to list all events that will be analysed with the same IRFs. In H.E.S.S., such event-list correspond to a run of 30min of data acquisition. + +Where feasible, the efficient granularity for distributing HE data products seems to be the full combination of data (event-list) and associated IRFs, packed or linked together, with further calibration files, so that the package becomes self-described. + +%It seems appropriate to distribute the metadata in the VO ecosystem together with a link to the data file in community format for finer analysis. +%In order to allow for multi-wavelength data discovery of HE data products and compare observations across different regimes, + + +\subsection{Statistical challenges} + +In order to produce advanced astrophysics data products such as light curves or spectra, assumptions +about the noise, the source morphology and its expected energy distribution must be introduced. This is one of the main +drivers for enabling a full and well described access to event-list data, as HE scientific analyses generally start at this data level. + +\subsubsection{Low count statistics} -In order to allow for multi-wavelength data discovery of HE data products and compare observations across different regimes, -it seems appropriate to distribute the metadata in the VO ecosystem together with an access link to the data file in -community format for finer analysis. Where feasible, the efficient granularity for distributing HE data products seems -to be the full combination of data and associated IRFs. -\subsection{Work flow specificities} \subsubsection{Event selection} @@ -403,11 +447,16 @@ \subsubsection{Event selection} analysis use case, i.e. the source targeted or the science objectives. The selection can be performed on the event characteristics, e.g. time, energy or more specific indicators (patterns, shape, IRFs properties, ...). -\subsubsection{Assumptions and probabilistic approach} +\subsubsection{Event binning} + + +\subsubsection{The unfolding problem} + +Due to the small number of particles +detected in many types of HE observations (i.e. within a Poisson regime) and the fact that the IRFs may not be directly invertible, +techniques such as forward-folding fitting \citep{mattox:1996} are needed to estimate the physical properties of the +source from the observables. -In order to produce advanced astrophysics data products like light curves, spectra or source morphology, assumptions -about the noise, the source morphology and its expected energy distribution must be introduced. This is one of the main -driver for enabling a full and well described access to event-list data, as scientific analyses generally start at this data level. \subsection{Data formats} @@ -415,13 +464,13 @@ \subsection{Data formats} \subsubsection{{OGIP}}\label{sec:ogip} -NASA's HEASARC FITS Working Group was part of the Office of Guest Investigator Programs, or OGIP, and created in the 1990's the multi-mission standards for the format of FITS data files in NASA high-energy astrophysics. Those so-called OGIP recommendations\footnote{\url{https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/ofwg_recomm.html}} include standards on keyword usage in metadata, on the storage of spatial, temporal, and spectral (energy) information, and representation of response functions, etc. These standards predate the IVOA but include such VO concepts as data models, vocabularies, provenance, as well as the corresponding FITS serialization specification. +NASA's HEASARC FITS Working Group was part of the Office of Guest Investigator Programs, or OGIP, and created in the 1990's the multi-mission standards for the format of FITS data files in NASA high-energy astrophysics. Those so-called OGIP recommendations\footnote{\url{https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/ofwg_recomm.html}} include standards on keyword usage in metadata, on the storage of spatial, temporal, and spectral (energy) information, and representation of response functions, etc. These standards predate the IVOA but include such VO concepts as data models, vocabularies, provenance, as well as the corresponding FITS serialisation specification. The purpose of these standards was to allow all mission data archived by the HEASARC to be stored in the same data format and be readable by the same software tools. \S~\ref{sec:chandra} above, for example, describes the Chandra mission products, but many other projects do so as well. Because of the OGIP standards, the same software tools can be used on all of the HE mission data that follow them. There are now some thirty plus different mission datasets archived by NASA following -these standards and different software tools that can analyze any of them. +these standards and different software tools that can analyse any of them. Now that the IVOA is defining data models for spectra and time series, we should be careful to include the existing OGIP standards as special cases of what are developed to be more general standards for all of astronomy. Standards about @@ -442,8 +491,8 @@ \subsubsection{GADF and VODF} model, to cover use cases of both VHE gamma-ray and neutrino astronomy, and to provide more support for validation and versioning. VODF will provide a standard set of file formats for data starting at the reconstructed event level (DL3, i.e. first item in the section \ref{sec:datalevels}) as well as higher-level products (i.e. sky images, light curves, and spectra) -and source catalogues (items 2-4 in section \ref{sec:datalevels}), as well as N-dimensional binned data cubes. With these -standards, common science tools can be used to analyze data from multiple high-energy instruments, including +and source catalogues (see section \ref{sec:datalevels}), as well as N-dimensional binned data cubes. With these +standards, common science tools can be used to analyse data from multiple high-energy instruments, including facilitating the ability to do combined likelihood fits of models across a wide energy range directly from events or binned products. VODF aims to follow or be compatible with existing IVOA standards as much as possible. @@ -463,13 +512,13 @@ \subsection{Tools for data extraction and visualisation} %??? naïve question : what would be the benefit to convert science ready event table data to VOTable? %Would TOPcat, Aladin, etc. allow more preview steps , xmatch, multi-wavelength analysis ? -High energy data are typically multi-dimensional ({\em e.g.\/}, 2 spatial dimensions, time, energy, possibly polarization) and may be complex and diverse at lower levels. Therefore one may commonly find specific tools to process the data for a given facility, {\em e.g.\/}, CIAO for Chandra, SAS for XMM-Newton, or Gammapy for gamma-ray data, with a particular focus on Cherenkov data as foreseen by CTA. +High energy data are typically multi-dimensional ({\em e.g.\/}, 2 spatial dimensions, time, energy, possibly polarisation) and may be complex and diverse at lower levels. Therefore one may commonly find specific tools to process the data for a given facility, {\em e.g.\/}, CIAO for Chandra, SAS for XMM-Newton, or Gammapy for gamma-ray data, with a particular focus on Cherenkov data as foreseen by CTA. However, many tools in a high energy astrophysics data analysis package may perform common tasks in a mission-independent way and can work well with similar data from other facilities. For example, one commonly needs to be able to filter and project the multi-dimensional data to select specific data subsets with manageable sizes and eliminate extraneous data. Some tool sets include built-in generic filtering and binning capabilities so that a general purpose region filtering and binning syntax is available to the end user. A high energy astrophysics data analysis package typically includes tools that apply or re-apply instrumental calibrations to the data, and as described above these may be observatory-specific. More general algorithms ({\em e.g.\/}, source detection) and utility tools ({\em e.g.\/}, extract an observed spectrum from a region surrounding a source) are applied to calibrated data to extract data subsets that can then be fed into modeling tools ({\em e.g.\/}, Xspec, Sherpa, or Gammapy) together with the appropriate instrumental responses (IRFs, or RMFs and ARFs) to derive physical quantities. Since instrumental responses are often designed to be compliant with widely adopted standards, the tools that apply these responses in many cases will interoperate with other datasets that use the same standards. -Most data analysis packages provide a visualization capability for viewing images, interacting with astronomy databases, overlaying data, or interacting via SAMP to tie several application functions together {\em (e.g.\/}, TopCat, Aladin, ds9, ESASky, Firefly) to simultaneously support both analysis and visualization of the data at hand. In addition, many packages offer a scripting interface ({\em e.g.\/}, Python, Jupyter notebooks) that enable customized job creation to perform turn-key analysis or process bulk data in batch mode. +Most data analysis packages provide a visualisation capability for viewing images, interacting with astronomy databases, overlaying data, or interacting via SAMP to tie several application functions together {\em (e.g.\/}, TopCat, Aladin, ds9, ESASky, Firefly) to simultaneously support both analysis and visualisation of the data at hand. In addition, many packages offer a scripting interface ({\em e.g.\/}, Python, Jupyter notebooks) that enable customised job creation to perform turn-key analysis or process bulk data in batch mode. To allow users of data to use pre-existing tools, often packages will support file I/O using several formats, for example, including FITS images and binary tables (for event files), VO formats, and several ASCII representations ({\em e.g.\/}, space, comma, or tab-separated columns). @@ -519,21 +568,20 @@ \subsection{UC4: Multi-wavelength and multi-messenger science} cases for the development of the VO. One use case is associated to independent analyses of the multi-wavelength and multi-messenger data. HE data are -analyzed to produce DL5/L3 quantities from DL3/L1 stored in the VO. And the multi-wavelength and multi-messenger +analysed to produce DL5/L3 quantities from DL3/L1 stored in the VO. And the multi-wavelength and multi-messenger DL5/L3 data stored are retrieved into the VO and associated to realise astrophysical interpretations. The other growing use case is associated to joint statistical analyses of multi-instrument data at different levels (DL3/L1 and DL5/L3) by adapted open science analysis tools. For both use cases, any type of data should be findable on the VO and retrievable. And the data should have a -standardized open format (OGIP, GADF, VODF). +standardised open format (OGIP, GADF, VODF). Such use case is already in use with small data sets shared by VHE experiments. In \citep{2019A&A...625A..10N, 2022A&A...667A..36A}, groups of astronomers working on the Gammapy library had successfully -analyzed DL3 data taken on the Crab nebula by different facilities (MAGIC, H.E.S.S., FACT, VERITAS, Fermi-LAT and HAWC). +analysed DL3 data taken on the Crab nebula by different facilities (MAGIC, H.E.S.S., FACT, VERITAS, Fermi-LAT and HAWC). A real statistical joint analysis has been performed to derive an emitting model of the Crab pulsar wind nebula over more -than five decades in energy. Such analysis type can be now retrieved in the literacy. It can be found also joint analyses -using X-ray and VHE data \citep{giunti2022}. A proof of concept of joint analysis of VHE gamma-ray and VHE neutrino, +than five decades in energy. Such analysis types can be now retrieved in the literature. One can also find joint analyses using X-ray and VHE data \citep{giunti2022}. A proof of concept of joint analysis of VHE gamma-ray and VHE neutrino, using simulated data, has been also published \citep{unbehaun2024}. \subsection{UC5: Extended source searches} @@ -557,7 +605,7 @@ \subsection{UC5: Extended source searches} %This was proposed in \citep{2019A&A...625A..10N} by a group of HE astronomers of various HE facilities. %%This work used event-list data products as an input from different facilities (MAGIC, H.E.S.S., FACT, VERITAS, etc...). data for the Crab Nebula computed from the Maximum likelihood functions of each event depending on the IRFs properties. %In this work, the authors implemented a prototypical data format (GADF) for a small set of MAGIC, VERITAS, FACT, and -%H.E.S.S. Crab nebula observations, and they analyzed them with the open-source Gammapy software package. By combining +%H.E.S.S. Crab nebula observations, and they analysed them with the open-source Gammapy software package. By combining %data from Fermi-LAT, and from four of the currently operating imaging atmospheric Cherenkov telescopes, they produced a %joint maximum likelihood fit of the Crab nebula spectrum. % @@ -568,23 +616,26 @@ \subsection{UC5: Extended source searches} \section{IVOA standards of interest for HE} \subsection{IVOA Recommendations} +\label{sec:vorecs} \subsubsection{ObsCore and TAP} -Event-list datasets can be described in ObsCore using a dataproduct\_type set to "event". However, this is not widely used in current services, and we observe only a few services with event-list datasets declared in the VO Registry, and mainly the H.E.S.S. public data release (see \ref{sec:hess}). +Event-list datasets can be described in ObsCore using a dataproduct\_type set to "event", and distributed via a TAP service. However, this is not widely used in current services, and we observe only a few services with event-list datasets declared in the VO Registry, and mainly the H.E.S.S. public data release (see \ref{sec:hess}). As services based on the Table Access Protocol \citep{2019ivoa.spec.0927D} and ObsCore are well developed within the VO, it would be a straightforward option to discover HE event-list datasets, as well as multi-wavelength and multi-messenger associated data. -Here is the evaluation of the ObsCore metadata for distributing high energy data set, some features being re-usable as such, and some other features requested for addition or re-interpretation. +Extension of ObsCore are proposed for some astronomy domains (radio, time), which is also relevant for the astronomy domain. The ObsCore description of HE datasets is further discussed in section \ref{sec:obscore_he}. + +%Here is the evaluation of the ObsCore metadata for distributing high energy data set, some features being re-usable as such, and some other features requested for addition or re-interpretation. \subsubsection{DataLink} %\todo[inline]{To be completed (e.g. François)} proposed below by FB (2024-01-31) -DataLink specification \citep{2023ivoa.spec.1215B} defines a \{links\} endpoint providing the possibility to link several +The DataLink specification \citep{2023ivoa.spec.1215B} defines a \{links\} endpoint providing the possibility to link several access items to each row of the main response table. These links are described and stored in a second -table. In the case of an ObsCore response each dataset can be linked this way (via the via the access\_url +table. In the case of an ObsCore response each dataset can be linked this way (via the access\_url FIELD content) to previews, documentation pages, calibration data as well as to the dataset itself. Some dynamical links to web services may also be provided. In that case the service input parameters are described with the help of a "service descriptor" feature as described in the same DataLink specification. @@ -602,27 +653,29 @@ \subsubsection{MOCs} \subsubsection{MIVOT} -Model Instances in VOTables (MIVOT \cite{2023ivoa.spec.0620M}) defines a syntax to map VOTable data to any model serialized in VO-DML. +Model Instances in VOTables (MIVOT \cite{2023ivoa.spec.0620M}) defines a syntax to map VOTable data to any model serialised in VO-DML. The annotation operates as a bridge between the data and the model. It associates the column/param metadata from the VOTable to the data model elements (class, attributes, types, etc.) [...]. -The data model elements are grouped in an independent annotation block complying with the MIVOT XML syntax. -his annotation block is added as an extra resource element at the top of the VOTable result resource. +The data model elements are grouped in an independent annotation block complying with the MIVOT XML syntax. +This annotation block is added as an extra resource element at the top of the VOTable result resource. The MIVOT syntax allows to describe a data structure as a hierarchy of classes. It is also able to represent relations and composition between them. It can also build up data model objects by aggregating instances from different tables of the VOTable. + In the case of HE data, this annotation pattern, used together with the MANGO model, will help to make machine-readable quantities that are currently not considered in the VO, such as the hardness ratio, the energy bands, the flags associated with measurements or extended sources. \subsubsection{Provenance} -Provenance information of VHE data product is a crucial information to provide, especially given the complexity of the data preparation and analysis workflow in the VHE domain. Such complexity comes from the specificities of the VHE data as exposed in sections \ref{sec:vhespec}. +Provenance information of VHE data product is crucial information to provide, especially given the complexity of the data preparation and analysis workflow in the VHE domain. Such complexity comes from the specificities of the VHE data as exposed in sections \ref{sec:vhespec}. The develoment of the IVOA Provenance Data Model \citep{2020ivoa.spec.0411S} has been conducted with those use cases in mind. The Provenance Data Model proposes to structure this information as activities and entities (as in the W3C PROV recommendation), and adds the concepts of descriptions and configuration of each step, so that the complexity of provenance of VHE data can be exposed. \subsubsection{Measurements} The Measurements model \citep{2022ivoa.spec.1004R} describes measured or determined astronomical data and their associated errors. -This model is highly compatible with the primary measured properties of High Energy data ( Time, Spatial Coordinates, Energy ). +This model is highly compatible with the primary measured properties of High Energy data (Time, Spatial Coordinates, Energy). + However, since HE data is typically very sparse, derived properties are often expressed as probability distributions, which are not well represented by the IVOA models. This is one area where input from the HE community can help to improve the IVOA models to better represent HE data. @@ -641,29 +694,28 @@ \subsubsection{Dataset} \subsubsection{Cube} -The Cube model\footnote{https://www.ivoa.net/documents/CubeDM} describes multi-dimensional sparse data cubes and images. This model is specifically designed to -represent Event list data and provides the framework for specializing to represent data products such as Spectra and Time Series +The Cube model\footnote{https://www.ivoa.net/documents/CubeDM} describes multi-dimensional sparse data cubes and images. This submodel is specifically designed to +represent Event list data and provides the framework for specialising to represent data products such as Spectra and Time Series as slices of a multi-dimensional cube. The image modeling provides the structure necessary to represent important HE image products. -\subsubsection{Mango} -MANGO is a model (draft: \footnote{https://github.com/ivoa-std/MANGO}) that has been developed to reveal +\subsubsection{MANGO} +MANGO is a model (draft: \footnote{https://github.com/ivoa-std/MANGO}) that has been developed to reveal and describe complex quantities that are usually distributed in query response tables. The use cases on which MANGO is built were collected in 2019 from different scientific fields, including HE. -The model focuses on the case of the epoch propagation, the state description - and photometry. - Some features of MANGO are useful for the HE domain: - +The model focuses on the case of the epoch propagation, the state description and photometry. + +Some features of MANGO are useful for the HE domain: % \begin{itemize}[noitemsep,topsep=0pt,parsep=0pt,partopsep=0pt] << these require the enumitem package? \begin{itemize} \item Hardness ratio support \item Energy band description \item Machine-readable description of state values \item Ability to group quantities (e.g., position with detection likelihood) - \item MANGO instance association (e.g., source with detections) -\end{itemize} + \item MANGO instance association (e.g., source with detections) +\end{itemize} -\section{Topics for discussions in an Interest Group} +\section{Topics for discussions in an Interest Group} \subsection{Definition of a HE event in the VO} \label{sec:event-bundlle-or-list} @@ -697,7 +749,7 @@ \subsubsection{Proposed definition to be discussed} high-energy particles, where an event is generally characterised by a spatial position, a time and a spectral value (e.g. an energy, a channel, a pulse height). \item Propose definitions for a product-type \textbf{event-bundle}: An event-bundle dataset is a complex object - containing an event-list and multiple files or other substructures that are products necessary to analyze the + containing an event-list and multiple files or other substructures that are products necessary to analyse the event-list. Data in an event-bundle may thus be used to produce higher level data products such as images or spectra. \end{itemize} @@ -706,18 +758,17 @@ \subsubsection{Proposed definition to be discussed} The precise content of an event-bundle remains to be better defined, and may vary significantly from a facility to another. For example, Chandra primary products distributed via the Chandra Data Archive include around half a dozen different -types of products necessary to analyze Chandra data (for example, L2 event-list, Aspect solution, +types of products necessary to analyse Chandra data (for example, L2 event-list, Aspect solution, bad pixel map, spacecraft ephemeris, V\&V Report). -{\bf the following is not clear for BKH: It is also possible to retrieve secondary products, -containing more products that are needed to recalibrate the data with updated calibrations}. +% {\bf the following is not clear for BKH: It is also possible to retrieve secondary products, containing more products that are needed to recalibrate the data with updated calibrations}. For VHE gamma rays and neutrinos, the DL3 event lists should mandatory be associated to their associated IRFs files. The links between the event-list and these IRFs should be well defined in the event-bundle. \subsection{ObsCore metadata description of an event-list} -\label{sec:obscore} +\label{sec:obscore_he} %%%% texte by Mireille to be checked and merged : start %% %\include{ObscoreReviewforVOHEcontext_Mireille Louys} @@ -730,7 +781,7 @@ \subsubsection{Usage of the mandatory terms in ObsCore} ObsCore \citep{2017ivoa.spec.0509L} can provide a metadata profile for a data product of type event-list and a qualified access to the distributed file using the Access class from ObsCore (URL, format, file size). -In the ObsCore representation, the event-list data product is described in terms of curation, coverage and access. However, several properties are simply set to NULL following the recommendation: Resolutions, Polarization States, Observable Axis Description, Axes lengths (set to -1)... +In the ObsCore representation, the event-list data product is described in terms of curation, coverage and access. However, several properties are simply set to NULL following the recommendation: Resolutions, Polarisation States, Observable Axis Description, Axes lengths (set to -1)... We also note that some properties are energy dependent, such as the Spatial Coverage, Spatial Extent, PSF. @@ -750,7 +801,7 @@ \subsubsection{Usage of the mandatory terms in ObsCore} \end{itemize} -\subsubsection{Metadata re-interpretation for the VOHE context} +\subsubsection{Metadata re-interpretation for the HE context} \paragraph{observation\_id} In the current definition of ObsCore, the data product collects data from one or several observations. The same happens in HE context. @@ -784,7 +835,7 @@ \subsubsection{Metadata addition required} \paragraph{Adding MIME-type to access\_format table} As seen in section \ref{sec:data_formats} current HE experiments and observatories use their community defined data format for data dissemination. They encapsulate the event-list table together with ancillary data dedicated to calibration and observing configurations and parameters. -Even if the encapsulation is not standardized between the various projects, it is useful for a client application to rely on the access\_format property in order to send it to an appropriate visualizing tool. +Even if the encapsulation is not standardised between the various projects, it is useful for a client application to rely on the access\_format property in order to send it to an appropriate visualising tool. Therefore these can be included in the MIME-type table of ObsCore section 4.7. suggestion for new terms like : \begin{itemize} @@ -841,16 +892,19 @@ \subsection{Event-list Context Data Model} The event-list concept may include, or may be surrounded by other connected concepts. Indeed, an event-list dataset alone cannot be scientifically analysed without the knowledge of some contextual data and metadata, starting with the good/stable time intervals, and the corresponding IRFs. -The aim of the Event-list Context Data Model is to name and indicate the relations between the event-list and its contextual information. It is presented in Figure~\ref{fig:EventListContext}. +The aim of an Event-list Context Data Model is to name and identify the relations between the event-list and its contextual information. A first attempt is presented in Figure~\ref{fig:EventListContext}. + +Such a model could help to define specific HE data attributes, that could be relevant for an ObsCore description of HE dataset, and thus incuded in a proposed extension. \subsection{Use of Datalink for HE products} \label{sec:datalink} + There are two options to provide an access to a full event-bundle package. -In the first option, the "event-bundle" dataset (\ref{sec:event-bundlle-or-list}) exposed in the discovery service contains all the relevant information, e.g. several frames in the FITS file, one corresponding to the event-list itself, and the others providing good/stable time intervals, or any IRF file. This is what was done in the current GADF data format (see \ref{sec:GADF}). In this option, the content of the event-list package should be properly defined in its description: what information is included and where is it in the dataset structure? Obviously the Event-list Context Data Model (see \ref{sec:EventListContext}) would be useful to provide that. +In the first option, the "event-bundle" dataset (\ref{sec:event-bundlle-or-list}) exposed in the discovery service contains all the relevant information, e.g. several frames in the FITS file, one corresponding to the event-list itself, and the others providing good/stable time intervals, or any IRF file. This is what was done in the current GADF data format (see \ref{sec:GADF}). In this option, the content of the event-list package should be properly defined in its description: what information is included and where is it in the dataset structure? The Event-list Context Data Model (see \ref{sec:EventListContext}) would be useful to provide that information. -In the second option we provide links to the relevant information from the base "event-list" (\ref{sec:event-bundlle-or-list}) exposed in the discovery service. This could be done using Datalink and a list of links to each contextual information such as the IRFs. The Event-list Context Data Model (see \ref{sec:EventListContext}) would provide the concepts and vocabulary to characterise the IRFs and other information relevant to the analysis of an event-list. These specific concepts and terms describing the various flavors of IRFs and GTI will be given in the semantics and content\_qualifier FIELDS of the DataLink response to qualify the links. The different links can point to different +In the second option, we would provide links to the relevant information from the base "event-list" (\ref{sec:event-bundlle-or-list}) exposed in the discovery service. This could be done using Datalink and a list of links to each contextual information such as the IRFs. The Event-list Context Data Model (see \ref{sec:EventListContext}) would provide the concepts and vocabulary to characterise the IRFs and other information relevant to the analysis of an event-list. These specific concepts and terms describing the various flavors of IRFs and GTI will be given in the semantics and content\_qualifier FIELDS of the DataLink response to qualify the links. The different links can point to different dereferencable URLs or alternbatively to different fragments of the same drefereencable URL as stated by the DataLink specification.