Skip to content

Commit

Permalink
Merge pull request #233 from KIT-CMS/docs
Browse files Browse the repository at this point in the history
update docs for latest KingMaker update, fix lots of typos
  • Loading branch information
winterchristian authored Sep 14, 2023
2 parents 87ed363 + 364c2dc commit 15c4d2b
Show file tree
Hide file tree
Showing 6 changed files with 139 additions and 109 deletions.
64 changes: 32 additions & 32 deletions docs/sphinx_source/contrib.rst

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions docs/sphinx_source/friend_trees.rst
Original file line number Diff line number Diff line change
@@ -1,24 +1,24 @@
FriendTree Generation
===========================

CROWN can be used, to generate FriendTrees based on a CROWN ntuple. The concept of FriendTrees is explained here: https://root.cern/manual/trees/#widening-a-ttree-through-friends. They allow to extend an existing ntuple with new quantities. Common use cases are new high level variables like neural network outputs or additional correction factors.
CROWN can be used, to generate FriendTrees based on a CROWN ntuple. The concept of FriendTrees is explained here: https://root.cern/manual/trees/#widening-a-ttree-through-friends. They allow to extend an existing ntuple with new quantities. Common use cases are new high-level variables like neural network outputs or additional correction factors.

.. image:: ../images/root_friends.png
:width: 900
:align: center
:alt: Sketch of how Friend trees work

The the example depicated above, two additional friends to the main NTuple are created. During analysis, the quantities stored in the friend trees can be added by using the ``AddFriend`` method. The quantities are then available in the TTree as if they were part of the original NTuple.
The the example depicted above, two additional friends to the main NTuple are created. During analysis, the quantities stored in the friend trees can be added by using the ``AddFriend`` method. The quantities are then available in the TTree as if they were part of the original NTuple.

A FriendTree is generated using a FriendTreeConfiguration. Such a configuration has some major differences, compared to a regular configuration:

1. The input file is a CROWN ntuple, not a ROOT file.
2. Only one scope per user is allowed.
3. No global scope is required
4. The available inputs have to be specified. The available inputs can be provided by using a CROWN ntuple as input, or a json file. The ntuple can be used for debugging proposes, when running a production, it is recommended to use a json file. The basic structure this quantities map is listed below. Such a json can then be used for multiple eras, sampletypes and scopes.
4. The available inputs have to be specified. The available inputs can be provided by using a CROWN ntuple as input, or a JSON file. The ntuple can be used for debugging proposes, when running a production, it is recommended to use a JSON file. The basic structure of this quantities map is listed below. Such a JSON can then be used for multiple eras, sample types and scopes.


.. code-block:: json
.. code-block:: JSON
{
"era_1": {
Expand Down Expand Up @@ -50,21 +50,21 @@ Writing a FriendTreeConfiguration

The basic structure of a FriendTreeConfiguration is identical to a regular configuration. When creating a new FriendTree executable, an additional argument has to be provided:

* ``DQUANTITIESMAP`` - The path to the quantities map json file or the crown ntuple root file.
* ``DQUANTITIESMAP`` - The path to the quantities map JSON file or the crown ntuple root file.

All other parameters are identical to the regular configuration. Setting up producers, outputs and new systematic shifts works the same way as before. The configuration has to be of type ``FriendTreeConfiguration``. During the configuration, the available inputs are checked for consistency, to catch any possible misconfiguration early. In addition, as for CROWN ntuples, only required shifts are executed.

FriendTrees with multiple input friend trees
--------------------------------------------

Starting from version 0.3 of CROWN, it is also possible to use multiple input friend trees. A typical usecase for this feature is the evaluation of Classifiers, and storing the output of the classifier in the friend tree. This way, the classifier can utilize quantities from both the main ntuple, and from additional friend trees. The interface for configuring such a FriendTree executable is similar to the regular FriendTree configuration, with the following differences:
Starting from version 0.3 of CROWN, it is also possible to use multiple input friend trees. A typical use case for this feature is the evaluation of Classifiers, and storing the output of the classifier in the friend tree. This way, the classifier can utilize quantities from both the main ntuple and from additional friend trees. The interface for configuring such a FriendTree executable is similar to the regular FriendTree configuration, with the following differences:

* The information for all input files has to be provided. This means that the ``DQUANTITIESMAP`` has to be extended. It is possible to
1. provide a single json file, that contains the input information for all input files (the crown ntuple + all additional files)
2. provide a list of json files, each containing the input information for one input file
1. provide a single JSON file, that contains the input information for all input files (the crown ntuple + all additional files)
2. provide a list of JSON files, each containing the input information for one input file
3. provide a list of root files (crown ntuple + all additional files)

During the execution, all inputfiles have to be provided, resulting in a command line like this:
During the execution, all input files have to be provided, resulting in a command line like this:

.. code-block:: bash
Expand All @@ -73,5 +73,5 @@ During the execution, all inputfiles have to be provided, resulting in a command
Before execution, the input files are checked for consistency. This means that the following checks are performed:

* All inputfiles have to contain the same number of entries
* All inputfiles have to be readable (no missing files)
* All input files have to contain the same number of entries
* All input files have to be readable (no missing files)
6 changes: 3 additions & 3 deletions docs/sphinx_source/index.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
Welcome to The CROWN documentation!
####################################

The The **C** ++-based **RO** OT **W** orkflow for **N** -tuples (CROWN) is a fast new way to convert NanoAOD samples into flat :code:`TTrees` to be used in further analysis. The main focus of the framework is to provide a fast and clean way of selecting events, calculating quantities and weights. The framework has minimal dependencies and only uses ROOT and it's Dataframe as a backend.
The **C** ++-based **RO** OT **W** orkflow for **N** -tuples (CROWN) is a fast new way to convert NanoAOD samples into flat :code:`TTrees` to be used in further analysis. The main focus of the framework is to provide a fast and clean way of selecting events and calculating quantities and weights. The framework has minimal dependencies and only uses ROOT and it's Dataframe as a backend.

.. note::
In order to get started, go here: :ref:`Getting started`.
To get started, go here: :ref:`Getting started`.

.. note::
To read about recent changes and new features, go here: :ref:`changelog`.


Available Analyses
*******************
The following analyses configurations are currently available in CROWN. If you want to add your own analysis configuration, contact the developers.
The following analysis configurations are currently available in CROWN. If you want to add your analysis configuration, contact the developers.

.. list-table:: Available Analyses Configurations for CROWN
:widths: 25 150
Expand Down
36 changes: 18 additions & 18 deletions docs/sphinx_source/introduction.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Introduction
=============

The The **C** ++-based **RO** OT **W** orkflow for **N** -tuples (CROWN) is a fast new way to convert NanoAOD samples into flat :code:`TTrees` to be used in further analysis. The main focus of the framework is to provide a fast and clean way of selecting events, calculating quantities and weights. The framework has minimal dependencies and only uses ROOT and it's Dataframe as a backend.
The **C** ++-based **RO** OT **W** orkflow for **N** -tuples (CROWN) is a fast new way to convert NanoAOD samples into flat :code:`TTrees` to be used in further analysis. The main focus of the framework is to provide a fast and clean way of selecting events and calculating quantities and weights. The framework has minimal dependencies and only uses ROOT and it's Dataframe as a backend.


Design Idea
Expand All @@ -20,13 +20,13 @@ Getting started

.. warning::
The Framework depends on the scale factors provided by CMS. These are directly included in the repository via a git submodule. Since the scale factors are added from the CERN gitlab, access to the CERN gitlab repository (https://gitlab.cern.ch/cms-nanoAOD/jsonpog-integration), is needed. Since the repository is added via SSH, your SSH key must be added to the CERN gitlab instance ( A tutorial on how to do this can be found here: https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account).
For the instructions to work, you also have to add the SSH key to your github.com account. The instructions to do this can be found here: https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/
For the instructions to work, you also have to add the SSH key to your GitHub.com account. The instructions to do this can be found here: https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/



After making sure, that the access right are given, setting up the framework is straight forward.
After making sure, that the access rights are given, setting up the framework is straightforward.

First clone the Repository
First, clone the Repository

.. code-block:: console
Expand All @@ -38,7 +38,7 @@ and source the current LCG stack (at the moment we use a nightly build)
source init.sh
after this, the framework should be installed, but without any analysis, other than the example analysis. If you want to setup a specific anlysis, you can do so by adding the name of the analysis to your ``init.sh`` command. Currently, supported analyses are:
after this, the framework should be installed, but without any analysis, other than the example analysis. If you want to set up a specific analysis, you can do so by adding the name of the analysis to your ``init.sh`` command. Currently, supported analyses are:

.. list-table:: Available Analyses
:widths: 25 150
Expand All @@ -51,7 +51,7 @@ after this, the framework should be installed, but without any analysis, other t
* - ``earlyrun3``
- https://github.com/khaosmos93/CROWN-config-earlyRun3

So in order to setup the `tau` Analysis, you can do so by running
So to set the `tau` Analysis, you can do so by running

.. code-block:: console
Expand All @@ -60,20 +60,20 @@ So in order to setup the `tau` Analysis, you can do so by running
Running the framework
**********************

In order to create a new executable, first create a build directory
To create a new executable, first create a build directory

.. code-block:: console
mkdir build && cd build
and then run `cmake` to setup the Makefiles. A python configuration is needed in order to specify the code, that should be generated. Configurations are located in the :code:`analysis_configuations` directory. Within this folder, a subfolder for each type of analysis is created. Within the analysis folder, multiple Configurations belonging to the same analysis can be located. For example in the `tau` analysis, a main configuration `config.py` as well as several smaller Configurations exist.
and then run `cmake` to set up the Makefiles. A python configuration is needed to specify the code, that should be generated. Configurations are located in the :code:`analysis_configuations` directory. Within this folder, a subfolder for each type of analysis is created. Within the analysis folder, multiple Configurations belonging to the same analysis can be located. For example in the `tau` analysis, a main configuration `config.py` as well as several smaller Configurations exist.

.. Note::
You have to provide both
1. the analysis that you want to run e.g. `-DANALYSIS=template_analysis`
2. the configuration that should be used `-DCONFIG=min_config`.

For the cmake command a minimal set of options has to be provided, in this case we use the template analysis with the minimal example
For the cmake command, a minimal set of options has to be provided, in this case, we use the template analysis with the minimal example

.. code-block:: console
Expand All @@ -82,12 +82,12 @@ For the cmake command a minimal set of options has to be provided, in this case
The options that are currently available are:

* :code:`-DANALYSIS=template_analysis`: The analysis to be used. This is the name of the folder in the :code:`analysis_configurations` directory.
* :code:`-DCONFIG=min_config`: The configuration to be used. This is the name of the python configuration file. The file has to be located in the directory of the analysis and the path is provided in the python import syntax so e.g. :code:`subfolder.myspecialconfig`
* :code:`-DSAMPLES=emb`: The samples to be used. This is a single sample or a comma separated list of sample names.
* :code:`-DERAS=2018`: The era to be used. This is a single era or a comma separated list of era names.
* :code:`-DSCOPES=et`: The scopes to be run. This is a single scope or a comma separated list of scopes. The global scope is always run.
* :code:`-DCONFIG=min_config`: The configuration to be used. This is the name of the python configuration file. The file has to be located in the directory of the analysis and the path is provided in the Python import syntax e.g. :code:`subfolder.myspecialconfig`
* :code:`-DSAMPLES=emb`: The samples to be used. This is a single sample or a comma-separated list of sample names.
* :code:`-DERAS=2018`: The era to be used. This is a single era or a comma-separated list of era names.
* :code:`-DSCOPES=et`: The scopes to be run. This is a single scope or a comma-separated list of scopes. The global scope is always run.
* :code:`-DTHREADS=20`: The number of threads to be used. Defaults to single threading.
* :code:`-DSHIFTS=all`: The shifts to be used. Defaults to all shifts. If set to :code:`all`, all shifts are used, if set to :code:`none`, no shifts are used, so only nominal is produced. If set to a comma separated list of shifts, only those shifts are used. If set to only a substring matching multiple shifts, all shifts matching that string will be produced e.g. :code:`-DSHIFTS=tauES` will produce all shifts containing :code:`tauES` in the name.
* :code:`-DSHIFTS=all`: The shifts to be used. Defaults to all shifts. If set to :code:`all`, all shifts are used, if set to :code:`none`, no shifts are used, so only nominal is produced. If set to a comma-separated list of shifts, only those shifts are used. If set to only a substring matching multiple shifts, all shifts matching that string will be produced e.g. :code:`-DSHIFTS=tauES` will produce all shifts containing :code:`tauES` in the name.
* :code:`-DDEBUG=true`: If set to true, the code generation will run with debug information and the executable will be compiled with debug flags
* :code:`-DOPTIMIZED=true`: If set to true, the compiler will run with :code:`-O3`, resulting in slower build times but faster runtimes. Should be used for developments, but not in production.

Expand All @@ -97,10 +97,10 @@ Compile the executable using
make install -j 20
The recommendded build system is using regular UNIX build files, however, as an additional option, the ninja build system (https://ninja-build.org/) can be used for CROWN. In order to use ninja, set :code:`export CMAKE_GENERATOR="Ninja"` in the :code:`init.sh` as env variable, and then use the :code:`ninja install -j 20` command to compile the executable. Since CROWN profits from the parallelization of the build process, the number of threads can and should be set using the :code:`-j` option.
The recommended build system is using regular UNIX build files, however, as an additional option, the ninja build system (https://ninja-build.org/) can be used for CROWN. To use ninja, set :code:`export CMAKE_GENERATOR="Ninja"` in the :code:`init.sh` as env variable, and then use the :code:`ninja install -j 20` command to compile the executable. Since CROWN profits from the parallelization of the build process, the number of threads can and should be set using the :code:`-j` option.


After the compilation, the CROWN executable can be found in the :code:`build/bin` folder. The executable can be used via, with a single output file followed by an arbitrary number of input files.
After the compilation, the CROWN executable can be found in the :code:`build/bin` folder. The executable can be used via a single output file followed by an arbitrary number of input files.

.. code-block:: console
Expand All @@ -116,7 +116,7 @@ The Web documentation at readthedocs is updated automatically. However, if you w
mkdir build_docs && cd build_docs
then run :code:`cmake` to setup the documentation building process
then run :code:`cmake` to set the documentation building process

.. code-block:: console
Expand All @@ -128,7 +128,7 @@ and build the documentation using
make
The resulting documentation can than be found in
The resulting documentation can then be found in

.. code-block:: console
Expand Down
Loading

0 comments on commit 15c4d2b

Please sign in to comment.