Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/hotfixes' into release
Browse files Browse the repository at this point in the history
  • Loading branch information
fit-alessandro-berti committed Nov 24, 2024
2 parents 61a7bb4 + bf0087b commit 7a2e3ac
Show file tree
Hide file tree
Showing 6 changed files with 305 additions and 576 deletions.
441 changes: 81 additions & 360 deletions docs/source/api.rst

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/source/examples.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Examples
============
===========

Filtering XYZ
-------------
Expand All @@ -8,4 +8,4 @@ Decision Point Analysis
-----------------------

Computing a DFG with Performance Overlay
----------------------------------------
----------------------------------------
263 changes: 127 additions & 136 deletions docs/source/getting_started.rst

Large diffs are not rendered by default.

16 changes: 8 additions & 8 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
Welcome to pm4py's Documentation!
Welcome to PM4Py's Documentation!
===================================

``pm4py`` is a Python library implementing a variety of `process mining <https://en.wikipedia.org/wiki/Process_mining>`_ algorithms.
``PM4Py`` is a Python library implementing a variety of `process mining <https://en.wikipedia.org/wiki/Process_mining>`_ algorithms.

A simple example of ``pm4py`` in action:
A simple example of ``PM4Py`` in action:

.. code-block:: python
Expand All @@ -13,11 +13,11 @@ A simple example of ``pm4py`` in action:
log = pm4py.read_xes('<path-to-xes-log-file.xes>')
process_model = pm4py.discover_bpmn_inductive(log)
pm4py.view_bpmn(process_model)
In this documentation, you can find all relevant information to set up ``pm4py`` and start your process mining journey.
Please consult the Contents listed below to navigate the documentation.
In this documentation, you can find all relevant information to set up ``PM4Py`` and start your process mining journey.
Please consult the contents listed below to navigate the documentation.

Happy #processmining!
Happy #ProcessMining!


Contents
Expand All @@ -29,4 +29,4 @@ Contents
install
getting_started
api
release_notes
release_notes
18 changes: 10 additions & 8 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
@@ -1,32 +1,34 @@
```rst
Installation
============
===========
pip
---
To use ``pm4py`` on any OS, install it using ``pip``:
To use ``PM4Py`` on any OS, install it using ``pip``:

.. code-block:: console
(.venv) $ pip install pm4py
``pmp4y`` uses the ``Graphviz`` library for rendering visualizations.
``PM4Py`` uses the ``Graphviz`` library for rendering visualizations.
Please install `Graphviz <https://graphviz.org/download/>`_.

After installation, GraphViz is located in the ``program files`` directory.
The ``bin\`` folder of the GraphViz directory needs to be added manually to the ``system path``.
After installation, Graphviz is located in the ``program files`` directory.
The ``bin\`` folder of the Graphviz directory needs to be added manually to the ``system path``.
In order to do so, please follow `this instruction <https://stackoverflow.com/questions/44272416/how-to-add-a-folder-to-path-environment-variable-in-windows-10-with-screensho>`_.

Docker
------
To install pm4py via Docker, use:
To install PM4Py via Docker, use:

.. code-block:: console
$ docker pull pm4py/pm4py-core:latest
To run pm4py via docker, use:
To run PM4Py via Docker, use:

.. code-block:: console
$ docker run -it pm4py/pm4py-core:latest bash
$ docker run -it pm4py/pm4py-core:latest bash
```
139 changes: 77 additions & 62 deletions docs/source/release_notes.rst
Original file line number Diff line number Diff line change
@@ -1,114 +1,129 @@
Release Notes
=============

PM4Py 2.7.0 - Release Notes
----------------------------

pm4py 2.7.0 - Release Notes
---------------------------
The major changes in PM4Py 2.7.0 are as follows:

The major changes in pm4py 2.7.0 are as follows:

1. We added an initial integration to ChatGPT
1. We added an initial integration to ChatGPT.

2. We added some connectors for workstation-supported processes (Outlook mail and calendar; web browsers).

PM4Py 2.6.0 - Release Notes
----------------------------

pm4py 2.6.0 - Release Notes
---------------------------

The major changes in pm4py 2.6.0 are as follows:

1. We added the ILP Miner as process discovery algorithm
The major changes in PM4Py 2.6.0 are as follows:

2. We added two log filters: "timestamp grouping" and "consecutive activities"
1. We added the ILP Miner as a process discovery algorithm.

3. We added the insertion of the case arrival/finish rate and of the waiting/service/sojourn times
in the simplified interface
2. We added two log filters: "timestamp grouping" and "consecutive activities."

4. We added a baseline clustering algorithm, based on the pre-existing feature extraction
3. We added the insertion of the case arrival/finish rate and of the waiting/service/sojourn times in the simplified interface.

5. We added the extraction of the "target vector" from event logs for machine learning purposes
4. We added a baseline clustering algorithm, based on the pre-existing feature extraction.

5. We added the extraction of the "target vector" from event logs for machine learning purposes.

pm4py 2.5.0 - Release Notes
---------------------------
PM4Py 2.5.0 - Release Notes
----------------------------

The major changes in pm4py 2.5.0 are as follows:
The major changes in PM4Py 2.5.0 are as follows:

1. We added the Cardoso and extended Cardoso simplicity metrics to pm4py
1. We added the Cardoso and extended Cardoso simplicity metrics to PM4Py.

2. We added discovery of Stochastic Arc Weight nets based on OCEL logs.

3. We added Murata-based Petri net simplification to the simplified interface (implicit place removal)
3. We added Murata-based Petri net simplification to the simplified interface (implicit place removal).

PM4Py 2.4.0 - Release Notes
----------------------------

pm4py 2.4.0 - Release Notes
---------------------------

Today, we released pm4py 2.4.0.
We have adopted our release policy slightly, i.e., as of now, the pm4py versioning follows the MAJOR.MINOR.FIX pattern.
We will also report all MAJOR and MINOR releases in the release notes.
Today, we released PM4Py 2.4.0. We have slightly adopted our release policy; as of now, the PM4Py versioning follows the MAJOR.MINOR.FIX pattern. We will also report all MAJOR and MINOR releases in the release notes.

As today's release is a minor release, we report on the main changes here.

1. We added the Murata algorithm (Berthelot implementation) to remove the structurally redundant places, which is now available in the simplified interface.

2. We added the reduction of invisible transitions in Petri nets to the simplified interface.

3. We added support for calcuating stochastic languages of process models
3. We added support for calculating stochastic languages of process models.

4. We adde support for calculating EMD between two stochastic languages
4. We added support for calculating EMD between two stochastic languages.

5. we added a visualization of alignments in simplified interface
5. We added a visualization of alignments in the simplified interface.

6. We added visualization of footprint table in simplified interface
6. We added visualization of the footprint table in the simplified interface.

7. We added a conversion of Petri net objects to networkX DiGraphs
7. We added a conversion of Petri net objects to NetworkX DiGraphs.

8. We added support for stochastic Petri nets
8. We added support for stochastic Petri nets.

9. We added support for stochastic arc-weight nets (the paper describing this class of nets is submitted to the Petri nets 2023 conference)
9. We added support for stochastic arc-weight nets (the paper describing this class of nets is submitted to the Petri Nets 2023 conference).

pm4py 2.3.0 - Release Notes
---------------------------
Finally, pm4py 2.3.0 has arrived!
The 2.3.0 release contains various significant updates and improvements concerning its predecessors.
The release consists of approximately 550 commits and 47.000 LoC!
The main changes are as follows:
PM4Py 2.3.0 - Release Notes
----------------------------

1. *Flexible parameter passing in the simplified method invocation*, e.g., :meth:`pm4py.discovery.discover_petri_net_inductive`;
For example, in ```pm4py``` 2.2.X, the columns used in process discovery were fixed (i.e., case:concept:name, concept:name, time:timestamp). Hence, changing the perspective implied changing column headers.
In pm4py 2.3.X, the columns used in process discovery are now part of the function arguments.
A simple comparison:
* Discovering a Petri net in pm4py 2.2.X:
``pm4py.discover_petri_net_inductive(dataframe, noise_threshold=0.2)``
Finally, PM4Py 2.3.0 has arrived! The 2.3.0 release contains various significant updates and improvements compared to its predecessors. The release consists of approximately 550 commits and 47,000 LoC! The main changes are as follows:

* Discovering a Petri net in pm4py 2.3.X:
``pm4py.discover_petri_net_inductive(dataframe, noise_threshold=0.2, activity_key="activity", timestamp_key="timestamp", case_id_key="case")``
1. *Flexible parameter passing in the simplified method invocation*, e.g., :meth:`pm4py.discovery.discover_petri_net_inductive`;

For example, in ```pm4py``` 2.2.X, the columns used in process discovery were fixed (i.e., case:concept:name, concept:name, time:timestamp). Hence, changing the perspective implied changing column headers.

In PM4Py 2.3.X, the columns used in process discovery are now part of the function arguments.

A simple comparison:

* Discovering a Petri net in PM4Py 2.2.X:

``pm4py.discover_petri_net_inductive(dataframe, noise_threshold=0.2)``

* Discovering a Petri net in PM4Py 2.3.X:

``pm4py.discover_petri_net_inductive(dataframe, noise_threshold=0.2, activity_key="activity", timestamp_key="timestamp", case_id_key="case")``

2. *Dataframes are primary citizens*;
pm4py used to support both Pandas ``Dataframes`` and our custom-defined event log object. We have decided to adapt all algorithms to work on Dataframes. As such, event data is expected to be represented as a Dataframe in pm4py (i.e., we are dropping the explicit use of our custom event log object). There are two main reasons for this design decision:
1. *Performance*; Generally, Pandas Dataframes are performing significantly better on most operations compared to our custom event log object
2. *Practice*; Most real event data is of tabular form.

Of course, pm4py still supports importing .xes files. However, when importing an event log using :meth:`pm4py.read.read_xes`, the object is directly converted into a Dataframe.
A general drawback of this design decision is that pm4py no longer appropriately supports nested objects (generally supported by the .xes standard). However, as indicated in point b), such nested objects are rarely used in practice.


PM4Py used to support both Pandas ``Dataframes`` and our custom-defined event log object. We have decided to adapt all algorithms to work on Dataframes. As such, event data is expected to be represented as a Dataframe in PM4Py (i.e., we are dropping the explicit use of our custom event log object). There are two main reasons for this design decision:

1. *Performance*; Generally, Pandas Dataframes perform significantly better on most operations compared to our custom event log object.

2. *Practice*; Most real event data is of tabular form.

Of course, PM4Py still supports importing .xes files. However, when importing an event log using :meth:`pm4py.read.read_xes`, the object is directly converted into a Dataframe.

A general drawback of this design decision is that PM4Py no longer appropriately supports nested objects (generally supported by the .xes standard). However, as indicated in point 2, such nested objects are rarely used in practice.

3. *Typing Information in the simplified interface*;
All methods in the simplified interface are guaranteed to have typing information on their input and output objects.

All methods in the simplified interface are guaranteed to have typing information on their input and output objects.

4. *Variant Representation*;
In pm4py 2.3.X, trace variants are represented as a tuple of Strings (representing activity names) instead of a String where a ‘,’ symbol indicates activity separation. For example, a variant of the form <A,B,C> is now represented as a tuple (‘A’,’B’,’C’) and was previously represented as ‘A,B,C’. This fix allows activity names to contain a ‘,’ symbol.

In PM4Py 2.3.X, trace variants are represented as a tuple of strings (representing activity names) instead of a string where a ‘,’ symbol indicates activity separation. For example, a variant of the form <A,B,C> is now represented as a tuple (‘A’, ’B’, ’C’) and was previously represented as ‘A,B,C’. This fix allows activity names to contain a ‘,’ symbol.

5. *Inductive Miner Revised*;
We have re-implemented and restructured the code of the inductive miner. The new version is closer to the reference implementation in ProM and is more performant than the previous version.

We have re-implemented and restructured the code of the inductive miner. The new version is closer to the reference implementation in ProM and is more performant than the previous version.

6. *Business Hours Revised*;
The business hours functionality in pm4py has been revised completely. In pm4py 2.2.X, one could only specify the working days and hours, which were fixed. In pm4py 2.3.X, one can define week-day-based activity slots (e.g., to model breaks). One slot, i.e., one tuple consists of one start and one end time given in seconds since week start, e.g. [(7 * 60 * 60, 17 * 60 * 60), ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60),] meaning that business hours are Mondays 07:00 - 17:00 and Tuesdays 07:00 - 12:00 and 13:00 - 17:00

The business hours functionality in PM4Py has been completely revised. In PM4Py 2.2.X, one could only specify the working days and hours, which were fixed. In PM4Py 2.3.X, one can define weekday-based activity slots (e.g., to model breaks). One slot, i.e., one tuple, consists of one start and one end time given in seconds since week start, e.g.,

```
[
(7 * 60 * 60, 17 * 60 * 60),
((24 + 7) * 60 * 60, (24 + 12) * 60 * 60),
((24 + 13) * 60 * 60, (24 + 17) * 60 * 60),
]
```

meaning that business hours are Mondays 07:00 - 17:00 and Tuesdays 07:00 - 12:00 and 13:00 - 17:00.

7. *Auto-Generated Docs*;
As you may have noticed, this website serves as the new documentation hub for pm4py. It contains all previously available information on the project website related to ‘installation’ and ‘getting started’. For the simplified interface, we have merged the general documentation with the API docs to improve the overall understanding of working with pm4py. The docs are now generated directly from the pm4py source. Hence, feel free to share a pull request if you find any issues.

As you may have noticed, this website serves as the new documentation hub for PM4Py. It contains all previously available information on the project website related to ‘installation’ and ‘getting started’. For the simplified interface, we have merged the general documentation with the API docs to improve the overall understanding of working with PM4Py. The docs are now generated directly from the PM4Py source. Hence, feel free to share a pull request if you find any issues.

Happy #processmining!

The #pm4py development team.
The #PM4Py development team.

0 comments on commit 7a2e3ac

Please sign in to comment.