feature request: `Sequences per Category` and `Categories per Sequence` as additional coherence charts + metrics #32

mplatzer · 2024-12-02T20:59:17Z

For sequential data we currently report coherence only via auto-correlation. But we also want to measure longer temporal coherence by introducing two additional metrics / charts:

Sequences per Category: For each discretized column, we calculate the share of sequences, that contain that category at least once. This is then being displayed as a chart like the a univariate (categorical) distribution. We then normalize these values to sum up to 1, and calculate again half the L1-distance (=TVD) as metric, to be consistent with the other accuracy metrics, and to be bound to [0, 1].
Categories per Sequence: For each sequence we calculate the number of distinct (discretized) categories, divided by the total number of discretized categories. Again, we show this as a chart like a univariate (numerical) distributions. We then normalize again to sum up to 1, and calculate half the L1-distance as metric.

These charts shall be two sub-sections to the Coherence section of the report. The metrics are calculated for each attribute, and displayed as part of the chart title. The column-level coherence metric in the accuracy table is then the average across auto-correlation, sequences-per-category, and categories-per-sequence. The overall coherence metric is then still the average coherence across all columns.

For the calculation of the metric we shall consider a maximum of 100 (randomly selected) events per sequence.

mplatzer changed the title ~~feature request: Users per Category and Categories per User as additional Coherence charts/metrics~~ feature request: Sequences per Category and Categories per Sequence as additional coherence charts + metrics Dec 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: `Sequences per Category` and `Categories per Sequence` as additional coherence charts + metrics #32

feature request: `Sequences per Category` and `Categories per Sequence` as additional coherence charts + metrics #32

mplatzer commented Dec 2, 2024

feature request: Sequences per Category and Categories per Sequence as additional coherence charts + metrics #32

feature request: Sequences per Category and Categories per Sequence as additional coherence charts + metrics #32

Comments

mplatzer commented Dec 2, 2024

feature request: `Sequences per Category` and `Categories per Sequence` as additional coherence charts + metrics #32

feature request: `Sequences per Category` and `Categories per Sequence` as additional coherence charts + metrics #32