FEAT: Sigo Re-identification (#21)

* docs: re-identification issue * test: re-identification function * test: package re-identification * test: re-identification dataset * test: doc TestReidentification * test: similarity metric on 2 dataset * refactor: package reidentification * feat: add reidentification in sigo * docs: example of trees reidentification * docs: 2nde example reidentification * docs: 2nde example reidentification * docs: 3rd example reidentification * docs: prevent re-identification * docs: prevent reidentification * docs: prevent reidentification * feat: sigo reidentification * feat: sigo reidentification * docs: add comments to functions * feat: sigo reidentification * refactor: define distance functions * feat: sigo reidentification * feat: sigo reidentification * feat: add function to scale data * feat: scale data before compute distance * docs: add README part reidentification * feat: add flag threshold reidentification * feat: add threshold to reidentification * docs: update README reidentification example * docs: update README reidentification example * docs: README explain reidentification * feat: add similarity score in output * docs: README add reidentification section * feat: add logs * feat: add logs * feat: add logs * docs: add reidentification tests * test: add benchmarck tests * feat: add reidentification in anonymizer * feat: add reidentification in anonymizer * feat: add arguments to anonymizer * feat: add checks for re-identification * test: add unit tests * docs: update README section reidentification * docs: remove old tests in example * docs: update README * docs: update README * docs: update README * docs: use LaTeX style syntax for formulas * docs: use LaTeX style syntax for formulas * docs: use LaTeX style syntax for formulas * docs: use LaTeX style syntax for formulas * docs: use LaTeX style syntax for formulas * test: add zerolog on benchmarck test --------- Co-authored-by: Youen Péron <[email protected]>
CGI-FR · Apr 18, 2024 · ce175b4 · ce175b4
1 parent f450b64
commit ce175b4
Show file tree

Hide file tree

Showing 32 changed files with 1,312 additions and 104 deletions.
diff --git a/README.md b/README.md
@@ -258,6 +258,186 @@ DataSet after sequencing:
 
 Dates can be easily transformed into a sequence of floats, but one can imagine categories like colors, origin (if not a sensitive value), or even genders.
 
+## REIDENTIFICATION
+
+With the evolution of information technologies that make it possible to link data from different sources, it is almost impossible to guarantee an anonymization that would offer a zero risk of re-identification.
+
+**Re-identification Definition :** A process (or algorithm) that takes an anonymized dataset and related knowledge as input and seeks to match the anonymized data with real-world individuals.
+
+Let's take as an example a very simple dataset that you can find in the `original.json` file in `examples/re-identification`.
+
+```json
+{"id": 1, "x": 5, "y": 6, "z":"a"}
+{"id": 2, "x": 3, "y": 7, "z":"a"}
+{"id": 3, "x": 4, "y": 4, "z":"c"}
+{"id": 4, "x": 2, "y": 10, "z":"b"}
+{"id": 5, "x": 8, "y": 4, "z":"a"}
+{"id": 6, "x": 8, "y": 10, "z":"a"}
+...
+```
+
+And suppose that we have 2 quasi-identifiers: `x` and `y` and as sensitive data the variable `z`. Anonymize the dataset using `sigo`, we use sigo's default settings **k=3** and **l=1** with the **meanAggregation** method :
+
+```console
+sigo -q x,y -s z -a meanAggregation -i cluster < original.json > anonymized.json
+```
+
+```json
+{"id":1,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":2,"x":3,"y":7,"z":"a","cluster":1}
+{"id":3,"x":3,"y":7,"z":"c","cluster":1}
+{"id":4,"x":3,"y":7,"z":"b","cluster":1}
+{"id":5,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":6,"x":7,"y":6.67,"z":"a","cluster":2}
+...
+```
+
+**Objective :** Identify for each individual in the original dataset (data from the open data) whether an anonymized individual is similar to him assuming the worst case scenario, i.e. the attacker has the original dataset but not the sensitive data.
+
+The data that the attacker has is in the `openData.json` file in `examples/re-identification`.
+
+```json
+{"id": 1, "x": 5, "y": 6}
+{"id": 2, "x": 3, "y": 7}
+{"id": 3, "x": 4, "y": 4}
+{"id": 4, "x": 2, "y": 10}
+{"id": 5, "x": 8, "y": 4}
+{"id": 6, "x": 8, "y": 10}
+...
+```
+
+![image](examples/re-identification/intro.png)
+
+Our method of re-identification is to find the closest or most similar individuals.
+
+This approach depends greatly on the concepts of distance and similarity (more details in èxample/re-identification/concept).
+
+### Approach
+
+- 1st step:
+
+Merge the anonymized data and the data from the open data, then add a binary attribute *original* which indicates the origin of the individual (`0` if it is the anonymized data and `1` if it is the external data to be re-identified).
+
+![image](examples/re-identification/step1.png)
+
+The files are located in the folder `examples/re-identification`.
+
+```console
+cat anonymized.json openData.json > merged.json
+```
+
+``` json
+...
+{"original":0,"id":22,"x":16.67,"y":18.33,"z":"c"}
+{"original":0,"id":23,"x":16.67,"y":18.33,"z":"b"}
+{"original":0,"id":24,"x":19.67,"y":17.67,"z":"b"}
+{"original":1,"id": 1, "x": 5, "y": 6}
+{"original":1,"id": 2, "x": 3, "y": 7}
+{"original":1,"id": 3, "x": 4, "y": 4}
+...
+```
+
+- 2nd step:
+
+Use sigo to form clusters of similar individuals.
+
+```console
+sigo -q x,y -s original -k 6 -l 2 -i cluster < merged.json
+```
+
+``` json
+{"id":2,"original":0,"x":3,"y":7,"z":"a","cluster":1}
+{"id":2,"original":1,"x":3,"y":7,"z":null,"cluster":1}
+{"id":3,"original":0,"x":3,"y":7,"z":"c","cluster":1}
+{"id":3,"original":1,"x":4,"y":4,"z":null,"cluster":1}
+{"id":4,"original":1,"x":2,"y":10,"z":null,"cluster":1}
+{"id":4,"original":0,"x":3,"y":7,"z":"b","cluster":1}
+{"id":1,"original":1,"x":5,"y":6,"z":null,"cluster":2}
+{"id":1,"original":0,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":5,"original":0,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":5,"original":1,"x":8,"y":4,"z":null,"cluster":2}
+{"id":6,"original":0,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":6,"original":1,"x":8,"y":10,"z":null,"cluster":2}
+...
+```
+
+- 3rd step:
+
+    - if in a cluster the sensitive data has the same value then we can re-identify the original individuals with this sensitive data.
+    - if in a cluster the sensitive data is not the same but the anonymized individuals are the same then we are not able to re-identify the original individuals.
+    - if in a cluster the sensitive data is not the same and the anonymized individuals are not the same then a distance computation will be performed to try to re-identify the individuals.
+
+![image](examples/re-identification/step3.png)
+
+Below is the use of `sigo` for re-identification.
+
+```console
+sigo -q x,y -s original -k 6 -l 2 -a reidentification --args z < merged.json
+```
+
+```json
+{"id":1,"original":1,"x":5,"y":6,"z":"a"}
+{"id":1,"original":0,"x":7,"y":6.67,"z":"a"}
+{"id":2,"original":0,"x":3,"y":7,"z":"a"}
+{"id":2,"original":1,"x":3,"y":7,"z":null}
+{"id":3,"original":0,"x":3,"y":7,"z":"c"}
+{"id":3,"original":1,"x":4,"y":4,"z":null}
+{"id":4,"original":1,"x":2,"y":10,"z":null}
+{"id":4,"original":0,"x":3,"y":7,"z":"b"}
+{"id":5,"original":0,"x":7,"y":6.67,"z":"a"}
+{"id":5,"original":1,"x":8,"y":4,"z":"a"}
+{"id":6,"original":0,"x":7,"y":6.67,"z":"a"}
+{"id":6,"original":1,"x":8,"y":10,"z":"a"}
+{"id":7,"original":1,"x":3,"y":16,"z":"a"}
+{"id":7,"original":0,"x":4.33,"y":17.67,"z":"a"}
+{"id":8,"original":1,"x":7,"y":19,"z":null}
+{"id":8,"original":0,"x":8,"y":15.67,"z":"a"}
+{"id":9,"original":0,"x":4.33,"y":17.67,"z":"a"}
+{"id":9,"original":1,"x":6,"y":18,"z":"a"}
+{"id":10,"original":0,"x":4,"y":18,"z":"b"}
+{"id":10,"original":1,"x":4,"y":19,"z":"b"}
+{"id":11,"original":1,"x":7,"y":14,"z":null}
+{"id":11,"original":0,"x":8,"y":15.67,"z":"c"}
+{"id":12,"original":0,"x":8,"y":15.67,"z":"c"}
+{"id":12,"original":1,"x":10,"y":14,"z":null}
+{"id":13,"original":1,"x":15,"y":5,"z":null}
+{"id":13,"original":0,"x":16,"y":6,"z":"c"}
+{"id":14,"original":1,"x":15,"y":7,"z":null}
+{"id":14,"original":0,"x":16,"y":6,"z":"b"}
+{"id":15,"original":1,"x":11,"y":9,"z":null}
+{"id":15,"original":0,"x":12.33,"y":6,"z":"b"}
+{"id":16,"original":1,"x":12,"y":3,"z":null}
+{"id":16,"original":0,"x":12.33,"y":6,"z":"a"}
+{"id":17,"original":0,"x":16,"y":6,"z":"c"}
+{"id":17,"original":1,"x":18,"y":6,"z":null}
+{"id":18,"original":0,"x":12.33,"y":6,"z":"c"}
+{"id":18,"original":1,"x":14,"y":6,"z":null}
+{"id":19,"original":0,"x":19.67,"y":17.67,"z":"b"}
+{"id":19,"original":1,"x":20,"y":20,"z":"b"}
+{"id":20,"original":0,"x":16.67,"y":18.33,"z":"c"}
+{"id":20,"original":1,"x":18,"y":19,"z":null}
+{"id":21,"original":0,"x":19.67,"y":17.67,"z":"b"}
+{"id":21,"original":1,"x":20,"y":18,"z":"b"}
+{"id":22,"original":0,"x":16.67,"y":18.33,"z":"c"}
+{"id":22,"original":1,"x":18,"y":18,"z":null}
+{"id":23,"original":1,"x":14,"y":18,"z":null}
+{"id":23,"original":0,"x":16.67,"y":18.33,"z":"b"}
+{"id":24,"original":1,"x":19,"y":15,"z":"b"}
+{"id":24,"original":0,"x":19.67,"y":17.67,"z":"b"}
+```
+
+### Usage for sigo reidentification
+
+These flags must be used for re-identification :
+
+- `--quasi-identifier,-q <strings>`, this flag lists the attributes (quasi-identifiers) of datasets.
+- `--sensitive,-s original`, this flag tells sigo to make the difference between anonymized data and original data to be re-identified.
+- `--l-value,-l 2`, the l-diversity parameter is set to 2 to have at least one anonymized data and one original data in the same cluster.
+- `--anonymizer,-a reidentification`, this flag allows to run the re-identification.
+- `--args <strings>`, this flag lists the arguments to pass to the re-identification method, i.e. the list of sensitive attributes of the dataset.
+
+The other `sigo` flags can be used in addition.
+
 ## Contributors
 
 - CGI France ✉[Contact support](mailto:[email protected])

diff --git a/cmd/sigo/main.go b/cmd/sigo/main.go
@@ -60,6 +60,7 @@ type pdef struct {
 	method    string
 	cmdLine   []string
 	config    string
+	args      []string
 }
 
 func main() {
@@ -94,25 +95,25 @@ There is NO WARRANTY, to the extent permitted by law.`, version, commit, buildDa
 		BoolVar(&logs.jsonlog, "log-json", false, "output logs in JSON format")
 	rootCmd.PersistentFlags().StringVar(&logs.colormode, "color", "auto", "use colors in log outputs : yes, no or auto")
 	// nolint: gomnd
-	rootCmd.PersistentFlags().
-		IntVarP(&definition.k, "k-value", "k", 3, "k-value for k-anonymization")
-	rootCmd.PersistentFlags().
-		IntVarP(&definition.l, "l-value", "l", 1, "l-value for l-diversity")
+	rootCmd.PersistentFlags().IntVarP(&definition.k, "k-value", "k", 3, "k-value for k-anonymization")
+	rootCmd.PersistentFlags().IntVarP(&definition.l, "l-value", "l", 1, "l-value for l-diversity")
 	rootCmd.PersistentFlags().
 		StringSliceVarP(&definition.qi, "quasi-identifier", "q", []string{}, "list of quasi-identifying attributes")
 	rootCmd.PersistentFlags().
 		StringSliceVarP(&definition.sensitive, "sensitive", "s", []string{}, "list of sensitive attributes")
 	rootCmd.PersistentFlags().
-		StringVarP(&definition.method, "anonymizer", "a", "",
-			"anonymization method used. Select one from this list "+
-				"['general', 'meanAggregation', 'medianAggregation', 'outlier', 'laplaceNoise', 'gaussianNoise', 'swapping']")
+		StringVarP(&definition.method, "anonymizer", "a", "NoAnonymizer", "anonymization method used."+
+			"Select one from this list ['general', 'meanAggregation', 'medianAggregation', 'outlier',"+
+			"'laplaceNoise', 'gaussianNoise', 'swapping', 'reidentification']")
 	rootCmd.PersistentFlags().
 		StringVarP(&logs.info, "cluster-info", "i", "", "display cluster for each jsonline flow")
 	rootCmd.PersistentFlags().BoolVarP(&logs.profiling, "profiling", "p", false,
 		"start sigo with profiling and generate a cpu.pprof file (debug)")
 	rootCmd.PersistentFlags().BoolVar(&definition.entropy, "entropy", false, "use entropy model for l-diversity")
 	rootCmd.PersistentFlags().
 		StringVarP(&definition.config, "configuration", "c", "sigo.yml", "name and location of the configuration file")
+	rootCmd.PersistentFlags().
+		StringSliceVar(&definition.args, "args", []string{}, "list of arguments for anonymizer method")
 
 	if err := rootCmd.Execute(); err != nil {
 		log.Err(err).Msg("Error when executing command")
@@ -123,8 +124,7 @@ There is NO WARRANTY, to the extent permitted by law.`, version, commit, buildDa
 func run(definition pdef, logs logs) {
 	initLog(logs, definition.entropy)
 
-	// if the configuration file is present in the current directory
-	if sigo.Exist(definition.config) {
+	if sigo.Exist(definition.config) { // if the configuration file is present in the current directory
 		if err := definition.initConfig(); err != nil {
 			log.Err(err).Msg("Cannot load configuration definition from file")
 			log.Warn().Int("return", 1).Msg("End SIGO")
@@ -133,14 +133,10 @@ func run(definition pdef, logs logs) {
 	}
 
 	log.Info().
-		Str("configuration", definition.config).
-		Int("k-anonymity", definition.k).
-		Int("l-diversity", definition.l).
-		Strs("Quasi-Identifiers", definition.qi).
-		Strs("Sensitive", definition.sensitive).
-		Str("Method", definition.method).
-		Str("Cluster-Info", logs.info).
-		Msg("Start SIGO")
+		Str("configuration", definition.config).Int("k-anonymity", definition.k).
+		Int("l-diversity", definition.l).Strs("Quasi-Identifiers", definition.qi).
+		Strs("Sensitive", definition.sensitive).Str("Method", definition.method).
+		Str("Cluster-Info", logs.info).Msg("Start SIGO")
 
 	source, err := infra.NewJSONLineSource(os.Stdin, definition.qi, definition.sensitive)
 	if err != nil {
@@ -165,8 +161,19 @@ func run(definition pdef, logs logs) {
 		cpuProfiler = profile.Start(profile.ProfilePath("."))
 	}
 
+	methodName := []string{
+		"NoAnonymizer", "general", "meanAggregation", "medianAggregation", "outlier",
+		"laplaceNoise", "gaussianNoise", "swapping", "reidentification",
+	}
+
+	if !sigo.Find(methodName, definition.method) {
+		log.Err(err).Msg("Unknown anonymization method")
+		log.Warn().Int("return", 1).Msg("End SIGO")
+		os.Exit(1)
+	}
+
 	err = sigo.Anonymize(source, sigo.NewKDTreeFactory(), definition.k, definition.l,
-		len(definition.qi), newAnonymizer(definition.method), sink, debugger)
+		len(definition.qi), newAnonymizer(definition.method, definition.args), sink, debugger)
 	if err != nil {
 		panic(err)
 	}
@@ -226,7 +233,8 @@ func initLog(logs logs, entropy bool) {
 	log.Info().Msgf("%v %v (commit=%v date=%v by=%v)", name, version, commit, buildDate, builtBy)
 }
 
-func newAnonymizer(name string) sigo.Anonymizer {
+//nolint: cyclop
+func newAnonymizer(name string, args []string) sigo.Anonymizer {
 	switch name {
 	case "general":
 		return sigo.NewGeneralAnonymizer()
@@ -242,6 +250,14 @@ func newAnonymizer(name string) sigo.Anonymizer {
 		return sigo.NewNoiseAnonymizer("gaussian")
 	case "swapping":
 		return sigo.NewSwapAnonymizer()
+	case "reidentification":
+		if len(args) == 0 {
+			log.Error().Msg("The list of arguments is empty")
+			log.Warn().Int("return", 1).Msg("End SIGO")
+			os.Exit(1)
+		}
+
+		return sigo.NewReidentification(args)
 	default:
 		return sigo.NewNoAnonymizer()
 	}

diff --git a/examples/cars/README.md b/examples/cars/README.md
@@ -45,11 +45,14 @@ To calculate the correlation between each variable of the dataset we use the pea
 Pearson correlation measures the strength of the linear relationship between two continuous variables. It has a value between -1 to 1, with a value of -1 meaning a total negative linear correlation, 0 being no correlation, and + 1 meaning a total positive correlation.
 
 Pearson Correlation Coefficient :
-![equation](https://latex.codecogs.com/svg.image?%5Crho(x,y)%20=%20%5Cfrac%7B%5Csum%20%5Cleft%20%5B%20%5Cleft%20(%20x_%7Bi%7D%20-%20%5Cbar%7Bx%7D%20%5Cright%20)%20*%20%5Cleft%20(%20y_%7Bi%7D%20-%20%5Cbar%7By%7D%20%5Cright%20)%20%20%5Cright%20%5D%7D%7B%5Csigma_%7Bx%7D%20*%20%5Csigma_%7By%7D%7D)
+$$ \rho \left( x, y \right) = \frac{\sum \left[ \left( x_i - \overline x \right) \times \left( y_i - \overline y \right) \right]}{\sigma_x \times \sigma_y}  $$
 
 With,
 
-![equation](https://latex.codecogs.com/svg.image?%5Cinline%20%5C%5C%5Cbar%7Bx%7D%20%5Ctext%7B%20:%20mean%20of%20x%20variable.%7D%20%5C%5C%5Cbar%7By%7D%20%5Ctext%7B%20:%20mean%20of%20y%20variable.%7D%20%5C%5C%5Csigma_x%20%5Ctext%7B%20:%20standart%20deviation%20of%20x%20variable.%7D%20%5C%5C%5Csigma_y%20%5Ctext%7B%20:%20standart%20deviation%20of%20y%20variable.%7D)
+$$ \overline x  \text : \space \text mean \space \text of \space \text x \space \text variable. \\
+\overline y  \text : \space \text mean \space \text of \space \text y \space \text variable. \\
+\sigma_x  \text : \space \text standart \space \text deviation \space \text of \space \text x \space \text variable. \\
+\sigma_y  \text : \space \text standart \space \text deviation \space \text of \space \text y \space \text variable. $$
 
 ```python
 import pandas as pd

diff --git a/examples/cars/micro-aggregation/README.md b/examples/cars/micro-aggregation/README.md
@@ -112,7 +112,7 @@ The data is recorded in the carsn.json file, below you will find an overview of
 | Weight_in_lbs    |     -0.927248    |  0.951283 |   0.985833   |  0.964514  |    1.000000   |   -0.500590  |
 | Acceleration     |     0.546557     | -0.501903 |   -0.528976  |  -0.669607 |   -0.500590   |   1.000000   |
 
-The correlation after anonymization is in the range ![equation](https://latex.codecogs.com/svg.image?%5Cinline%20%5Cleft%20%5B%20%5Cpm%200.003%20;%20%5Cpm%200.163%20%5Cright%20%5D)
+The correlation after anonymization is in the range $\left[ \pm 0.003; \pm 0.163\right]$.
 
 ### Bibliography
 

diff --git a/examples/cars/random-noise/README.md b/examples/cars/random-noise/README.md
@@ -161,7 +161,7 @@ masking:
 | Weight_in_lbs    |     -0.711799    |  0.665301 |   0.672666   |  0.638299  |    1.000000   |   -0.170356  |
 | Acceleration     |     0.269291     | -0.175330 |   -0.239672  |  -0.332941 |   -0.170356   |   1.000000   |
 
-The correlation after anonymization is in the range ![equation](https://latex.codecogs.com/svg.image?%5Cleft%20%5B%20%5Cpm%200.08%20;%20%5Cpm%200.36%20%5Cright%20%5D)
+The correlation after anonymization is in the range $\left[ \pm 0.08; \pm 0.36\right]$.
 
 ### Bibliography
 

diff --git a/examples/cars/top-bottom-coding/README.md b/examples/cars/top-bottom-coding/README.md
@@ -114,7 +114,7 @@ The data is recorded in the carsn.json file, below you will find an overview of
 | Weight_in_lbs    |     -0.896857    |  0.930259 |   0.963663   |  0.917956  |    1.000000   |   -0.507366  |
 | Acceleration     |     0.534477     | -0.554863 |   -0.567617  |  -0.714646 |   -0.507366   |   1.000000   |
 
-The correlation after anonymization is in the range ![equation](https://latex.codecogs.com/svg.image?%5Cinline%20%5Cleft%20%5B%20%5Cpm%200.021%20;%20%5Cpm%200.111%20%5Cright%20%5D)
+The correlation after anonymization is in the range $\left[ \pm 0.021; \pm 0.111\right]$.
 
 ### Bibliography
 

diff --git a/examples/re-identification/anonymized.json b/examples/re-identification/anonymized.json
@@ -0,0 +1,24 @@
+{"id":1,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":2,"x":3,"y":7,"z":"a","cluster":1}
+{"id":3,"x":3,"y":7,"z":"c","cluster":1}
+{"id":4,"x":3,"y":7,"z":"b","cluster":1}
+{"id":5,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":6,"x":7,"y":6.67,"z":"a","cluster":2}
+{"id":7,"x":4.33,"y":17.67,"z":"a","cluster":3}
+{"id":8,"x":8,"y":15.67,"z":"a","cluster":4}
+{"id":9,"x":4.33,"y":17.67,"z":"a","cluster":3}
+{"id":10,"x":4.33,"y":17.67,"z":"b","cluster":3}
+{"id":11,"x":8,"y":15.67,"z":"c","cluster":4}
+{"id":12,"x":8,"y":15.67,"z":"c","cluster":4}
+{"id":13,"x":16,"y":6,"z":"c","cluster":6}
+{"id":14,"x":16,"y":6,"z":"b","cluster":6}
+{"id":15,"x":12.33,"y":6,"z":"b","cluster":5}
+{"id":16,"x":12.33,"y":6,"z":"a","cluster":5}
+{"id":17,"x":16,"y":6,"z":"c","cluster":6}
+{"id":18,"x":12.33,"y":6,"z":"c","cluster":5}
+{"id":19,"x":19.67,"y":17.67,"z":"b","cluster":8}
+{"id":20,"x":16.67,"y":18.33,"z":"c","cluster":7}
+{"id":21,"x":19.67,"y":17.67,"z":"b","cluster":8}
+{"id":22,"x":16.67,"y":18.33,"z":"c","cluster":7}
+{"id":23,"x":16.67,"y":18.33,"z":"b","cluster":7}
+{"id":24,"x":19.67,"y":17.67,"z":"b","cluster":8}
diff --git a/examples/re-identification/anonymizedNoise.json b/examples/re-identification/anonymizedNoise.json
@@ -0,0 +1,24 @@
+{"id":1,"x":5.938120851374984,"y":4.533965986177158,"z":"a"}
+{"id":2,"x":2.680835241693462,"y":8.19425223836935,"z":"a"}
+{"id":3,"x":3.4851832584055824,"y":5.668937832579665,"z":"c"}
+{"id":4,"x":2.191579318404636,"y":5.768462278793522,"z":"b"}
+{"id":5,"x":7.849763705115755,"y":6.176648218859225,"z":"a"}
+{"id":6,"x":7.236515404399387,"y":4.889286506988984,"z":"a"}
+{"id":7,"x":3.5463258052174105,"y":16.62968221276086,"z":"a"}
+{"id":8,"x":7.166427916409467,"y":14.853870793985047,"z":"a"}
+{"id":9,"x":5.313935236579784,"y":18.671614394540253,"z":"a"}
+{"id":10,"x":3.525701395319161,"y":18.180632263551825,"z":"b"}
+{"id":11,"x":7.596790801515696,"y":14.387014064867332,"z":"c"}
+{"id":12,"x":8.371563137890895,"y":16.27598200732568,"z":"c"}
+{"id":13,"x":15.9401941749371,"y":5.703704179313746,"z":"c"}
+{"id":14,"x":15.912168810000395,"y":6.760290278787764,"z":"b"}
+{"id":15,"x":12.056370844747219,"y":7.593606438643265,"z":"b"}
+{"id":16,"x":12.399920711511685,"y":3.289488817191873,"z":"a"}
+{"id":17,"x":17.281844840528116,"y":5.019956467119892,"z":"c"}
+{"id":18,"x":11.670023549374136,"y":8.978933759942723,"z":"c"}
+{"id":19,"x":19.92395414411107,"y":16.401105715881105,"z":"b"}
+{"id":20,"x":14.35294166396883,"y":18.172668663775376,"z":"c"}
+{"id":21,"x":19.987901071971024,"y":19.864786326688186,"z":"b"}
+{"id":22,"x":17.20752528905433,"y":18.546736879113336,"z":"c"}
+{"id":23,"x":14.442379076836225,"y":18.044346789802102,"z":"b"}
+{"id":24,"x":19.476769170963276,"y":18.014504697069313,"z":"b"}