Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor #12

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 7 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@ bin
*.log
.DS_Store

pkg/analyse/testdata/data*/data_output.yaml
pkg/analyse/testdata/benchmark/**/*.jsonl
pkg/analyse/testdata/benchmark/**/*.yaml
pkg/analyse/testdata/data2/data_output.yaml
testdata/data*/data_output.yaml

testdata/benchmark/**/*.jsonl
testdata/benchmark/**/*.yaml

test/suites/cli/rimo.schema.json
test/suites/testdata/data1/output/data.yaml
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ Types of changes

## [0.1.0]

- `Added` rimo analyse command
- Added `rimo analyse` command
649 changes: 649 additions & 0 deletions LICENSE.md

Large diffs are not rendered by default.

55 changes: 32 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,50 +3,45 @@
## Description

Rimo contains tools that helps creating a *masking.yaml* for [PIMO](https://github.com/CGI-FR/PIMO).
It works as a 6 steps process :
![rimo steps](.github/img/rimo_steps.png "rimo steps")

<!-- It works as a 6 steps process : -->
<!-- ![rimo steps](.github/img/rimo_steps.png "rimo steps") -->
<!--
1. `gather` : orchestrate LINO to extract table of the database into a *.jsonl* file
2. `analyse` : extract meaningful information on database from *.jsonl*
3. `export` : dump data into an *Excel* file which serves as a configuration means
4. `import` : load, store and verify inputted data of the *Excel* file into a *.yaml* file
5. `build` : create a *pimo_masking.yaml* from *.yaml*
6. `script` : build a bash script to execute pipeline for PIMO
6. `script` : build a bash script to execute pipeline for PIMO -->

<!-- ## Installation
`rimo` command line work in relative project's directory, like `git` or `docker` -->

## Usage

### `rimo analyser`
### `rimo analyse`

```console
// to be defined
rimo analyse [inputDir] [outputDir]
```

- `input` : path to a directory or file containing to *jsonl* files
- `output` : optional path to output. If none, same name and directory as input. If directory, same name as input.

**input.jsonl** is a JSON single line that contains a pair of (column_name, value) for every row of the database table

**output.yaml** contain various metrics on table's columns and a small default configuration for PIMO. An example can be found in *src/unit_test/testcase_output.yaml*.
- `inputDir` : path to a directory containing *jsonl* files.
- `output` : path to a directory where *rimo.yaml* will be created.

### `rimo exporter`
**inputDir** must contain .jsonl files named basename_tablename.jsonl and respecting this format :

```console
// to be defined
```json
{"colName1": value1, "colName2": value2 }
{"colName1": value2, "colName2": value2 }
...
```

- `input` : path to a *yaml* file
- `output` : optional path to output. If none, same name and directory as input. If directory, same name as input.

**input.yaml** contains information on database generated by `analyser`
such files can be generated using [LINO](https://github.com/CGI-FR/LINO)

**output.xlsx** is an Excel file where input is dumped.
**outputDir** will generate basename.yaml in output directory containing various metrics. An example can be found in *testdata/data1/data_expected.yaml*.

## Tests

To implement : Venom tests
Run `neon test-int` to execute unit-test and Venom test.

## Project status

Expand All @@ -55,8 +50,22 @@ In active development
## Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
`testcase_data.jsonl` in *tests/data* is generated using Faker and does contain any real information.

## License

TODO
Copyright (C) 2023 CGI France

This file is part of RIMO.

RIMO is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

RIMO is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with RIMO. If not, see <http://www.gnu.org/licenses/>.
6 changes: 3 additions & 3 deletions build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -231,13 +231,13 @@ targets:
doc: "generate bench data"
depends: ["info"]
steps:
- $: ./pkg/analyse/testdata/benchmark/buildBenchData.sh
- $: ./testdata/benchmark/buildBenchData.sh

benchmark:
doc: "Run all benchmarks"
depends: ["info", "refresh", "lint", "bench-data"]
steps:
- $: go test -bench=. -benchmem -coverprofile=./={BUILD_DIR}/coverage.txt -covermode=atomic ./...
- $: go test -bench=. -benchmem -coverprofile=./={BUILD_DIR}/coverage_benchmark.txt -covermode=atomic ./...

test:
doc: "Run all tests with coverage"
Expand All @@ -256,7 +256,7 @@ targets:

test-int:
doc: "Run all integration tests"
depends: ["info", "refresh", "lint", "test", "release"]
depends: ["info", "refresh", "lint", "test", "benchmark", "release"]
steps:
- $: venom run test/suites/*

Expand Down
112 changes: 34 additions & 78 deletions cmd/rimo/main.go
Original file line number Diff line number Diff line change
@@ -1,12 +1,29 @@
// Copyright (C) 2023 CGI France
//
// This file is part of RIMO.
//
// RIMO is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// RIMO is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with RIMO. If not, see <http://www.gnu.org/licenses/>.

package main

import (
"errors"
"fmt"
"os"
"path/filepath"

"github.com/cgi-fr/rimo/pkg/analyse"
"github.com/cgi-fr/rimo/pkg/io"
"github.com/cgi-fr/rimo/pkg/model"
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"
Expand Down Expand Up @@ -38,62 +55,47 @@ func main() { //nolint:funlen
}

rimoSchemaCmd := &cobra.Command{ //nolint:exhaustruct
Use: "schema",
Short: "Export rimo json schema",
Use: "jsonschema",
Short: "Return rimo jsonschema",
Args: cobra.NoArgs,
Run: func(cmd *cobra.Command, args []string) {
// Print current working directory
cwd, err := os.Getwd()
if err != nil {
log.Fatal().Msgf("error getting current working directory: %v", err)
}

err = model.ExportSchema()
jsonschema, err := model.GetJSONSchema()
if err != nil {
log.Fatal().Msgf("error generating rimo schema: %v", err)
os.Exit(1)
}

log.Info().Msgf("rimo schema successfully exported in %s", cwd)
fmt.Println(jsonschema)
},
}

rimoAnalyseCmd := &cobra.Command{ //nolint:exhaustruct
Use: "analyse [inputPath] [outputPath]",
Use: "analyse [inputDir] [outputDir]",
Short: "Generate a rimo.yaml from a directory of .jsonl files",
Args: cobra.ExactArgs(2), //nolint:gomnd
Run: func(cmd *cobra.Command, args []string) {
inputPath := args[0]
if err := CheckDir(inputPath); err != nil {
log.Fatal().Msgf("error checking input directory: %v", err)
}
inputDir := args[0]
outputDir := args[1]

outputPath := args[1]
if err := CheckDir(outputPath); err != nil {
log.Fatal().Msgf("error checking output directory: %v", err)
// List .jsonl files in input directory
if err := io.ValidateDirPath(inputDir); err != nil {
log.Fatal().Msgf("error validating input directory: %v", err)
}

// List .jsonl files in input directory
inputList, err := FilesList(inputPath, ".jsonl")
inputList, err := FilesList(inputDir, ".jsonl")
if err != nil {
log.Fatal().Msgf("error listing files: %v", err)
}
if len(inputList) == 0 {
log.Fatal().Msgf("no .jsonl files found in %s", inputPath)
}

// Output path : outpathPath + basename + .yaml
basename, err := analyse.GetBaseName(inputList[0])
if err != nil {
log.Fatal().Msgf("error getting basename: %v", err)
if len(inputList) == 0 {
log.Fatal().Msgf("no .jsonl files found in %s", inputDir)
}
outputPath = filepath.Join(outputPath, basename+".yaml")

err = analyse.Analyse(inputList, outputPath)
err = analyse.Orchestrator(inputList, outputDir)
if err != nil {
log.Fatal().Msgf("error generating rimo.yaml: %v", err)
}

log.Info().Msgf("Successfully generated rimo.yaml at %s", outputPath)
log.Info().Msgf("Successfully generated rimo.yaml in %s", outputDir)
},
}

Expand All @@ -106,52 +108,6 @@ func main() { //nolint:funlen
}
}

var (
ErrNotExist = errors.New("path does not exist")
ErrNotDir = errors.New("path is not a directory")
ErrNotFile = errors.New("path is not a file")
)

func CheckFile(path string) error {
// Get absPath
path, err := filepath.Abs(path)
if err != nil {
return fmt.Errorf("error getting absolute path: %w", err)
}

fileInfo, err := os.Stat(path)
// Check if the file exists
if os.IsNotExist(err) {
return fmt.Errorf("%w: %s", ErrNotExist, path)
}
// Check if the file is a regular file
if !fileInfo.Mode().IsRegular() {
return fmt.Errorf("%w: %s", ErrNotFile, path)
}

return nil
}

func CheckDir(path string) error {
// Get absPath
path, err := filepath.Abs(path)
if err != nil {
return fmt.Errorf("error getting absolute path: %w", err)
}

fileInfo, err := os.Stat(path)
// Check if path exists
if os.IsNotExist(err) {
return fmt.Errorf("%w: %s", ErrNotExist, path)
}
// Check if path is a directory
if !fileInfo.Mode().IsDir() {
return fmt.Errorf("%w: %s", ErrNotDir, path)
}

return nil
}

func FilesList(path string, extension string) ([]string, error) {
pattern := filepath.Join(path, "*"+extension)

Expand Down
Loading