Skip to content

Commit

Permalink
Initial public version (#1)
Browse files Browse the repository at this point in the history
* Public initial version

* Initial readme
  • Loading branch information
abdolence authored Aug 4, 2024
1 parent 01b9d29 commit 381c66a
Show file tree
Hide file tree
Showing 23 changed files with 3,060 additions and 0 deletions.
21 changes: 21 additions & 0 deletions .github/workflows/security-audit.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
name: security audit
on:
push:
paths:
- '**/Cargo.toml'
- '**/Cargo.lock'
schedule:
- cron: '5 4 * * 6'
concurrency:
group: ${{ github.workflow }}-${{ github.ref_protected && github.run_id || github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
security_audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
toolchain: stable
components: rustfmt, clippy
- run: cargo install cargo-audit && cargo audit || true && cargo audit
45 changes: 45 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: tests & formatting
on:
push:
branches:
- master
pull_request:
workflow_dispatch:
env:
GCP_PROJECT: latestbit
GCP_PROJECT_ID: 288860578009
concurrency:
group: ${{ github.workflow }}-${{ github.ref_protected && github.run_id || github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
permissions:
contents: 'read'
id-token: 'write'
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
toolchain: stable
components: rustfmt, clippy
- name: Authenticate to Google Cloud development
id: auth
uses: google-github-actions/auth@v2
if: github.ref == 'refs/heads/master'
with:
workload_identity_provider: 'projects/${{ env.GCP_PROJECT_ID }}/locations/global/workloadIdentityPools/lb-github-identity-pool/providers/lb-github-identity-pool-provider'
service_account: 'lb-github-service-account@${{ env.GCP_PROJECT }}.iam.gserviceaccount.com'
create_credentials_file: true
access_token_lifetime: '240s'
- name: 'Set up Cloud SDK'
uses: google-github-actions/setup-gcloud@v2
if: github.ref == 'refs/heads/master'
- name: 'Checking formatting and clippy'
run: cargo fmt -- --check && cargo clippy -- -Dwarnings
- name: 'Run tests without access to GCP'
run: cargo test
if: github.ref != 'refs/heads/master'
- name: 'Run all test'
run: cargo test --features "ci-gcp"
if: github.ref == 'refs/heads/master'
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
/target/
Cargo.lock
**/*.rs.bk
.idea/
*.tmp
*.orig
*.swp
tmp/
76 changes: 76 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Contributing

Welcome! Please read this document to understand what you can do:
* [Analyze Issues](#analyze-issues)
* [Report an Issue](#report-an-issue)
* [Contribute Code](#contribute-code)

## Analyze Issues

Analyzing issue reports can be a lot of effort. Any help is welcome!
Go to the GitHub issue tracker and find an open issue which needs additional work or a bugfix (e.g. issues labeled with "help wanted" or "bug").
Additional work could include any further information, or a gist, or it might be a hint that helps understanding the issue.

## Report an Issue

If you find a bug - you are welcome to report it.
You can go to the GitHub issue tracker to report the issue.

### Quick Checklist for Bug Reports

Issue report checklist:
* Real, current bug for the latest/supported version
* No duplicate
* Reproducible
* Minimal example

### Issue handling process

When an issue is reported, a committer will look at it and either confirm it as a real issue, close it if it is not an issue, or ask for more details.
An issue that is about a real bug is closed as soon as the fix is committed.


### Reporting Security Issues

If you find or suspect a security issue, please act responsibly and do not report it in the public issue tracker, but directly to us, so we can fix it before it can be exploited.
For details please check our [Security policy](SECURITY.md).

## Contribute Code

You are welcome to contribute code in order to fix bugs or to implement new features.

There are three important things to know:

1. You must be aware of the Apache License (which describes contributions) and **agree to the Contributors License Agreement**. This is common practice in all major Open Source projects.
For company contributors special rules apply. See the respective section below for details.
2. **Not all proposed contributions can be accepted**. Some features may e.g. just fit a third-party add-on better. The code must fit the overall direction and really improve it. The more effort you invest, the better you should clarify in advance whether the contribution fits: the best way would be to just open an issue to discuss the feature you plan to implement (make it clear you intend to contribute).

### Contributor License Agreement

When you contribute (code, documentation, or anything else), you have to be aware that your contribution is covered by the same [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

This applies to all contributors, including those contributing on behalf of a company.

### Contribution Content Guidelines

These are some of the rules we try to follow:

- Apply a clean coding style adapted to the surrounding code, even though we are aware the existing code is not fully clean
- Use variable naming conventions like in the other files you are seeing
- No println() - use logging service if needed
- Comment your code where it gets non-trivial
- Keep an eye on performance and memory consumption, properly destroy objects when not used anymore
- Avoid incompatible changes if possible, especially do not modify the name or behavior of public API methods or properties

### How to contribute - the Process

1. Make sure the change would be welcome (e.g. a bugfix or a useful feature); best do so by proposing it in a GitHub issue
2. Create a branch forking the repository and do your change
3. Commit and push your changes on that branch
4. In the commit message
- Describe the problem you fix with this change.
- Describe the effect that this change has from a user's point of view. App crashes and lockups are pretty convincing for example, but not all bugs are that obvious and should be mentioned in the text.
- Describe the technical details of what you changed. It is important to describe the change in a most understandable way so the reviewer is able to verify that the code is behaving as you intend it to.
5. Create a Pull Request
6. Once the change has been approved we will inform you in a comment
7. We will close the pull request, feel free to delete the now obsolete branch
13 changes: 13 additions & 0 deletions COPYRIGHT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Copyright 2022 Abdulla Abdurakhmanov ([email protected])

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
50 changes: 50 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
[package]
name = "redacter"
version = "0.1.0"
edition = "2021"
authors = ["Abdulla Abdurakhmanov <[email protected]>"]
license = "Apache-2.0"
homepage = "https://github.com/abdolence/redacter-rs"
repository = "https://github.com/abdolence/redacter-rs"
documentation = "https://docs.rs/redacter"
readme = "README.md"
include = ["Cargo.toml", "src/**/*.rs", "README.md", "LICENSE"]
rust-version = "1.77.0"

[features]
default = []
ci-gcp = [] # For testing on CI/GCP
ci-aws = [] # For testing on CI/AWS
ci = ["ci-gcp", "ci-aws"]


[dependencies]
rsb_derive = "0.5"
rvstruct = "0.3"
chrono = { version = "0.4", features = ["serde"] }
serde = { version = "1.0", features = ["derive"] }
console = { version = "0.15" }
indicatif = { version = "0.17" }
clap = { version = "4.1", features = ["derive"] }
tokio = { version = "1.14", features = ["fs", "rt-multi-thread", "sync", "rt", "macros"] }
tokio-util = { version = "0.7", features = ["compat"] }
gcloud-sdk = { version = "0.25.4", features = ["google-privacy-dlp-v2", "google-rest-storage-v1"] }
futures = "0.3"
sha2 = "0.10"
async-trait = "0.1"
hex = "0.4"
thiserror = "1"
sync_wrapper = { version = "1", features = ["futures"] }
async-recursion = "1"
mime = "0.3"
mime_guess = "2"
zip = "2"
globset = "0.4"
tempfile = "3"
csv-async = { version = "1", default-features = false, features = ["tokio", "tokio-stream"] }
aws-config = { version = "1", features = ["behavior-version-latest"] }
aws-sdk-s3 = { version = "1" }


[dev-dependencies]
cargo-husky = { version = "1.5", default-features = false, features = ["run-for-all", "prepush-hook", "run-cargo-fmt"] }
48 changes: 48 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
[![Cargo](https://img.shields.io/crates/v/redacter.svg)](https://crates.io/crates/redacter)
![tests and formatting](https://github.com/abdolence/redacter-rs/workflows/tests%20&amp;%20formatting/badge.svg)
![security audit](https://github.com/abdolence/redacter-rs/workflows/security%20audit/badge.svg)

# Redacter

Copy & Redact cli tool to securely copy and redact files across various sources and destinations,
utilizing Data Loss Prevention (DLP) capabilities.

## Features

* **Copy & Redact:** copy files while applying DLP redaction to protect sensitive information.
* **Multiple Sources & Destinations:** interact with:
* Local filesystem
* Google Cloud Storage (GCS)
* Amazon Simple Storage Service (S3)
* Zip files
* **GCP DLP Integration:** Leverage the power of GCP's DLP API for accurate and customizable redaction.
* **CLI:** Easy-to-use command-line interface for streamlined workflows.
* Built with Rust to ensure speed, safety, and reliability.

## Installation

**Cargo:**

```sh
cargo install redacter
```

## Command line options

TBD

## Google authentication

Looks for credentials in the following places, preferring the first location found:

- A JSON file whose path is specified by the GOOGLE_APPLICATION_CREDENTIALS environment variable.
- A JSON file in a location known to the gcloud command-line tool using `gcloud auth application-default login`.
- On Google Compute Engine, it fetches credentials from the metadata server.

## Licence

Apache Software License (ASL)

## Author

Abdulla Abdurakhmanov
9 changes: 9 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Security Policy

## Reporting a Vulnerability

Please follow general guidlines defined here:
https://cheatsheetseries.owasp.org/cheatsheets/Vulnerability_Disclosure_Cheat_Sheet.html

## Contacts
E-mail: [email protected]
118 changes: 118 additions & 0 deletions src/args.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
use crate::common_types::GcpProjectId;
use crate::errors::AppError;
use crate::redacters::{GcpDlpRedacterOptions, RedacterOptions, RedacterProviderOptions};
use clap::*;
use std::fmt::Display;

#[derive(Parser, Debug)]
#[command(author, about)]
pub struct CliArgs {
#[command(subcommand)]
pub command: CliCommand,
}

#[derive(Subcommand, Debug)]
pub enum CliCommand {
#[command(about = "Copy and redact files from source to destination")]
Cp {
#[arg(
help = "Source directory or file such as /tmp, /tmp/file.txt or gs://bucket/file.txt and others supported providers"
)]
source: String,
#[arg(
help = "Destination directory or file such as /tmp, /tmp/file.txt or gs://bucket/file.txt and others supported providers"
)]
destination: String,
#[arg(short = 'm', long, help = "Maximum size of files to copy in bytes")]
max_size_limit: Option<u64>,
#[arg(
short = 'f',
long,
help = "Filter by name using glob patterns such as *.txt"
)]
filename_filter: Option<globset::Glob>,

#[command(flatten)]
redacter_args: Option<RedacterArgs>,
},
}

#[derive(ValueEnum, Debug, Clone)]
pub enum RedacterType {
GcpDlp,
}

impl std::str::FromStr for RedacterType {
type Err = String;

fn from_str(s: &str) -> Result<Self, Self::Err> {
match s {
"gcp-dlp" => Ok(RedacterType::GcpDlp),
_ => Err(format!("Unknown redacter type: {}", s)),
}
}
}

impl Display for RedacterType {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
RedacterType::GcpDlp => write!(f, "gcp-dlp"),
}
}
}

#[derive(Args, Debug, Clone)]
#[group(required = false)]
pub struct RedacterArgs {
#[arg(short = 'd', long, value_enum, help = "Redacter type")]
redact: Option<RedacterType>,

#[arg(
long,
help = "GCP project id that will be used to redact and bill API calls"
)]
pub gcp_project_id: Option<GcpProjectId>,

#[arg(
long,
help = "Allow unsupported types to be copied without redaction",
default_value = "false"
)]
pub allow_unsupported_copies: bool,

#[arg(
long,
help = "Disable CSV headers (if they are not present)",
default_value = "false"
)]
pub csv_headers_disable: bool,

#[arg(long, help = "CSV delimiter (default is ','")]
pub csv_delimiter: Option<char>,
}

impl TryInto<RedacterOptions> for RedacterArgs {
type Error = AppError;

fn try_into(self) -> Result<RedacterOptions, Self::Error> {
let provider_options = match self.redact {
Some(RedacterType::GcpDlp) => match self.gcp_project_id {
Some(project_id) => Ok(RedacterProviderOptions::GcpDlp(GcpDlpRedacterOptions {
project_id,
})),
None => Err(AppError::RedacterConfigError {
message: "GCP project id is required for GCP DLP redacter".to_string(),
}),
},
None => Err(AppError::RedacterConfigError {
message: "Redacter type is required".to_string(),
}),
}?;
Ok(RedacterOptions {
provider_options,
allow_unsupported_copies: self.allow_unsupported_copies,
csv_headers_disable: self.csv_headers_disable,
csv_delimiter: self.csv_delimiter.map(|c| c as u8),
})
}
}
Loading

0 comments on commit 381c66a

Please sign in to comment.