- This model is a GPT model trained to generate synthetic Azure logs. This approach can be used to generate logs that are realistic for some downstream tasks, i.e. generating training data as a baseline, generating attack behavior to test detectors.
- To run this example, additional requirements must be installed into your environment. A supplementary requirements file has been provided in this example directory.
pip install -r requirements.txt
Architecture Type:
- Transformer
Network Architecture:
- GPT
Input Format:
- JSON
Input Parameters:
- Azure AD Logs
Other Properties Related to Output:
- N/A
Output Format:
- Text file with synthetic logs
Output Parameters:
- N/A
Other Properties Related to Output:
- N/A
Runtime(s):
- Morpheus
Supported Hardware Platform(s):
- Ampere/Turing
Supported Operating System(s):
- Linux
- v1
Link:
Properties (Quantity, Dataset Descriptions, Sensor(s)):
- 3239 Azure AD logs
Dataset License:
Link:
- N/A
Properties (Quantity, Dataset Descriptions, Sensor(s)):
- N/A
Dataset License:
- N/A
Engine:
- N/A
Test Hardware:
- A100
- Not Applicable
- Not Applicable
- Not Applicable
- English: 100%
- Not Applicable
- Not Applicable
- Not Applicable
- Not Applicable
- Not Applicable
- Not Applicable
Individuals from the following adversely impacted (protected classes) groups participate in model design and testing.
- Not Applicable
- Not Applicable
- The model is primarily designed for testing purposes and serves as a small pre-trained model used to generate Azure AD logs.
- This model is intended for developers who want to build GPT based synthetic log generator
- The intended beneficiaries of this model are developers who aim to generate synthetic Azure logs.
- This model output is synthetic Azure AD logs.
- This model is an example of a GPT model. This model requires raw log messages as input for training and a prompt for inference. The model is trained as in the training notebook. During inference, the trained model is prompted with the first key of the log type and generates synthetic logs.
Name the adversely impacted groups (protected classes) this has been tested to deliver comparable outcomes regardless of:
- Not Applicable
- This model is trained with synthetic logs for demonstration purposes. A separate training is needed for other logs.
- Intact raw logs
- N/A
- None
- No
- N/A
- No
- This model is provided as an example of synthetic log generation. Users can create their own models for their use cases and downstream tasks.
- It's been trained with a small dataset for mainly demonstration purposes.
- No
- N/A
- No
- No
- No
- No
- No
- Neither
- N/A
Protected classes used to create this model? (The following were used in model the model's training:)
- N/A
- The dataset is initially reviewed upon addition, and subsequent reviews are conducted as needed or upon request for any changes.
- N/A
- N/A
- N/A
- No
- Yes
- N/A
Is data compliant with data subject requests for data correction or removal, if such a request was made?
- N/A