Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update limitations of event generator tool #5842

Open
wants to merge 7 commits into
base: enhacement/24586-benchmark-tests
Choose a base branch
from

Conversation

santipadilla
Copy link
Member

Description

Related issue.

This PR is to add information about the limitations of the event generator tool.

Includes: Event volume limitations, maximum generation rate, number of parallel event threads, other potential limitations and the results of testing performed.

@santipadilla santipadilla self-assigned this Nov 19, 2024
Copy link
Member

@rafabailon rafabailon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@Rebits Rebits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, but some changes are required.

Comment on lines 91 to 110
## Testing performed

| Operations | Rate (EPS) | Expected Time (s) | Observed Time (s) | Difference (s) |
|------------|------------|-------------------|-------------------|----------------|
| 1,000 | 100 | 10 | 10.041 | +0.041 |
| 5,000 | 500 | 10 | 10.048 | +0.048 |
| 10,000 | 1,000 | 10 | 10.042 | +0.042 |
| 20,000 | 1,000 | 20 | 20.039 | +0.039 |
| 50,000 | 2,000 | 25 | 25.074 | +0.074 |
| 100,000 | 5,000 | 20 | 20.047 | +0.047 |
| 100,000 | 10,000 | 10 | 10.035 | +0.035 |
| 200,000 | 20,000 | 10 | 10.057 | +0.057 |
| 250,000 | 25,000 | 10 | 10.482 | +0.482 |
| 300,000 | 30,000 | 10 | 12.718 | +2.718 |
| 500,000 | 50,000 | 10 | 16.229 | +6.229 |
| 1,000,000 | 100,000 | 10 | 33.802 | +23.802 |

- Below 250,000 operations, the observed execution times closely match the expected times.
- Over 300,000 operations, the observed times start deviating from the expected times.
- The deviation increases as both the operations and the rate increase, indicating system limitations.
Copy link
Member

@Rebits Rebits Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What operating systems and hardware have been tested so far? We need to determine the minimum requirements to ensure the tool performs effectively in real-world scenarios.

Additionally, the table could be updated to reflect EPS (Events Per Second) based on different host resource configurations, providing clear information about potential limitations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here.

Comment on lines 24 to 89
# Limitations

## Event volume limitations

### Configured operations limitation

- Each event generator instance is limited by the operations parameter, which specifies the total number of events to generate before stopping.
- For example, if operations is set to 5, the generator will produce 5 events and then stop. The event volume per generator is therefore capped
by this parameter.

### System resource limitations

- Disk space: For logEventGenerator (Simulates log file entries with rotation), the total volume of logs generated is limited by disk space.
If the logs are large or if log rotation occurs frequently, disk space may become a limiting factor.

- File system limits: For syscheckEventGenerator (Simulates file system operations (create, modify, delete)), creating a large number of files
can hit file system limits.

## Maximum generation rate

- The rate parameter specifies the number of events per second.
- The start() method in EventGenrator calculates the sleep time between events to maintain the specified rate.

### Processing overhead

- Each event involves processing tasks such as file I/O and data formatting.
- At very high rates, the processing time per event may exceed the interval calculated for the desired rate, leading to lower actual rates.

### System performance

- High rates may lead to increased CPU usage, potentially causing system slowdowns.
- Rapid file writes and modifications can saturate disk I/O bandwidth.

## Number of parallel event threads

- Each event generator instance runs in its own thread.
- The main script initiates and manages these threads.

### Limitations

- System thread limits: The operating system limits the maximum number of threads that can be created.
- Resource consumption: Each thread consumes system resources (memory and CPU time).
- Context stwitching overhead: A high number of threads can lead to increased context switching, reducing overall performance.

### Recommendations

- Keep the number of threads at a level that the system can handle efficiently.
- Monitor system performance to avoid overloading the CPU with excessive threading.

## Other potential limitations

### Disk I/O limitations

- The speed at which the disk can handle read/write operations may become a bottleneck.
- High-frequency file operations can lead to increased disk latency and I/O wait times.
- Solid-state drives (SSDs) offer faster I/O performance compared to hard disk drives (HDDs).

### CPU limitations

- The generation of events involves processing that consumes CPU resources.
- High rates and multiple threads can lead to high CPU utilization, impacting other system operations.

### Memory usage

- Each thread and its associated data structures consume memory.
- Prolonged operation with many threads may lead to increased memory usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s an improved version of the message:

This documentation is too generic and does not adequately address the specific requirements for tool usage. To improve it, we need precise documentation that highlights real hardware limitations in practical environments. It should include actual data on the maximum EPS (Events Per Second) achieved under various resource configurations. This information will serve as a reference to determine whether the EPS usage in a given environment is feasible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants