Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance performance statistics module #5049

Open
fdalmaup opened this issue Feb 29, 2024 · 0 comments
Open

Enhance performance statistics module #5049

fdalmaup opened this issue Feb 29, 2024 · 0 comments
Labels

Comments

@fdalmaup
Copy link
Member

fdalmaup commented Feb 29, 2024

Description

The developed tool deps/wazuh_testing/wazuh_testing/tools/performance/statistic.py to retrieve statistics from the Wazuh API has certain improvements that may be needed to avoid misfunctions and errors in the fetched data.

  • API Token management
    The API authentication process is an API process child in charge of managing the authentication requests made to the API that are queued before attending them and returning a response. If any of the requests are delayed, the processing time is carried over to the following requests, making the API susceptible to errors. This happened in a worker of a loaded environment (Compare wazuh-db stats in loaded env wazuh#21937 (comment)) in which the API received three consecutive calls:
2024/02/23 18:38:58 INFO: wazuh 127.0.0.1 "GET /security/user/authenticate" with parameters {} and body {} done in 200.024s: 500
2024/02/23 18:38:58 ERROR: JSON couldn't be loaded
2024/02/23 18:38:58 INFO: wazuh 127.0.0.1 "GET /security/user/authenticate" with parameters {} and body {} done in 209.895s: 500
2024/02/23 18:38:58 ERROR: JSON couldn't be loaded
2024/02/23 18:38:58 INFO: wazuh 127.0.0.1 "GET /security/user/authenticate" with parameters {} and body {} done in 205.954s: 500

The process could be enhanced, keeping in mind the time it takes for the JWT token to expire and requesting it only if needed.

# Try to get the response token three times
for _ in range(max_retries):
try:
token_response = requests.get(API_URL + TOKEN_ENDPOINT, verify=False,
auth=requests.auth.HTTPBasicAuth("wazuh", "wazuh"))
if token_response.status_code != 200:
break
except requests.exceptions.RequestException as e:
logging.error(f"Error getting token from API: {str(e)}")
else:
logging.error("Retrying get API data, status code {}".format(token_response.status_code))
for _ in range(max_retries):
try:
daemons_response = requests.get(API_URL + DAEMONS_ENDPOINT, verify=False,
headers={'Authorization': 'Bearer ' + token_response.json()['data']['token']})
if daemons_response.status_code != 200:
break
except requests.exceptions.RequestException as e:
logging.error(f"Error fetching {self.daemon} datafrom API: {str(e)}")

  • Retrieved statistics management and data visualization
    The tool contains a method to Write the data collected from the .state into a CSV file.. It explicitly defines the fields to fetch from the obtained data that should be written in the CSV file and is prone to errors in case a modification from the Core or Framework side is to be done. This must be analyzed so that it is maintainable over time and robust to changes in the origin of the data.

    The data visualization script suffers a similar caveat, the plots are generated using hardcoded columns and fails if any of their names changes or if the column is no longer present. We should find a way of generating graphics dynamically to prevent the pipeline from failing everytime there is a minimal modification to the statistics files.

Sample artifacts that can be of use to test modifications (these already contain the resulting CSV files in <node>/data/stats/<target>):

artifacts4.7.3-I.zip
artifacts4.7.3-II.zip
artifacts4.7.3-III.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants