Skip to content

Commit

Permalink
Implement cache for files read operation
Browse files Browse the repository at this point in the history
Implements [0].

Added new operations:
* GET files/configuration
* PUT files/configuration&readCache=<true|false>

Also,

- Metrics measurements are moved to file manager and operated from
  SafeFile. In this way, metrics references are lazily created once
  when file manager is created. This avoid error to create those
  references everytime a SafeFile is dynamically created when a new file
  is managed. SafeFile has now a file manager pointer reference to
  access increments for every counter.
- Move close delay microseconds to write interface instead of
  store it as SafeFile member (also removed from json representation).
  This allows to change the delay for different operations in the same
  file, and eases the fact that read and write can be performed over
  it.
- Fixed short/long term files close delay configuration in command-line:
  zero value was not accepted due to bug in numeric control.
- Short term/long term identification has been changed from original
  procedure (compares original target value with final replaced) to a
  new way: check if there are variable patterns @{varname} regardless
  they are replaced or not. This is because one missing variable is not
  replaced instead of being replaced by empty value, and it seems more
  intuitive that: "target@{var_not_replaced}" is also short term, and a
  trick to force shorttermness: "/my/file@{shortterm}.txt".
- README.md updated with benchmark and helpers scripts. Also, improved
  file targets description about short-term mode force procedure.
- Update benchmark script to include cache configuration mode. True
  by default due to performance impact of not having it.
- Remove delays in unit tests and component tests related to file
  manager, to speed up testing and make it more robust.

[0] #63
  • Loading branch information
testillano committed Aug 15, 2022
1 parent 5c7202f commit 93d539a
Show file tree
Hide file tree
Showing 15 changed files with 332 additions and 148 deletions.
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,10 @@ Input Global variable(s) configuration
(or set 'H2AGENT_GLOBAL_VARIABLE' to be non-interactive) [global-variable.json]:
global-variable.json

Input File manager configuration to enable read cache (true|false)
(or set 'H2AGENT__FILE_MANAGER_ENABLE_READ_CACHE_CONFIGURATION' to be non-interactive) [true]:
true

Input Server configuration to ignore request body (true|false)
(or set 'H2AGENT__SERVER_TRAFFIC_IGNORE_REQUEST_BODY_CONFIGURATION' to be non-interactive) [false]:
false
Expand Down Expand Up @@ -1741,7 +1745,7 @@ The **target** of information is classified after parsing the following possible
You could, for example, simulate a database where a *DELETE* for an specific entry could infer through its provision an *out-state* for a foreign method like *GET*, so when getting that *URI* you could obtain a *404* (assumed this provision for the new *working-state* = *in-state* = *out-state* = "id-deleted"). By default, the same `uri` is used from the current event to the foreign method, but it could also be provided optionally giving more flexibility to generate virtual events with specific states.
- txtFile.`<path>` *[string]*: dumps source (as string) over text file with the path provided. The path can be relative (to the execution directory) or absolute, and **admits variables substitution**. Note that paths to missing directories will fail to open (the process does not create tree hierarchy). It is considered long term file (file is closed 1 second after last write, by default) when a constant path is configured, because this is normally used for specific log files. On the other hand, when any substitution took place on the path provided it is considered as a dynamic name, so understood as short term file (file is opened, written and closed without delay, by default). Delays in microseconds are configurable on process startup. Check [command line](#command-line) for `--long-term-files-close-delay-usecs` and `--short-term-files-close-delay-usecs` options.
- txtFile.`<path>` *[string]*: dumps source (as string) over text file with the path provided. The path can be relative (to the execution directory) or absolute, and **admits variables substitution**. Note that paths to missing directories will fail to open (the process does not create tree hierarchy). It is considered long term file (file is closed 1 second after last write, by default) when a constant path is configured, because this is normally used for specific log files. On the other hand, when any substitution may took place in the path provided (it has variables in the form `@{varname}`) it is considered as a dynamic name, so understood as short term file (file is opened, written and closed without delay, by default). **Note:** you can force short term type inserting a variable, for example with empty value: `txtFile./path/to/short-term-file.txt@{empty}`. Delays in microseconds are configurable on process startup. Check [command line](#command-line) for `--long-term-files-close-delay-usecs` and `--short-term-files-close-delay-usecs` options.
- binFile.`<path>` *[string]*: same as `txtFile` but writting binary data.
Expand Down Expand Up @@ -2364,6 +2368,9 @@ Usage: schema [-h|--help] [--clean] [file]; Cleans/gets/updates current schema c
Usage: global_variable [-h|--help] [--clean] [name|file]; Cleans/gets/updates current agent global variable configuration
(http://localhost:8074/admin/v1/global-variable).
Usage: files [-h|--help]; Gets the files processed.
Usage: files_configuration [-h|--help]; Manages files configuration (gets current status by default).
[--enable-read-cache] ; Enables cache for read operations.
[--disable-read-cache] ; Disables cache for read operations.
Usage: configuration [-h|--help]; Gets agent general configuration.
Usage: server_configuration [-h|--help]; Manages agent server configuration (gets current status by default).
[--traffic-server-ignore-request-body] ; Ignores request body on server receptions.
Expand Down
12 changes: 10 additions & 2 deletions ct/src/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -540,7 +540,7 @@ def send(content, responseBodyRef = VALID_GLOBAL_VARIABLES__RESPONSE_BODY, respo
}
'''

FILE_GENERATION_PROVISION='''
FILE_MANAGER_PROVISION='''
{
"requestMethod": "GET",
"requestUri":"/app/v1/foo/bar",
Expand All @@ -550,9 +550,17 @@ def send(content, responseBodyRef = VALID_GLOBAL_VARIABLES__RESPONSE_BODY, respo
"source": "eraser",
"target": "txtFile./tmp/example.txt"
},
{
"source": "value./tmp/example.txt",
"target": "var.file"
},
{
"source": "value.hello",
"target": "txtFile./tmp/example.txt"
"target": "txtFile.@{file}"
},
{
"source": "txtFile./tmp/example.txt",
"target": "response.body.string"
}
]
}
Expand Down
25 changes: 13 additions & 12 deletions ct/src/files_operation/files_test.py
Original file line number Diff line number Diff line change
@@ -1,32 +1,33 @@
import pytest
import json
import time
from conftest import ADMIN_FILES_URI, string2dict, FILE_GENERATION_PROVISION
from conftest import ADMIN_FILES_URI, string2dict, FILE_MANAGER_PROVISION


@pytest.mark.admin
def test_001_i_want_to_get_process_files(h2ac_admin, admin_server_provision, h2ac_traffic):

# Provision
admin_server_provision(string2dict(FILE_GENERATION_PROVISION))
admin_server_provision(string2dict(FILE_MANAGER_PROVISION))

# Check file before traffic: skipped because the test could re-run
#response = h2ac_admin.get(ADMIN_FILES_URI)
#assert response["status"] == 204

# Send GET
response = h2ac_traffic.get("/app/v1/foo/bar")

# Check file
response = h2ac_admin.get(ADMIN_FILES_URI)
responseBodyRef = [{ "bytes":0, "path": "/tmp/example.txt", "state": "opened", "closeDelayUsecs": 1000000 }]
h2ac_admin.assert_response__status_body_headers(response, 200, responseBodyRef)

# Wait 2 seconds (long-term file closes in 1 second by default)
time.sleep(2)

h2ac_admin.assert_response__status_body_headers(response, 200, "hello")

# # Check file
# response = h2ac_admin.get(ADMIN_FILES_URI)
# responseBodyRef = [{ "bytes":0, "path": "/tmp/example.txt", "state": "opened" }]
# h2ac_admin.assert_response__status_body_headers(response, 200, responseBodyRef)
#
# # Wait 2 seconds (long-term file closes in 1 second by default)
# time.sleep(2)
#
# Check file
response = h2ac_admin.get(ADMIN_FILES_URI)
responseBodyRef = [{ "bytes":5, "path": "/tmp/example.txt", "state": "closed", "closeDelayUsecs": 1000000 }]
responseBodyRef = [{ "bytes":5, "path": "/tmp/example.txt", "state": "closed" }]
h2ac_admin.assert_response__status_body_headers(response, 200, responseBodyRef)

21 changes: 21 additions & 0 deletions src/http2/MyAdminHttp2Server.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,10 @@ void MyAdminHttp2Server::receiveGET(const std::string &uri, const std::string &p
responseBody = getHttp2Server()->dataConfigurationAsJsonString();
statusCode = 200;
}
else if (pathSuffix == "files/configuration") {
responseBody = getFileManager()->configurationAsJsonString();
statusCode = 200;
}
else if (pathSuffix == "files") {
responseBody = getFileManager()->asJsonString();
statusCode = ((responseBody == "[]") ? 204:200);
Expand Down Expand Up @@ -606,6 +610,23 @@ void MyAdminHttp2Server::receivePUT(const std::string &pathSuffix, const std::st
ert::tracing::Logger::error("Cannot keep requests history if data storage is discarded", ERT_FILE_LOCATION);
}
}
else if (pathSuffix == "files/configuration") {
std::string readCache;

if (!queryParams.empty()) { // https://stackoverflow.com/questions/978061/http-get-with-request-body#:~:text=Yes.,semantic%20meaning%20to%20the%20request.
std::map<std::string, std::string> qmap = h2agent::model::extractQueryParameters(queryParams);
auto it = qmap.find("readCache");
if (it != qmap.end()) readCache = it->second;

success = (readCache == "true" || readCache == "false");
}

if (success) {
bool b_readCache = (readCache == "true");
getFileManager()->enableReadCache(b_readCache);
LOGWARNING(ert::tracing::Logger::warning(ert::tracing::Logger::asString("File read cache: %s", b_readCache ? "true":"false"), ERT_FILE_LOCATION));
}
}

statusCode = success ? 200:400;
}
Expand Down
4 changes: 2 additions & 2 deletions src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -620,7 +620,7 @@ int main(int argc, char* argv[])
if (cmdOptionExists(argv, argv + argc, "--long-term-files-close-delay-usecs", value))
{
int iValue = toNumber(value);
if (iValue < 1)
if (iValue < 0)
{
usage(EXIT_FAILURE, "Invalid '--long-term-files-close-delay-usecs' value. Must be greater or equal than 0.");
}
Expand All @@ -630,7 +630,7 @@ int main(int argc, char* argv[])
if (cmdOptionExists(argv, argv + argc, "--short-term-files-close-delay-usecs", value))
{
int iValue = toNumber(value);
if (iValue < 1)
if (iValue < 0)
{
usage(EXIT_FAILURE, "Invalid '--short-term-files-close-delay-usecs' value. Must be greater or equal than 0.");
}
Expand Down
12 changes: 8 additions & 4 deletions src/model/AdminServerProvision.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -718,8 +718,10 @@ bool AdminServerProvision::processTargets(std::shared_ptr<Transformation> transf
}
else {
// assignments
bool shortTerm = (target != transformation->getTarget()); // something was replaced in target: path is considered arbitrary and dynamic: short term files
file_manager_->write(target/*path*/, targetS/*data*/, true/*text*/, (shortTerm ? configuration_->getShortTermFilesCloseDelayUsecs():configuration_->getLongTermFilesCloseDelayUsecs()));
bool longTerm =(transformation->getTargetPatterns().empty()); // path is considered fixed (long term files), instead of arbitrary and dynamic (short term files)
// even if @{varname} is missing (empty value) we consider the intention to allow force short term
// files type.
file_manager_->write(target/*path*/, targetS/*data*/, true/*text*/, (longTerm ? configuration_->getLongTermFilesCloseDelayUsecs():configuration_->getShortTermFilesCloseDelayUsecs()));
}
}
else if (transformation->getTargetType() == Transformation::TargetType::TBinFile) {
Expand All @@ -733,8 +735,10 @@ bool AdminServerProvision::processTargets(std::shared_ptr<Transformation> transf
}
else {
// assignments
bool shortTerm = (target != transformation->getTarget()); // something was replaced in target: path is considered arbitrary and dynamic: short term files
file_manager_->write(target/*path*/, targetS/*data*/, false/*binary*/, (shortTerm ? configuration_->getShortTermFilesCloseDelayUsecs():configuration_->getLongTermFilesCloseDelayUsecs()));
bool longTerm =(transformation->getTargetPatterns().empty()); // path is considered fixed (long term files), instead of arbitrary and dynamic (short term files)
// even if @{varname} is missing (empty value) we consider the intention to allow force short term
// files type.
file_manager_->write(target/*path*/, targetS/*data*/, false/*binary*/, (longTerm ? configuration_->getLongTermFilesCloseDelayUsecs():configuration_->getShortTermFilesCloseDelayUsecs()));
}
}
}
Expand Down
73 changes: 66 additions & 7 deletions src/model/FileManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,51 @@ namespace h2agent
namespace model
{


void FileManager::enableMetrics(ert::metrics::Metrics *metrics) {

metrics_ = metrics;

if (metrics_) {
ert::metrics::counter_family_ref_t cf = metrics->addCounterFamily("FileSystem_observed_operations_total", "H2agent file system operations");
observed_open_operation_counter_ = &(cf.Add({{"operation", "open"}}));
observed_close_operation_counter_ = &(cf.Add({{"operation", "close"}}));
observed_write_operation_counter_ = &(cf.Add({{"operation", "write"}}));
observed_empty_operation_counter_ = &(cf.Add({{"operation", "empty"}}));
observed_delayed_close_operation_counter_ = &(cf.Add({{"operation", "delayedClose"}}));
observed_instant_close_operation_counter_ = &(cf.Add({{"operation", "instantClose"}}));
observed_error_open_operation_counter_ = &(cf.Add({{"success", "false"}, {"operation", "open"}}));
}
}

void FileManager::incrementObservedOpenOperationCounter() {
if (metrics_) observed_open_operation_counter_->Increment();
}

void FileManager::incrementObservedCloseOperationCounter() {
if (metrics_) observed_close_operation_counter_->Increment();
}

void FileManager::incrementObservedWriteOperationCounter() {
if (metrics_) observed_write_operation_counter_->Increment();
}

void FileManager::incrementObservedEmptyOperationCounter() {
if (metrics_) observed_empty_operation_counter_->Increment();
}

void FileManager::incrementObservedDelayedCloseOperationCounter() {
if (metrics_) observed_delayed_close_operation_counter_->Increment();
}

void FileManager::incrementObservedInstantCloseOperationCounter() {
if (metrics_) observed_instant_close_operation_counter_->Increment();
}

void FileManager::incrementObservedErrorOpenOperationCounter() {
if (metrics_) observed_error_open_operation_counter_->Increment();
}

void FileManager::write(const std::string &path, const std::string &data, bool textOrBinary, unsigned int closeDelayUs) {

std::shared_ptr<SafeFile> safeFile;
Expand All @@ -56,11 +101,11 @@ void FileManager::write(const std::string &path, const std::string &data, bool t
std::ios_base::openmode mode = std::ofstream::out | std::ios_base::app; // for text files
if (!textOrBinary) mode |= std::ios::binary;

safeFile = std::make_shared<SafeFile>(path, io_service_, metrics_, closeDelayUs, mode);
safeFile = std::make_shared<SafeFile>(this, path, io_service_, mode);
add(path, safeFile);
}

safeFile->write(data);
safeFile->write(data, closeDelayUs);
}

bool FileManager::read(const std::string &path, std::string &data, bool textOrBinary) {
Expand All @@ -77,11 +122,11 @@ bool FileManager::read(const std::string &path, std::string &data, bool textOrBi
else {
if (!textOrBinary) mode |= std::ios::binary;

safeFile = std::make_shared<SafeFile>(path, io_service_, metrics_, 0, mode);
safeFile = std::make_shared<SafeFile>(this, path, io_service_, mode);
add(path, safeFile);
}

data = safeFile->read(result, mode);
data = safeFile->read(result, mode, read_cache_);

return result;
}
Expand All @@ -95,7 +140,7 @@ void FileManager::empty(const std::string &path) {
safeFile = it->second;
}
else {
safeFile = std::make_shared<SafeFile>(path, io_service_, metrics_);
safeFile = std::make_shared<SafeFile>(this, path, io_service_);
add(path, safeFile);
}

Expand All @@ -111,9 +156,17 @@ bool FileManager::clear()
return result;
}

std::string FileManager::asJsonString() const {
nlohmann::json FileManager::getConfigurationJson() const {

return ((size() != 0) ? getJson().dump() : "[]"); // server data is shown as an array
nlohmann::json result;
result["readCache"] = read_cache_ ? "enabled":"disabled";

return result;
}

std::string FileManager::configurationAsJsonString() const {

return (getConfigurationJson().dump());
}

nlohmann::json FileManager::getJson() const {
Expand All @@ -129,6 +182,12 @@ nlohmann::json FileManager::getJson() const {
return result;
}

std::string FileManager::asJsonString() const {

return ((size() != 0) ? getJson().dump() : "[]"); // server data is shown as an array
}


}
}

Loading

0 comments on commit 93d539a

Please sign in to comment.