Skip to content
/ broker Public

AKTIN search broker components: Asynchronous distribution of search queries across federated data warehouses

License

Notifications You must be signed in to change notification settings

aktin/broker

Repository files navigation

AKTIN Broker

A content-agnostic data exchange middle-ware for federated data warehouses.

This software is not designed for standalone usage. Rather it is to be used within federated data warehouse environments like e.g. research networks or multi-center clinical study registries in conjunction with local workflows and processing infrastructure.

The broker infrastructure contains two main components: The central component broker-server is used to publish information to be distributed. The local component broker-client is used by multiple parties to retrieve the published information and subsequently report status updates and response-data.

The middleware infrastructure is content-agnostic in the sense that it can be used with any format or kind of data. A typical scenario is submitting a query for data extraction (e.g. SQL) to the broker (server) to be received by multiple clients which in turn process the query and return status/progress information and resulting extracted data. Other use cases include distribution of metadata/terminology and processing of material transfer requests.

As all communication is based on RESTful HTTP endpoints, the broker-client is optional and can be replaced with simple HTTP calls.

Example for a typical scenario:

Analyst          Broker-Server        Broker-Client-1        Broker-Client-2      ....

submit query --->   publish query
                        |
                        |    <-----     ask for queries
                        |                    |
                        |    ----->     receive new query
                        |                    |
                        |    <-----     report status
                        |                    | (internal workflow 
                        |                    | and query execution)
                        |    <-----     report results
                        |
                (collect responses)
                        |
get status update <---  |
                        |
                        |
                        |    <-----------------------------     ask for queries
                        |                                            |
                        |    ----------------------------->     receive new query
                        |                                            |
                        |    <-----------------------------     report status
                        |                                            | (internal workflow 
                        |                                            | and query execution)
                        |    <-----------------------------     report results
                        |
retrieve results <---   |

Getting started

The easiest way to use the software is to download and run the pre-build binary distribution of broker-admin-dist.

Running the broker

To run the central broker component, first unpack the binary distribution broker-admin-dist-1.x.zip. For running the application, you need a Java 8 runtime environment or newer. We recommend to use the latest OpenJDK version (currently OpenJDK 15). Open a command shell in the extracted folder and run the script run_broker.sh for Linux/MingW/GitBash or run_broker.bat for Windows respectively. To change startup options (e.g port), edit the startup script.

Running the broker from your IDE for development

To test the broker from an IDE, simply run the class broker-admin/src/test/java/org.aktin.broker.admin.standalone.TestServer. The port can be specified as command line argument and defaults to locahost:8080. Running the TestServer (instead of directly using the production src/main/java/org.aktin.broker.admin.standalone.HttpServer) will make sure that the system state and configuration files go to the target/ folder instead of cluttering the project directories as well as that example data is already available. In this test environment, the admin password is set to 'test'. Client API-Keys can be found under broker-admin/src/test/resources/api-keys.properties. For examples to use the broker, see below.

Building from source code

To build the project from its source code, you need a Java runtime environment (e.g. OpenJDK 15, minimum is Java 8) and the build-tool Apache Maven. To build the project, download/clone the repository source code and run mvn clean install via command shell in the main directory. After the build process is completed, you can find the broker binary distribution in the subfolder broker-admin-dist/target. To run the broker, see section Getting startet above.

Examples for using the broker

Below, you will find examples for typical use cases. For accessing the broker from Java/JRE-Applications, you can use the broker-client dependency (https://mvnrepository.com/artifact/org.aktin/broker-client) which communicates with the broker via HTTP. To demonstrate the simplicity of the RESTful API, the command line tool curl is used in the following examples for direct HTTP communication.

Submitting a request to the broker

This example performs authentication at the broker and creates a request containing query syntax which is to be distributed to all nodes.

# Admin authentication and store token in shell variable
TOKEN=`curl -s -H "Content-Type: application/xml" -X POST \
    -d '<credentials><username>admin</username><password>CHANGEME</password></credentials>' \
    http://localhost:8080/auth/login`

# Create a file containing the query syntax
echo "SELECT * FROM fhir_observation" > query1.sql

# submit the query
curl -i -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/sql" -X POST \
     -d @query1.sql http://localhost:8080/broker/request

# the response will contain a Location header for the newly created request. 
# We will use this location below to publish the request
curl -si -H "Authorization: Bearer $TOKEN" -X POST http://localhost:8080/broker/request/1/publish

Documentation on the RESTful API of the request administration endpoint can be found in RequestAdminEndpoint.java

Retrieving a request at the client side

In this example, the client will authenticate via API-key and receives the published request.

# list available requests
curl -is -H "Authorization: Bearer xxxApiKey123" http://localhost:8080/broker/my/request

# retrieve first request
curl -is -H "Authorization: Bearer xxxApiKey123" http://localhost:8080/broker/my/request/1

# update status to retrieved
curl -is -H "Authorization: Bearer xxxApiKey123" -X POST \
     http://localhost:8080/broker/my/request/1/status/retrieved

Supplying results to a request from the client side

This example demonstrates, how the client can supply arbitrary data as a result to a request.

# Create a file containing the query results
echo -e "a;b\n1;2\n3;4\n" > result1.csv

# submit the file contents
curl -i -H "Authorization: Bearer xxxApiKey123" -H "Content-Type: text/csv" -X PUT \
     -d @result1.csv http://localhost:8080/aggregator/my/request/1/result

# update status to completed
curl -is -H "Authorization: Bearer xxxApiKey123" -X POST \
     http://localhost:8080/broker/my/request/1/status/completed

Updating the client status for a request

A different client may also reject a query. (Note the different API-key to indicate a different client)

curl -is -H "Authorization: Bearer xxxApiKey567" -X POST \
     http://localhost:8080/broker/my/request/1/status/rejected

Possible request states are documented in broker-api/RequestStatus

Download the collection of results for a query

Once one or more clients have responded to a request e.g. by supplying result data, the submitter can retrieve the results either via individual REST calls or more conveniently via ZIP bundle:

# assuming authentication was already done (see first example)
# retrieve a download ID for the bundle containing all results and status updates
BUNDLE_ID=`curl -s -H "Authorization: Bearer $TOKEN"  http://localhost:8080/broker/export/request-bundle/1`

# download the results bundle
curl -s -H "Authorization: Bearer $TOKEN" --output results.zip \
     http://localhost:8080/broker/download/$BUNDLE_ID 

Using the web frontend

The broker includes a minimal web frontent, which can be used to view the status of connected nodes, manage requests and responses and download results. This frontent serves only as a built-in minimal user interface - for a better user experience, external customized frontents should be used.

To access the frontend, go to http://localhost:8080/admin/html/index.html once the broker is running. For login, use the username admin. The admin password is specified in the startup script and defaults to CHANGEME.

The minimal web frontend currently only supports the org.aktin.broker.auth.cred.CredentialTokenAuthProvider. Without this auth provider, the frontend will not function. Nevertheless, the broker will be fully function

Using the broker-client library

To use broker functionality in java client applications, you can use the broker-client dependency (https://mvnrepository.com/artifact/org.aktin/broker-client).

The client library comes with two runnable implementations:

  1. org.aktin.broker.client.live.sysproc.CLI for a command line implementation with executes custom OS processes to automatically handle requests.
  2. org.aktin.broker.client.live.util.AdminListener which connects to the server and listens/prints any changes and updates. For more information, see broker-client/README.md

For a code example of other clients, see the following implementations:

Publishing node resources to the broker (eg. statistics, health status, etc.)

A node can publish and update arbitrary resources to the broker. This functionality is commonly used to upload daily statistics, ETL errors or health information.

# Create a file containing some health info
date > health.txt
free >> health.txt
df -lh / >> health.txt

curl -i -H "Authorization: Bearer xxxApiKey123" -H "Content-Type: text/plain" -X PUT \
     -d @health.txt http://localhost:8080/broker/my/node/health

The last part of the URL /broker/my/node/health can be changed to any other name eg., /broker/my/node/stats.

If someone with admin-privilege is connected to the broker via websocket, they will receive a realtime notification resource update 0 health.

On the admin side, the node resource can be downloaded via

curl -s -H "Authorization: Bearer xxxAdmin1234" --output health-0.txt \
     http://localhost:8080/broker/node/0/health

Real-time communication via websocket connection

For non-real-time applications, the node/client is typically configured to ask (poll) for new requests in pre-defined intervals. In this case, only short-lived HTTPS connections are used from client to server.

For real-time applications, each node needs to be notified immediately about new requests. For this purpose, websocket connections can be used:

A websocket connection is established from node to broker and kept open. Over this connection, the broker will send a notification once a new request is published or an existing request is closed. See MyBrokerWebsocket.java

When such a notification is retrieved by the client, it will react immediately by retrieving and processing the request.

The websocket is only used for notifications and optional ping-pong messages. All content and status updates are transferred via traditional HTTPS connections initiated by the client as explained above.

Using other authentication methods and authentication providers

Multiple authentication/authorization providers are supported. Default method is API key authentication.

OpenConnect, OAuth, KeyCloak

Client now retrieves keycloak token. Broker does no longer use the introspection endpoint but checks the token itself.

To get it running, the following is needed:

keycloak.json: get it from keycloak. Go to your client there, and click on the "installation" tab and select "keycloak oidc json" format. Copy the file to the target folder (where sysproc.properties is copied to as well) add the OpenIdAuthProvider to the list of providers.

In run_broker.sh, add -Dbroker.auth.provider=org.aktin.broker.auth.apikey.ApiKeyPropertiesAuthProvider,org.aktin.broker.auth.cred.CredentialTokenAuthProvider,org.aktin.broker.auth.openid.OpenIdAuthProvider or if wanted, add it to the defaultconfig adapt openid-config.properties (broker-admin) to contain the correct urls to your keycloak and copy it to the place where you extracted the broker admin dist zip to (place it where api-keys.properties is)

About

AKTIN search broker components: Asynchronous distribution of search queries across federated data warehouses

Resources

License

Stars

Watchers

Forks

Packages