Milestone 2 - Overview #9
mspiekermann
started this conversation in
Planning
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This issue was created to provide an overview of topics and features that are addressed within the next weeks for the Milestone 2 release.
The linked issues are related to corresponding topics and provide status information and additional descriptions, what needs to be done. Some of them nevertheless need to be drilled down in scope and are not committed as is for the release. Derived issues will be created and labeled accordingly.
DataPlane enhancement
The EDC will provide a scalable and extensible DataPlane architecture, that leverages the existing TransferExtensions and enable the end-to-end data transfer in various use case scenarios.
Concept of DataPlane must allow extensions for different transfer protocols
Support for transferring data files via HTTP endpoints
Enable data transfer via HTTP-endpoints on provider and consumer side and implement the DataPlane Agent for HTTP (using existing HTTP Transfer Extensions) that routes the data between these endpoints. This could either be an existing endpoint registered at the EDC or an endpoint that will be dynamically provisioned for the data transfer.
Support for Azure OS and S3 endpoints
Enable data transfer for Azure OS and S3 endpoints for large files (binaries and text-based) and leverage existing BLOB Transfer Extensions that routes the data between these endpoints. This could either be an existing endpoint registered at the EDC or an endpoint that will be dynamically provisioned for the data transfer.
Demo “in-process” deployment
DataPlane scalability
Enable Kubernetes deployment for DataPlane Agents to ensure automated scaling and infrastructure integration within existing environments.
#463 #418 #364
Pull DataFlow
Right now, the EDC follows the push pattern to transfer data between two endpoints after the negotiation process.
#463
Streaming
While priority is set on the HTTP transfer between EDC instances, streaming protocols should be supported as well. To showcase this capability of the DataPlane, a DataPlane Extension for Apache Kafka should be implemented. To set the scope and expected result for this milestone we need some further investigation and assessment.
#463
Contract negotiation
The EDC right now has the contract and policy negotiation process for a simple happy flow available. To enhance this process we will add support for a real negotiation scenario that allows offers and counter-offers by supporting corresponding workflows.
#352 #71 #344
TransferProcessManager
The TransferProcessManager (TPM) uses several methods to check and advance the state of transfer processes. Those "check-methods" are not transactional, so another process/thread could intercede and perform another modification, which would get overwritten by the TPM.
#393 #386 #330 #470 #434 #416
Management – Observability
We will expose our application's health and monitor at scale by adding observability interfaces for “healthchecks” in terms of readiness and liveness probe to show a) the application is up and running and working as expected and b) the application is ready to receive new requests.
#472 #420 #419
IDS Support
The EDC currently basically supports IDS data-protocol (information model, messages) that has to be extended and raised to a next level.
#367 #403 #400 #390 #385 #380 #341 #338 #408
SQL Database Support
The EDC currently supports some basic data retention site (Cosmos DB and basic FS). Since in practice SQL relational databases are widely used, the EDC shall support a corresponding implementation.
#460 #461 #453 #452 #444 #438 #437 #436
Modularity
The EDC is evolving and in addition of the Connector capabilities implement drafts for other components, e.g. federated catalog and IdM with DID-Web. To foster this implementations, we do a breakdown of different subcomponents and do housekeeping for the structure and capsulation, for the goal to be prepared to move components into separate repositories if complexity requires this step.
#92 #469
API Definition
One of the important aspects to work with the EDC are the APIs to control DataAsset handling (register, contract management, transfer), configuration and observability. Therefore, a documentation of available APIs will be created. Automatic generation from code, in combination with Swagger description language is assessed as a favored option.
#476 #46
Documentation
As continuous goal, we want to work on the documentation a) to describe all features of the EDC, b) on how to use the EDC and c) code documentation
Beta Was this translation helpful? Give feedback.
All reactions