Layers & Modules
This chapter gives an overview how the Agent standard and Kit should be implemented in terms of layers and modules.
For more information see
- Our Adoption guideline
- The High-Level Architecture
- The Ontology modelling guide
- The ARC42 documentation
- An API specification
- Our Reference Implementation
- The Deployment guide
In this context generic building blocks were defined (see next figure) which can be implemented with different open source or COTS solutions. In the scope of Catena-X project these building blocks are instantiated with a reference implementation based on open source components (the Knowledge Agents KIT). The detailed architecture following this reference implementation can be found here.
In the following paragraphs, all building blocks relevant for this standard are introduced:
Semantic Models
Ontologies, as defined by W3C Web Ontology Language OWL 2 standard, provide the core of the KA catalogue. By offering rich semantic modelling possibilities, they contribute to a common understanding of existing concepts and their relations across the data space participants. To increase practical applicability, this standard contains an overview about most important concepts and best practices for ontology modelling relevant for the KA approach (see chapter 5). OWL comes with several interpretation profiles for different types of applications. For model checking and data validation (not part of this standard), this KIT proposes the Rule Logic (RL) profile. For query answering/data processing (part of this standard), this KIT applies the Existential Logic (EL) profile (on the Dataspace Layer) and the Query Logic (QL) profile (on the Binding Layer).
Ontology Editing & Visualization
To create and visualize ontology models, dedicated tooling is advised. For this purpose, various open source tools (e.g. Protegé) or commercial products (e.g. metaphacts) are available. This KIT hence standardizes on the ubiquitous RDF Terse Triple Language TTL format which is furthermore able to divide/merge large ontologies into/from modular domain ontology files.
Ontology Management
To achieve model governance, a dedicated solution for ontology management is necessary. Key function is to give an overview about available models and their respective meta data and life cycle (e.g. in work, released, deprecated). Because of the big parallels, it is today best practice to perform ontology management through modern and collaborative source code versioning systems. The de-facto standard in this regard is GIT (in particular: its http/s protocol variant, including an anonymous read-only raw file access to release branches). In the following, we call the merged domain ontology files in a release branch “the” (shared) Semantic Model (of that release). For practicability purposes, the Data Consumption and the Binding Layer could be equipped with only use-case and role-specific excerpts of that Semantic Model. While this may affect the results of model checking and validity profiles, it will not affect the query/data processing results.
Data Consumption Layer/Query Definition
This layer comprises all applications which utilize provided data and functions of business partners to achieve a direct business impact and frameworks which simplify the development of these applications. Thus, this layer focuses on using a released Semantic Model (or a use-case/role-specific excerpt thereof) as a vocabulary to build flexible queries (Skills) and integrating these Skills in data consuming apps.
This KIT relies on SPARQL 1.1 specification as a language and protocol to search for and process data across different business partners. As a part of this specification, this KIT supports the QUERY RESULTS JSON and the QUERY RESULTS XML formats to represent both the answer sets generated by SPARQL skills and the sets of input parameters that a SPARQL skill should be applied to. For answer sets, additional formats such as the QUERY RESULTS CSV and TSV format may be supported. Required is the ability to store and invoke SPARQL queries as parameterized procedures in the dataspace; this is a KA-specific extension to the SPARQL endpoint and is captured a concise Openapi specification. Also part of that specification is an extended response behaviour which introduces the warning status code “203” and a response header “cx_warning” bound to a JSON structure that lists abnormal events or trace information that appeared during the processing.
Skill Framework
Consumer/Client-side component, which is connected to the consumer dataspace components (the Matchmaking Agent via SPARQL, optionally: the EDC via the Data Management API). It is at least multi-user capable (can switch/lookup identities of the logged-in user), if not multi-tenant capable (can switch Matchmaking Agents and hence ED Connectors). It looks up references to Skills in the Dataspace and delegates their execution to the Matchmaking Agent. The Skill framework may maintain a “conversational state” per user (contextual memory which is a kind of graph/data set) which drives the workflow. It may also help to define, validate and maintain Skills in the underlying Dataspace Layer.
Query/Skill Editor
To systematically build and maintain Skills, a query editor for easy construction and debugging queries is advisable. The skill editor should support syntax highlighting for the query language itself and it may support auto-complete based on the Semantic Model. A skill editor could also have a graphical model in which a procedure can be composed out of pre-defined blocks. Finally, a skill editor should have the ability to test-drive the execution of a skill (maybe without storing/publishing the skill and making any “serious” contract agreements in the dataspace and based on sample data).
Data Consuming App
Application that utilizes data of data providers to deliver added value to the user (e.g. CO2 footprint calculation tool). Skills can be easily integrated in these apps as stored procedure. Hence, skill and app development can be decoupled to increase efficiency of the app development process. For more flexible needs, Skills could be generated ad-hoc from templates based on the business logic and app data. The Data Consuming App could integrate a Skill Framework to encapsulate the interaction with the Dataspace Layer. The Consuming App could also integrate a Query/Skill Editor for expert users.
Dataspace Layer
The base Dataspace-building technology is the Eclipse Dataspace Connector (EDC) which should be extended to operate as a HTTP/S contracting & transfer facility for the SPARQL-speaking Matchmaking Agent. To resolve dataspace offers and addresses using the ontological vocabulary, the Matchmaking Agent keeps a default meta-graph, the Federated Catalogue, that is used to host the Semantic Model and that is regularly synchronized with the relevant dataspace information including the offers of surrounding business partners/EDCs.
EDC
Actually, the Eclipse Dataspace Connector (see Catena-X Standard CX-00001) consists of two components, one of which needs to be extended. See the Tractus-X Knowledge Agents EDC Extensions (KA-EDC) and their KA-EDC Deployment
Control Plane
The Control Plane hosts the actual management/negotiation engine and is usually a singleton that is exposing
- an internal (api-key secured) API for managing the control plane by administrative accounts/apps and the Matchmaking Agent
- Manages Assets (=Internal Addresses including security and other contextual information into the Binding/Virtualization/Backend Layers together with External meta-data/properties of the Assets for discovery and self-description)
- Manages Policies (=Conditions regarding the validity of Asset negotiations and interactions)
- Manages Contract Definitions (=Offers are combinations of Assets and Policies and are used to build up a Catalogue)
- a public (SSI-secured) Protocol API for coordination with other control planes of other business partners to setup transfer routings between the data planes.
- state machines for monitoring (data) transfer processes which are actually executed by the (multiple, scalable) data plane(s). KA uses the standard “HttpProxy” transfer.
- a validation engine which currently operates on static tokens/claims which are extracted from the transfer flow but may be extended with additional properties in order to check additional runtime information in the form of properties
- callback triggers for announcing transfer endpoints to the data plane to external applications, such as the Matchmaking Agent (or other direct EDC clients, frameworks and applications). This KIT supports multiple Matchmaking Agent instances per EDC for load-balancing purposes and it also allows for a bridged operation with other non-KA use cases, so it should be possible to configure several endpoint callback listeners per control plane.
Data Plane
The Data Plane (multiple instances) performs the actual data transfer tasks as instrumented by the control plane. The data plane exposes transfer-specific capabilities (Sinks and Sources) to adapt the actual endpoint/asset protocols (in the EDC standard: the asset type).
Graph Assets use the asset type/data source “urn:cx:Protocol:w3c:Http#SPARQL”. In their address part, the following properties are supported
Data Address Property | Description |
---|---|
@type | must be set to "DataAddress" |
type | Must be set to "cx-common:Protocol?w3c:http:SPARQL" |
id | The name under which the Graph will be offered. Should be a proper IRI/URN, such as GraphAsset?oem=BehaviourTwin |
baseUrl | The endpoint URL of the binding agent (see below). Should be a proper http/s SPARQL endpoint. |
proxyPath | must be set to “false” |
proxyMethod | must be set to “true” |
proxyQueryParams | must be set to “true” |
proxyBody | must be set to “true” |
authKey | optional authentication header, e.g. “X-Api-Key” |
authCode | optional authentication value, such as an API key |
header:Accepts | optional fixes the Accepts header forwarded to the endpoint, e.g., “application/sparql-results+json” |
cx-common:allowServicePattern | an optional regular expression that overrides the default service URL allowance (white list) for this asset |
cx-common:denyServicePattern | an optionaal regular expression that overrides the default service URL denial (black list) for this asset |
Skill Assets use the asset type/data source “urn:cx:Protocol:w3c:Http#SKILL”. In their address part, the following properties are supported
DataAddress Property | Description |
---|---|
@type | must be set to "DataAddress" |
type | Must be set to "cx-common:Protocol?w3c:http:SKILL" |
id | The name under which the Skill will be offered. Should be a proper IRI/URN, such as SkillAsset?supplier=RemainingUsefulLife |
baseUrl | The endpoint URL of the binding agent (see below). Should be a proper http/s SPARQL endpoint. |
proxyPath | must be set to “false” |
proxyMethod | must be set to “true” |
proxyQueryParams | must be set to “true” |
proxyBody | must be set to “true” |
Both Skill and Graph Assets share public properties
Public Property | Description |
---|---|
@type | must be set to "Asset" |
@id | The name under which the Skill/Graph will be offered. Should coincide with the "id" in the DataAddress |
properties.name | Title of the asset in the default language (English). |
properties.name@de | Title of the asset in German (or other languages accordingly). |
properties.description | Description of the asset in the default language (English) |
properties.description@de | Description of the asset in German (or other languages accordingly). |
properties.version | A version IRI |
properties.contenttype | "application/json, application/xml" for Graph Assets, "application/sparql-query, application/json, application/xml" for Skill Assets |
properties.rdf:type | "cx-common:GraphAssetAsset" for graph Assets, "cx-common:SkillAsset" for Skill Assets |
properties.rdfs:isDefinedBy | An RDF description listing the Use case ontologies that this asset belongs to, e.g., “<https://w3id.org/catenax/ontology/core>" |
properties.cx-common:implementsProtocol | should be set to “<urn:cx:Protocol:w3c:Http#SPARQL> |