Aggregator Specification

Living Document,

Previous Versions:
Issue Tracking:
Inline In Spec
Editor:
(Ghent University - imec)

Abstract

The Aggregator Protocol defines an HTTP interface for discovering aggregator deployments, registering aggregator instances, and managing services that execute data transformations. The protocol leverages OIDC and User-Managed Access (UMA) for authentication and authorization, enabling secure access to protected derived resources.

1. Introduction

This specification defines the Aggregator Protocol, an HTTP-based interface that lets a client create and manage Aggregator Instances and configure Aggregator Services that execute data transformations. Transformations are described and discovered using the Function Ontology (FnO) [FNO]; a service is configured by referencing a transformation (a fno:Execution) and providing its parameters, after which the client can retrieve the derived result from the output defined by that function.

Clients start from the Aggregator Server Description at the server base URL to discover the registration endpoint, supported registration flows, and the server’s transformation catalog (§ 4 Aggregator Server Metadata). Using the registration endpoint, a client creates (or manages) an Aggregator Instance (§ 5.1 Aggregator Registration Endpoint) and then follows the instance’s Aggregator Description to find the instance’s service collection and transformations endpoint (§ 7.1 Aggregator Description and § 8 Aggregator Service Management).

All management operations are authenticated and authorized. This protocol integrates with OpenID Connect [OIDC-Core] for identity and uses UMA-style authorization [UMA] for protected resources, scopes, and tickets as described in § 6 Aggregator Security Model (Authentication & Authorization).

Servers MUST support JSON (application/json) where specified and MAY additionally provide semantically annotated RDF representations (e.g., JSON-LD or Turtle) via HTTP content negotiation. The Aggregator vocabulary (§ 9 Vocabulary) provides stable IRIs for classes and predicates used throughout the specification.

2. Definitions

This section defines terminology used throughout this specification. Where applicable, terms are aligned with external specifications such as WebID Profiles [WEBID-PROFILE], OpenID Connect (OIDC) [OIDC-Core], OAuth 2.0 [RFC6749], and User-Managed Access (UMA) [UMA].

2.1. Core Roles and Components

2.2. Identity, Authorization, and Tokens

2.3. Namespaces

The following namespace prefixes are used throughout this specification and SHOULD appear in RDF serializations (e.g., Turtle, JSON-LD contexts):

3. Architecture & Resource Model

This section describes the high-level architecture of an Aggregator deployment and the resource model exposed by this specification. It connects the terminology in § 2 Definitions to the endpoint definitions in later sections.

3.1. Actors and Responsibilities

The protocol distinguishes the following parties:

3.2. Resource Types

This specification defines HTTP resources at three levels:

3.3. Base URLs and Addressing

An implementation MUST define a stable base URL for:

An implementation MAY use either of the following instance addressing patterns:

Clients SHOULD treat server-provided URLs as authoritative and avoid constructing URLs by string concatenation. Unless a path is explicitly fixed, the concrete URLs shown in this document serve as illustrative defaults; deployments MAY use different locations as long as the relevant metadata resources advertise the authoritative links.

3.4. Discovery and Entry Points

3.5. Security Boundaries

All security requirements are defined in § 6 Aggregator Security Model (Authentication & Authorization). At a high level:

The aggregator and its clients SHOULD use secure transport (HTTPS) for all communication, and SHOULD use DPOP tokens wherever possible to bind tokens to the client. Aggregator endpoints intended for browser-based clients MUST support CORS. Unless the client and Aggregator are tightly coupled and deployed under the same origin, the server MUST answer OPTIONS preflight requests and include appropriate Access-Control-Allow-* headers for the methods and headers used by this specification (including Authorization, Content-Type, and Accept). Implementations MAY restrict allowed origins to trusted client origins.

3.6. Example URL Layout (Non-normative)

Given an Aggregator Server at https://aggregator.example/ and a path-based Aggregator instance at https://aggregator.example/abc123/:

4. Aggregator Server Metadata

This section describes the public endpoints exposed by the Aggregator Server for discovery and metadata retrieval. Except for the Aggregator Server Description at the server base URL, implementations MAY expose the remaining endpoints at deployment-specific URLs; the server description document MUST include absolute URLs for each resource so that clients can discover them.

4.1. Aggregator Server Description

The Aggregator Server Description is a public document that allows clients to discover the Aggregator Server’s endpoints and capabilities. This document MUST be available at the server base URL (i.e., {aggregator-server-url}/), MUST be accessible without authentication, and MUST provide at least the following information. Each JSON member is paired with an RDF predicate from § 9 Vocabulary so the document can also be served as JSON-LD or other RDF formats using content negotiation based on [RFC9110]:

In semantically annotated representations, the Aggregator Server Description MUST state that the described resource has RDF type aggr:AggregatorServer (§ 9.1.3 aggr:AggregatorServer) (e.g., via @type in JSON-LD or a aggr:AggregatorServer in Turtle). Clients MAY rely on this type statement when consuming semantic representations.

registration_endpoint (REQUIRED):

The value is a string containing the absolute URL of the Aggregator Registration Endpoint (§ 5.1 Aggregator Registration Endpoint); in the RDF representations, this member maps to the predicate aggr:registrationEndpoint (§ 9.2.10 aggr:registrationEndpoint).

supported_registration_types (REQUIRED):

The value is a JSON array of strings identifying the supported registration flow tokens at registration_endpoint; in the RDF representations, each entry maps to an aggr:supportedRegistrationType triple (§ 9.2.11 aggr:supportedRegistrationType) whose object is the corresponding flow class IRI.

Each member MUST be one of the registration flow tokens defined in § 5.1 Aggregator Registration Endpoint:

registration_request_formats_supported (REQUIRED):

The value is a JSON array of strings identifying the supported request formats for the registration_endpoint. Each entry MUST be a media type; in RDF representations, each entry maps to an aggr:registrationRequestFormatSupported triple (§ 9.2.12 aggr:registrationRequestFormatSupported). Servers MUST include either application/json or application/x-www-form-urlencoded or both.

version (REQUIRED):

The value is a string containing the semantic version ([SEMVER]) of the Aggregator specification that the server adheres to; in the RDF representations, this member maps to the predicate aggr:specVersion (§ 9.2.13 aggr:specVersion).

client_identifier (REQUIRED):

The value is a string containing the absolute URL of the Client ID Document (§ 4.2 Client ID Document); in the RDF representations, this member maps to the predicate aggr:clientIdentifier (§ 9.2.14 aggr:clientIdentifier).

transformation_catalog (REQUIRED):

The value is a string containing the absolute URL of the Public Transformation Catalog (§ 4.3 Server-level Transformation Catalog); in the RDF representations, this member maps to the predicate aggr:transformationCatalog (§ 9.2.15 aggr:transformationCatalog).

{
  "@context": {
    "aggr": "https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#",
    "provision": "aggr:ProvisionFlow",
    "authorization_code": "aggr:AuthorizationCodeFlow",
    "registration_request_formats_supported": "aggr:registrationRequestFormatSupported",
    "registration_endpoint": {
      "@id": "aggr:registrationEndpoint",
      "@type": "@id"
    },
    "supported_registration_types": {
      "@id": "aggr:supportedRegistrationType",
      "@type": "@vocab"
    },
    "version": {
      "@id": "aggr:specVersion"
    },
    "client_identifier": {
      "@id": "aggr:clientIdentifier",
      "@type": "@id"
    },
    "transformation_catalog": {
      "@id": "aggr:transformationCatalog",
      "@type": "@id"
    }
  },
  "@id": "https://aggregator.example/",
  "@type": "aggr:AggregatorServer",
  "registration_endpoint": "https://aggregator.example/registration",
  "supported_registration_types": [
    "provision",
    "authorization_code"
  ],
  "registration_request_formats_supported": [
    "application/json",
    "application/x-www-form-urlencoded"
  ],
  "version": "1.0.0",
  "client_identifier": "https://aggregator.example/client.json",
  "transformation_catalog": "https://aggregator.example/transformations"
}

4.2. Client ID Document

Endpoint that exposes the Client ID Document of the aggregator used for authorization. Servers MAY host this document at any URL; the client_identifier property in the Aggregator Server Description MUST contain the authoritative absolute URL. The Client ID Document MUST conform to the OAuth Client ID Metadata Document specification [Client-ID]. In the case for the aggregator, the redirect_uris property is OPTIONAL instead of REQUIRED, as multiple clients MAY create an Aggregator on the same Aggregator Server (depending on the implementation). Adding this property allows an Aggregator Server implementation to restrict which clients may create aggregators on the server.

4.3. Server-level Transformation Catalog

This endpoint returns a RDF document describing the transformations supported by the aggregator server on a server level. Servers MAY publish this catalog at any deployment-specific URL; the Aggregator Server Description MUST advertise it via the transformation_catalog field. This RDF document SHOULD be exposed using content negotiation based on HTTP. The transformations MUST be described using FnO [FNO]. The document MAY be protected using an OIDC ID Token.

The types (fno:type) of the parameters and outputs of the functions SHOULD be a class (owl:Class) that is a subclass (rdfs:subClassOf) of a Data Catalog entity like dcat:DataService. This entity SHOULD then be further described using Data Catalog properties to indicate how to access the outputs of the transformations. Pre defined pipelines (using fno:Compositions) MAY also be described in this document. Finally, the catalog MAY also include external entities using rdfs:seeAlso to link to external functions, pipelines or Data Catalog entities.

The specificity of the transformation is up to the server owner. The functions MAY have a link to an implementation with fno:Implementation if needed.

@prefix aggr: <https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#> .
@prefix trans: <http://aggregator.example.org/transformations#> .
@prefix fno: <https://w3id.org/function/ontology#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<> a aggr:TransformationCollection ;
    dct:title "Aggregator transformations" ;
    aggr:hasTransformation trans:AggregateSources .
    aggr:hasTransformation trans:AggregateSources2 .

trans:AggregateSources
    a                   fno:Function ;
    fno:name            "A Aggregator Service that combines a list of sources"^^xsd:string ;
    fno:expects         ( trans:Sources ) ;
    fno:returns         ( trans:Result ) .

trans:Sources
    a             fno:Parameter ;
    fno:predicate trans:sources ;
    fno:type      rdf:List ;
    fno:required  "true"^^xsd:boolean.

trans:Result
    a             fno:Output ;
    fno:predicate trans:result ;
    fno:type      trans:SPARQLProtocol .

trans:SPARQLProtocol
    a owl:Class ;
    rdfs:subClassOf dcat:DataService ;
    dcterms:conformsTo <https://www.w3.org/TR/sparql12-protocol/> .

5. Aggregator Management

This section describes how Aggregator Instances are managed at the Aggregator Server level. Deployments define their own authorization policy for these endpoints, but they MUST require authenticated requests so the Aggregator provider can authorize a user to manage their aggregators. The tokens used in these requests (IDP_client_token) prove the identity of the user and the client (Client ID Document) used to access the Aggregator Server. Deployments MAY expose the same functionality at different paths, provided that the Aggregator Server Description advertises the authoritative URLs.

5.1. Aggregator Registration Endpoint

This section specifies the behavior of the registration_endpoint advertised by the Aggregator Server Description (§ 4.1 Aggregator Server Description). The endpoint supports creating Aggregator Instances, replacing the stored token set for an existing instance, and deleting an instance. If the registration_type isn’t none the endpoint SHOULD reject unauthenticated requests with 401 Unauthorized.

should we allow a name field for the aggregator during creation?

POST

Creates an Aggregator Instance or replaces the stored token set for an existing instance. The request body MUST use a content type listed in registration_request_formats_supported and MUST include: When using application/json, the body MUST be a JSON object. When using application/x-www-form-urlencoded, the body parameters MUST be encoded as form fields with the same member names. The server MUST support application/json and MAY support application/x-www-form-urlencoded.

  • registration_type (REQUIRED): The value is a string token; it MUST be one of the supported_registration_types advertised in the Aggregator Server Description (§ 4.1 Aggregator Server Description). The following string tokens are defined, each corresponding to an RDF class in the Aggregator vocabulary (§ 9 Vocabulary) for semantically annotated representations:

  • aggregator (string, OPTIONAL): When present, the request targets an existing Aggregator Instance and the server MUST replace its stored access token and refresh token with a new set obtained from the Identity Provider (IdP). This is not a refresh-token grant; it is a full re-authentication to obtain a fresh access token and refresh token.

Depending on registration_type, additional members are defined:

  • registration_type: "none"

    • No additional members are required.

    In this flow the Aggregator Instance has no identity and all requests to upstream resource servers will be unauthenticated. All hosted resources MUST be public or accessible without authentication. In this flow the registration post request MIGHT not require authentication, depending on the deployment’s policy.

  • registration_type: "provision"

    • No additional members are required.

    In this flow the Aggregator server MUST provision an identity for the Aggregator Instance that is registered at the Identity Provider (IdP).

  • registration_type: "authorization_code" The authorization_code flow uses two POST messages to the registration_endpoint. It is based on the OAuth 2.0 Authorization Code grant [RFC6749] (https://datatracker.ietf.org/doc/html/rfc6749). The Start Request bootstraps PKCE [RFC7636], the request body MUST include:

    • registration_type (string, REQUIRED): "authorization_code".

    • authorization_server (string, REQUIRED): The URL of the UMA Authorization Server that governs access to resources exposed by the Aggregator.

    The server MUST respond with 201 Created and a JSON object containing:

    • aggregator_client_id (string, REQUIRED): The Client ID Document of the aggregator.

    • code_challenge (string, REQUIRED)

    • code_challenge_method (string, REQUIRED)

    • state (string, REQUIRED)

    The server MAY also include IdP discovery hints (for example issuer or authorization_endpoint) if the client cannot determine them through other means.

    The Finish Request redeems the authorization code, the request body MUST include:

    • registration_type (string, REQUIRED): "authorization_code".

    • code (string, REQUIRED): The authorization code issued by the IdP.

    • redirect_uri (string, REQUIRED): The redirect URI used in the authorization request.

    • state (string, REQUIRED): The state returned by the start request.

    The finish request reuses the stored authorization_server from the start request and the client SHOULD NOT include it again. This flow uses two Client ID Documents: the Aggregator Client ID Document and the application Client ID Document identified by the client_id URI in the aud claim. If the Aggregator Client ID Document (§ 4.2 Client ID Document) doesn’t have redirect_uris registered, the Aggregator Server MUST verify that redirect_uri in the request matches one of the redirect URIs registered in the Client ID Document of the client application. The client application’s Client ID Document MUST be the client_id URI in the aud claim of the IDP_client_token from the authorization header in the Start Request request. If the aud claim is missing or does not contain a dereferenceable client identifier, the server MUST respond with 400 Bad Request. In deployments where the client application does not have a Client ID Document, the Aggregator Server MUST require redirect_uris to be registered in the Aggregator Client ID Document which will be validated by the IDP. If the Aggregator Client ID Document has redirect_uris registered, it MAY skip the Client ID Document check of the client application. In all cases, the server MUST verify that the state matches the stored state for the pending registration.

  • registration_type: "device_code" The device_code flow uses two POST messages to the registration_endpoint and follows the OAuth 2.0 Device Authorization Grant [RFC8628]. This flow is intended for headless components (for example CLI tools) to authenticate an Aggregator Instance where the authorization_code flow is not practical. The Start Request initiates device authorization flow and the request body MUST include:

    • registration_type (string, REQUIRED): "device_code".

    • authorization_server (string, REQUIRED): The URL of the UMA Authorization Server that governs access to resources exposed by the Aggregator.

    The server MUST determine the IdP from the IDP_client_token in the authorization header (for example, via the iss claim and discovery), request device authorization at the IdP, securely store the returned device_code, and respond with 201 Created and a JSON object containing:

    • state (string, REQUIRED): Opaque value used by the client to poll completion of the device flow.

    • user_code (string, REQUIRED): The end-user verification code issued by the IdP.

    • verification_uri (string, REQUIRED): The end-user verification URI on the IdP.

    • verification_uri_complete (string, OPTIONAL): A verification URI that includes the "user_code" (or other information with the same function as the "user_code"), which is designed for non-textual transmission.

    • expires_in (number, REQUIRED): The lifetime in seconds of the device authorization session.

    • interval (number, OPTIONAL): Minimum polling interval in seconds, this may differ from the interval the IdP recommends.

    The device_code is confidential and MUST NOT be returned to the client. The Poll Request checks whether the user has authorized the device code. The client uses the state value to poll for completion, the request body MUST include:

    • registration_type (string, REQUIRED): "device_code".

    • state (string, REQUIRED): The state value returned by the start request.

    Unlike a traditional OAuth device flow, the client does not poll the IdP directly. Upon receiving a poll request, the Aggregator MUST attempt to redeem the stored device_code at the IdP token endpoint (grant_type=urn:ietf:params:oauth:grant-type:device_code). If authorization is not yet complete, the server MUST respond with 202 Accepted. When authorization succeeds, the server creates (or updates) the Aggregator Instance and responds as for other successful create/update operations (see below). If the device authorization session has expired, the server MUST respond with 400 Bad Request. The device authorization session MUST be bound to the authenticated caller.

Success responses For successful POST requests that create/update an Aggregator Instance (i.e., provision, the authorization_code finish request, and successful device_code poll requests), the server MUST respond with:

  • 201 Created when it created a new Aggregator Instance (i.e., no aggregator was provided).

  • 200 OK when it replaced the token set for an existing Aggregator Instance (i.e., aggregator was provided).

For successful POST requests that create/update an Aggregator Instance (i.e., all types except the authorization_code and device_code start requests), the response MUST be a JSON object and MUST include:

  • aggregator (string, REQUIRED): Absolute URL of the Aggregator Instance base URL that dereferences to the Aggregator Description (§ 7.1 Aggregator Description).

  • subject (string, OPTIONAL): The WebID or Client_ID for which OIDC tokens were created. This MUST be added to the response when the request was a registration type provision.

  • idp (string, OPTIONAL): The Identity Provider (IdP) that issued the OIDC tokens for the Aggregator Instance. This MUST be added to the response when the request was a registration type provision and the subject is not a WebID.

The server MUST NOT return any IdP access tokens, refresh tokens, or user credentials to the client. When a registration flow completes successfully (for example provision, authorization_code finish, or a successful device_code poll), the server obtains and stores an IdP access token for the Aggregator Instance (the IDP_aggregator_token) and an optionally accompanying refresh token. These tokens are used by the Aggregator for upstream access and are not returned to the client.

Token replacement rules To update the tokens for an existing Aggregator Instance, the client MUST include the aggregator member along with the other required members for the selected registration_type. This SHOULD only be done for the following registration_type values, for other values the server SHOULD respond with 400 Bad Request:

  • authorization_code: To obtain a new access token and refresh token after the previous ones have expired.

  • device_code: To obtain a new access token and refresh token after the previous ones have expired.

GET

This method returns the list of Aggregator Instances owned by the authenticated user. The server MUST respond with 200 OK and a JSON array containing zero or more Aggregator Description URLs (§ 7.1 Aggregator Description).

DELETE

Deletes an existing Aggregator Instance. The request body MUST use a content type listed in registration_request_formats_supported. When using application/json, the body MUST be a JSON object. When using application/x-www-form-urlencoded, the body parameters MUST be encoded as form fields with the same member names. The request body MUST include:

  • aggregator (string, REQUIRED): Absolute URL of the Aggregator Instance base URL that dereferences to the Aggregator Description (§ 7.1 Aggregator Description).

If deletion succeeds the server MUST respond with 204 No Content.

For error conditions, the server MUST respond with:

5.2. Aggregator Management Flows (Non-normative)

This section gives non-normative examples of how a client can use the registration_endpoint to create, delete, and re-authenticate an Aggregator Instance.

5.2.1. Creation provision Flow

The provision flow allows clients to create an Aggregator with its own identity. This lets resource owners target access-control policies at the aggregator’s dedicated WebID instead of having the aggregator impersonate another user’s WebID.

1. Client starts flow with Aggregator Server

The client calls the registration endpoint authenticated with its IDP_client_token.

POST /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json

{
  "registration_type": "provision"
}

2. Aggregator Server provisions an account at an IDP

The Aggregator Server provisions an account at an IDP. This might be linked to a WebID document that conforms to the WebID Profile specification [WEBID-PROFILE]. Using the credentials of this account the Aggregator Server can perform a client credentials grant to obtain the IDP_aggregator_token (and accompanying refresh token) to authorize the aggregator acting under its own WebID.

3. Aggregator Server creates an aggregator

Using the obtained tokens, the Aggregator Server creates an aggregator linked to the user, and returns the aggregator description (§ 7.1 Aggregator Description). The aggregator should not give these tokens or credentials to the client.

HTTP/1.1 201 Created
Content-Type: application/json

{
  "aggregator": "https://aggregator.example/aggregators/agg-7890/",
  "subject": "https://aggregator.example/webid#me"
}

Or with a non-WebID subject:

HTTP/1.1 201 Created
Content-Type: application/json

{
  "aggregator": "https://aggregator.example/aggregators/agg-7890/",
  "subject": "aggregator@example.com",
  "idp": "https://idp.example/"
}

5.2.2. Creation authorization_code Flow

The authorization_code flow allows clients to create an aggregator that acts on behalf of the end-user, but with a token that is scoped specifically for the aggregator. This flow follows the OAuth 2.0 Authorization Code grant [RFC6749] (https://datatracker.ietf.org/doc/html/rfc6749).

PlantUML Diagram

1. Client starts flow with Aggregator Server

The client begins by asking the Aggregator to bootstrap an authorization_code registration and indicate which authorization server should be used. The Aggregator identifies the client application from the IDP_client_token and responds with the public parameters required for the OIDC authorization request.

POST /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json

{
  "registration_type": "authorization_code",
  "authorization_server": "https://as.example"
}

1.2 Aggregator responds with public parameters

The Aggregator generates the PKCE verifier/challenge pair plus a random state, persists them together with the pending registration, and returns only the public portions (aggregator_client_id, code_challenge, code_challenge_method, state) to the client application. The authorization_server value identifies the UMA Authorization Server (AS) that governs access to resources exposed by the Aggregator. The aggregator uses the IDP_client_token to identify the user’s IdP and Application Client ID Document for the subsequent OIDC exchange.

HTTP/1.1 201 Created
Content-Type: application/json

{
  "aggregator_client_id": "https://aggregator.example/client.jsonld",
  "code_challenge": "1uLSZp2...",
  "code_challenge_method": "S256",
  "state": "1eb7c8f5..."
}

2. Client sends the end-user through the IDP authorization endpoint

Using the information supplied by the Aggregator, the client constructs an authorization request against the IdP.

GET https://idp.example/authorize?
    response_type=code&
    client_id=https%3A%2F%2Faggregator.example%2Fclient.jsonld&
    redirect_uri=https%3A%2F%2Fapp.example%2Fcallback&
    scope=openid%20webid%20offline_access&
    code_challenge=1uLSZp2...&
    code_challenge_method=S256&
    state=1eb7c8f5...

2.1 IDP dereferences the Aggregator Client ID Document

If the IDP does not already have the aggregator_client_id registered, it dereferences the Aggregator’s Client ID Document to retrieve the client metadata (for example redirect URIs and other policy-required fields).

3. User authenticates and consents at the IDP

The IDP performs its usual login and consent screens, after which it issues an authorization_code tied to the Aggregator’s client.

4. IDP redirects the user agent back to the client’s redirect_uri

The IDP redirects the user agent back to the client application with the authorization code and the original state.

HTTP/1.1 302 Found
Location: https://app.example/callback?code=SplxlOBeZQQYbYS6WxSbIA&state=1eb7c8f5...

5. Client posts the authorization code back to the Aggregator

The client sends the code, redirect URI, and echoed state to the registration endpoint so the Aggregator can finish the flow.

POST /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json

{
  "registration_type": "authorization_code",
  "code": "SplxlOBeZQQYbYS6WxSbIA",
  "redirect_uri": "https://app.example/callback",
  "state": "1eb7c8f5..."
}

5.1 Aggregator dereferences the client application’s Client ID Document

If no specific redirect URIs were given in the Client ID Document, the Aggregator dereferences the https://app.example/client.jsonld JSON-LD document to confirm the registered redirect URIs. The aggregator then verifies that the supplied redirect_uri belongs to that set and that the returned state matches the stored state.

5.2 Aggregator redeems the authorization code at the IDP token endpoint

POST /token HTTP/1.1
Host: idp.example
Content-Type: application/x-www-form-urlencoded
Authorization: Basic <aggregator-client-auth>

grant_type=authorization_code&
code=SplxlOBeZQQYbYS6WxSbIA&
redirect_uri=https%3A%2F%2Fapp.example%2Fcallback&
client_id=https%3A%2F%2Faggregator.example%2Fclient.jsonld&
code_verifier=Hjs8...stored...

The IDP verifies the authorization_code, ensures the redirect_uri matches the original authorization request, and recomputes the PKCE challenge from the supplied code_verifier. If everything matches, it returns:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "access_token": "<IDP_aggregator_token>",
  "refresh_token": "<refresh_token>",
  "token_type": "Bearer",
  "expires_in": 3600
}

5.3 Aggregator finalizes the account and responds

Using the issued tokens, the Aggregator creates the aggregator account linked to the user and returns the aggregator description (§ 7.1 Aggregator Description).

HTTP/1.1 201 Created
Content-Type: application/json

{
  "aggregator": "https://aggregator.example/aggregators/agg-6780/"
}

5.2.3. Creation device_code Flow

The device_code flow allows headless components (for example CLI tools) to authenticate an Aggregator Instance by using the OAuth 2.0 Device Authorization Grant [RFC8628].

PlantUML Diagram

1. Client starts flow with Aggregator Server

The client calls the registration endpoint authenticated with its IDP_client_token, indicating the authorization server that governs access to resources exposed by the Aggregator.

POST /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json

{
  "registration_type": "device_code",
  "authorization_server": "https://as.example"
}

1.1 Aggregator Server returns device authorization parameters

The Aggregator requests device authorization parameters from the IdP and returns the user code, verification URI(s), and polling state to the client.

HTTP/1.1 201 Created
Content-Type: application/json

{
  "state": "state-abc",
  "user_code": "WDJB-MJHT",
  "verification_uri": "https://idp.example/activate",
  "verification_uri_complete": "https://idp.example/activate?user_code=WDJB-MJHT",
  "expires_in": 600,
  "interval": 5
}

2. User authorizes at the IdP

The client prompts the user to visit the verification URI and enter the user code to authorize access.

3. Client polls Aggregator Server

The client polls the registration endpoint using the state value, waiting at least the returned interval (if provided) between polls. The Aggregator polls the IdP token endpoint using the stored device code until the user authorizes or the device code expires.

POST /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json

{
  "registration_type": "device_code",
  "state": "state-abc"
}

If authorization is not yet complete, the Aggregator responds with 202 Accepted:

HTTP/1.1 202 Accepted

Once authorization succeeds, the Aggregator creates the Aggregator Instance and responds as for other successful create operations:

HTTP/1.1 201 Created
Content-Type: application/json

{
  "aggregator": "https://aggregator.example/aggregators/agg-5670/"
}

Example server-side token polling (Non-normative)

While polling, the IdP can return an authorization-pending error:

POST /token HTTP/1.1
Host: idp.example
Content-Type: application/x-www-form-urlencoded

grant_type=urn:ietf:params:oauth:grant-type:device_code&
device_code=GmRhmhcxhwAzkoEqiMEg_DnyEysNkuNhszIySk9eS
HTTP/1.1 400 Bad Request
Content-Type: application/json

{
  "error": "authorization_pending"
}

Once the user completes authorization, the IdP returns the token set:

HTTP/1.1 200 OK
Content-Type: application/json

{
  "access_token": "<IDP_aggregator_token>",
  "refresh_token": "<refresh_token>",
  "token_type": "Bearer",
  "expires_in": 7200
}

5.2.4. Token Update Flow

This flow allows users to replace the stored access token and refresh token for an existing Aggregator Instance. This is not a refresh-token grant: even refresh tokens can expire, so the client repeats the original registration flow to obtain a new set of tokens (and refresh tokens) for the Aggregator Instance.

The flow is the same as creating an Aggregator Instance but an aggregator member is provided in the start request. The exact steps depend on the registration_type used when creating the Aggregator Instance. For example, for the authorization_code flow:

POST /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json
{
    "registration_type": "authorization_code",
    "aggregator": "https://aggregator.example/aggregators/agg-7890/"
}

5.2.5. Aggregator Listing Flow

This flow allows users to list Aggregator Instances by sending a GET request to the registration_endpoint.

GET /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
HTTP/1.1 200 OK
Content-Type: application/json

[
  "https://aggregator.example/aggregators/agg-7890/",
  "https://aggregator.example/aggregators/agg-9012/"
]

5.2.6. Aggregator Deletion Flow

This flow allows users to delete an existing Aggregator Instance by sending a DELETE request to the registration_endpoint with the aggregator member.

DELETE /registration HTTP/1.1
Authorization: Bearer <IDP_client_token>
Content-Type: application/json
{
    "aggregator": "https://aggregator.example/aggregators/agg-7890/"
}

6. Aggregator Security Model (Authentication & Authorization)

This section describes how the Aggregator handles authentication and authorization for:

The Aggregator relies on the Authorization for Data Spaces (A4DS) specification [A4DS] to authenticate Clients and authorize access to resources. For streaming or non-HTTP interfaces, the Aggregator MAY additionally use the Service Authorization for Data Spaces (SA4DS) specification [SA4DS]. All Aggregator endpoints are protected using User-Managed Access (UMA) [UMA].

In this section, the Upstream Resource Server (URS) is the Resource Server that hosts the original data the Aggregator consumes. The Upstream Authorization Server (UAS) is the Authorization Server that protects that URS and issues access tokens for those upstream resources. The Aggregator Authorization Server (AAS) is the Authorization Server that protects the Aggregator itself and issues tokens (e.g., UMA RPTs) for access to derived resources.

NOTE: The following behavior extends the A4DS specification and is intended to be incorporated in a future version of A4DS.

6.1. Upstream Access (Requesting)

Normative requirements

The Aggregator Service MUST obtain upstream access tokens according to A4DS and UMA and MUST present the ID token (IDP_aggregator_token) obtained during aggregator registration (§ 5.1 Aggregator Registration Endpoint). When the Aggregator Service intends to create derived resources, it MUST request scope urn:knows:uma:scopes:derivation-creation. The Aggregator Service MAY include a transformation description immediately. If not, the Upstream Authorization Server MAY require transformation description claims; if it does, it MUST respond with a UMA need_info error and required_claims as defined by A4DS. If the UAS requests it via need_info, the Aggregator Service MUST include a transformation description claim token using UMA claims pushing ([A4DS]). The claim_type for this description is https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#transformation-description. The claim_token_format values identify acceptable RDF serializations as URIs (e.g., http://www.w3.org/ns/formats/Turtle).

If the UAS allows, the Aggregator Service to create this derived resource it MUST include a derivation_resource_id field in the response next to the access token. This derivation_resource_id is a unique identifier that links the requested data and the intended transformation to the derived resource that the Aggregator Service will create. If the Aggregator Service previously obtained a derivation_resource_id for the same upstream resource and transformation, it SHOULD include it in the token request using the derivation_resource_id field. If the hint is still valid and bound to the authenticated Aggregator and resource, the UAS SHOULD reuse it; otherwise it SHOULD ignore it and issue a new identifier. The UAS MUST include a derivation_resource_id in successful access-token responses for derivation creation. Access tokens MAY be reused until they expire or are revoked.

The Aggregator Service MUST modify the asset ([A4DS]) representing the derived resource on the AAS with a derived_from entry containing the issuer (the UAS) and derivation_resource_id, and the AAS MUST expire previous access tokens for that derived resource. The Aggregator Service SHOULD only use the data from resource after a successful resource registration update at the AAS. When a derived resource is no longer used, the Aggregator Service SHOULD remove the derived_from entry, MAY expire previous access tokens, and SHOULD delete the derivation_resource_id at the UAS.

How should the Aggregator Service delete the derivation_resource_id at the UAS?

Non-normative flow and examples

PlantUML Diagram

1. Requesting the upstream resource without token

The Aggregator Service requests the upstream resource without an access token. If the resource is protected with UMA, the Upstream Resource Server responds with a 401 Unauthorized status and a UMA ticket.

GET /resource/123 HTTP/1.1
Host: upstream.example.org

1.1 URS requests ticket from UAS

The Upstream Resource Server requests a UMA ticket from its Upstream Authorization Server, as described in A4DS ([A4DS]).

1.2 URS returns ticket

The Upstream Resource Server returns the ticket it got from the UAS to the Aggregator Service.

HTTP/1.1 401 Unauthorized
WWW-Authenticate: UMA realm="solid", as_uri="https://upstream.as.example.org/uma", ticket="tkt-URS"

2. Requesting an access token from the UAS

Using the IDP_aggregator_token obtained during registration, the Aggregator Service requests an access token from the Upstream Authorization Server (of the Upstream Resource Server). The request typically includes the derivation-creation scope when creating derived resources. If the Upstream Authorization Server needs details about the intended transformation, it can request them using UMA need_info with a transformation claim type:

{
  "error": "need_info",
  "ticket": "d9c3b6f2-0f7a-4c6e-9d2f-1a7c4e9b8a21",
  "required_claims": [
    {
      "claim_type": "https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#transformation-description",
      "friendly_name": "Intended data transformation",
      "claim_token_format": [
        "http://www.w3.org/ns/formats/Turtle",
        "http://www.w3.org/ns/formats/JSON-LD"
      ],
      "claim_description": "Describe the transformation that will be executed on the requested data (e.g. query, mapping, model, or workflow reference)."
    }
  ]
}

The Aggregator then resubmits the token request with an additional claim token describing the intended transformation. If the service description or transformation catalogs are dereferenceable by the UAS, the Aggregator can provide those URIs rather than embedding a full description.

If the Aggregator has previously obtained a derivation_resource_id for the same upstream resource and transformation, it should include it as a hint using the derivation_resource_id parameter.

Example token request after a need_info response (includes transformation claim):

POST /token HTTP/1.1
Host: upstream.as.example.org
Content-Type: application/json
{
    "grant_type": "urn:ietf:params:oauth:grant-type:uma-ticket",
    "ticket": "tkt-URS",
    "scope": "urn:knows:uma:scopes:derivation-creation",
    "derivation_resource_id": "handle-id-1",
    "claim_tokens": [
      {
        "claim_token": "<IDP_aggregator_token>",
        "claim_token_format": "http://openid.net/specs/openid-connect-core-1_0.html#IDToken"
      },
      {
        "claim_token": "<transformation_description>",
        "claim_token_format": "http://www.w3.org/ns/formats/Turtle"
      }
    ]
}

2.1 UAS validates the IDP_aggregator_token

The Upstream Authorization Server validates the IDP_aggregator_token with the Identity Provider that issued it.

2.2 UAS returns access token and derivation resource identifier

Adding the derivation-creation scope signals to the Upstream Authorization Server that the Aggregator intends to create a derived resource based on the requested upstream resource. The Upstream Authorization Server includes a derivation_resource_id in the response. The derivation_resource_id is a unique identifier that the Aggregator will use to reference this upstream resource when creating derived resources. The Upstream Authorization Server can link this identifier to the Aggregator’s identity and intended use to manage and track derived resources.

HTTP/1.1 200 OK
Content-Type: application/json
{
    "access_token": "<upstream_access_token>",
    "derivation_resource_id": "handle-id-1"
}

3. Accessing the upstream resource with token

The Aggregator Service then requests the resource from the Upstream Resource Server using the access token obtained from the Upstream Authorization Server. This access token MAY be used multiple times until it expires or is revoked.

4. Resource registration of the Aggregator Service

Finally, the Aggregator Service updates the resource registration at its own Authorization Server to signal that it used this derivation_resource_id to create derived resources.

PUT /resource-registration/agg-service-123 HTTP/1.1
Host: as.example.org
Content-Type: application/json
{
  ...
  "derived_from": [
    {
      "issuer": "https://as.example.org",
      "derivation_resource_id": "handle-id-1"
    }
  ]
}

If a resource is no longer used, the Aggregator Service can update the resource registration to remove the derivation_resource_id from the derived_from relations and delete the identifier at the Upstream Authorization Server, following the Resource ID deletion procedure.

6.2. Client Access (Serving)

Normative requirements

All endpoints on the Aggregator SHOULD be protected using an AS defined during registration. The Aggregator acts as a Resource Server in UMA terminology. So when the Aggregator receives a request without a valid Requesting Party Token (RPT) from a client it MUST request a UMA ticket from its Authorization Server and return 401 Unauthorized with that ticket. RPT validity is discussed in more detail in [A4DS]. If this is a request for a derived resource, the Aggregator Service SHOULD make sure that the derivation_resource_id used to create the derived resource is still valid by inspecting the asset on the UAS.

How does the Aggregator Service inspect the asset on the UAS? Probably via the same endpoint as an RS would do it in A4DS. Can the aggregator also do a GET to this endpoint to verify the validity of the derivation_resource_id?

The Client SHOULD present the UMA ticket from the Aggregator response to the AAS to obtain an RPT as defined in [A4DS]. Additionally to A4DS, if this is a request to access a derived resource, the requested resource may have a derived_from entry in its registration that includes one or more issuer (the UAS) and derivation_resource_id entries. If no claims are given by the Client for these derived resources, the AAS MUST respond with a need_info error and include a required_claims list where each object MUST contain:

Maybe the resource_scopes should be send during the access token request from the Aggregator to the UAS? Maybe we could send derived_resource_scopes to the UAS and during resource registration send this to the AAS as well? This scope should be the same as the one required to access the derived resource.

If the Client does not present a valid RPT for the requested derived resource, the Aggregator Service MUST request a UMA ticket from its Authorization Server and return 401 Unauthorized with that ticket. During ticket creation, the Aggregator Service SHOULD validate that the derivation_resource_id used to create the derived resource is still valid; it MUST do this by inspecting the asset on the UAS, this is defined in [A4DS]. It should use the IDP_aggregator_token in the authorization header. If the derivation_resource_id is no longer valid, the Aggregator Service MUST treat the derived resource as invalid and MUST recreate the Aggregator Service (including its resource registrations).

If upstream access is required, the Aggregator Authorization Server MAY respond with a need_info error. In that case, it MUST include the issuer, derivation_resource_id, and resource_scopes entries in required_claims. The Client SHOULD request upstream access tokens with the required scopes (including a derivation scope) and present them as claim tokens. The Aggregator Authorization Server MUST validate provided upstream access tokens with the corresponding upstream Authorization Servers; if checks succeed, it issues an RPT for the derived resource. The Aggregator Service MUST validate or introspect the RPT before serving the derived resource. The Aggregator MAY implement SA4DS for streaming or non-HTTP interfaces.

Non-normative flow and examples

PlantUML Diagram

1. Client requests derived resource from Aggregator Service without token

The Client sends a request to the Aggregator Service for a derived resource without including a valid Requesting Party Token (RPT), or with an RPT that does not grant sufficient permission.

GET /aggregator/derived-resource-123 HTTP/1.1
Host: agg.example.org

When a Client requests access to a derived resource from the Aggregator Service, a normal UMA flow follows.

1.1 Aggregator Service requests UMA ticket from its Authorization Server

The Aggregator Service, acting as a UMA Resource Server, requests a UMA ticket for the requested derived resource from its Authorization Server.

1.2 Aggregator Service returns 401 Unauthorized with ticket

If the Client did not present a valid RPT, or if the RPT does not cover the requested permission, the Aggregator Service returns a 401 Unauthorized response containing the UMA ticket issued by its Authorization Server.

HTTP/1.1 401 Unauthorized
WWW-Authenticate: UMA realm="aggregator", as_uri="https://agg-as.example.org/", ticket="tkt-1"

2. Client presents ticket to Aggregator Authorization Server

The Client discovers the Aggregator Authorization Server (for example, via the as_uri parameter in the WWW-Authenticate header) and sends a UMA grant request to exchange the ticket for an RPT. The Client includes any claim tokens it already has (for example, its IDP_client_token) in the request.

{
  "grant_type": "urn:ietf:params:oauth:grant-type:uma-ticket",
  "ticket": "tkt-1",
  "claim_tokens": [
    {
      "claim_token": "<IDP_client_token>",
      "claim_token_format": "http://openid.net/specs/openid-connect-core-1_0.html#IDToken"
    }
  ]
}

2.1 Aggregator Authorization Server introspects Client access tokens

If access to the derived resource depends on access to upstream resources, and the Client has not yet presented suitable upstream access tokens, the Aggregator Authorization Server responds with a need_info error requesting additional claim tokens for the upstream resources. In that case, the Authorization Server adds the issuer, derivation_resource_id, and resource_scopes entries to the required_claims array to indicate which upstream Authorization Server and which resource the Client must obtain access tokens from.

{
  "error": "need_info",
  "ticket": "tkt-2",
  "required_claims": [
    {
      "claim_type": "https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#derivation-access",
      "claim_token_format": "urn:ietf:params:oauth:token-type:access_token",
      "issuer": "https://as.example.org",
      "derivation_resource_id": "handle-id-1",
      "resource_scopes": [ "urn:knows:uma:scopes:derivation-read" ]
    }
  ]
}

3. Client requests upstream access tokens

The Client then requests access tokens from the UAS with the required permissions (resource_id & resource_scopes).

{
  "grant_type": "urn:ietf:params:oauth:grant-type:uma-ticket",
  "permissions": [
    {
      "resource_id": "handle-id-1",
      "resource_scopes": [ "urn:knows:uma:scopes:derivation-read" ]
    }
  ],
  "claim_token": [
    {
      "claim_token": "<IDP_aggregator_token>",
      "claim_token_format": "http://openid.net/specs/openid-connect-core-1_0.html#IDToken"
    }
  ]
}

3.1 UAS returns upstream access tokens

The UAS validates the claims and policies and issues an access token for the derivation resource.

{
    "access_token": "<upstream_access_token>"
}

4. Client presents upstream tokens to Aggregator Authorization Server

Once the Client has obtained the required upstream access tokens, it sends another request to the Aggregator Authorization Server, presenting the upstream access tokens (here only one) as claim tokens together with the new ticket.

{
  "grant_type": "urn:ietf:params:oauth:grant-type:uma-ticket",
  "ticket": "tkt-2",
  "claim_token": [
    {
      "claim_token_format": "urn:ietf:params:oauth:token-type:access_token",
      "claim_token": "<upstream_access_token>"
    }
  ]
}

4.1 AAS verifies upstream tokens and returns RPT

The Aggregator Authorization Server verifies the provided access tokens with the upstream Authorization Servers to confirm that the Client is authorized to access the upstream resources. If all policy checks succeed, it issues an RPT for the requested derived resource.

{
    "access_token": "RPT-agg-1",
    "token_type": "Bearer",
    "expires_in": 3600
}

5. Client requests derived resource from Aggregator Service with RPT

Finally, the Client retries the original request to the Aggregator Service, this time including the RPT it obtained from the Aggregator Authorization Server.

GET /aggregator/derived-resource-123 HTTP/1.1
Host: agg.example.org
Authorization: Bearer RPT-agg-1

5.1 Aggregator Service introspects RPT

The Aggregator Service introspects the RPT with the Aggregator Authorization Server to validate it and determine the permissions granted to the Client.

POST /introspect HTTP/1.1
Host: agg-as.example.org
Accept: application/json
Content-Type: application/x-www-form-urlencoded
token=RPT-agg-1

5.2 AAS returns authorization result

The Aggregator Authorization Server responds to the introspection request indicating whether the RPT is active and what permissions it grants.

{
    "active": true,
}

5.3 Aggregator Service returns derived resource

The Aggregator Service returns the requested derived resource to the Client.

7. Aggregator Metadata

This endpoint provides metadata about the Aggregator Instance. Deployments MAY choose arbitrary paths for instance-level endpoints. The Aggregator Metadata representation MUST include absolute URLs for those resources (e.g., the transformation_catalog and service_collection_endpoint fields) so clients can discover the deployment-specific layout.

7.1. Aggregator Description

The Aggregator Metadata resource (aggregator-url) allows clients to retrieve the current status of their aggregator. This endpoint MUST be guarded by the authentication and authorization mechanisms described in the § 6 Aggregator Security Model (Authentication & Authorization). It MUST be accessible as JSON using application/json and MAY additionally expose semantically annotated RDF representations (for example JSON-LD or Turtle) using HTTP content negotiation based on [RFC9110].

The endpoint MUST return at least the following information about the aggregator, but additional fields MAY be included as needed. Each field SHOULD be expressed using the RDF properties defined in § 9 Vocabulary so the document MAY be served as JSON-LD or other RDF formats. In semantically annotated representations, the Aggregator Description MUST state that the described resource has RDF type aggr:Aggregator (§ 9.1.1 aggr:Aggregator) (e.g., via @type in JSON-LD or a aggr:Aggregator in Turtle). Clients MAY rely on this type statement when consuming semantic representations.

id (OPTIONAL):

The value is a string containing the absolute URL that identifies the Aggregator Instance (typically the instance base URL itself); in the RDF representations, this is the RDF subject (i.e., @id) of the aggr:Aggregator resource (§ 9.1.1 aggr:Aggregator).

created_at (REQUIRED):

The value is a string timestamp (recommended: xsd:dateTime lexical form, e.g., RFC 3339 [RFC3339]); in the RDF representations, this member maps to the predicate aggr:createdAt (§ 9.2.1 aggr:createdAt).

login_status (REQUIRED):

The value is a boolean that indicates whether the stored token set for the aggregator is currently valid; in the RDF representations, this member maps to the predicate aggr:loginStatus (§ 9.2.2 aggr:loginStatus).

token_expiry (OPTIONAL):

The value is a string timestamp indicating when the aggregator’s access token will expire (recommended: xsd:dateTime lexical form, e.g., RFC 3339 [RFC3339]); in the RDF representations, this member maps to the predicate aggr:tokenExpiry (§ 9.2.3 aggr:tokenExpiry).

transformation_catalog (REQUIRED):

The value is a string containing the absolute URL of the instance’s Transformations Endpoint (§ 7.2 Instance-level Transformations Catalog); in the RDF representations, this member maps to the predicate aggr:transformationsEndpoint (§ 9.2.4 aggr:transformationsEndpoint).

service_collection_endpoint (REQUIRED):

The value is a string containing the absolute URL of the instance’s Service Collection to create and fetch the Aggregator Services (§ 8.1 Service Collection Endpoint); in the RDF representations, this member maps to the predicate aggr:serviceCollectionEndpoint (§ 9.2.5 aggr:serviceCollectionEndpoint).

This document MAY be the WebID of the Aggregator Instance when the provision flow § 5.2.1 Creation provision Flow was used. In that case this document MUST be an RDF document that conforms to the WebID Profile specification [WEBID-PROFILE].

{
  "@context": {
    "id": "@id",
    "aggr": "https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#",
    "oidcIssuer": "http://www.w3.org/ns/solid/terms#oidcIssuer",
    "xsd": "http://www.w3.org/2001/XMLSchema#",

    "created_at": {
      "@id": "aggr:createdAt",
      "@type": "xsd:dateTime"
    },
    "login_status": {
      "@id": "aggr:loginStatus",
      "@type": "xsd:boolean"
    },
    "token_expiry": {
      "@id": "aggr:tokenExpiry",
      "@type": "xsd:dateTime"
    },
    "transformation_catalog": {
      "@id": "aggr:transformationsEndpoint",
      "@type": "@id"
    },
    "service_collection_endpoint": {
      "@id": "aggr:serviceCollectionEndpoint",
      "@type": "@id"
    }
  },
  "id": "https://aggregator.example/aggregators/agg-7890/",
  "@type": "aggr:Aggregator",
  "created_at": "2025-12-17T17:20:00Z",
  "login_status": true,
  "token_expiry": "2025-12-17T18:20:00Z",
  "transformation_catalog": "https://aggregator.example/aggregators/agg-7890/transformations",
  "service_collection_endpoint": "https://aggregator.example/aggregators/agg-7890/config/services",
  "oidcIssuer": "https://issuer.example/"
}

7.2. Instance-level Transformations Catalog

This endpoint is the instance-level extension on the public Transformation Catalog defined in § 4.3 Server-level Transformation Catalog. Implementations MAY expose it at any URL, but the Aggregator Description MUST advertise the correct location via its transformation_catalog field (aggr:transformationsEndpoint). It allows aggregator implementations to expose user-specific transformations, or protected/private transformations. This endpoint, contrasting to the public Transformation Catalog, MAY be user specific and MUST require authentication. The endpoint MAY return a 404 Not Found if no user-specific transformations are available. A client SHOULD combine the information from this endpoint with the public Transformation Catalog to get a complete view on the available transformations. The endpoint follows the same content negotiation rules and other requirements as the public Transformation Catalog.

8. Aggregator Service Management

The Aggregator Service Management endpoint gives an authenticated client a complete view on the Aggregator Services. It lets an authenticated client see all the running Aggregator Services and create, inspect, and remove individual Aggregator Services. Implementations MAY use different URLs as long as the Aggregator Description document (described in § 7.1 Aggregator Description) links to the concrete entry points. All described endpoints in this section MUST be protected by the authentication and authorization mechanisms defined in § 6 Aggregator Security Model (Authentication & Authorization).

8.1. Service Collection Endpoint

Do we define the exact scopes here?

This endpoint allows to create and fetch the collection of configured Aggregator Services. The location of this resource is advertised in the Aggregator Description (§ 7.1 Aggregator Description) via the service_collection_endpoint field. Clients MUST treat that advertised URL as authoritative and MUST NOT assume a fixed path (the examples in this section use /services purely for illustration). The Aggregator MUST register this UMA resource with the Authorization Server and advertise the read and create scopes so that clients can both inspect and add members.

HEAD

If the Service Collection exists, the server MUST respond with 200 OK, Content-Type: application/json, and an ETag header whose value changes whenever a service is added or removed. The ETag allows clients to detect collection changes without re-downloading it. This request MUST be authorized with a read scope on the Service Collection resource.

GET

Returns the list of Aggregator Service resources (§ 8.2 Service Resource). The server MUST set the same ETag value as the HEAD response. This request MUST be authorized with a read scope on the Service Collection resource. The server MUST support JSON and MAY additionally expose semantically annotated RDF representations (for example JSON-LD or Turtle) using HTTP content negotiation based on [RFC9110]. The payload MUST include at least the following fields:

services (REQUIRED):

The value is a JSON array of strings where each member MUST be an absolute URL of a Service Resource that can be dereferenced by the client. In semantically annotated representations, this member maps to the predicate aggr:service (§ 9.2.6 aggr:service).

id (OPTIONAL):

The value is a string containing the absolute URL that identifies this Service Collection. This MUST be the same URL as the request target. In semantically annotated representations, this is the RDF subject (i.e., @id) of the aggr:ServiceCollection resource (§ 9.1.4 aggr:ServiceCollection).

{
  "@context": {
    "id": "@id",
    "services": {
      "@id": "aggr:service",
      "@container": "@set",
      "@type": "@id"
    },
    "aggr": "https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#"
  },
  "id": "https://aggregator.example.org/services",
  "services": [
    "https://aggregator.example.org/services/410b093c-04b3-4fac-87be-4d393f40b2e5",
    "https://aggregator.example.org/services/42"
  ]
}
POST

This request allows a client to create a new Aggregator Service. The request MUST be authorized with a create scope on the Service Collection resource. The request body MUST contain an fno:Execution that references a transformation from the public or instance-level transformation catalog, unless it references a client-provided fno:Composition handled as described below (implementations MAY decide on the exact semantic media type using HTTP content negotiation based on [RFC9110]). This execution description MUST conform to the FnO specification [FNO] and MUST include exactly one fno:executes with the IRI of the function to execute. The client SHOULD either use a blank node or an IRI as the subject of the fno:Execution, using an IRI allows the client to suggest a specific identifier for the created service. Servers MAY honor this suggested identifier; if they do, the created service URL MUST equal the suggested IRI. Servers that do not honor client-suggested identifiers MUST ignore the suggestion and generate their own identifier. If fno:executes references a fno:Composition provided by the client, the server MUST either reject the request with 400 Bad Request or accept it and publish the composition in the instance-level transformation catalog. On success, the server MUST:

  1. Persist the new service and change the collection ETag.

  2. Register a new UMA resource for the created Service Resource URL (e.g., /services/{service_id}) with the read and delete scopes so the creator—or any other party with an RPT containing those scopes—can manage the service.

  3. Return 201 Created, with the Aggregator Service resource representation (as defined in § 8.2 Service Resource) in the response body.

If the request body is invalid the server MUST respond with 400 Bad Request. Failures while instantiating the service MUST result in 500 Internal Server Error. If the server honors a suggested identifier and the suggested identifier is not available, the server MUST respond with 409 Conflict.

8.2. Service Resource

Operations on the service resource MUST require the read scope for HEAD and GET requests and the delete scope for DELETE requests. The service resource URL MUST be one of the URLs returned by the collection resource; a request for a non-existent service MUST return 404 Not Found, while malformed service URLs MUST yield 400 Bad Request. In semantically annotated representations, the service resource MUST also be typed as fno:Execution.

HEAD

If the Service Resource exists, the server MUST respond with 200 OK and ETag, and Content-Type headers whose value MUST change whenever the service state changes.

GET

If no Accept header was set by the user, a JSON representation of the service MUST be returned with 200 OK and with a Content-Type: application/json. The user MAY request semantic representations using HTTP content negotiation based on [RFC9110]. The representation MUST include at least the following fields:

id (REQUIRED):

The value is a string containing the absolute URL of this Service Resource. In semantically annotated representations, this is the RDF subject (i.e., @id) of the aggr:Service class (§ 9.1.2 aggr:Service).

type (REQUIRED):

The value is a array of strings indicating the RDF types of this service. This array MUST include aggr:Service and fno:Execution. In semantically annotated representations, this represents the type (@type in JSON-LD) of the service.

status (REQUIRED):

The value is a string indicating the current status of the service (e.g., "starting", "running", "stopped", or "errored"). In semantically annotated representations, this member maps to the predicate aggr:status (§ 9.2.8 aggr:status).

status_detail (OPTIONAL):

The value is a string providing a human-readable detail about the current status (for example, a stop reason or error message). When status is "errored", the server SHOULD include this field. In semantically annotated representations, this member maps to the predicate aggr:statusDetail (§ 9.2.9 aggr:statusDetail).

created_at (REQUIRED):

The value is a string timestamp (recommended: xsd:dateTime lexical form, e.g., RFC 3339 [RFC3339]). In semantically annotated representations, this member maps to the predicate aggr:createdAt (§ 9.2.1 aggr:createdAt).

executes (REQUIRED):

The value is a string containing the IRI of the FnO function being executed. In semantically annotated representations, this member maps to the predicate fno:executes.

The representation MUST also include any required input and output parameters for the executed FnO function, using the FnO parameters and outputs predicates defined in the function’s FnO description. The representation MAY include additional fields (e.g., fno:name, fno:solves, etc.) as needed. The specification does not mandate a specific serialization for these additional fields; clients MUST be prepared to handle arbitrary RDF properties in semantically annotated representations.

DELETE

Stops and removes the service. The Aggregator MUST stop the running transformation, delete the persisted service entry, change the collection ETag, unregister the service’s UMA resource, and respond with 204 No Content. Clients that held the service identifier MUST treat it as invalid after receiving the success response.

8.3. Service Management Flows (Non-normative)

This section gives some examples on how a client can create, find, use and delete services on the Aggregator. This section is non-normative, and is only meant to illustrate the usage of the various endpoints defined in this specification. This section assumes the client has already created an Aggregator using the Aggregator Registration API (§ 5.1 Aggregator Registration Endpoint) and is able to authenticate using the mechanisms defined in § 6 Aggregator Security Model (Authentication & Authorization).

8.3.1. Creating a Service

To create a new Aggregator Service, a client starts by doing a POST request to the Service Collection endpoint (i.e., the URL advertised via service_collection_endpoint in the Aggregator Description; this section uses /services as an example). The body of the post is an execution of an FnO function [FNO].

POST /services HTTP/1.1
Host: aggregator.example.org
Content-Type: text/turtle

@prefix trans: <http://aggregator.example.org/transformations#> .
@prefix fno: <https://w3id.org/function/ontology#> .

_:execution a fno:Execution ;
    fno:executes trans:AggregateSources ;
    trans:sources ( <http://example.org/source/1> <http://example.org/source/2> ) .

The Aggregator fetches a ticket from the Authorization Server with the resource_id 1a2b-creation-endpoint it got during asset creation (see [A4DS]).

HTTP/1.1 /ticket
Host: as.example.org
Content-Type: application/json
{
    "resource_id": "1a2b-creation-endpoint",
    "resource_scopes": ["https://example.org/modes/create"]
}

This returns a ticket that represents the permissions needed for this request to the RS (the Aggregator in this case).

HTTP/1.1 200 OK
Content-Type: application/json
{
    "ticket": "service-creation-ticket-xyz"
}

This ticket is then returned to the client in a 401 Unauthorized response.

HTTP/1.1 401 Unauthorized
WWW-Authenticate: UMA as_uri="https://as.example.org", ticket="service-creation-ticket-xyz"

The client then requests an RPT from the AS using the ticket, as defined in § 6 Aggregator Security Model (Authentication & Authorization). The original request can then be retried, this time including the RPT in the Authorization header.

POST /services HTTP/1.1
Host: aggregator.example.org
Authorization: Bearer ey...
Content-Type: text/turtle

@prefix trans: <http://aggregator.example.org/transformations#> .
@prefix fno: <https://w3id.org/function/ontology#> .

_:execution a fno:Execution ;
    fno:executes trans:AggregateSources ;
    trans:sources ( <http://example.org/source/1> <http://example.org/source/2> ) .

If the request is valid, the Aggregator will create a new service, register the appropriate UMA resource, and return a 201 Created response with the service representation in the body.

HTTP/1.1 201 Created
Content-Type: text/turtle

@prefix aggr: <https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#> .
@prefix fno: <https://w3id.org/function/ontology#> .
@prefix trans: <http://aggregator.example.org/transformations#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://aggregator.example.org/services/410b093c-04b3-4fac-87be-4d393f40b2e5>
    a aggr:Service ;
    a fno:Execution ;
    aggr:status "running" ;
    aggr:statusDetail "" ;
    aggr:createdAt "2024-01-01T12:00:00Z"^^xsd:dateTime ;
    fno:executes trans:AggregateSources ;
    trans:sources ( <http://example.org/source/1> <http://example.org/source/2> ) ;
    trans:result ( <https://aggregator.example.org/410b093c-04b3-4fac-87be-4d393f40b2e5/> ) .

8.3.2. Discovering Services

To discover the services currently registered on the Aggregator, a client can do a GET request to the Service Collection endpoint (this section uses /services as an example). After authenticating using the mechanisms defined in § 6 Aggregator Security Model (Authentication & Authorization), the Aggregator will return a list with the registered service from § 8.3.1 Creating a Service.

HTTP/1.1 200 OK
Content-Type: text/turtle

@prefix aggr: <https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<> a aggr:ServiceCollection ;
    aggr:service <https://aggregator.example.org/services/410b093c-04b3-4fac-87be-4d393f40b2e5> .

Dereferencing this URL will return the full service representation.

HTTP/1.1 201 Created
Content-Type: text/turtle

@prefix aggr: <https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#> .
@prefix fno: <https://w3id.org/function/ontology#> .
@prefix trans: <http://aggregator.example.org/transformations#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<https://aggregator.example.org/services/410b093c-04b3-4fac-87be-4d393f40b2e5>
    a aggr:Service ;
    a fno:Execution ;
    aggr:status "starting" ;
    aggr:statusDetail "" ;
    aggr:createdAt "2024-01-01T12:00:00Z"^^xsd:dateTime ;
    fno:executes trans:AggregateSources ;
    trans:sources ( <http://example.org/source/1> <http://example.org/source/2> ) ;
    trans:result ( <https://aggregator.example.org/410b093c-04b3-4fac-87be-4d393f40b2e5/> ) .

8.3.3. Using a Simple Service

After discovering the service, the client can use the transformation catalog to find out what trans:AggregateSources does, and how to use the output. In this example, the client sees that the function produces an output parameter trans:result that contains a URL (https://aggregator.example.org/410b093c-04b3-4fac-87be-4d393f40b2e5/) where the aggregated data can be accessed. The client can then do a GET request to that URL, again authenticating using the mechanisms defined in § 6 Aggregator Security Model (Authentication & Authorization). This time the client might need to gather multiple access claims from the AS’s of http://example.org/source/1 and http://example.org/source/2. After receiving an access token from the Aggregator Authorization Server, the client can redo the GET request to https://aggregator.example.org/410b093c-04b3-4fac-87be-4d393f40b2e5/ to get the result of the aggregation.

9. Vocabulary

The Aggregator vocabulary is defined in the aggr: namespace (https://spec.knows.idlab.ugent.be/aggregator-protocol/latest/#). The following classes and properties are used throughout this specification.

9.1. Classes

9.1.1. aggr:Aggregator

Describes an Aggregator Instance (its base URL is the Aggregator Description resource).

type: rdfs:Class
subClassOf: schema:Service
subClassOf: foaf:Agent

9.1.2. aggr:Service

Represents a configured Aggregator pipeline that can be created, inspected, and removed via the Service Management API (e.g., /config/services/{service_id}).

type: rdfs:Class
subClassOf: prov:Activity, fno:Execution

9.1.3. aggr:AggregatorServer

Describes the Aggregator Server Description document that advertises discovery metadata.

type: rdfs:Class
subClassOf: schema:Service

9.1.4. aggr:ServiceCollection

Describes the service collection resource (e.g., /config/services).

type: rdfs:Class
subClassOf: schema:Collection, hydra:Collection

9.1.5. aggr:TransformationCollection

Describes a transformation catalog resource that lists the transformations supported by an Aggregator Server (and optionally instance-specific transformations).

type: rdfs:Class
subClassOf: schema:Collection, hydra:Collection

9.1.6. aggr:RegistrationFlow

Describes a registration flow supported by an Aggregator Server. This specification models each flow as an RDF class so that aggr:supportedRegistrationType can be semantically annotated by referencing the relevant flow class (e.g., aggr:AuthorizationCodeFlow).

type: rdfs:Class
subClassOf: rdfs:Class

9.1.7. aggr:NoAuthFlow

Registration flow where the Aggregator Instance does not authenticate and only accesses public resources.

type: rdfs:Class
subClassOf: aggr:RegistrationFlow

9.1.8. aggr:ProvisionFlow

Registration flow where the Aggregator Server provisions an Aggregator Instance with its own identity.

type: rdfs:Class
subClassOf: aggr:RegistrationFlow

9.1.9. aggr:AuthorizationCodeFlow

Registration flow based on OAuth 2.0 Authorization Code [RFC6749] (https://datatracker.ietf.org/doc/html/rfc6749) (via OpenID Connect), where the Aggregator acts on behalf of an end-user with a token scoped to the Aggregator.

type: rdfs:Class
subClassOf: aggr:RegistrationFlow

9.1.10. aggr:DeviceCodeFlow

Registration flow based on OAuth 2.0 Device Authorization Grant [RFC8628].

type: rdfs:Class
subClassOf: aggr:RegistrationFlow

9.2. Properties

9.2.1. aggr:createdAt

Timestamp when an aggr:Aggregator or aggr:Service was created.

type: rdf:Property
domain: aggr:Aggregator, aggr:Service
range: xsd:dateTime

9.2.2. aggr:loginStatus

Indicates whether the stored token set for an aggr:Aggregator is currently valid.

type: rdf:Property
domain: aggr:Aggregator
range: xsd:boolean

9.2.3. aggr:tokenExpiry

Timestamp when the current access token for an aggr:Aggregator will expire.

type: rdf:Property
domain: aggr:Aggregator
range: xsd:dateTime

9.2.4. aggr:transformationsEndpoint

Links an aggr:Aggregator to its (possibly private) transformations endpoint.

type: rdf:Property
domain: aggr:Aggregator
range: xsd:anyURI

9.2.5. aggr:serviceCollectionEndpoint

Links an aggr:Aggregator to its service collection endpoint.

type: rdf:Property
domain: aggr:Aggregator
range: xsd:anyURI

9.2.6. aggr:service

Links an aggr:ServiceCollection to the aggr:Service instances it advertises.

type: rdf:Property
domain: aggr:ServiceCollection
range: aggr:Service

9.2.7. aggr:hasTransformation

Links an aggr:TransformationCollection to the transformations it advertises.

type: rdf:Property
domain: aggr:TransformationCollection
range: fno:Function

9.2.8. aggr:status

Provides the lifecycle phase of an aggr:Service (values such as running, stopped, or error).

type: rdf:Property
domain: aggr:Service
range: xsd:string

9.2.9. aggr:statusDetail

Provides a human-readable explanation of the current aggr:Service status (for example, a stop reason or error message).

type: rdf:Property
domain: aggr:Service
range: xsd:string

9.2.10. aggr:registrationEndpoint

Links an aggr:AggregatorServer to its registration endpoint.

type: rdf:Property
domain: aggr:AggregatorServer
range: xsd:anyURI

9.2.11. aggr:supportedRegistrationType

Lists the registration flows advertised by an aggr:AggregatorServer.

type: rdf:Property
domain: aggr:AggregatorServer
range: rdfs:Class (expected to be an aggr:RegistrationFlow class)

9.2.12. aggr:registrationRequestFormatSupported

Lists the supported request formats for an aggr:AggregatorServer registration endpoint.

type: rdf:Property
domain: aggr:AggregatorServer
range: xsd:string

9.2.13. aggr:specVersion

States which version of this specification an aggr:AggregatorServer implements.

type: rdf:Property
domain: aggr:AggregatorServer
range: xsd:string

9.2.14. aggr:clientIdentifier

Links an aggr:AggregatorServer to its Client ID Document.

type: rdf:Property
domain: aggr:AggregatorServer
range: xsd:anyURI

9.2.15. aggr:transformationCatalog

References the Aggregator Server’s public transformation catalog.

type: rdf:Property
domain: aggr:AggregatorServer
range: xsd:anyURI

9.3. Claim Types

9.3.1. aggr:transformation-description

Identifier for the UMA claim_type used to request or provide a transformation description.

type: rdfs:Resource

Claim tokens of this type MUST be RDF descriptions of the intended transformation (for example an fno:Execution or a reference to a transformation catalog entry). Acceptable claim_token_format values are URIs identifying RDF serializations (such as http://www.w3.org/ns/formats/Turtle and http://www.w3.org/ns/formats/JSON-LD).

9.3.2. aggr:derivation-access

Identifier for the UMA claim_type used to request or provide upstream access tokens for derived resources.

type: rdfs:Resource

Claim tokens of this type MUST be access tokens issued by the upstream Authorization Server for the derivation_resource_id referenced in the claim request.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[A4DS]
Authorization for Data Spaces (A4DS). URL: https://spec.knows.idlab.ugent.be/A4DS/L1/latest/
[Client-ID]
Aaron Parecki; Emelia Smith. OAuth Client ID Metadata Document. 10 January 2025. Internet-Draft. URL: https://datatracker.ietf.org/doc/draft-parecki-oauth-client-id-metadata-document/
[FNO]
Function Ontology (FnO). URL: https://w3id.org/function/spec/
[OIDC-Core]
OpenID Connect Core 1.0. URL: https://openid.net/specs/openid-connect-core-1_0.html
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SA4DS]
Service Authorization for Data Spaces (SA4DS). URL: https://solid.github.io/service-authorization-for-data-spaces/
[SEMVER]
Semantic Versioning 2.0.0. URL: https://semver.org/
[UMA]
User-Managed Access (UMA) 2.0. URL: https://docs.kantarainitiative.org/uma/rec-uma-core.html
[WEBID-PROFILE]
WebID Profile. URL: https://solid.github.io/webid-profile/

Informative References

[RFC3339]
G. Klyne; C. Newman. Date and Time on the Internet: Timestamps. July 2002. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc3339
[RFC6749]
D. Hardt, Ed.. The OAuth 2.0 Authorization Framework. October 2012. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc6749
[RFC6750]
M. Jones; D. Hardt. The OAuth 2.0 Authorization Framework: Bearer Token Usage. October 2012. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc6750
[RFC7636]
N. Sakimura, Ed.; J. Bradley; N. Agarwal. Proof Key for Code Exchange by OAuth Public Clients. September 2015. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc7636
[RFC8628]
W. Denniss; et al. OAuth 2.0 Device Authorization Grant. August 2019. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc8628
[RFC9110]
R. Fielding, Ed.; M. Nottingham, Ed.; J. Reschke, Ed.. HTTP Semantics. June 2022. Internet Standard. URL: https://httpwg.org/specs/rfc9110.html

Issues Index

should we allow a name field for the aggregator during creation?
How should the Aggregator Service delete the derivation_resource_id at the UAS?
How does the Aggregator Service inspect the asset on the UAS? Probably via the same endpoint as an RS would do it in A4DS. Can the aggregator also do a GET to this endpoint to verify the validity of the derivation_resource_id?
Maybe the resource_scopes should be send during the access token request from the Aggregator to the UAS? Maybe we could send derived_resource_scopes to the UAS and during resource registration send this to the AAS as well? This scope should be the same as the one required to access the derived resource.
Do we define the exact scopes here?