LDT Interoperability Blueprint

Living Document,

This version:
https://spec.knows.idlab.ugent.be/ldt-interoperability-blueprint/all/b538bb5acbaa362547db0de68f0701470e4f7e8c
Previous Versions:
Issue Tracking:
GitHub
Editors:
(Ghent University - imec)
Sille Sepp (TalTech)
Laura Riou (Cerema)
Lucas Vieira Magalhães (LIST)
Thimo Thoeye (OASC)
Not Ready For Implementation

This spec is not yet ready for implementation. It exists in this repository to record the ideas and promote discussion.

Before attempting to implement this spec, please contact the editors.


Abstract

To-do

1. Introduction

The Interoperability Blueprint provides business, organisational, and technical guidelines for building our own Local Digital Twin (LDT) and interconnecting with other LDTs. An LDT is a digital representation of physical assets, systems, or processes in a defined local context (for example, city, district, building, industry, port, and airport). LDTs are based on structured data models and contextualised data. They leverage either historical data, near real-time data, or real-time data, and they enable visualisation, analysis, simulation, and reasoning services that support decision-making. First, we will elaborate on why LDTs can be useful, followed by a high-level overview of an LDT. Next, we will explain the importance of being able to add new components to an LDT (Section 1.3) and discover existing components that can be added to an LDT (Section 1.4). Finally, we will discuss how this blueprint aligns with Minimum Interoperability Mechanisms (MIMs) Plus (Section 1.5) and how data spaces and LDTs are related (Section 1.6). The remainder of this deliverable is structured as follows:

Although this blueprint mainly guides us through the process of building a new LDT with the focus on interoperability, we can also use its recommendations for updating an existing LDT and interlinking LDTs. The scope of the blueprint isn’t to offer guidelines for every use case in which an LDT is being built or has been built, as every LDT operates in a very specific context.

1.1. Why use an LDT

An LDT is a virtual replica of a territory based on structured data models, real-time data feeds, and potentially 3D representations, which can integrate simulation models (flows, usage, impacts, and so on). It aims to dynamically represent the territory at various spatial and temporal scales to analyse, understand, anticipate, and simulate the effects of public policies, environmental hazards, climate change, development projects or disruptions. The digital twin supports strategic decision-making, consultation, foresight, and scenario design. It can even include automated decision-making and execution. It can incorporate historical datasets, time-delayed data, Building Information Modelling (BIM), Geographic Information Systems (GIS), and/or specific models (mobility, climate, energy, and so on). LDTs can offer the following capabilities to users:

Note that we do not derive this definition of an LDT from a formal international standard. Instead, it is a project-specific conceptualisation influenced by existing frameworks such as Digital Twins, the EU LDT Toolbox, MIMs Plus 8 on Local Digital Twins, Smart City models, and Data Spaces. While relevant standards (for example, ISO 23247, ISO/IEC 30182, and ISO 30173) provide partial guidance, there is currently no universally accepted standard definition for LDTs in the context of urban or territorial systems.

1.2. High-level overview of an LDT

An LDT has three levels, Base, Core, and Surface (see Figure 1), representing a logical data pipeline from raw data ingestion to end-user interaction, as proposed in Deliverable 3.8:

TO-DO
High-level overview of an LDT. An LDT contains 7 high-level components: Data sources, Data Acquisition, Data management, Analysis, Visualisations, Decision-making, and Connectivity.

We describe each component below:

A Service is not a standalone component itself, but consists of a logical composition of the analysis, visualisations, and decision-making tools. A single service may include multiple visualisations, while individual components (for example, visualisations or analysis modules) can be reused across multiple services. A Service represents a higher-level abstraction that delivers value to end-users by orchestrating multiple underlying components.

1.3. Adding a new Surface-level component to an existing LDT, reusing Core and Base-level components

Once we have built an LDT, it’s important that we can still add new Surface-level components (SLCs) to ensure that new needs, coming from new and updated use cases, can be fulfilled (see Figure 2). While adding new SLCs, we want to

TO-DO
We design an LDT in such a way that we can easily add new visualisations, new analysis, and new decision-making tools.

Each SLC might have specific requirements regarding different aspects, including, but not limited to

Our LDT might not meet these requirements out of the box, but the LDT’s design should limit the necessary changes to business, organisational, and technical aspects to fulfil these requirements. Some examples of new SLCs are

1.4. Discovering which Surface-level components work on which existing LDTs and exchanging components through interconnected LDTs

While § 1.3 Adding a new Surface-level component to an existing LDT, reusing Core and Base-level components focuses on integrating a selected Surface-level component (SLC) into an existing LDT, interoperability should also support an earlier step: identifying which SLCs are suitable candidates in the first place. As increasing numbers of SLCs are developed by cities, service providers, and research initiatives, ideally, LDTs should be able to discover which of these components are compatible with their existing Base and Core-level components, and what adaptations would be required where full compatibility is not available (see Figure 3).

TO-DO
We want to be able to automatically discover which new SLCs can work on top of an existing LDT.

This is important because testing and deploying a new SLC can be costly in terms of time, effort, and governance. Before starting integration work, an LDT should therefore be able to assess whether a candidate SLC is likely to work with its available data, interfaces, policies, and organisational conditions. Such an assessment may consider, for example, the data models used, supported data exchange types, provenance and observability requirements, access and usage policies, and deployment responsibilities. Rather than repeating the full integration process for every candidate, the LDT should be able to compare its existing capabilities with the requirements published by the SLC and determine whether the component is

This discoverability is especially valuable in interconnected LDT ecosystems, where components developed in one context may be reused in another. For example, air pollution or urban heat island visualisations developed for one city could be reused by other cities if those LDTs can determine that the required data and interfaces are available. Similarly, AI models for traffic congestion prediction and impact analysis may be useful beyond the city in which they were originally developed, provided that other LDTs can assess whether their data, policies, and deployment conditions are sufficient to support them.

At the same time, discoverability is not only beneficial for city-to-city reuse. It can also support the emergence of an ecosystem in which external service providers offer new capabilities to existing LDTs. If an LDT describes the requirements, interfaces, and policies of SLCs in a consistent and machine-readable way, providers can develop reusable components that are easier to adopt across multiple LDTs. Additionally, LDT operators can more easily identify which offerings match their needs and constraints. In this way, interoperability helps create the conditions for a broader market of reusable LDT services and components, reducing duplication of effort and accelerating innovation.

By enabling this kind of discovery and compatibility assessment, the LDT reduces the cost and uncertainty of experimentation, supports both public-sector reuse and market-based innovation, and strengthens interoperability across interconnected LDTs.

1.5. Alignment with MIMs Plus

The MIMs Plus emerged as part of the Living-in.EU movement to enable a minimal but sufficient level of interoperability of data, systems, and services, particularly in the context of smart city solutions. By facilitating this minimum level, MIMs Plus contributes to the development of a coherent global market and collaboration focused on solutions, services, and data. They are not closed standards, but evolving recommendations, co-developed with local and regional authorities and interested stakeholders, and aligned with European frameworks.

Each of the 9 MIMs identifies an area in which interoperable mechanisms need to be put in place. At the time of publication of this deliverable, the MIMs Plus' framework is at version 8 and distinguishes between two main categories: foundational MIMs, which provide essential functionality for data interoperability within a city’s data ecosystem; application-specific MIMs, which will enhance the functionality of the data ecosystem by introducing interoperability in specific application areas. For this deliverable, it is important to highlight the application-based MIM8 - Local Digital Twin. MIM8 describes the following 6 layers:

Layer 1 (Data acquisition) aligns with the Data Acquisition component of our high-level overview of an LDT; layer 2 (Connectivity) with the Connectivity and Data Management component; layer 3 (Data pre-processing) with the Data acquisition and Data Management components; layer 4 (Analysis and simulation) with the Analysis component, which includes simulation; layer 5 (Communication of results) with the Visualisation component; and layer 6 (Decision-making) with the Visualisations (support), Analysis (automation), and Decision-Making (prescription, support) components (see Figure 4).

TO-DO
The high-level components align with the MIMs.

Next to defining the typical layers of an LDT, MIM8 provides more details on an LDT’s capabilities: LDTs are able to share data with other data ecosystems, LDTs are able to share services, and LDTs are able to share SLCs. Moreover, other MIMs, notably the Foundational MIMs, provide interoperability mechanisms dealing with Accessing Data (MIM0), which aligns with the Data Acquisition component; Interlinking Data (MIM1) and Representing Data (MIM2), which align with the Data Acquisition and Data Management components; Sharing Data (MIM3) and Securing Data (MIM6), which align with the Data Management and Connectivity component.

1.6. Relationship between data spaces and LDTs

An LDT can participate in a data space and may assume multiple roles, including both data consumer and data provider. As a data consumer, an LDT can access data sources through a trusted and governed ecosystem (see Figure 5). This allows the LDT to ingest primary data, such as IoT data streams, from multiple organisations and territories, thereby extending its reach beyond the systems directly operated within its own local context. In addition, an LDT can obtain more elaborate data products through a data space, including outputs generated by external systems and services. These may enrich the LDT’s own simulations, predictions, and analyses. In this sense, a data space is not only a mechanism for exchanging raw data, but also an enabler for building more complex systems by combining the outputs of multiple services in a governed way.

TO-DO
LDTs can use data from a data space.

As a data provider, an LDT can also expose part of its own outputs to other participants in the data space. Although the outputs of an LDT are intrinsically linked to the territory with which it is associated, this does not mean that they are useful only within that single LDT. Several LDTs may cover the same territory while focusing on different dimensions of it, and these perspectives are not mutually exclusive. Therefore, the outputs of one LDT may become valuable inputs for another. In particular, LDTs can share analysis results, predictions, simulations, and derived datasets through a data space so that other systems can reuse them under agreed governance and usage conditions. For example, one LDT may focus on traffic and provide traffic predictions, simulations, and analysis tools for a given territory. Another LDT covering the same territory may focus on environmental quality and use those traffic results as input to estimate emissions or air-quality impacts. This LDT could then combine these results with additional variables, such as industrial emissions, weather forecasts, wind behaviour, street topology, and the presence of green areas, to obtain a more comprehensive understanding of pollutant propagation and absorption. In this way, a data space can support the composition of complementary LDT capabilities across systems. From a practical perspective, current data space technologies can provide several capabilities that are relevant for LDTs:

Figure 6 illustrates how LDTs can reuse the Surface-level components of another LDT. In doing so, these components may use the same data or different data obtained through the data space, depending on the needs and context of each LDT.

TO-DO
Surface-level components of one LDT ideally work with other LDTs as well.

Overall, the relationship between LDTs and data spaces is bidirectional. Data spaces can strengthen LDTs by providing governed access to external data and services, while LDTs can contribute back valuable derived data products and capabilities. This makes data spaces a key interoperability mechanism for connecting LDTs with broader digital ecosystems.

2. LDT building blocks based on DSSC & DS4SSCC-DEP building blocks

Building on the fact that LDTs can act as data consumers and data providers in data space ecosystems (§ 1.6 Relationship between data spaces and LDTs), this blueprint proposes to reuse the DSSC & DS4SSCC-DEP Blueprints and their building blocks to a certain extent (see Figure 7). The DSSC Blueprint defines a set of business, organisational, and technical building blocks, which need to be implemented in the context of data spaces. The application of each of these building blocks in the context of smart cities and communities was refined in the DS4SSCC-DEP project. Here, relevant implementations for each building block were identified. In LDT4SSC, these refined and detailed building blocks are then applied to the context of interconnected local digital twins. Again, practical implementations are identified, and new, LDT-specific building blocks are added.

TO-DO
The DSSC Blueprint was used by the DS4SSCC-DEP Blueprint. The LDT4SSC Blueprint uses the DS4SSCC-DEP Blueprint and, consequently, the DSSC Blueprint.

In § 2.1 Business and organisational building blocks and § 2.2 Technical building blocks, we list the business and organisational building blocks and the technical building blocks, respectively. The blocks' summaries are altered to fit LDTs, instead of data spaces. For their full descriptions, we refer to the DSSC Blueprint v3 (released in March 2026) and DS4SSCC-DEP Blueprint (published in Sept 2025). In the case of a new business and organisational building block, we add a more detailed description referring to available academic and grey literature. In case of a new technical building block, we refer to § 4.2 Technical architecture and § 6.2 Define and implement. If the LDT is part of a data space, we need to take into account the building blocks on both the levels of the LDT and the data space.

2.1. Business and organisational building blocks

An essential part of LDTs' interoperability is the business and organisational aspects that define the purpose, organisational context (including involved stakeholders and their roles), incentives, rules and processes for designing, deploying, and operating LDTs. This section explains which business and organisational building blocks are important for either building LDTs or interconnecting LDTs. Figure 8 provides a visual overview of these building blocks in four pillars: business, governance, legal and design.

TO-DO
Business and organisational building blocks, based on theDSSC/DS4SSCC-DEP building blocks and evolved as part of the LDT4SSC project.

The Business pillar of LDTs frames the context for value creation, value capture and feasibility for sustaining the LDT beyond initial testing and piloting. This pillar of building blocks explores how to identify stakeholder incentives and structure data and service offerings. At its core, the business model should address the motives and strategic objectives of relevant stakeholders, ensure their alignment with offered data and services, foster multi-stakeholder data cooperations, and be sustainable in the long-run. This pillar includes the following blocks:

In addition, the Governance pillar ensures that the operating model of a single LDT or its interconnection with other LDTs is aligned with its core purpose. It sets clear roles and responsibilities, rules, and processes, among others, to ensure trust among engaged stakeholders and beneficiaries, foster clear and transparent decision-making, and support operational compliance with relevant regulations and requirements. This pillar includes the following building blocks:

In addition to having clear and strong governance, the Legal pillar emphasises the necessity for ensuring regulatory compliance and a strong legal foundation for the LDT’s operations. It defines the following blocks:

Finally, the LDT4SSC Blueprint defines a Design pillar with two essential building blocks which will influence how the LDTs are built, sustained and adopted, influencing all other business and organisational as well as technical building blocks, enabling the alignment of technological innovation with societal and environmental goals, fostering trust and long-term value for all users and citizens.

The aforementioned building blocks are implemented in various ways, using specific assets, inter alia contracts and rulebooks, or specific methodologies and templates, inter alia the LDT4SSC Methodology, Data Cooperation Canvas, and/or various ecosystem design materials.

The business and organisational building blocks are further described in the Business and Organisational Architecture in § 4.1 Business and organisational architecture, emphasising their connections and relevance for LDTs interoperability.

2.2. Technical building blocks

This section explains what technical pillars and building blocks are required to build an LDT. Figure 9 contains a graphical overview of these pillars and blocks, which we will discuss in detail next.

TO-DO
Technical LDT building blocks aligned with the LDT levels, based on the DSSC/DS4SSCC-DEP building blocks.

The Data Interoperability pillar contains the capabilities needed for data acquisition, data exchange, semantic models, data formats, and application interfaces. This also includes functionalities for provenance and traceability. This pillar has the following building blocks, which apply to interactions within a single LDT and between LDTs:

The Data Sovereignty and Trust pillar contains the capabilities needed for identifying components and assets that interact within an LDT and between LDTs. This allows establishing trust and the possibility of defining and enforcing policies for access and usage control. This pillar has the following building blocks:

The Data Value Creation Enablers pillar contains the capabilities for enabling value creation on top of LDTs: providing metadata on data products and publishing them in catalogues. This pillar has the following building blocks:

The Service Creation Enablers pillar contains the capabilities for enabling service creation on top of LDTs: ensuring accessibility and inclusive user experience (UX)/user interface (UI), developing and deploying AI models, and running simulations. This is a new LDT-specific pillar, because creating service is as important as the data itself. It’s based on the Value Creation Services building block of data spaces, which we don’t include in the Data Value Creation Enablers pillar for an LDT. The new Service Creation Enablers pillar has the following building blocks:

3. Workflows

Once we have built an LDT, the LDT should support the following workflows: adding a new SLC to an existing LDT (§ 3.1 Adding a new Surface-level component to an existing LDT) and discovering which SLCs work on an existing LDT (§ 3.2 Discovering which Surface-level components work on an existing LDT). We include both technical steps, which are labelled starting with "T", and business and organisational steps, which are labelled starting with "BO". We present the workflows in a mostly sequential way, but it might be necessary that we revisit previous steps when we acquire new information, including requirements and restrictions.

3.1. Adding a new Surface-level component to an existing LDT

In this section, we describe the workflow for adding a new SLC to an existing LDT, as mentioned in § 1.3 Adding a new Surface-level component to an existing LDT, reusing Core and Base-level components. We visualise this in Figure 10, where technical steps are blue with rounded corners and business and organisational steps are green with square corners.

TO-DO
This workflow guides us through the process of adding a new SLC to an existing LDT. It includes technical steps (blue, rounded corners) and business and organisational steps (green, square corners).

First, we create a shared understanding of the use, pain points, and envisioned solutions (with our key stakeholders) (BO1). Next, in parallel, we determine the functional requirements of the new SLC (T1) and align the stakeholders and ensure a shared motivation to develop the use case where the new SLC is needed (BO2). T1 and BO2 might influence each other, because the stakeholders need to support the selected SLC. After T1, we determine the SLC’s technical requirements (T2). After BO2, we, in parallel, should determine how value is created and captured (BO3), and explore regulatory boundaries (BO4) to explore the feasibility of adding a new SLC and justify the investments required. BO3 can influence the technical requirements of the SLC. Next, we determine the technical gaps between the SLC’s requirements and what the LDT currently offers (T3). We follow this up, in parallel, with mapping out the costs and benefits and assessing the return on investment (BO5); and determining how we will deal with the technical gaps (T4). BO5 leads us to exploring if we need additional data providers and/or intermediaries and operators (BO6), and if we need to make changes to the existing governance framework (BO7) or if it doesn’t exist yet, create one. BO7 might require us to update our contractual framework (BO8), which in turn might require us to make parties sign updated or newly established agreements and contracts (BO9).

BO6 and BO7 might influence each other, because, on the one side, the governance framework might affect the data providers and intermediaries we are allowed to interact with, and, on the other side, if we want to work with providers and intermediaries that don’t fit within the governance framework, we need to update the governance framework. BO6 and T4 might also influence each other, because, on the one hand, the technical gaps might influence which providers and intermediaries we need, and, on the other hand, the unavailability of certain providers and intermediaries might require additional actions to resolve a technical gap.

If there are new participants in the LDT, we need to onboard them (BO10). Next, we deploy the participatory and service design methods throughout the design and implementation phases (BO11), addressing the two design building blocks described in § 2.1 Business and organisational building blocks. We follow both T4 and BO11 by implementing the technical changes and deploying the components. Finally, we can start using the new SLC with our LDT and start creating the envisioned value for our beneficiaries.

Note that this process might not be straightforward, and we might need to go back to previous steps to adapt or validate specific details, either with respect to the technical aspects or the business and organisational aspects, or both.

3.2. Discovering which Surface-level components work on an existing LDT

In the previous workflow, we manually executed the steps T2 and T3, but it should be possible to automate these steps, as mentioned in § 1.4 Discovering which Surface-level components work on which existing LDTs and exchanging components through interconnected LDTs. This is the goal of this section’s workflow (see Figure 11).

TO-DO
This workflow contains the steps to automatically determine the technical requirements to deploy the SLC and automatically determine the gaps between these requirements and what the LDT offers.

For each available SLC, especially those built for other SLCs, e execute 2 steps. First, we automatically determine the technical requirements to deploy the SLC (T1). Next, we automatically determine the gaps between these requirements and what the LDT offers (T2). Finally, we use these SLCs in the first workflow at T1. This doesn’t mean that we can skip T2 and T3 of this workflow because we still have to take into account the business and organisational aspects, such as legal checks and signing of necessary agreements or budgetary assessments, but it allows us to speed up the process.

4. Reference architectures

In this section, we discuss the business and organisational architecture, including how it aligns with the business and organisational building blocks. Next, we discuss the technical architecture, including how it aligns with the technical building blocks, and the standards and protocols that we recommend.

4.1. Business and organisational architecture

The business and organisational architecture of LDTs builds on the building blocks of § 2.1 Business and organisational building blocks. Figure ?? provides the schematic overview of the interrelations of various business and organisational building blocks, specific to developing and operating an LDT. Note that building blocks are not specific assets or components, as these are deployed based on particular implementation contexts. However, as the LDT4SSC project’s aim is to support the development of an LDT ecosystem, Figure ?? introduces further the business and organisational building blocks necessary for fostering the connections between various LDTs, and the sharing of data products and LDT services.

Figures 12 and 13 consist of five major clusters:

  1. An LDT (dark grey), including data products and LDT services.

  2. LDT ecosystem (light grey, only in Figure ??) refers to the overarching framework for connecting LDTs and sharing resources (data products and/or services) among them.

  3. Participants in their specific roles (blue), emphasising especially data providers, data consumers, intermediaries/operators and/or a governance authority (even though there may be additional roles, defined by LDT’s governance framework). Note that these are not mutually exclusive, and participants may carry one or multiple roles. All participants must adhere to the governance framework, ensure regulatory compliance, and are expected to participate in the respective use cases. Note that single participants (legal bodies) have their own organisational form and business model, and their interests, rights and responsibilities must be accounted for when designing the data cooperation model.

  4. Business and organisational building blocks (green) that are necessary either for the design, governance or operations of the LDT and/or LDT ecosystem.

  5. Urban and community processes (yellow) layer emphasises the importance of integrating LDT services into specific urban and community processes, such as smart governance, administration, specific urban or community services or others. This fosters practical value creation and increases the relevance of LDTs in the respective local community, showing how LDTs help address local needs and achieve specific objectives.

TO-DO
High-level architecture of the business and organisational building blocks for developing and operating an LDT.

Following Figure ??, the development and use of an LDT is dependent on a specific use case(s), which identifies the need for specific LDT services and data products. That use case must address local needs and integrate the LDT into urban processes. In doing so, the LDT’s user experience should be carefully considered and monitored to foster adoption. To sustain the use of LDTs and the respective use cases, one needs to ensure a feasible business model (short- and long-term), which is aligned with the strategic business incentives of all engaged participants.

Various participants engage/participate in the use case in different ways:

While collecting, sharing and processing data (as well as any other business activity), all participants must ensure regulatory compliance in their actions. Relatedly, the contractual framework will give a legal ground to the established governance framework and ensure legal certainty and alignment with various regulations.

Depending on whether the participants have an organisational form or not, data sharing and engagement in the LDT may vary (for example, whether individuals could engage with the LDT by providing data directly). Also, if the LDT is co-developed by multiple actors, it may be co-governed through a specific organisational form (either incorporated, such as an association, or unincorporated, such as a consortium or joint venture).

In addition to the perspective of necessary building blocks for the single LDT, Figure ?? provides an overview of the building blocks (and relationships between them) that shape how several LDTs connect with one another and exchange assets among themselves. LDTs may leverage the interoperability framework of data spaces and the guidelines from DSSC and DS4SSCC-DEP, as discussed in § 2 LDT building blocks based on DSSC & DS4SSCC-DEP building blocks.

TO-DO
High-level architecture of the business and organisational building blocks for developing and operating an LDT-ecosystem, with the aim of sharing data products and/or LDT services among different LDTs.

The central focus point is again on the use case, which identifies the need for sharing specific data products or LDT services. Furthermore, the use case identifies mandatory functionalities (capabilities) for the functioning of the LDT ecosystem (for example, exchange of assets, brokering, and so on) that may be provided by a data space. A data space is defined by the rulebook (essentially a documented governance framework), which includes a structured set of principles, processes, standards, protocols, rules and practices that guide and regulate the governance, management and operations within a data space to ensure effective and responsible leadership, control, and oversight. It also defines the functionalities the data space provides and the associated data space roles, including the data space governance authority and participants. The rulebook is maintained by a data space’s (DS) governance authority, which also decides on the participants' management, in other words, the onboarding, continuous facilitation of participants, and offboarding. Relatedly, LDT-ecosystem is not an intellectual construct - it has tangible interfaces through which users experience the ecosystem and the interactions within.

It’s worth noting that an operational data space (LDT ecosystem) also has a business model to sustain its activities and fulfil its purpose. It also takes an organisational form (incorporated or unincorporated), which influences several business and organisational considerations (such as what agreements/contracts should be established, what its decision-making processes will look like, and so on).

4.2. Technical architecture

The technical architecture expands the high-level overview of an LDT. It includes artefacts and components. Artefacts are structured pieces of information that are created, processed, exchanged, or consumed within an LDT. They are passive elements. Components are functional building blocks that perform operations on artefacts. They are active elements responsible for producing, transforming, managing, exchanging, or analysing these artefacts. First, in Table 1, we explain the artefacts of the architecture, including their names, descriptions, the high-level components they belong to, and the building blocks they are related to. Next, in Table 2, we do the same for the components of the architecture. Descriptions are intentionally technology-agnostic. We provide recommendations for standards and protocols in § 4.2.5 Standards and protocols. Finally, we describe how the data flows through the architecture, visualised in Figure 14. Note that this architecture focuses on the different artefacts, components, and their interactions. It doesn’t say anything about which parties deploy which components where.

TO-DO
The technical architecture expands the high-level overview of an LDT. It defines the artefacts and components for Data Sources, Data Acquisition, Data Management, Analysis, Visualisations, Decision-Making, and Connectivity, and how they interact.

Artefact

High-level component

Description

Building Block

Rich Data Model

Data Acquisition

This is a data model that represents all the data from a data source.

Data models

Common Data Model

Data Management

This is a data model used by the LDT Core and the SLCs. The word “common” refers to the fact that this model will only contain what is common between what the LDT Core offers and what an SLC needs.

Data models

Data Model Mappings

Data Management

These are rules to convert data using one data model to data using another data model.

Data models

Data, Service, and Offerings Descriptions

Data Management, Visualisations, Decision-Making

These are the details of the data, services, and offerings

Data, service, and offerings descriptions

Access and Usage Policies

Data Management

These are the details about permitted and prohibited access and usage actions over certain data, services, and offerings, as well as the obligations required to be met by the acting parties.

Access and usage policy enforcement

Rulebook

Data Management

This defines governance requirements and supports automated conformity assessment.

Trust framework

Accredited Trust Anchors and Trust Service Providers

Data Management

This is a listing of trust anchors and trust service providers (including revoked ones).

Trust framework

AI Model Descriptions

Analysis

These are the details of the ​​datasets utilised, training versions, dependencies, and performance metrics such as accuracy and loss of AI models.

Data, service, and offerings descriptions

The artefacts of the technical architecture, including their names, descriptions, the high-level components they belong to, and the building blocks they are related to.

Component

High-level component

Description

Building Block

Raw Data Sources

Data Sources

This is any data source, excluding data spaces.

Data Acquisition

Data Spaces

Data Sources

This is a federated environment where data is shared among trusted participants.

Data Acquisition

Data Space Connector

Connectivity

This enables an LDT to provide and consume data within a data space under governed, secure, and interoperable conditions.

Data Acquisition, Data Management

Extract, Transform, Load (ETL)

Data Acquisition

This is a data process that combines, cleans and organises data from multiple sources.

Data Acquisition

Data Storage

Data Acquisition

This is where the ETL stores existing data and where new data is stored that is generated by the SLCs.

Data Management

Data Model Publication

Data Management

This makes the data models available for querying. This includes (new) vocabularies.

Data Models, Publication and Discovery

Catalogue

Data Management

This makes the data, service, and offerings descriptions available for querying.

Publication and Discovery

Data Model Mediator

Data Management

This maps data from one data model to another data model.

Data models

Data Exchange Component

Data Management

This enables the actual sharing of data between a data provider and a data user.

Data exchange, Access and Usage Policy Enforcement

Observability Service

Data Management

This collects and stores provenance and traceability data.

Provenance, Traceability, and Observability

Discovery Service

Connectivity

This allows discovering which SLCs work with which LDTs.

Publication and Discovery

Rulebook Publication

Data Management

This makes the rulebook available for querying.

Trust Framework, Publication and Discovery

Trust Services

Connectivity

They validate attestations and declarations against the conformity assessment criteria defined in the Rulebook. They can also perform conformity assessments by validating declarations and certifications against the conformity schema. These services combine organisational checks and automated verification processes, issuing results under the authority of the Governance Authority.

Trust Framework, Identity and Attestation Management

Trust Anchors and Trust Service Providers Publication

Data Management

This makes the list of trust anchors and trust service providers available for querying.

Trust Framework, Publication and Discovery

Visualisation

Visualisations

This provides dashboards, spatial and temporal views, and interactive interfaces that enable end-users to explore, interpret, and interact with the outputs of services and analyses within an LDT.

Visualisation

Simulation

Analysis

This enables the exploration of dynamic behaviours and scenarios within LDTs. This supports testing of policies, systems, or interventions under different assumptions.

Analysis

AI Model Catalogue

Analysis

This tracks trained AI models and their associated metadata, including the datasets utilised, training versions, dependencies, and performance metrics such as accuracy and loss.

Publication and Discovery

AI Model Manager

Analysis

This manages each phase of an AI model's lifecycle, including training, deployment, and updates.

Analysis

AI Model Provider

Analysis

This deploys AI models as inference interfaces and manages the models' state, deletion, and publication.

Analysis

AI Model Runner

Analysis

This runs the AI models, as instructed by the AI Model Provider.

Analysis

Decision Support Channels

Decision-Making

This enables the engagement of end-users (citizens or decision-makers) and supports participatory planning processes.

Decision-Making

Notifications

Decision-Making

This pushes notifications to end-users.

Decision-Making

Application Composition

Decision-Making

This allows end-users to wire together components from other high-level components (for example, dashboards, scenario builders, and analysis flows) to contribute to use cases.

Decision-Making

The components of the technical architecture, including their names, descriptions, the high-level components they belong to, and the building blocks they are related to.

When reading the data as an SLC, it flows through the architecture from the data sources to the SLCs, depicted by the arrows in Figure 14. Firstly, note that we didn’t include every possible interaction/arrow to keep the figure as readable as possible. Secondly, if an SLC writes data back to the data sources or any of the components in between, then the arrows are reversed.

Raw data goes from its data source, called Raw Data Source, to the ETL (arrow 1a). This does not include data from a data space. Data coming from a Data Space goes through a Data Space Connector (arrow 1b) before arriving at the ETL (arrow 1c). The ETL extracts, loads, and transforms the data based on a Rich Data Model that captures as much knowledge from the data sources as possible. Next, data flows from the ETL to the Data Storage (arrows 2a and 2b). The Data Model Mediator uses Data Model Mappings to map data (arrow 3) in a Rich Data Model to a Common Data Model. The mapped data flows to the Data Exchange Component (arrow 4). The Data Exchange Component uses the Trust Services (arrow 6) and the Access and Usage Policies to determine what data is shared with what parties and for what purposes.

Next, the data flows from the Data Exchange Component to the Visualisations, Analysis, and Decision-Making (arrows 8a, 8b, 8c, and 8d). This can happen via different interfaces, for example, API 1 (arrow 8a and 8d) and API X (Arrow 8c and 8d). The shared data is in the Common Data Model that is required by the requesting SLC. The SLCs can access the data models through the Data Model Publication’s API. The Discovery Service uses the details about the data, data models, AI models, services, and offerings (arrows 10a, 10b, 11a, 11b and 11c).

The data can also flow from the Data Model Mediator to the Data Space Connector (arrow 14). From there, it can flow to the data space participants. Note that in Figure 14, we use 2 separate instances of the Data Space Connector for clarity, but during deployment, this can be a single instance.

The AI Model Manager stores the details of the AI Data Model in the AI Model Registry (arrow 12a). The AI Model Provider uses these details (arrow 12b) to run the models on the AI Model Runner (arrow 12c). The latter uses the data made available through the Data Exchange Component (arrow 8b). Afterwards, a Decision-Making tool uses the runner’s output (arrow 15). When accessing data, SLCs use the Trust Services (arrows 13a and 13b) to authenticate the users. To support provenance and traceability, the necessary information flows from different components to the Observability Service (arrows 5a and 5b). Note that in Figure 14, we only added arrows between two components and the Observability Service for clarity. It’s possible to let the Observability Service collect data from other components as well.

Similar to the AI Model Runner, the Simulation can use the data (arrow 8c). Afterwards, the results of the Simulation can flow, for example, to a Visualisation (arrow 9b) or a Decision-Making tool (arrow 9a).

The Data Exchange Component and SLCs use a Contract Negotiation API that allows them to offer data, along with policies, to a data consumer that requests data (arrow 7). This happens before data flows from the Data Exchange Components to the SLCs.

The architecture applies the shift-left data paradigm to prevent bad data proliferation, storage, and compute needs in downstream systems. It achieves this by processing the data as close to the data sources as possible instead of requiring the SLCs to do that. Specifically, the Data Acquisition and Data Management components make sure

In § 4.2.1 Support different data models, § 4.2.2 Support different data exchange types, and § 4.2.3 Support for the usage of data from different LDTs, we elaborate on how the architecture behaves during specific scenarios. In § 4.2.4 Support for working with different AI models, we explain the recommended standards and protocols for the artefacts and the interaction between components.

4.2.1. Support different data models

Support for different SLCs using different data models is achieved in the Data Management component by using the Data Model Mediator. After the data from the different data sources is materialised using their corresponding rich data model in the Data Acquisition component, the Data model mediator maps the data to one of the common data models. The mediator does this by using the corresponding data model mappings. The LDT can support new rich and common data models by adding mappings for the relevant models. Consequently, the LDT supports semantic interoperability.

In Figure 15, we have an example with two Rich Data Models, two Common Data Models, and two Visualisations. The Data Acquisition component materialises the data from a Raw Data Source using Rich Data Model 1 (arrow 1a) and data from a Data Space using Rich Data Model 2 (arrows 1b and 1c). Next, the data goes to the Data Storage (arrows 2a and 2b) and then to the Data Model Mediator (arrow 3). The mediator maps the resulting data to either the Common Data Model 1 (arrow 4a), which is used by Visualisation 1 (arrow 5a), or the Common Data Model 2 (arrow 4b), which is used by Visualisation 2 (arrow 5b). Note that we kept only the relevant components needed for our explanation compared to Figure 4.2.

Also, Figure 15 highlights how interoperability between different data sources and SLCs is achieved through the use of multiple data models and their mappings, but it doesn’t provide an exhaustive representation of all possible data modelling and transformation patterns and deployments.

TO-DO
The Data Acquisition component ingests and structures the data from a raw data source using Rich Data Model 1 and data from a data space using Rich Data Model 2. Next, the data goes to the Data Storage and then to the Data Model Mediator. The mediator maps the resulting data to either the Common Data Model 1, which is used by Visualisation 1, or the Common Data Model 2, which is used by Visualisation 2.

4.2.2. Support different data exchange types

Support for different SLCs using different data exchange types is achieved in the Data Management component by using the Data Exchange Component. After the data enters the Data Exchange Component from the Data Model Mediator, the Data Exchange Component offers it to the different SLCs through the necessary interfaces. Note that supporting different data exchange types contributes to an LDT’s interoperability, but it doesn’t contribute to the semantic interoperability. The LDT does this through the Data Model Mediator (see § 4.2.1 Support different data models).

In Figure 16, the Data Exchange Component has two interfaces: API 1 and API 2. There are two Visualisations: one that gets the data via API 1 (arrow 1) and another that gets the data via API 2 (arrow 2).

TO-DO
The Data Exchange Component has two interfaces: API 1 and API 2. There are two Visualisations: one that gets the data via API 1 and another that gets the data via API 2.

4.2.3. Support for the usage of data from different LDTs

The Data Space Connector in one LDT’s Data Acquisition component and the Data Space Connector in the other LDT’s Data Management component allow support for sharing data between those LDTs. After the data leaves the Data Model Mediator, an LDT’s Data Space Connector shares the data with the other LDT’s Data Space Connector. Afterwards, the latter LDT processes the data like the data from a Raw Data Source. In Figure 17, the data from LDT 2 flows to its Data Space Connector (arrow 1). Next, it flows to the Data Space Connector of LDT 1 (arrow 2). Finally, the data is processed by LDT 1’s ETL (arrow 3).

TO-DO
The Data Space Connectors of both LDTs allow for the sharing of data between the LDTs. Data from one LDT flows, from the Data Management component, to the other LDT, into the Data Acquisition component.

4.2.4. Support for working with different AI models

The AI model manager, AI model catalogue, AI model provider, and AI model runner, which are part of the Analysis component, support working with different AI models. The manager updates a model’s information in the catalogue. This allows the provider to query the necessary information about a model to deploy it on the runner. This setup allows running different models at the same time, but also different versions of the same model. In Figure 18, the manager manages the lifecycle of AI model 1 (top-left diagram). It stores the information about the model, including its versions, in the catalogue (top-right diagram, arrow 1). The provider queries the second version of the AI model 1’s information from the catalogue (bottom-left diagram, arrow 2) and deploys it on the runner (bottom-left diagram, arrow 3). Next to this model, the provider also deploys the third version of AI model 2 (bottom-right diagram, arrow 3).

<TO-DO
The AI Model Manager, AI Model Catalogue, AI Model Provider, and AI Model Runner work together to support working with different AI models.

4.2.5. Standards and protocols

In this section, we list the recommended standards and protocols for the artefacts and the interactions between the components. We based ourselves on the previous deliverables D3.7 Technical Resources for pilots (Work Strand 1) and D3.8 Technical Resources for pilots (Work Strands 1, 2 & 3), the DSSC Blueprint v3, and the DS4SSCC-DEP Blueprint. We focus only on RDF-based and NGSI-LD systems to facilitate the interoperability within an LDT and between LDTs, as described in Section 4.2.5 of D3.8.

4.2.5.1. Artefacts

In this section, we describe the standards and protocols for data, vocabularies, data models, data model mappings, services, and offerings descriptions, access and usage policies, provenance and traceability, and AI model descriptions.

4.2.5.1.1. RDF-based systems
Data: Resource Description Framework (RDF). RDF is a framework for representing information on the Web. Its core structure is a set of triples, each consisting of a subject, a predicate and an object. A set of such triples is called an RDF graph.

Vocabularies: RDF Schema (RDFS) and Web Ontology Language (OWL). RDF Schema provides a data-modelling vocabulary for RDF data. OWL is a language designed to represent rich and complex knowledge about things, groups of things, and relations between things.

Data Models: Shapes Constraint Language (SHACL). This is a language for validating RDF against a set of conditions. We provide these conditions as shapes and other constructs expressed via RDF. As we use SHACL shapes to validate that data satisfy a set of conditions, these shapes can also be viewed as a description of the data that satisfy these conditions. Such descriptions may be used for a variety of purposes besides validation, including user interface building, code generation and data integration through shared data models.

4.2.5.1.2. NGSI-LD-based systems

Data: NGSI-LD Information Model. It builds upon JSON-LD and RDF principles to represent entities, properties, and relationships as Linked Data, allowing information to be read by machines and interoperable across domains.

Vocabularies and Data Models: Smart Data Models (SDM) and JSON Schema. SDM is an initiative that develops open, reusable data schemas aligned with NGSI-LD. It provides harmonised ontologies according to a specific domain, such as mobility, energy, or environment. JSON Schema is a declarative language for defining structure and constraints for JSON data. SDM uses it to describe its data models.

4.2.5.1.3. System-independent

Data Model Mappings: Notation3 (N3), ]SPARQL Query Language](https://www.w3.org/TR/sparql11-query/) (SPARQL), and RDF Mapping Language (RML). N3 is a rule language for natively building and reasoning over semantic knowledge graphs. Since N3 is a superset of RDF, there is no impedance mismatch between N3 and Knowledge Graphs. We can use N3 to map RDF data using one data model to data using another data model by defining rules. We can find an example on the Eyeling N3 Playground. We can use SPARQL to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. SPARQL can be used to map RDF data by using the CONSTRUCT query form. We can try an example via Comunica. RML is a mapping language defined to express mapping rules from heterogeneous data structures and serialisations to the RDF data model. Similar to N3, we can use RML to map RDF-based data (available as a result of, for example, a SPARQL query) by defining rules. We can try an example on the RML Playground.

Data, Services, and Offerings Descriptions: Data Catalog Vocabulary (DCAT) and DCAT-AP. DCAT provides the baseline vocabulary for describing datasets and related resources in a structured, machine-readable manner. DCAT-AP is an application profile of DCAT with mandatory fields and controlled vocabularies, ensuring that metadata can be aggregated across Europe, such as the official portal for European data. It’s extensible to reflect specific domains. It also contains the descriptions of the endpoints where others can find the rulebook, trust anchors, trust service providers, data models, and AI model descriptions.

Access and Usage Policies: Open Digital Rights Language (ODRL). This is a policy expression language that provides a flexible and interoperable information model, vocabulary, and encoding mechanisms for representing statements about the usage of content and services. Policies are used to represent permitted and prohibited actions over a certain asset, as well as the obligations required to be met by stakeholders. In addition, policies may be limited by constraints (for example, temporal or spatial constraints), and duties (for example, payments) may be imposed on permissions.

Provenance and Traceability: The PROV Ontology (PROV-O). It provides a set of classes, properties, and restrictions that can be used to represent and interchange provenance information generated in different systems and under different contexts. It can also be specialised to create new classes and properties to model provenance information for different applications and domains.

Note that, to the best of our knowledge, there is no standard yet for describing a rulebook and AI models in a machine-readable way.

4.2.5.2. Interaction between components

In this section, we explain how the different components interact. If a specific interaction between two components is not discussed, then it is very use-case dependent, which is out of scope of the blueprint. For example, how data sources interact with the ETL is very use-case dependent because it depends on the volume of the data, how often it changes, and whether it’s streaming data or not. Nonetheless, we can find suggestions in the deliverables D3.7 Technical Resources for pilots (Work Strand 1) and D3.8 Technical Resources for pilots (Work Strands 1, 2 and 3).

The interactions between components can be understood as a sequence of steps that enable data discovery, data access, secure exchange, and supporting SLCs.

Discovery interactions enable consumers to discover available datasets and SLCs. The Catalogue (arrows 10b, 11a, 11b, and 11c) is a crucial component in these interactions. For this component, we suggest the Dataspace Protocol - Catalog Protocol. It defines how a catalogue is requested from a catalogue service by a consumer using an abstract Message exchange format. The Catalog Protocol reuses properties from the DCAT and ODRL vocabularies with restrictions defined in the protocol.

Negotiation interactions define how access to data is agreed between participants. The Data Exchange Component and SLCs (arrow 7) are crucial components in these interactions. For these components, we suggest the Dataspace Protocol - Contract Negotiation Protocol. It defines how a data provider offers data, along with policies, to a data consumer that requests data.

Data exchange interactions enable the actual sharing of data between participants once agreements are in place. The Data Space Connector is a crucial component when sharing data through a data space. For this component, we suggest the Dataspace Protocol. It is a specification designed to facilitate interoperable data sharing between entities governed by usage control and based on Web technologies. This specification defines the schemas and protocols required for entities to publish data, negotiate agreements, and access data as part of a data space.

Publication and management interactions support the exposure and maintenance of shared artefacts. Data Model Publication, Rulebook Publication, and Observability Service are the related components for these interactions. For these components, we suggest an HTTP API. The HTTP methods are enough to make data models, rulebooks, and provenance and traceability data queryable (GET) and updatable (POST, PUT, and DELETE).

Trust and identity interactions establish trust between participants. Trust Services is a crucial component for these interactions. For this component, we suggest W3C Verifiable Credentials (VC), W3C Decentralized Identifiers (DIDs), OpenID for Verifiable Credentials (OIDC4VC), and Eclipse Dataspace Decentralized Claims Protocol (DCP). VC provides an interoperable model for expressing digital credentials. We can use it for attestations in a machine-readable format. DIDs are globally unique and verifiable identifiers. They allow organisations, natural persons, and machines to be securely identified across domains. OIDC4VC and DCP are two complementary protocols for credential exchange. OIDC4VC extends OpenID Connect and OAuth2 flows, enabling secure and interoperable credential issuance and presentation in line with existing identity and access management practices. DCP, developed under the Eclipse Dataspace Foundation, provides a governance-aware protocol overlay for exchanging and verifying claims without relying on centralised intermediaries.

Notification interactions enable the propagation of events and updates between components. Notifications is a crucial component for these interactions. For this component, we suggest Linked Data Notifications (LDN) and Common Alerting Protocol (CAP). LDN is a protocol that describes how servers (receivers) can have messages pushed to them by applications (senders), as well as how other applications (consumers) may retrieve those messages. Messages are expressed in RDF and can contain any data. CAP is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. CAP increases warning effectiveness and simplifies the task of activating a warning for responsible officials. It allows LDTs to deliver notifications to end-users in a standardised manner.

Note that some interactions are not standardised, such as the publication of Trust Anchors and Trust Service Providers and AI components. However, there are efforts that we can build on, such as the Gaia-X Registry, the EBSI Trusted Issuers Registry, and the Open Inference Protocol.

5. Workflows using reference architectures

5.1. Adding a new Surface-level component to an existing LDT

5.2. Discovering which Surface-level components work on an existing LDT

6. How to build an LDT

6.1. Explore and validate

6.2. Define and implement

6.2.1. How to build a Data Acquisition component

6.2.2. How to build a Data Management component

6.2.3. How to build a Connectivity component

6.2.4. How to build Decision-Making components

6.2.5. How to build Analysis components

6.2.6. How to build Visualisations