MIM2 - Data Models
OASC MIM2: Data models
Description
Definitions
Entities: humans, places and things in the real world.
Objectives
To support cities and communities to use consistent and machine-undertstandable definitions of all the entities about which data is being captured in a data ecosystem, along with a consistent set of identifiers of individual instances of each entity, so that data about any entity can be combined with other data referring to that entity, and every instance of that entity, in the confidence that they refer to the same thing.
Capabilities and Requirements
C1: All entities included in data sources are described using consistent data models to enable interoperability for applications and systems.
R1. Data models used for all entities in any data source shall be made explicit.
R2a. Data models used shall be based (wherever possible) on commonly recognised sets of machine-understandable standardised data models as listed in the section below.
R2b. Where it is not possible to use existing standardised data models, efforts shall be made to extend existing standardised data models that are most closely aligned or to define new ones, following best practice and conventions of the community or organisation defining the data models.
R3. Data models used shall support the exchange of data via the context management API (MIM1).
C2: All data sources in a data ecosystem use consistent identifiers for individual instances of each entity.
R4. The type of identifiers used for entities shall be made explicit.
R5. Unique and persistent identifiers shall be used to identify particular instances of any entity used in data sets.
Notes
Note for R2a: This is to ensure that translation engines can help align data models coming from different sources within a city/community data ecosystem. See also SEMIC Style Guide.
Note for R2b: See also SEMIC Style Guide.
Note for R5: It is recommended that these follow the work of W3C in the data on the web best practice or the requirements of the Inspire directive data specifications, chapter 14 Identifier management.
Mechanisms
Requirements mapping to mechanisms
R1. Data models used for all entities in any data source shall be made explicit.
R2a. Data models used shall be based (wherever possible) on commonly recognised sets of machine-understandable standardised data models as listed in the section below that are machine understandable.
A set of suitable mechanisms are provided in the standardised sets of data model below.
R2b. Where it is not possible to use existing standardised data models, efforts shall be made to extend existing standardised data models that are most closely aligned or to define new ones, following best practice and conventions of the community or organisation defining the data models.
TBD
R3. Data models used shall support the exchange of data via the context management API (MIM1).
TBD
R4. The type of identifiers used for entities shall be made explicit.
TBD
R5. Unique and persistent identifiers shall be used to identify particular instances of any entity used in data sets.
TBD
Standardised sets of data models
The following list provides the recommended standardised sets of data models for MIM2. This will continue to be added to, as new and suitable sets are identified.
ISO/IEC JTC1 is developing the ISO/IEC 5087 series of standards on City Data Model, of which Part 1: Foundation level concepts (ISO/IEC 5087-1:2023) has already been published and Part 2: City level concepts (ISO/IEC DIS 5087-2) is at the final draft stage.
NGSI-LD compliant data models for aspects of the smart city have been defined by organisations and projects, including OASC, FIWARE, GSMA and the SynchroniCity project and there is an ongoing joint activity of TM Forum and FIWARE to specify more under the Smart Data Model initiative.
oneM2M base ontology (that is compatible with SAREF). Additionally, oneM2M provides the means to instantiate ontologies as a means to provide semantic descriptions of the data exchanged (through the use of metadata).
SAREF: Smart Appliances REFerence (SAREF) ontology specified by ETSI OneM2M committee with the extension of SAREF4Cities provides an ontology focused on smart cities.
Core vocabularies of former ISA2 (now Interoperable Europe) like Core Public Service Vocabulary Application Profile used as the basis for the Single Digital Gateway Regulation that touches local governments, Core Person, Core Organization etc.
DTDL is the Digital twin Definition Language developed by Microsoft. This language is based on top of json-ld and the existing Fiware data models are converted in this format.
For spatial (and spatio-temporal) observation data the provisions of MIM-7 (Places) about data encoding have to be taken into consideration.
Notes
Note to 2: The initiative mentioned above provides a standardised way of developing new data models, where there is no existing model that is suitable, and thus provides an appropriate mechanism for Recommendation 2c above.
Note to 2: Existing data models and ontologies, e.g. the SAREF (Smart Applications REFerence ontology) standard by ETSI/oneM2M, can be mapped for use with NGSI-LD by identifying what are entities, properties and relationships, which can be managed and requested by the NGSI-LD API.
Interoperability guidance
Here we specify how interoperability between different implementations can be realised, for instance by making use of PPIs such as GeoJSON.
Interoperability Guidance Option 1
One issue with interoperability between semantic and non-semantic data models is that semantic models require all instances to have a unique and persistent identifier. Identifiers in a non-semantic setting can use different identification schemes.
One way of turning non-semantic identifiers, such as DOIs, is by prefixing them with a URI. In case of this approach, one needs to set up a “resolver” service, which can generate URIs for each entity, and allows resolving them to a page (ideally a semantic document) that provides more information about the entity and allows linking it to others.
Interoperability Guidance Option 2
Another great challenge in using data models is the abundance of existing models, which may describe the same or similar types of information, but do not align correctly.
Conformance and compliance testing
To perform conformance testing, we highly recommend the use of the Interoperability TestBed which can be tailored to the user’s needs.
...