§ Data Disclosure Agreement (DDA) Specification
Specification Status: version 1.0.0
This is reviewed and implementation has started. This spec is live and is being iterated as part of the PS-SDA project in NGI-ONTOCHAIN. The project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 957338.
Latest Draft: Avaialble here
Editors:
- Mr. George Padayatti (iGrant.io, Sweden)
- Mr. Jan Linquist (Linaltec, Sweden)
- Mr. Lal Chandran (iGrant.io, Sweden)
Contributors and Reviwers:
Participate:
§ Abstract
A Data Disclosure Agreement (DDA) enables automated agreement handling for data exchange between a Data Source (DS) and Data Using Service (DUS). It helps organisations to continue leveraging their data assets while being transparent and legitimate in their data usage. Automated agreement handling is a requisite for a scalable and regulatory-compliant data marketplace… It also provides individuals control over how their data is used and exchanged.
§ Introduction
Data is the critical currency of an advanced digital economy, and trust is fundamental for continuous access to personal data. An adequate governance framework is essential to build the requisite trust and must involve, at the very least, the following actors:
-
Individuals, who can manage their preferences and follow their data, should know who is consuming what, when and why.
-
Organisations, which are either a DS or a DUS, should be able to leverage personal data and gain access to the right quality data, provided that:
- they offer adequate transparency for individuals to trust them.
- their data usage is compliant with the relevant data protection and privacy regulations which they can prove if requested.
-
Auditors, who can independently prove fair data usage via independent audit mechanisms, can be used by organisations and individuals to verify the legitimacy of their claims in any legal dispute concerning the use or misuse of data.
Any organisation undergoing digital transformation needs to ensure that it can cater to the above, so individuals continue to say “yes” to sharing their data. Initiatives like MyData Operator and the proposed EU Data Governance Act point to the need for data intermediaries to enable the above for individuals and businesses.
Data exchange agreements aim to enhance data governance to increase transparency and authenticity as critical elements for digital trust. This specification describes how the key actors above can capture, view or disclose the provenance trail of a personal data exchange transaction, starting with creating data disclosure agreements during any such data exchanges. It also specifies how a DS can define rules for data processing (to demonstrate regulatory compliance etc.) in a data exchange transaction.
§ Data intermediaries enabling data exchange
iGrant.io is a data exchange platform that helps organisations access personal data in a sustainable and human-centric manner. Using iGrant.io’s data exchange services, organisations gain access to verifiable, auditable and data regulatory compliant personal data. Every data exchange transaction has an associated DA that records conditions for an organisation to process personal data in accordance with data regulations, such as the GDPR and the Data Governance Act, as illustrated in Figure 1 below.
Figure 1. A data exchange ecosystem using a data intermediary
§ Data exchange agreement landscape
In a data exchange ecosystem, there are a number of agreements that are required to legally validate data exchanges. This chapter introduces various Data Exchange Agreements (DEXA) and the relationships that exist between organisations and individuals, depending on their roles in different personal data usage scenarios. The various agreements involved can be classified into four broad categories as shown in Figure 2 below. These are agreements between:
- An individual and an organisation,
- Two organisations (DS and DUS),
- An organisation and its supplier and
- Two individuals
Figure 2. Data exchange agreement landscape
§ Data Agreement (DA) or Personal Data Agreement
This is an agreement between an organisation and an individual when it comes to the use and processing of personal data. A data agreement can have any legal basis as outlined by the relevant data protection regulation. The agreement can be with a DS (issuer) or a DUS (verifier) and can also be used for personal data exchange with third parties.
Today, the DA is implemented via a W3C specified Decentralised Identifier (DID) DID:mydata. It records the conditions for an organisation to process personal data in accordance with data protection regulations. Regulations could be data laws or they could be norms such as the MyData principles.
The key characteristics of a DA are as follows:
- It is associated with any personal data usage including data exchange
- It has the ability to rely on an individual’s consent or other lawful basis such as contract, legal obligation, vital interests, public task and legitimate interests by outlining the purpose for which personal data is to be processed
- It is tied to a data protection impact assessment (DPIA) that further strengthens legal compliance for the organisation. iGrant.io automates the conversion of the results of a DPIA to a machine-readable DA
- It is standardised via ISO/IEC JTC 1/SC 27 Information security, cybersecurity and privacy protection WG5: 27560 (working draft)
The key commercial values enabled by a DA are described below:
Data regulatory compliance: A DA based on a DPIA provides reassurance that the organisation has the intent to exchange data in compliance with a jurisdiction appropriate data protection regulation.
Transparency: A DA provides the requisite transparency to a data subject on how personal data is to be used by an organisation, especially if exchanged with third parties.
Auditability: With a DA, a DS can prove its legitimate right to collect and share data with a DUS via digital token-based verification system. Similarly, an individual can dispute data usage for which no legitimacy can be proven using the signed DA.
§ Data Disclosure Agreement (DDA)
A Data Disclosure Agreement (DDA) exists between two organisations where one organisation acts as a DS and the other as a DUS. The DDA captures how data is shared between the two organisations and what role and obligation each party has, as either a data processor and/or a data controller. For any organisation involved in the data exchange, there is an associated DA that explains the purpose of processing personal data, what personal data is collected, what the data subject rights are, etc. Where both organisations are data controllers, the individual (data subject) has a signed DA with both.
§ Data Processing Agreement (DPA)
The third form of an agreement exists between an organisation and its suppliers, as illustrated in Figure 2. Here, there is a vertical relationship between Organisation A as a data controller and its supplier as a data processor or sub-processor. For a higher level of accountability between these organisations, a DPA is set up, which lays out what routines are required to be in place: for example, a data processor’s obligations in case of a data breach or how the rights of the individual, such as access rights, are supported, among other policies and routines. An auditor should also be able to inspect the organisation and use the DPA as reference material during the inspection. As depicted in Figure 2, the DPA is connected to the individual at the top of the hierarchy via the data controller organisation.
The concept of a DDA can be extended to include the DPA as well. This, however, is not within the scope of this project. For details on the DPA and its content please refer to Appendix C.
§ Delegation Agreement
The delegation agreement is included to complete the data exchange ecosystem. A delegate may act on behalf of an individual in signing off any data exchange. There are several scenarios where delegation is necessary for example in the case of guardianship when an individual is not capable of signing off or in case an individual is given temporary rights to sign off on behalf of the individual for example purchasing medicine at a pharmacy.
§ Data provenance
W3C defines provenance as the information about entities, activities, and agents involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness.
Figure 3. Data provenance terminology
There are different models that can be used to achieve data provenance. This includes the Open Provenance Model (OPM) and W3C PROV. Both these models have basic entity, activity and agent (people) components. The difference comes with W3C PROV providing additional terms to help with explaining the details of activity through the usage of “plans”. These plans set the details of the execution of an activity.
In the context of a personal data exchange transaction, the agents are the actors involved in the personal data exchange transaction like the individuals (or data subjects) and organisations (DS and DUS), entities are the data being exchanged. The activities include create/read/update/delete (CRUD) operations on these agreements and will be further elaborated in this document. The model in Figure 3 is taken from the W3C PROV Primer to illustrate a high-level overview.
In a personal data transaction, provenance provides a critical foundation for assessing authenticity, enabling trust, and allowing reproducibility and the re-use of personal data. Once this is achieved, assertions could be made with regard to the use of contextual metadata (e.g. events in data agreements and data disclosure agreements) and can themselves become important records with their own provenance. Once provenance metadata is collected it is possible to check claims that are being made in the records.
§ Data Exchange Agreement overview
Figure 4 below illustrates a typical scenario where an organisation, Org. A, uses a DA to share data externally to Org B and Org C. Individual instances of the DA are signed by individuals X and Y. A DDA is used to govern the data exchange between Org A and Orgs B and C.
Figure 4. Data exchange and provenance scenarios
§ DEXA High-level workflow
As described in the background section, iGrant.io provides a decentralised data exchange service based on self-sovereign identity (SSI), OpenID Connect and OAuth protocols. The service enables organisations to exchange data in a transparent, secure and privacy-centric manner using verifiable data exchange agreements. The received data can be verified using SSI and X.509-based signatures. A DEXA workflow involves the following stages as illustrated in Figure 5:
Figure 5. DA and DDA workflow
These different phases are explained below:
Definition: An existing agreement template is adopted as is or new ones are formulated in this phase. The template could be based on a particular industry and/or sector-specific practice. This can then be used by any organisation (DS or DUS) for a particular data usage purpose, in our case for enabling third party data exchange.
Preparation: An organisation creates an agreement based on a DPIA or similar and shares it with relevant parties . For a DA, the relevant counterparty is the individual while for the DDA, it is a DUS.
Capture: The counterparty signs the agreement in this phase. For a DDA, it is countersigned by the DUS while for a DA it’s the individual.
Proof: Any organisation or individual is able to demonstrate that an agreement exists between the parties. Independent auditors can also check that records are in place proving that the individual’s personal data may be processed (e.g. as per GDPR Article 30).
§ DEXA workflow enabling data provenance
The four phases described above are further elaborated in Figure 6 and explains how the DAs and DDAs are interlinked in the DEXA workflow. To ensure their compliance with data regulations, each organisation is encouraged to perform a privacy risk assessment or DPIA and ensure that risk mitigation measures are in place before collecting and processing personal data.
Figure 6: DA and DDA interlink within DEXA workflow
When a DDA is signed and personal data has been exchanged, the DS is not liable for the DUS’s use of personal data. However, the obligation to monitor the individual’s consent to share data with a third party, in this case, the DUS, remains with the DS. The DUS is obliged to adhere to the terms laid out in the DDA. If the DUS does not adhere to the DDA, the DS can, upon finding out about the infringement, withdraw the DDA and consequently revoke all the DAs associated with the DDA.
Throughout the DA lifecycle, cryptographic proofs are generated detailing who created the DA during the preparation phase and who signed it during the capture phase [2]. This is based on a W3C (DID:mydata) specification [4] and realised via a Data Agreement DIDComm protocol [5] implementation.
The key activities and actors involved in each phase are summarised in the table below:
# | Activity | Phase | Actors |
Assessment | |||
1.0 | Perform DPIA or similar privacy assessment (external or self-assessment)
Output: DA template consisting of usage purpose, data attribute etc as per the DA schema |
Definition and preparation | Assessor, DS, DUS |
For DS: Lawfully expose data and demonstrate provenance | |||
2.1 | If the organisation wishes to expose personal data to a third party, it creates the DDA offer, If required, publishes to a data marketplace
Output: DDA is available and is ready for any DUS to sign. Provenance: action to create an entity (DDA) by an actor (DS). |
Definition and preparation | DS |
2.2 | Add DUS when they sign up
Output: DDA is updated with the DUS signatory info (during the DDA capture phase) Provenance: action to update and sign off entity (DDA) by an actor (DUS) |
Capture | DUS |
2.3 | Any interested party can check the signed DDA proof of compliant data usage.
Provenance: Query to view provenance metadata |
Proof | Auditor, Individual, DS or DUS or any 3PP |
2.4 | Automated enforcement of DDA
Provenance: query provenance metadata to ensure the signing of DDA is permitted |
Proof,
Capture |
DS, DUS |
For DUS: Lawfully consume data and demonstrate provenance | |||
3.1 | If the organisation (DUS) wishes to consume data, it checks the available DDA offers in the marketplace. | Definition and Preparation | DUS |
3.2 | Organisation (DUS) signs up (Same as step 2.2) | Capture | DUS |
3.3 | DUS updates and register the DA based on the DDA | Definition and Preparation | DUS |
3.4 | Any interested party can check the signed DA proof of compliant data usage | Proof | Auditor, Individual, DS or DUS or any 3PP |
§ Publish and Sign DDA sequence diagram
Figure 7: DDA - Publish and sign sequence
§ Data Provenance
Appendix D in this specification provides a detailed ontology analysis of what needs to be captured in a provenance trail. A DDA schema (Ref. chapter Ontologies) is proposed and the existing DA schema has been updated to include the key elements required to establish data provenance when making the data available for third parties. By combining information from the DA and the DDA, a provenance trail can be generated to provide reassurance about the legitimacy of owning, trading or controlling copies of the data. With the usage of digital signatures, provenance trails are tamper-evident logs, which means you can cryptographically prove that data hasn’t been unexpectedly changed. A verifiable provenance trail helps in realising the accurate, immutable and verifiable history of activity and entity.
Scenario 01: Here DS and DUS do not need individual identity proof before the data exchange e.g. in the case of a DUS using anonymous or pseudonymous data sharing to offer personalised services.
Figure 8 elaborates on how the provenance is captured during a data exchange transaction in Scenario 01 where there is no need for identity proof (E.g. Anonymised data exchange).
Figure 8: Scenario 01: DA, DDA workflows without ID proof (E.g. anonymised data exchange)
Scenario 02: DS and DUS require individual identity proof before the data exchange. This is the case, for e.g., during a registration process, check-ins, covid-credential exchange etc. This is illustrated in Figure 9.
Figure 9. Scenario 02: DA, DDA workflows with ID proof
§ On-chain and off-chain agreements for provenance trail
A DDA can be realised as a smart contract adding the key advantages of using a blockchain. The key considerations for the design of the PS-SDA blockchain-based solution are:
- Authenticity - Enabling verified agreements source and ownership
- Scalability - Can store large amounts of data at low-cost
- Proof of membership - Makes it easy to prove if a data disclosure agreement transaction occurred or not, e.g. signing, counter signing, data exchange without revealing PII data.
- Tamper resistance - Allows checking that data hasn’t been altered
- Availability of transaction data - Allows for immediate access to any transaction. The blockchain confirmation times [12] of mainstream blockchain are slow. E.g. the proof of work consensus mechanism in Bitcoin is 10 minutes while for Ethereum it is 2 minutes.
During a DDA capture process, the signing transaction is registered on the chain to enable provenance. The solution will leverage features offered by ONTOCHAIN to develop and deploy a smart contract to facilitate seamless data exchanges between two organisations.
§ Publish to the marketplace via a DLT
During the creation of the data disclosure agreement, the DS creates a token of ownership for the DDA offer and registers it to the blockchain. The DDA offer is stored in the (Inter Planetary File System) IPFS. The signing process is facilitated by a data intermediary service, such as iGrant.io, which acts as a notary and registers the signed DDA evidence (hash) to the blockchain.
Figure 10. Publishing of DDAs by DS in the blockchain ecosystem
The DS and the DUS do not need to understand the underlying witness system and cost mechanisms of using a DLT, e.g. Ethereum, bitcoin etc. The use of data intermediary service also reduces the costs associated with the footprint on the blockchain and solves the key issue of trust, e.g. who’s signed DDA do you trust?
The transaction data is on-chain as per chapter “Ontologies/ On-Chain Data”, with the ownership of the DDA and event signature needed for data provenance. The DDA offer is stored in the IPFS.
§ Countersigning of DDA offers by the DUS via a DLT
Any DUS can view what data DSs are offering to share, negotiate new terms and eventually sign a DDA and use the data based on the terms set in the agreement as shown below.
Figure 11. Countersigning of DDA by DUS in the blockchain ecosystem
§ Off-chain and On-chain handling of data disclosure agreements
As illustrated in Figure 11 above, once the DDA is prepared, the DDA offer (by the DS) is published in the marketplace by the Data Intermediary. The Data Intermediary awaits the signed DDA (by both DS and DUS) before anchoring it to the DLT. The data intermediary waits for the countersigned DDA from the DUS(s) and anchors it to the DLT. The detailed mechanism involved in anchoring (via a smart contract) to the underlying witness system (e.g. DLTs like blockchain, hash graphs) taking into account the key considerations (in chapter 3.2] is as pictured below Figure 12.
Figure 12. Anchoring DDA transactions to a witness system, e.g. DLT
The DDAs are always stored in local storage, a copy of which is available with the signing parties and the data intermediary. The hash of the DDA offer (Hash (1)) forms the bottom leaf of the Merkle tree. Hash (2) represents the hash of the signed DDA when it is published to the data marketplace while Hash (3) represents the countersigned DDA. At a given point in time, the root hash from the Merkle tree is added to the DLT.
§ Handling release of personal data
When the actual exchange of data happens from the DS to the DUS, an event and a corresponding proof is appended to the Data Disclosure Agreement. This provides for data exchange provenance to any interested parties.
§ User stories and use case analysis
§ Individual
Use case | Description |
View personal data flows | View the data provenance trail on what happens to personal data. This can be fetched from the DDA events with proof coming from the blockchain. |
View shared organisations | View the DUS’ with which a DS has shared data. This can be fetched by from the DA and DDA with proof acquired via the blockchain |
View DDAs | View the DDA offer from the DS with the usage of personal data. |
View DAs | View the DAs in all organisations with whom the personal data has been shared. |
Use case | Description |
CRUD DDAs | The organisation administrator performs CRUD for DDAs. |
View personal data flows | The organisation administrator can view the data provenance trial with proof of each data exchangetransaction. |
Share proofs of DAs | The organisation administrator can share the DA and DA record with an external party, e.g. a third-party auditor. |
Share proofs of DDAs | The organisation administrator can share signed DDA and DDA records (on and off chain) with an external party, e.g. a third-party auditor. |
Automated enforcement of DDAs | The organisation's (DUS) IT system can read the DDA and automatically enforce it before being able to process and use personal data. |
§ Auditors
Use case | Description |
View personal data flows | Received the complete data trail on what happens to personal data, with verifiable proof. |
View shared organisations | View the organisations with which the primary organisations have shared their data with proof |
View DDAs | View the DDAs between organisations with verifiable proof |
§ Detailed design specification
§ Architecture diagram
The DEXA functions are available as microservices that can be plugged into existing systems, such as in iGrant.io SSI data exchange workflows. The core components are exposed as RESTFul APIs. The components can exist independently in any service provider agreement handling system. Here, the DA registry holds the signed DA records. In PS-SDA, we will add the DDA registry, as shown in Figure 13:
IMAGE Figure 13. DEXA microservice components
In Figure 13, the agreement records can be both DA and DDA records. Together with the JSON-LD processing components, DA and DDA, all exposed as RESTFul APIs, form the core component in the DEXA architecture. The pluggable components could be replaced with any chosen implementation mechanisms.
§ High-level software quality analysis
The project uses a DevOps process with established CI/CD practices. Quality assurance is via manual and automated tests by the development team. Validations are carried out by the product owners and signed off during each sprint.
§ APIs for SDKs
No SDKs are required for the DDA implementation. Existing DA SDKs will be reused and be revised to support new flows.
§ Rest APIs for Services
The key interfaces are APIs classified under the key actors listed below. An early version of the DEXA APIs is published at the iGrant.io swagger API hub.
§ Individuals
Action | Description |
GET /individuals/data-agreements | View signed DAs |
GET /individuals/data-using-service | View the shared data using services for a given organisation |
GET /individuals/data-disclosure-agreements | View the DDA(s) in an organisation |
GET /individuals/data-agreements/{data_agreement_id}/provenance_trail | Fetch provenance trail for a DA |
§ Organisations (DS and DUS)
Action | Description |
POST /organisation/data-disclosure-agreement | Create DDA offer |
GET /organisation/data-disclosure-agreement | List all published DDAs |
PUT /organisation/data-disclosure-agreement/{data_disclosure_agreement_id} | Update signed DDA by ID |
DELETE /organisation/data-disclosure-agreement/{data_disclosure_agreement_id} | Delete signed DDA by ID |
GET /organisation/data-agreements | View signed DAs (Org. copy) |
GET /organisation/data-agreements/{data_agreement_id}/provenance_trail | Fetch provenance trail for a DA |
POST /organisation/data-disclosure-agreements/{data_disclosure_agreement_id}/organisation/{organisation_id}/offer | Offer a DDA to an organisation |
POST /organisation/data-disclosure-agreements/{data_disclosure_agreement_instance_id}/accept | Accept a DDA sent by an organisation |
POST /organisation/data-disclosure-agreements/{data_disclosure_agreement_instance_id}/reject | Reject a DDA sent by an organisation |
POST /organisation/data-disclosure-agreements/{data_disclosure_agreement_instance_id}/terminate | Terminate a DDA sent by an organisation |
POST /organisation/data-agreements/{data_agreement_id}/auditor/{auditor_id}/request-verify | Request verification of a DA instance by a third party auditor |
POST /organisation/audit-requests | Query audit requests sent |
§ Auditors
Action | Description |
POST /organisation/audit-requests | Query audit requests received |
POST /auditor/audit-requests/{audit_request_id}/verify | Verify the digital signatures in DA |
§ Ontologies
§ Data Disclosure Agreement
Attribute Name | Mandatory | Description |
@context
|
TRUE | Defines the context of this document. E.g. the link the JSON-LD |
id
|
TRUE | Identifier to the data disclosure agreement instance addressed to a specific DUS |
version
|
TRUE | Version number of the data disclosure agreement |
template_id
|
TRUE | Identifier to the DDA offer |
template_version
|
TRUE | Version number of the DDA offer |
language
|
TRUE | language used. If not present default language is English |
data_controller
|
TRUE | Encapsulates the data controller data |
- did
|
TRUE | This is the DID of the data source preparing the agreement |
- name
|
TRUE | The name of the data source exposing the data |
- legal_id
|
TRUE | This is the legal ID to the data source. E.g. Swedish Organisation Number |
- url
|
TRUE | This is the data source organisation URL |
- industry_sector
|
TRUE | Industry sector that the DS belongs to |
agreement_period
|
TRUE | Duration of the agreement after which the data disclosure agreement expires |
data_sharing_restrictions
|
TRUE | Used by the DS to configure any data sharing restrictions towards the DUS. This could reuse the data agreement policy parameters as is. |
- policy_URL
|
TRUE | URL to the privacy policy document of the DS |
- jurisdiction
|
TRUE | The jurisdiction associated with the data source exposing the personal data that the privacy regulation is followed. These can be country, economic union, law, location or region. [value based on W3C Location and Jurisdiction] |
- industry_sector
|
FALSE | The sector to which the data source restricts the use of data by any data using services. If no restriction, leave blank |
- data_retention_period
|
TRUE | The amount of time that the data source holds onto any personal data, in days. |
- geographic_restriction
|
FALSE | The country or economic union is restricted from processing personal data.[value based on W3C Location and Jurisdiction] for the data source |
- storage_location
|
FALSE | The geographic location where the personal data is stored by the data source |
purpose
|
TRUE | Describes the purpose for which the data source shares personal data as described in the data agreement [values based on W3C DPV Purposes] |
purpose_description
|
TRUE | Additional description of the purpose for which the data source shares personal data |
lawful_basis
|
TRUE | Indicate the lawful basis for sharing personal data. These can be consent, legal obligation, contract, vital interest, public task or legitimate_interest. [values based on W3C DPV legal basis] |
personal_data []
|
TRUE | Encapsulates the attributes shared by the data source |
- attribute_id
|
TRUE | Identity of the attribute that is being shared |
- attribute_name
|
TRUE | Name of the attributes that is being shared |
- attribute_sensitive
|
FALSE | Defines the sensitivity of the data as per PII |
- attribute_category
|
FALSE | An explicit list of personal data categories to be shared. The categories shall be defined using language meaningful to the users and consistent with the purposes of the processing. [values based on W3C DPV-DP] |
code_of_conduct
|
FALSE | The code of conduct is followed by the data source. This provides the proper application of privacy regulation taking into account specific features within a sector. The code of conduct shall reference the name of the code of conduct and with a publicly accessible reference. |
data_using_service
|
TRUE | The data using services that have signed up for consuming data. This get populated after the data disclosure agreement is proposed by the data using service |
- did
|
TRUE | This is the DID of the data using service signing the agreement |
- name
|
TRUE | Name of the DUS signing the agreement |
- legal_id
|
TRUE | The legal ID of the data using service |
- url
|
TRUE | This is the data using service organisation URL |
- industry_sector
|
TRUE | Industry sector that the DUS belongs to |
- usage_purposes
|
TRUE | The purpose for which the data is being used by the DUS |
- jurisdiction
|
TRUE | The jurisdiction associated with the data using service consuming personal data that the privacy regulation is followed. These can be country, economic union, law, location or region. [value based on W3C Location and Jurisdiction] |
- withdrawal
|
FALSE | Reference to how data subject may withdraw. |
- privacy rights
|
FALSE | Reference to information on how to exercise privacy rights (ex. erasure, objection, withdrawal, copy) |
- signature_contact
|
FALSE | The responsible entity or person in the organisation signing the data disclosure agreement |
event []
|
TRUE | Encapsulates the data disclosure agreement lifecycle event data. For e.g. data disclosure agreement Offer, Accept, Reject, Terminate etc. |
- id
|
TRUE | Event identifier |
- time-stamp
|
TRUE | Event timestamp (ISO 8601 UTC) |
- did
|
TRUE | Should match the data_using_service did |
- state
|
TRUE | The various available states are: offer/accept/reject/terminate/fetch-data |
proof []
|
TRUE | |
- id
|
TRUE | Proof identifier |
- type
|
TRUE | Signature schema type (For e.g. ed25519, es256 e.t.c.) |
- created
|
TRUE | Proof creation time (ISO 8601 UTC) |
- verificationMethod
|
TRUE | Should match the data_using_service did |
- proofPurpose
|
TRUE | Contract agreement (Type inferred from JSON-LD spec) |
- proofValue
|
TRUE | Proof value |
§ Data Agreement
The existing DA schema (v1) is updated to include the following:
- The events with additional states to take care of third party data exchange handling
- DA revocation list
- DA expiry date to refetch the data based on the rules set by the DS
Attribute Name | Description |
@context
|
Defines the context of any this document. E.g. the link the JSON-LD |
id
|
Identifier to the data agreement instance addressed to a specific individual (Data Subject) |
version
|
Version number of the data agreement |
template_id
|
Identifier to the template of the data agreement |
template_version
|
Version number of the data agreement template |
language
|
Is the language used. If not present default language is English |
data_controller_name
|
The name of the data controller processing the data |
data_controller_url
|
This is the controller URL |
data_controller_legal_id
|
This is the legal ID to the data controller. E.g. Swedish org. number |
data_policy
|
Encapsulates the data policies used in the use of personal data |
- policy_URL
|
URL to the privacy policy document of the data controller organisation |
- jurisdiction
|
The jurisdiction associated with the organisation processing personal data that the privacy regulation is followed. This can be a country, economic union, law, location or region. [value based on W3C Location and Jurisdiction] |
- industry_sector
|
The sector to which the data controller belongs to |
- data_retention_period
|
The amount of time that an organization holds onto any personal data, in days (per purpose) |
- geographic_restriction
|
The country or economic union is restricted for processing personal data [value based on W3C Location and Jurisdiction] |
- storage_location
|
The geographic location where the personal data is stored |
- third_party_disclosure
|
This is a boolean value to indicate that the DA is used for third party data disclosures. This indicates that some data disclosures will happen and is used to release personal data to DUS based on an agreement |
purpose
|
Describes the purpose for which a data controller (DSor DUS) uses personal data. This is the purpose for which the data agreement is being formulated.[values based on W3C DPV Purposes] |
purpose_description
|
Provides a description of the purpose for which the personal data is used |
lawful_basis
|
Indicates the lawful basis for processing personal information. This can be based on consent, legal obligation, contract, vital interest, public task or legitimate_interest. [values based on W3C DPV legal basis] |
method_of_use
|
Type of processing of personal data [value based on W3C DPV Processing] |
personal_data []
|
Encapsulates the attributes used for the usage purpose defined |
- attribute_id
|
Identity of the attribute that is being processed |
- attribute_name
|
Name of the attributes that is being processed |
- attribute_sensitive
|
[OPTIONAL] Defines the sensitivity of the data as per PII (Personal Identifiable Information) |
- attribute_category
|
An explicit list of personal data categories to be processed for the specified purpose. The categories shall be defined using language meaningful to the users and consistent with the purposes of processing. [values based on W3C DPV-DP] |
- restrictions []
|
[OPTIONAL] If provided, this can be used to restrict where the data is being consumed |
-- schema_ID
|
[OPTIONAL] Restrict data from this personal data schema issued by a legal entity |
-- credential_def_ID
|
[OPTIONAL] Restrict data from this credential schema from an organisation |
dpia
|
Encapsulate the organisation performing the Data Protection Impact Assessment (DPIA) |
- dpia_date
|
The data when the latest DPIA was carried out |
- dpia_summary_url
|
The URL to the DPIA summary information |
code_of_conduct
|
The data controller may follow a code of conduct which sets the proper application of privacy regulation taking into account specific features within a sector. The code of conduct shall reference the name of the code of conduct and with a public accessible reference. |
event []
|
Encapsulates the data agreement lifecyle event data. For e.g. Data Agreement Offer, Accept, Reject, Terminate etc. |
- id
|
Event identifier |
- time-stamp
|
Event timestamp (ISO 8601 UTC) |
- did
|
The DID associated with the entity executing the event. E.g. An organisation (Data Controller) or an Individual (Data Subject) |
- state
|
The current state of the event during a data agreement lifecycle. E.g. Offer, Accept, Reject and Terminate |
proof []
|
Encapsulates the event signatures that allows anyone (e.g. an auditor) to verify the authenticity and source of the data agreement. It uses linked data proofs as per W3C and contains a set of attributes that represent a Linked Data digital proof and the parameters required to verify it |
- id
|
Proof identifier |
- type
|
Signature schema type (For e.g. ed25519, es256 e.t.c.) |
- created
|
Creation time of the proof (ISO 8601 UTC) |
- verificationMethod
|
Should match the data_using_service did |
- proofPurpose
|
Purpose of the proof |
- proofValue
|
Proof value |
data_subject_did
|
The DID of the data subject signing the agreement |
revocation_list
|
Link to the storage location of the revocation list for the agreement |
data_expiry
|
Expiry for the agreement (in epoch time) |
§ On-Chain Data
Only the DS/DUS DIDs and the root hash value based on the Merkle tree of the DDA/DA is stored on chain.
§ Interoperability
Work in progress
§ Appendix A: References
[1] Automated Data Agreements (Linaltec, iGrant.io and PrivacyAnt with NGI eSSIF-Lab)
[2] Lundin, L and Chandran, L, Lindquist, J: Data Agreement Specification
[3] Lindquist, J: (ISO Standards editor) Information technologies — Consent record information structure, ISO/IEC 27560 (working draft)
[4] Data Agreement DID Method Specification, Available at: https://github.com/decentralised-dataexchange/automated-data-agreements/blob/main/docs/did-spec.md [Accessed: 13-Jan-2022]
[5] Data Agreement - DIDComm Protocol Specification, Available at:https://github.com/decentralised-dataexchange/automated-data-agreements/blob/main/docs/didcomm-protocol-spec.md [Accessed: 13-Jan-2022]
[6] iGrant.io Data Wallet: https://igrant.io/datawallet.html [Accessed: 13-Jan-2022]
[7] The Open Provenance Model core specification (v1.1), Available: https://www.sciencedirect.com/science/article/abs/pii/S0167739X10001275 [Accessed: 13-Jan-2022]
[8] W3C PROV-Data Model: https://www.w3.org/TR/2013/REC-prov-dm-20130430/ [Accessed: 23-Feb-2022]
[9] W3C PROV-O: The PROV Ontology, Available at: https://www.w3.org/TR/prov-o/ [Accessed: 23-Feb-2022]
[10] W3C PROV Model Primer, Available at:https://www.w3.org/TR/prov-primer/ [Accessed: 23-Feb-2022]
[11] The OPM Provenance Model (OPM), Available at: https://openprovenance.org/opm/ [Accessed: 23-Feb-2022]
[12] Ethereum whitepaper: https://ethereum.org/en/whitepaper/ [Accessed: 01 April 2022]
§ Appendix B: Glossary
§ B.1 Abbreviations
Abbr. | Description |
ADA | Automated Data Agreements |
CRUD | Create / Read / Update / Delete |
DA | Data Agreement |
DAO | Decentralized Autonomous Organizations |
DDA | Data Disclosure Agreement (Introduced first in this specification) |
DEXA | Data Exchange Agreements |
DID | Decentralised Identifier (according to W3C) |
DPA | Data Processing Agreement |
DPIA | Data Protection Impact Assessment |
DS | Data Source |
DUS | Data Using Service |
EEA | European Economic Area |
EU | European Union |
GDPR | General Data Protection Regulation |
IPFS | Inter Planetary File System |
ISO | International Organization for Standardization |
JSON | JavaScript Object Notation |
SDK | Software Development Kit |
SSI | Self Sovereign Identity |
ToIP | Trust over Internet Protocol |
VC | Verifiable credentials |
W3C | World wide web consortium |
§ B.2 Terminology
Term | Description |
Data Agreement (DA) | A data agreement exists between organisations and individuals in the use of personal data. This agreement can have any legal basis outlined according to any data protection regulation, such as the GDPR. |
Data Disclosure Agreement (DDA) | Data disclosure agreements are formal contracts that detail what data is being shared and the appropriate use for the data between a DS and a DUS. It records conditions on which a DUS will consume data from a DS. A DDA could contain both personal and non-personal data. |
Decentralised IDentifier (DID) | A DID is a new type of identifier that is globally unique, resolvable with high availability, and cryptographically verifiable. DIDs are typically associated with cryptographic material, such as public keys and service endpoints, for establishing secure communication channels. |
Data Processing Agreement | A Data Processing Agreement is a legally binding contract, either in written or electronic form, entered between a data processor and a data controller that states the rights and obligations of each party concerning the protection of personal data. The agreement will be legally binding in any data protection regulation, such as the GDPR. |
Data Source (DS) | The role responsible for collecting, storing, and controlling personal data that persons, operators, and data using services may wish to access and use; defined as per MyData. |
Data Using Service (DUS) | The role responsible for processing personal data from one or more data sources to deliver a service; is defined as per MyData |
Data Protection Impact Assessment (DPIA). | A Data Protection Impact Assessment is a process designed to help systematically analyse, identify and minimise the data protection risks of a project or plan. |
Individual | A natural, living human being, in the GDPR also referred to as a data subject |
Self Sovereign Identity | A model for managing digital identities where individual identity holders can create and control their verifiable credentials without being forced to request permission from an intermediary or centralised authority and give control over how their data is shared and used |
§ Appendix C: Data Processing Agreement (DPA)
Processing personal data on behalf of another organisation requires a Data Processing Agreement (DPA). The one processing personal data is a data Processor, and the one using the services of the data processor is the data controller. The DPA is required to ensure that when something goes wrong, there are steps to address, for example, a security incident. Guidance for creating a “Data Processing Agreement” is based on GDPR Article 28 and can be summarised in the following list:
- Processing only on the controller’s documented instructions (Article 28(3)(a))
- Duty of confidence (Article 28(3)(b))
- Appropriate security measures (Article 28(3)©)
- Using sub-processors (Article 28(3)(d))
- Data subjects’ rights (Article 28(3)(e))
- Assisting the controller (Article 28(3)(f))
- End-of-contract provisions (Article 28(3)(g))
- Audits and inspections (Article 28(3)(h))
Below is a DPA template from the Danish Data Protection Agency (Datatilsynet) which is endorsed by the EDPB. These sections are the relevant ones to this analysis.
- Table of Contents \
- Preamble \
- The rights and obligations of the data controller \
- The data processor acts according to instructions \
- Confidentiality \
- Security of processing \
- Use of sub-processors \
- Transfer of data to third countries or international organisations \
- Assistance to the data controller \
- Notification of personal data breach \
- Erasure and return of data \
- Audit and inspection \
- The parties’ agreement on other terms \
- Commencement and termination \
- Data controller and data processor contacts/contact points \
Included in the template are three appendixes that need to be filled out.
Appendix, A Information about the processing
Appendix B, Authorised sub-processors
Appendix C Instruction pertaining to the use of personal data
§ Appendix D: Data provenance - Ontology analysis
**Provenance Models and provenance metadata: **Different provenance metadata models were evaluated to determine the model that best matched the requirements to support DAs. The models that were analysed were the Open Provenance Model (OPM) and W3C PROV. Both have the basic components of entity, activity and agent (people, organisations and devices). W3C PROV provides additional terms to help explain the execution details of a particular activity by using “plans”.
To capture the provenance metadata in a data exchange, it is necessary to understand the complete lifecycle for populating, capturing and updating the metadata. The below list of terms describes the provenance metadata required for a fully functioning data exchange ecosystem (W3C PROV primer) and analyses how it is today address within the DA DDAs.
Term | Description | Terms in the DA/DDA |
Entities | Physical, digital or concept (for example, document, database, etc.). | The personal data attributes in the DA and the DA record |
Activities | Actions changing attributes to become new entities. | The usage purpose in the DA for which the data was collected. |
Usage and Generation | Further description of activity either “usage” or generate (for example, develop a new version of the document). | Processing category, for example, how the data is processed. For example, anonymisation, transformation etc. |
Agents and Responsibilities | Person, organisation or piece of software assigned responsibility for action (for example, a person creates a chart, the software presents a chart). | The data wallet and the backend run the DS and DUS software. |
Roles | Relationship between an entity and activity (e.g, the editor of a document). | DUS, DS |
Revisions | Derivative from a previous revision where attributes may change. | This could be the revision of the DA or the personal data. To be further studied. |
Plans | Describe what plans are followed by agents to execute the activity. | The plans will be correlated with the activity and the introduction of the processing categories. |
Time | Fundamental when the event took place (for example, when the activity took place to create a new revision of document). | Event timestamp (ISO 8601 with date and time in UTC) |
Alternative Entities and
Specialization |
Shared attributes between entities is a specialisation when wanting to refer to the same thing (for example, a reference to an article or a reference to a quote in the article, two authors may think of the same article). | Some attributes in the data agreement may be referred to at a high or granular level. For example, personal data classification, like an email address, is general but it is possible to specify the specific version of an email associated with a schema table. |
The table below provides an extension to the provenance metadata, outlined in the table above, and is introduced in W3C P-Plan Ontology to add further granularity.
Term | Description | Terms in the DA |
Entity | subclass of PROV entity | same as PROV |
Activity | subclass of PROV activity | same as PROV |
Bundle | A bundle of entities | If personal data is combined from different sources the associated entities become bundled into a new entity. |
MultiStep | Represents a plan that is a step of another plan | Not supported, will be considered for the future. |
Plan | Composed of smaller steps that use or produce variables. | List of processing classification in the DA. |
Step | Represents the planned execution activity. | Processing classification |
Variable | Represents a description of the input or output for the planned activity. | Not supported; will be considered for the future. |
§
§ Appendix E: Ontology analysis
§ E.1 DA and DDA Ontology
§ E.1.1 Legal prose builder
To capture the contractual obligations between organisations in a data exchange ecosystem, a governance framework is suggested regarding legal documents for the DS or DUS. The legal documents required typically do not change over time and are frequently copied; therefore, templates can be created with fixed input to populate information like organisation address, duration of the agreement, etc. A legal prose builder is introduced based on work from CommonAccord.
CommonAccord is an initiative to create global codes of legal transactions by codifying and automating legal documents, including contracts, permits, organisational records, and consents in any language. It has two repositories: one for the reference legal documents templates and a repository with the input to the templates similar to a DA record. The input will be called a DDA/DPA record. An example of an “Intra Group Data Transfer Agreement” can be found in CommonAccord. Section 3 in the model has text for data transfer between data controllers in a DDA, and section 4 has text for a DPA. These examples will be modified to European requirements.
§ E.1.2 DDA/DPA record
The DDA/DPA record will capture the requisite input fields to the agreement template. Below is an example of the content of the record.
- Organisation name and address
- Contact information
- Duration of the agreement
- List of purposes categories for processing personal data
- Location personal data is stored
- Retention period
- List of personal data categories to be processed
Many fields in the record are similar to the DA. It is important to use the same ontology between the two in order to be able to automate any controls required. For example, if a DUS uses personal data that is not in the DDA or for a different purpose than specified, the system can block the DA from being able to be registered by the DUS. In theory, if any DUS infringement occurs, there is a means to automatically revoke the DA or at the very least block new ones from being created during the arbitration. A classic example that would benefit from this solution is Cambridge Analytics’ relationship with Facebook. Cambridge Analytica would be required to set up a new DA with the data subject. If Cambridge Analytica tried to use personal data outside the stated purposes, the DDA would kick in and block that attempt.
§ E1.3 DDA and DDA attribute mapping
A key aspect that needs to be understood when populating the DA or DDA is what fields must be filled and how they may be used for automated enforcement policies. The automated enforcement policies are numbered in the policies column. This table was used to arrive at the ontology described in 5.6 Ontologies.
Data Agreements attributes | Personal | Disclosure | Policy |
[NEW] Relation | B2C | B2B | |
[NEW] Role | Data Subject, Controller | Data Controller, Processor | |
Purpose | x | x | P1: Purpose match |
[NEW] Service name | x | x | |
Lawful basis | x | x | P2: If consent based require DA-Privacy |
Form of consent (explicit, implicit) | x | ||
Duration (of service) | x | x | |
Collection method | x | ||
[NEW] Processing classification | x | x | P4: Processing match |
Personal data categories (including in identifier and sensitivity classification) | x | x | P5: Categories match |
Storage (or compute) location | x | P6: Transfborder flag | |
Retention (how long data is stored) | x | x | |
[NEW] Privacy rights (example withdrawal) | x | x | |
Privacy assessment (DPIA) (2) | x | x | P7: Flag if required |
[NEW] Privacy (+ security) program (1) | x | P7: Flag if required | |
[NEW] Code of conduct (association) (1) | x | (x) | P7: Flag if required |
[NEW] Contact information to Data Controller or Processor | x | x | P8: Contact information match |
Jurisdiction | x | x | P9: Cross-border data transfer compliance |
[NEW] Data sharing (y/n) | x | x | P10: Data sharing match for same personal data categories |
Reference to Third parties (if shared) | x | ||
Privacy policy URL | x | ||
[NEW] If applicable provenance key to source | (x) | (x) | |
Event: notice, consent, reject, terminate + timestamp | x | x |
*() indicates optionality or condition (1) - Can support certification
§ E.1.4 Provenance metadata relation with the ledger
All DS disclose what data they are ready to expose in the data exchange ecosystem by publishing their DDA offer(s). Each DUS can decide if their signed DDA i.e. what personal data they are consuming should be published or not.
No personal data shall be stored on the ledger. Even a hash value that may correlate with a data subject like a decentralised identifier is considered identifiable and is marked as sensitive. Personal data will be off-ledger. In order to make the DA verifiable or auditable in iExec confidential computing, a smart data agreement (SDA) is required.
§ E.2 Privacy ontology vocabulary
The provenance metadata requires a privacy ontology vocabulary in order to have the same interpretation of the terms used across implementations. The privacy ontology will be based on W3C Data Privacy Vocabulary [DPV] which enables expressing machine-readable metadata about the use and processing of personal data based on legislative requirements such as the General Data Protection Regulation [GDPR].
The core concepts in DPV represent the most relevant concepts for representing information regarding the what, how, where, who, why of personal data and its processing. Each of these concepts is further elaborated as a taxonomy of concepts in a hierarchical fashion. The DPV provides the following as ‘top-level’ concepts and relations to associate them with other concepts (only the ones relevant to this specification are listed, for other classes refer to DPV):
Class | Property | Description |
PersonalData | hasPersonalData | Personal data categories |
Purpose | hasPurpose | Purpose of Processing |
Processing | hasProcessing | Category or type of processing of personal data |
LegalBasis | hasLegalBasis | Legal bases or justifications for processing |
Foot Notes Work in progress to convert footnotes to references
[^1]: EU GDPR: https://docs.igrant.io/regulations/reg-eu-gdpr
[^2]: W3C DID Registry: https://www.w3.org/TR/did-spec-registries
[^3]: DID:MyData Specification
[^4]: Automated Data Agreements: https://github.com/decentralised-dataexchange/automated-data-agreements
[^5]: ISO/IEC AWI TS 27560 Privacy technologies — Consent record information structure