Privacy-Preserving Data Sharing
Description
Use smart-contracts technology to express "algorithms" (queries) for data-sharing, in which the execution of the smart-contract is conditioned on the fulfillment of a number or requirements and input-parameters expressed in the smart-contract.
Instead of a centralized data processing architecture, the P2P nodes (e.g. in a blockchain) offers the opportunity for data (user data and organizational data) to be stored by these nodes and be processed in a privacy-preserving manner, accessible via well-known APIs and authorization tokens and the use of smart contracts to let the “query meet the data”.
In this new paradigm of privacy-preserving data sharing, we “move the algorithm to the data” where queries and subqueries are computed by the data repositories. A repository must never release raw data and they must perform the algorithm/query computation locally which produce aggregate answers only. This approach of moving the algorithm to the data provides data-owners and other joint rights-holders the opportunity to exercise control over data release, and thus offers a way forward to provide the highest degree of privacy-preservation while allowing data to still be effectively shared. Furthermore, only "safe answers" must be released, meaning that the query-results must be filtered (e.g. through a machine learning engine) to ensure it satisfies the desired policies (e.g. answers do not violate privacy).
The "safe answer" queries and subqueries can be expressed in the form of a Query Smart Contract (QSC) that legally bind the querier (person or organization), the data repository and other related entities.
A query smart contract that has been vetted to be safe can be stored on nodes of the P2P network (e.g. blockchain). This allows Queriers to not only search for useful data (as advertised by the metadata in the repositories) but also search for prefabricated safe QSCs that are available throughout the P2P network that match the intended application. Such a query smart contract will require that identities and authorizations requirements be encoded within the contract. Each QSC must be digitally-signed by its author to ensure protection against unauthorized modifications.
Champion / Stakeholder
Thomas Hardjono
Actors
<<Provide a list of the actors, including what type of entity they are and what their role in the use case.>>
<<Should we consider including bad actors who do not do what they are "supposed to"?>>
Querier | Person or organization issuing the query. |
Data Repo | A data repository connected to the P2P network. |
Smart Contracts Node (QSC-Node) | A node on the P2P network that makes available a query smart-contract. |
Metadata Node | A node on the P2P network that "advertises" the existence of certain types of data, and the route/location to the corresponding data repository. |
Prerequisites / Assumptions
<<Provide a description of all assumptions or prerequisites that need to be in place for the use case to be applicable or possible.>>