Protocol for Indirect Controlled Access to Repository Data
Labelling data for training Artificial Intelligence applications is a process that can still only be performed by humans. This is a time-consuming manual exercise that requires access to a large set of workers. While the behemoths of the industry can afford to hire a large workforce, the smaller players have found the cost formidably high - and forcing them to crowdsource labelling instead.
Experts opine that 80% of the factors that make an AI application successful depend on the quality of the dataset. To put it simply: access to the high-quality dataset is what sets the winners apart, and not the initial algorithms and neural networks. Therefore, if the dataset is leaked out to another company, they will be able to replicate the functionality within a matter of days - if not hours - based on the data crunching abilities their infrastructure provides them. For example, all Alpha Zero needed to become the strongest computer engine was 4 hours of crunching data.
One of the goals of Dbrain is to minimize the risks data owners face when sharing their datasets with the third party. Towards this, they have developed and implemented Protocol for Indirect Controlled Access to Repository Data - PICARD for short.
PICARD ensures that all the datasets associated with the AI applications hosted on the Dbrain platform are protected from prying eyes. With this protocol implemented, data scientists do not have to download the datasets to train the AI models they are working on. It allows data scientists to not just work on a contract basis but also to contribute to community-owned datasets and public kernels. To encourage participation towards community projects, it hosts Kaggle-like competitions on openly listed challenges.
PICARD enforces strict access control both during the model training phase and the deployment phase. This ensures the safety of datasets and protects the intellectual property rights of all parties involved. During training, all the computations happen within the platform and no external access to the calculations is allowed. Further, developers who do not own the original datasets are not allowed to download trained models unless they have express permission from the owner. Additional validations are put in place to ensure that no unauthorized party can download this information. Model and dataset owners are required to approve indirect usage of their intellectual property is
derived work. In return, they may either ask for a share in future revenues or receive one time/recurring direct payment.
All access to datasets and model APLIS are logged into to the blockchain as a transaction. Model and dataset owners can access the entire history of access whenever they need to. While owners can completely restrict access to their IP without having to go through any broker (say, a database administrator), the much more profitable approach is to provide indirect controlled access to datasets and models. Owners can decide the price at which they are willing to provide their data and models to others and list them publicly on the Dbrain platform. Developers can use these to build a better
model - or improve an existing one with creative tweaks - and provide access to the same as a new API through Dbrain platform. The owners of datasets can also decide to collaborate with developers to build new AI models and share revenues generated from the resulting products. The granular access control guarantees fair and precise distribution of revenues from AI Apps by ensuring that all history is recorded into immutable ledgers audited by third parties.
Through Dbrain and PICARD, developing AI applications has been democratized. Now even small up-starts will be able to challenge the Goliaths of the industry on an equal footing. With access to abundant crowd workers around the globe, even a small business will be able to get high-quality labelled data without worrying about data leaks thanks to the robust PICARD implementation on the Dbrain platform.
Link to my Referral: bountyhive.io/r/fawadescobar