Secure trustless text processing of sensitive documents

Flávio C Coelho; Bruno Cuconato

doi:10.7287/peerj.preprints.2994v1

Secure trustless text processing of sensitive documents

Flávio C Coelho , Bruno Cuconato

School of Applied Mathematics, Fundação Getulio Vargas, Rio de Janeiro, Rio de Janeiro, Brazil

DOI: 10.7287/peerj.preprints.2994v1

Published: 2017-05-26
Accepted: 2017-05-26

Subject Areas: Cryptography, Data Science, Natural Language and Speech
Keywords: Sensitive documents, machine learning, Hash functions, Document classification

Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Coelho FC, Cuconato B. 2017. Secure trustless text processing of sensitive documents. PeerJ Preprints 5:e2994v1 https://doi.org/10.7287/peerj.preprints.2994v1

Abstract

Scaling up the analysis of sensitive or confidential documents frequently stumbles on the limited number of individuals with the necessary clearance to access the documents. The availability of cryptographic protocols compatible with text processing methods can greatly improve this situation allowing for the automated processing of large corpora of confidential documents by ``untrusted'' third-parties. In this paper we propose a protocol which allows for secure outsourcing of text analytics tasks without compromising the confidentiality of documents. The method scales to large corpora, and presents linear time complexity on the size of the corpus.

Author Comment

Preprint manuscript submitted to a peer reviewed journal.

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article