Harvesting the Power of Big Data while Preserving Privacy
Speaker: Anderson C A NascimentoWith success stories ranging from online matchmaking to self-driving cars, machine learning (ML) has been one of the most impactful areas of computer science. ML's versatility stems from the wealth of techniques it offers, making ML seem to have a tool for any task that involves building a model from data. And yet, ML makes an implicit overarching assumption that severely limits its applicability to a broad class of critical domains: the data owner is willing to disclose the data to the model builder. This assumption's significance is immediately apparent in ML applications in security-sensitive industries such as the financial sector and electronic surveillance. In the former, a bank may want to hire an analytics company to mine the data of its customers but, being bound by the customer agreement, cannot simply hand over the data to that company. In the latter, Internet service providers wanting to have a consulting firm do traffic analysis on their logs may be unwilling to disclose details about their customer base in the process. In this presentation we will show how recent developments in cryptography, and more particularly in secure multiparty computation can be used to reconcile the benefits of machine learning with privacy for people providing the data behind it.