Machine Learning Privacy

Figure 1: What can be inferred about the trained model and data subjects in the training data when the model is exposed as a service to third parties?

Data analysis methods using machine learning (ML) can unlock valuable insights for improving revenue or quality-of-service from, potentially proprietary, private datasets. Having large high-quality datasets improves the quality of the trained ML models in terms of the accuracy of predictions on new, potentially untested data. The subsequent improvements in quality can motivate multiple data owners to share and merge their datasets in order to create larger training datasets. For instance, financial institutes may wish to merge their transaction or lending datasets to improve the quality of trained ML models for fraud detection or computing interest rates. However, government regulations (e.g., the roll-out of the General Data Protection Regulation in EU, the California Consumer Privacy Act or the development of the Data Sharing and Release Bill in Australia) increasingly prohibit sharing customer’s data without consent. This motivates the need to conciliate the tension between quality improvement of trained ML models and the privacy concerns for data sharing. Therefore, this is a need for privacy-preserving machine learning.

Collaborators: Dali Kaafar, David Smith

Funding: Next Generation Technologies Fund from the Defence Science and Technology Group (DSTG)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.