PRIVEE: A Visual Analytic Workflow for Proactive Privacy Risk Inspection of Open Data #3415

Kaustav Bhattacharjee, Akm Islam, Jaideep Vaidya, Aritra Dasgupta

View presentation:2022-10-19T16:30:00ZGMT-0600Change your timezone on the schedule page
2022-10-19T16:30:00Z
Exemplar figure, described by caption below
PRIVEE is an end-to-end risk inspection workflow for open datasets that informs the defender in the analytical loop about potential disclosure risks in the presence of joinable datasets. Interactive visualization plays a crucial role in bootstrapping the risk inspection process via risk profiling, triaging and explaining risk signatures, and ultimately detecting instances of true disclosure at a record level. Colored borders track datasets across the goals.

The live footage of the talk, including the Q&A, can be viewed on the session page, VizSec: Best Paper Announcement and Papers.

Keywords

Human-centered computing, Visualization, Visualization application domains, Visual analytics;

Abstract

Open data sets that contain personal information are susceptible to adversarial attacks even when anonymized. By performing low-cost joins on multiple datasets with shared attributes, malicious users of open data portals might get access to information that violates individuals' privacy. However, open data sets are primarily published using a release-and-forget model, whereby data owners and custodians have little to no cognizance of these privacy risks. We address this critical gap by developing a visual analytic solution that enables data defenders to gain awareness about the disclosure risks in local, joinable data neighborhoods. The solution is derived through a design study with data privacy researchers, where we initially play the role of a red team and engage in an ethical data hacking exercise based on privacy attack scenarios. We use this problem and domain characterization to develop a set of visual analytic interventions as a defense mechanism and realize them in PRIVEE, a visual risk inspection workflow that acts as a proactive monitor for data defenders. PRIVEE uses a combination of risk scores and associated interactive visualizations to let data defenders explore vulnerable joins and interpret risks at multiple levels of data granularity. We demonstrate how PRIVEE can help emulate the attack strategies and diagnose disclosure risks through two case studies with data privacy experts.