Carnegie Mellon University

photo of bhatia and breaux with award

September 10, 2018

Bhatia and Breaux Take Home Best Paper at IEEE RE ‘18

By Josh Quicksall

Crafting a solid privacy and data usage policy is no easy task. The policy must encompass the breadth of services and products a company currently offers, digital and physical storefronts, as well as their current business practices. But, what's more, is that these policies are often drafted with flexibility in mind - allowing for the business to shift their practices and offerings without revisiting their data policy.

Oftentimes, this flexibility is incorporated by way of ambiguity in the policy statements. This ambiguity is the source of a great deal of apprehension on the part of users, notes a recent study by the Institute of Software Research’s Jaspreet Bhatia and Travis Breaux.

Their paper, “Semantic Incompleteness in Privacy Policy Goals”, addresses the semantic source of this ambiguity and, recently, was awarded Best Paper at the 26th IEEE International Requirements Engineering Conference (RE ‘18) in Banff, Canada.

By representing incompleteness in policy statements as semantic frames, Bhatia and Breaux, were able to uncover which data actions and semantic roles are necessary to construct complete data practice descriptions.

When these descriptions of how, when, where, and by whom user data can be used are incomplete it can lead users to overestimate the risk of a privacy breach, the researchers note. “The overestimation of privacy risk is not a favorable situation for a company, because it can lead to either the user not using a service due to fear of data misuse, or it can lead to the regulator concluding that the data practice is not in compliance with a regulation.”

To that end, Bhatia and Breaux, manually annotated over 200 statements from five privacy policies resulting in 878 instances of 17 types of semantic roles. Coupling this with a user’s perceived privacy risks - measured using factorial vignettes - the team found that the perception of risk decreased dramatically when two roles are present in a statement: the condition under which a data action is performed and the purpose for which the user’s information is used.

This work, the group notes, represents the first step in improving the ability of organizations to craft privacy policies which alleviate user concerns while also remaining flexible. “We can envision using the annotation technique and the findings from this paper to build a corpus of semantic frames for data practices, and then studying ways to develop an automatic role labelling system for privacy policies.”