
NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | June 29, 2020 |
Latest Amendment Date: | April 18, 2024 |
Award Number: | 2007298 |
Award Instrument: | Standard Grant |
Program Manager: |
Sol Greenspan
sgreensp@nsf.gov (703)292-7841 CCF Division of Computing and Communication Foundations CSE Directorate for Computer and Information Science and Engineering |
Start Date: | October 1, 2020 |
End Date: | September 30, 2024 (Estimated) |
Total Intended Award Amount: | $498,221.00 |
Total Awarded Amount to Date: | $524,221.00 |
Funds Obligated to Date: |
FY 2022 = $16,000.00 FY 2024 = $10,000.00 |
History of Investigator: |
|
Recipient Sponsored Research Office: |
5000 FORBES AVE PITTSBURGH PA US 15213-3815 (412)268-8746 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
5000 Forbes Ave Pittsburgh PA US 15213-3815 |
Primary Place of
Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): |
Secure &Trustworthy Cyberspace, Software & Hardware Foundation |
Primary Program Source: |
01002425DB NSF RESEARCH & RELATED ACTIVIT 01002021DB NSF RESEARCH & RELATED ACTIVIT |
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Verifying that web and mobile applications will protect user privacy requires knowledge about what kinds of data and data practices are sensitive to users. Privacy impact assessments are standardized procedures that companies and government agencies use to identify what personal information is collected, used, and for what purpose, and shared with whom, as well as, what steps are taken to protect that information. Conducting privacy impact assessments on applications is time consuming, because evaluators often have limited knowledge of the software?s behavior, and the assessments are often done after the software has been constructed, which is costly. Because developers are under pressure to continuously release new application versions, they have little time for extensive documentation about their data practices. Today, the status quo in documenting privacy is the privacy policy, which regulators increasingly check for data practice misrepresentations during the application?s lifetime. This project seeks to develop methods and tools to automatically and quickly conduct privacy impact assessments from software artifacts, called user stories, that are easier for developers to produce. Based on a risk assessment informed by which data practices are most sensitive to users, developers can prioritize where best to introduce privacy controls that users want. Furthermore, by conducting risk assessments from user stories, regulators and developers would have greater assurance that assessments accurately reflect current app behavior. Finally, these assessments save developer time, because a change to a user story could trigger an automatic re-assessment that alerts the developer to changes in privacy risk. This research is transformative because it allows software developers to respond to changes in privacy risk during design time, when important safeguards can be introduced, as opposed to waiting for lengthier impact assessments that are harder to integrate after the software has been constructed.
The project investigates the symbolic and statistical relationships between agile requirements, privacy risk and privacy policies. The research explores strategies for scoring user stories for privacy risk and prioritizing which stories are most important to user privacy comprehension. The components of the solution will be investigated as follows: (1) corpora of user stories and privacy policies expressed in natural language will be acquired and annotated using coding theory; (2) semantic frames and an ontology expressed in Description Logic will be extracted from the corpora using entity and relation extraction; and (3) the risk scores will be collected using privacy risk surveys that measure how users perceive privacy risk under different scenarios derived from user stories and mitigations. A key obstacle to effectively scoring risk is the inherent presence of ambiguity and vagueness in natural language. The semantic frames and ontology will be used to encode and resolve ambiguity and vagueness in the scenarios. Furthermore, the survey results will be used to model changes in risk due to selected mitigations, thus, developers will be able to explore the local design space around a specific user story and available mitigation choices.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external
site maintained by the publisher. Some full text articles may not yet be available without a
charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from
this site.
PROJECT OUTCOMES REPORT
Disclaimer
This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.
Mobile and web applications provide users with services to solve everyday problems. Increasingly, these applications collect personal information to personalize these services to individual user needs. Because privacy is personal and not every user perceives the same level of privacy risk when sharing their personal information, developers need ways to elicit privacy requirements from users before or during design time. This project investigated new ways to collect privacy requirements directly from users by inviting users to describe their experiences in using mobile and web apps. By collecting user perceptions of privacy risk, we could train a machine learning model to predict which information types were high and low risk. This information could then be shared with developers to help them spot privacy hotspots in their application design. In addition, we conduct research to identify ways that software could increase the level of personalization to offer a better fit for user needs. This study revealed gaps in modern software applications where user needs are unaddressed, and where addressing those needs require collecting deeply personal information. This study raises awareness for new ways to develop better software while also identifies a greater need to adopt some of the methods and tools produced by this research. The research resulted in tools to conduct this elicitation exercise and collect the data that developers need.
Last Modified: 12/20/2024
Modified by: Travis Breaux
Please report errors in award information by writing to: awardsearch@nsf.gov.