Award Abstract # 2029680
RAPID: World Wide Access to COVID-19 Information

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: ROCHESTER INSTITUTE OF TECHNOLOGY
Initial Amendment Date: May 22, 2020
Latest Amendment Date: May 22, 2020
Award Number: 2029680
Award Instrument: Standard Grant
Program Manager: Daniela Oliveira
doliveir@nsf.gov
 (703)292-0000
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: May 15, 2020
End Date: April 30, 2022 (Estimated)
Total Intended Award Amount: $147,665.00
Total Awarded Amount to Date: $147,665.00
Funds Obligated to Date: FY 2020 = $147,665.00
History of Investigator:
  • Hrishikesh Bhattacharya (Principal Investigator)
    hbhatta@okstate.edu
Recipient Sponsored Research Office: Rochester Institute of Tech
1 LOMB MEMORIAL DR
ROCHESTER
NY  US  14623-5603
(585)475-7987
Sponsor Congressional District: 25
Primary Place of Performance: Rochester Institute of Tech
141 Lomb Memorial Drive
Rochester
NY  US  14623-5603
Primary Place of Performance
Congressional District:
25
Unique Entity Identifier (UEI): J6TWTRKC1X14
Parent UEI:
NSF Program(s): Secure &Trustworthy Cyberspace
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 025Z, 096Z, 7914
Program Element Code(s): 806000
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Response to a worldwide crisis, such as the current COVID-19 pandemic, requires extensive communication involving different groups (governments, scientists, healthcare providers, common people). The Internet does provide an effective means of communication ? but rumor, non-validated theories or outright propaganda also propagate to wide audiences over the Internet, and have the potential to cause significant harm. It is tempting to conclude that strong Internet controls are necessary in times of crisis to limit the spread of malcontent. On the other hand, more free flow of information is also beneficial. It can enable rapid yet nuanced deployment of resources to critical needs in times of such a crisis. It is therefore an open question whether open or strict information control policies work better in a pandemic situation. In this project, researchers study Internet censorship policies in several countries ? ranging from those employ very strict controls (e.g. China, Russia) through moderate (e.g. South Korea, India) to very open (e.g. USA, Germany, Japan) ? and characterize how they control access to information, specifically related to COVID-19, during the course of the pandemic. This study would help decide best practice in Internet communications and thereby help to protect national health.

This study collects data regarding read and write access to COVID-19 related information on the Internet, from vantage points in a range of countries. Chosen countries are well-connected to the Internet, geographically diverse, and different in their approaches to Internet censorship: some strict censors, some semi-free, and some known for free speech (as reported in the Internet Freedom Study, by Freedom House). Further, as access controls may vary within a country, multiple vantage points are used ? ideally, one in the "customer cone" of each major AS in the country. From these vantage points, four measures are checked twice a month: (1) Whether users can access sources considered authoritative by the Johns Hopkins University's Center for Systems Science and Engineering Coronavirus Dashboard, Center for Disease Control and Prevention, World Health Organization, and peer-reviewed journals such as Science, Nature, NEJM etc. (2) Whether users can access non-authoritative sources, such as websites identified as spreading misinformation (identified by researchers following social networks). (3) What social media sites users can access, and whether these are known to be censored (e.g. can users in country A reach Twitter, Facebook etc., or only local sites like qq? Are these sites strictly filtered?) (4) If a user sends an HTTP query with a manually-constructed trigger phrase (say, "coronavirus bioweapon"), is the query blocked? The data collected in this study is correlated against trusted measures of pandemic impact (time-to-peak and mortality), and can help identify the Internet control strategies associated with success in containing and recovering from the COVID-19 pandemic.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

This project began with the goal of investigating how Internet censorship affects the pandemic outcomes of a country. Do censors in countries (say China or Russia) actually help, by stopping misinformation in its tracks? Or do they cause more damage, by preventing the public from learning true information? 

In order to study this in detail, we built an infrastructure of Virtual Private Server (VPS) nodes, across a range of different countries. We originally planned to try two kinds of tests:

1. To find a corpus of sites with good information (e.g. Google Scholar, Johns Hopkins Covid dashboard, CDC, etc.) and of sites with misinformation, and compare how effectively censors in different countries blocked them.

2. To find what triggers the censor. - for instance, send an HTTP POST with "trigger words" such as "Li Wenliang" and see if it gets blocked; write known misinformation (boiling alcohol fumes will kill covid, etc.), on social-media sites, and see if it is taken down; and so on.

In the first thrust of the project, work proceeded smoothly. We were able to build corpus of known-good and -bad sites (from sources such as newsguard), and check whether they were being blocked. Unfortunately, the result was that a very small proportion of sites were blocked at all - some 1 or 2 out of some 330 bad sites, and no good sites. We concluded that there was no meaningful level of censorship that could be affecting COVID outcomes.

The second set of tests however, we learned on discussing with the community, would be considered unethical research, even if we put up misinformation for a very short period (<1 day).  We therefore decided to stop this line of work and focus on "read access" rather than "write access" from various countries.

In Mar 2021, we learned of results from the CensoredPlanet group at University of Michigan, which specializes in indirect measurement of network interference, censorship etc. They report a very small fraction of sites to be blocked (46 websites out of 1291), but this was still an order of magnitude greater than ours. Accordingly, we started a comparative study to see if our measurement strategy i.e. directly accessing websites from vantage points (VPS) in a country, produced different results than existing platforms.

We found a very interesting result: our VPS-based measurements did indeed successfully load sites that were considered blocked by general censorship measurement projects such as ICLab. For example, out of 100 representative sites known as blocked in China, we were able to load 61. We thus discussed the feasibility of pivoting the project to a comparative study of censorship measurement (of all sites, not only COVID-19 related sites) when measured from VPN, from VPS, and using indirect tools such as Quack.

Being a one-year project, we ran out of funds before this could be completed; but we were able to not only collect preliminary data but also (1) build tools to deploy, manage, and run a large VPS-based infrastructure with over fifty nodes (including some countries that are not well studied for censorship, e.g. Mongolia), and (2) develop a method to attribute many instances of "collateral damage" where a country does not censor a particular website, but is still denied access to it, because it is downstream from another country's censors. 

We are now seeking a new grant to collect more data and explore this research direction.


Last Modified: 05/01/2022
Modified by: Hrishikesh Bhattacharya

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page