Award Abstract # 2033946
RAPID: Neighborhood-level U.S. Internet Accessibility Assessment through Dataset Aggregation and Statistical and Predictive Modeling

NSF Org: CNS
Division Of Computer and Network Systems
Recipient: UNIVERSITY OF CALIFORNIA, SANTA BARBARA
Initial Amendment Date: June 16, 2020
Latest Amendment Date: June 16, 2020
Award Number: 2033946
Award Instrument: Standard Grant
Program Manager: Deepankar Medhi
dmedhi@nsf.gov
 (703)292-2935
CNS
 Division Of Computer and Network Systems
CSE
 Directorate for Computer and Information Science and Engineering
Start Date: July 1, 2020
End Date: June 30, 2022 (Estimated)
Total Intended Award Amount: $149,439.00
Total Awarded Amount to Date: $149,439.00
Funds Obligated to Date: FY 2020 = $149,439.00
History of Investigator:
  • Elizabeth Belding (Principal Investigator)
    ebelding@cs.ucsb.edu
Recipient Sponsored Research Office: University of California-Santa Barbara
3227 CHEADLE HALL
SANTA BARBARA
CA  US  93106-0001
(805)893-4188
Sponsor Congressional District: 24
Primary Place of Performance: University of California-Santa Barbara
3227 Cheadle Hall, 3rd Floor
Santa Barbara
CA  US  93106-2050
Primary Place of Performance
Congressional District:
24
Unique Entity Identifier (UEI): G9QBQDH39DF4
Parent UEI:
NSF Program(s): Networking Technology and Syst
Primary Program Source: 01002021DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 096Z, 7914, 7923, 9102
Program Element Code(s): 736300
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

The U.S. has long suffered from digital inequities in multiple dimensions: rural and tribal regions are far less likely than urban cities to have high speed Internet access. Internet availability and quality within communities can often be predicted based on demographic and socioeconomic factors. The COVID-19 pandemic has brought to the forefront these inequalities; due to shelter-in-place orders, the lack of high quality Internet access has had dramatic impacts, including on the ability to participate in remote learning, remote work, and telehealth. While new government programs have been created to try to broaden access, a fundamental problem persists: no one accurately knows who does and does not have high quality access. There are many datasets of Internet measurements, but each on its own represents too incomplete a picture to provide the fine-grained information needed to discern which communities, or, ideally, neighborhoods lack quality Internet access. However, these datasets, when combined, is expected to provide a rich and geographically broad data source through which it may be possible to accurately assess Internet connectivity and performance. Furthermore, this study can let one learn trends from these datasets to predict Internet accessibility in regions for which no measurement data is currently available.

The goal of this project is threefold: (i) to aggregate data from public and private sources to produce the most fine-grained analysis and detailed maps, to date, within states, at the community and, ideally, neighborhood level, of where fixed and mobile Internet access exists, where it does not, and where it is of too poor quality to be usable; (ii) to build statistical models that use demographic and other social variables to understand variation in Internet availability and quality; and (iii) to use what is learned to build predictive models of Internet service in areas for which there exist insufficient measurement data from available sources.

This work will have broad impacts, including the informing of local, state and federal governments about where investments must be made to ensure all Americans have access to high quality mobile and/or fixed Internet. The project website, digitalaccess.cs.ucsb.edu, will contain information about research methodology and outcomes, including a report on what is learned about the state of California, the first state of focus for this award. Prediction models will also be made available.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Paul, U. and Liu, J. and Adarsh, V. and Gu, M. and Gupta, A. and Belding, E. "Characterizing Performance Inequity Across U.S. Ookla Speedtest Users" ArXivorg , 2021 Citation Details
Paul, Udit and Liu, Jiamo and Farias-llerenas, David and Adarsh, Vivek and Gupta, Arpit and Belding, Elizabeth "Characterizing Internet Access and Quality Inequities in California M-Lab Measurements" ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies (COMPASS) , 2022 https://doi.org/10.1145/3530190.3534813 Citation Details

PROJECT OUTCOMES REPORT

Disclaimer

This Project Outcomes Report for the General Public is displayed verbatim as submitted by the Principal Investigator (PI) for this award. Any opinions, findings, and conclusions or recommendations expressed in this Report are those of the PI and do not necessarily reflect the views of the National Science Foundation; NSF has not approved or endorsed its content.

In the effort to create more accurate maps of fixed broadband deployment, crowdsourced measurement data, i.e. ?speed tests?, are an invaluable piece of the puzzle; they offer a snapshot of the instantaneous network performance, or the network quality, as experienced by the end user on their device.  Aggregated from a single user over time, speed tests provide a longitudinal view on the reliability, stability, and performance consistency of the network connection.  Alternatively, aggregated from multiple users over a small geographic area, they can point to important performance gaps within a region. 

However, there are several issues with taking speed tests measurements at face value; we must be very careful to understand what the speed test is actually measuring and how it compares to expected speed values. For example, many fixed broadband plans offer download speeds as high as 1 Gbps. If a speed test measures performance significantly less than these values, is it because the access network is under-performing, the user has purchased a lower-tier plan, or the user?s home WiFi network is misconfigured or experiencing interference? It is critical to determine the source of the under-performance, to know whether or not it is the fault of the ISP. 

Because these measurements provide critically important data about link performance, this project studied the impact of a variety of factors on speed test performance.  In one of the primary studies from this project, we analyzed 745k individual Ookla speed test measurements and 717k individual Measurement-Lab (M-Lab) speed tests, two of the most used speed test platforms.  The data represents all measurements taken in 2021 from the same four major U.S. cities. 

As a first step, we developed a method to reverse engineer the ISP speed plan from which each measurement originates. This first step is critical to contextualize the measurement to understand what range of speeds should have been achieved.  We found, unsurprisingly, that lower speed plans are much more likely to achieve the purchased speed than are higher speed plans.  We also found that the majority of speed tests stem from lower tier plans, thereby skewing aggregated results towards lower throughputs.

As a second step, we investigated the impact of a variety of link and device characteristics on the measured performance.  Our findings demonstrate that the access link type (WiFi versus ethernet), the WiFi link characteristics (RSSI, spectrum band), and the device type all influence the measured results.  For instance, across ISP subscription tiers, tests conducted over ethernet are much more likely to achieve speeds closer to the purchased plan speed than tests conducted over WiFi.  The same is true for tests in the WiFi 5GHz band compared to the 2.4GHz band.  Similarly, WiFi signal strength (RSSI) and the available memory in the device also have an impact of measured performance.  We do not claim that these are the only factors that can impact performance.  It is likely that the operating system and other factors will also play a role.

In addition to these findings, the measurement methodology is critical.  For example, M-Lab and Ookla measure broadband speed in different ways.  When we categorized results by their ISP broadband subscription tier, our analysis found that the median download speed of M-Lab measurements can be up to twice as slow as that measured by Ookla, for users of the same ISP plan. 

Based on our study, we urge speed test providers to include as much contextual information as possible to better understand the speed test measurements.  This includes but is not limited to meta-data about the access link type, device type, and user subscription plan.  We also encourage providing ?reasonable? location accuracy for measurement data so that it can be useful for understanding connectivity patterns.  IP-geolocation techniques are known to have inaccuracies of up to 20km.  This margin of error is much too large for planning broadband buildouts.   Tagging measurements with latitude/longitude data of two to three decimal places offers a good starting point while still maintaining user privacy.

Finally, we urge the FCC to not overlook this value data source, but instead to work with measurement providers and researchers to extract meaningful, contextualized Internet performance data to more fully understand internet performance and quality in current buildouts.

 


Last Modified: 12/05/2022
Modified by: Elizabeth M Belding

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page