Broadband Coverage Mapping of New York State

A Tool to Analyze Broadband Connectivity in New York State

Problem Statement

In light of increased remote activity during COVID-19, federal aid has been made available to help close the gap in broadband access, with President Biden pledging billions to improve broadband coverage options and affordability for all. However, allocation of these resources proves difficult since there is no singular source of truth for the measure of broadband coverage or how coverage strength and options in certain areas compare to options available elsewhere.

 
 

Project Description

Our capstone project aggregates different datasets at various spatial levels to create a master dataset at the census tract level containing information on broadband coverage, all of which are meant to better inform future deployment efforts. We used a variety of supervised learning models to understand the relationship between demographics and broadband connectivity. We also built a proprietary scoring system to determine census tracts which are well-served, unserved and underserved.

 

The end product is a broadband coverage map that will be of use to: 

  • Policymakers in charge of broadband deployment efforts and the pursuit of grant funding available from the Federal Communications Commission (FCC), US Department of Agriculture and future state programs for universal broadband access

  • Digital inclusion nonprofits that try to understand how broadband coverage disparities overlap with various demographics 

  • Everyday citizens who can use our open broadband map to determine their region's coverage

This map will provide a singular source of truth on broadband coverage in New York State, and help reach the eventual goal of closing the digital divide. 

Computer Keyboard
broadbandpic8.jpg

Methodology

  1. Data Engineering
    We initially had 6 datasets at various spatial levels, which we brought to the census tract level through spatial interpolation and the creation of specialized crosswalk files. These datasets contained various measures of broadband speed, broadband coverage, demographic data and provider availability. All of these variables were needed to provide a holistic view of broadband coverage throughout the state.
     

  2. Modeling 
    Using a Random Forest regressor with a R2 score of 0.76, we found the variables most closely correlated with broadband usage to be: 
    - Number of M-Lab speed tests conducted per census tract
    - Minimum round trip time 
    - Population
    - Average loss rate
    - Fastest average broadband speed measured 
    - Percentage of census tract population that uses the Internet at broadband speeds (25 Mbps upload speed / 3 Mbps download speed)
    - M-Lab broadband speed
    A more detailed description of these variables is available on the Github repository linked below. 
     

  3. Broadband Score 
    Using the above variables, we created a broadband score for each census tract ranging from 1-5. This will hopefully provide policymakers with actionable insight for the pursuit of grants and allocation of investment resources for broadband infrastructure.
     

  4. Data Visualization
    We created an interface that would convey our insights to users and allow them to intuitively explore our curated datasets on their own. The visualization is accessible through the image below.  

 

Broadband Mapping Tool

Our calculated broadband score showed that the tracts with the highest coverage tended to be concentrated in New York City.

Click on the image below to open the tool in another tab and begin exploring broadband coverage across the state

broadbandpic6.jpg
 

Sponsors

Thank you to our sponsors who provided guidance and support throughout this project.

Schmidt Futures PNG.png

Schmidt Futures

US ignite png.png

US Ignite

 

The Team

Raju_Aleka.png

Aleka Raju

  • Grey LinkedIn Icon
linkedinpic.jpg

Kelsey Nanan

  • Grey LinkedIn Icon
AJ Profile Pic_edited.jpg

AJ Kuhn

  • Grey LinkedIn Icon