From 03d473f0912e5148d063b109ac3b7e613f756609 Mon Sep 17 00:00:00 2001 From: Manushi Majumdar Date: Fri, 6 Mar 2026 00:34:34 -0500 Subject: [PATCH 1/2] sample that shows SEDF and Geometry tools --- ...s_and_states_have_access_to_airports.ipynb | 1285 +++++++++++++++++ 1 file changed, 1285 insertions(+) create mode 100644 samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb diff --git a/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb b/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb new file mode 100644 index 0000000000..7102a8f99d --- /dev/null +++ b/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb @@ -0,0 +1,1285 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Which US Counties and States lack access to public airports?\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Access to airports can boost connectivity and as a result, increase tourism and enhance regional visibility. This helps with the economic development and business activity of the surrounding regions. It is also critical to have ease of access to airports to support emergency services and disaster response in times of need, especially for regions with significant population density. \n", + "\n", + "In this example, we will use the Spatially Enabled DataFrame (SEDF) from the ArcGIS API for Python along with some spatial analysis concepts and techniques to answer some real-world questions. We will work through an example that seeks to answer the following questions:\n", + "\n", + "> 1. Which counties contain airports?\n", + "> 2. Which counties *do not* contain airports?\n", + "> 3. Which of those counties do not have access to airports within 30 miles?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Import necessary packages" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "import warnings\n", + "warnings.filterwarnings(\"ignore\")\n", + "\n", + "import pandas as pd\n", + "\n", + "from arcgis.features import FeatureLayer, GeoAccessor\n", + "from arcgis.gis import GIS\n", + "from arcgis.geometry import Geometry, distance\n", + "from arcgis.geometry.functions import LengthUnits" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "gis = GIS(profile=\"your_online_profile\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Having imported necessary packages and connecting to our GIS, we now proceed to read in the data required." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Extract data for Counties of the US as a SEDF\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We will access the data for Counties in the US and read it in as a Spatially Enabled DataFrame using the `spatial` namespace which returns a [__`GeoAccessor`__](https://developers.arcgis.com/python/latest/api-reference/arcgis.features.toc.html#geoaccessor) object." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(3144, 13)" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "counties_url = 'https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Counties_Generalized_Boundaries/FeatureServer/0'\n", + "counties_layer = FeatureLayer(counties_url)\n", + "counties_sdf = counties_layer.query(as_df=True, out_sr=4326)\n", + "counties_sdf.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This dataset has information for 3144 counties across the US. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDNAMESTATE_NAMESTATE_FIPSFIPSSQMIPOPULATIONPOP_SQMISTATE_ABBRCOUNTY_FIPSShape__AreaShape__LengthSHAPE
02Grant CountyNorth Dakota38380371643.57142923011.4ND0370.5040683.413506{\"rings\": [[[-102.003385812939, 46.05284759773...
13Griggs CountyNorth Dakota3838039720.62523063.2ND0390.2230341.949037{\"rings\": [[[-97.9616729885937, 47.24493801330...
24Hettinger CountyNorth Dakota38380411131.36363624892.2ND0410.3427472.691898{\"rings\": [[[-102.003371825849, 46.20580311678...
35Kidder CountyNorth Dakota38380431408.23529423941.7ND0430.4378062.719487{\"rings\": [[[-100.088490419176, 46.63570178753...
47Logan CountyNorth Dakota3838047987.36842118761.9ND0470.3090112.454735{\"rings\": [[[-99.0440801101036, 46.28338272232...
\n", + "
" + ], + "text/plain": [ + " OBJECTID NAME STATE_NAME STATE_FIPS FIPS SQMI \\\n", + "0 2 Grant County North Dakota 38 38037 1643.571429 \n", + "1 3 Griggs County North Dakota 38 38039 720.625 \n", + "2 4 Hettinger County North Dakota 38 38041 1131.363636 \n", + "3 5 Kidder County North Dakota 38 38043 1408.235294 \n", + "4 7 Logan County North Dakota 38 38047 987.368421 \n", + "\n", + " POPULATION POP_SQMI STATE_ABBR COUNTY_FIPS Shape__Area Shape__Length \\\n", + "0 2301 1.4 ND 037 0.504068 3.413506 \n", + "1 2306 3.2 ND 039 0.223034 1.949037 \n", + "2 2489 2.2 ND 041 0.342747 2.691898 \n", + "3 2394 1.7 ND 043 0.437806 2.719487 \n", + "4 1876 1.9 ND 047 0.309011 2.454735 \n", + "\n", + " SHAPE \n", + "0 {\"rings\": [[[-102.003385812939, 46.05284759773... \n", + "1 {\"rings\": [[[-97.9616729885937, 47.24493801330... \n", + "2 {\"rings\": [[[-102.003371825849, 46.20580311678... \n", + "3 {\"rings\": [[[-100.088490419176, 46.63570178753... \n", + "4 {\"rings\": [[[-99.0440801101036, 46.28338272232... " + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "counties_sdf.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Extract Spatially Enabled DataFrames for Airports across the US" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The data for airports in the US is available across 3 different layers based on the scale of the airport.\n", + "\n", + "1. The first layer, extracted as SEDF `airports1`, comprises of airports with capacity of 1,000,000 or more.\n", + "2. The second layer, extracted as SEDF `airports2`, comprises of airports with capacity of 100,000 - 999,999.\n", + "3. The third layer, extracted as SEDF `airports3`, comprises of airports with capacity of less than 100,000.\n", + "\n", + "Once we extract data for all 3 SEDFs, we then combine them to create `airports` that we will use going forward. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDFAA_IDNAMEFACILITYCITYCOUNTYSTATEOWNERELEV_FEETINTLTOWERARRIVALSDEPARTURESENPLANEMENPASSENGERSSHAPE
029ANCTed Stevens Anchorage IntlAirportAnchorageAnchorageAKPublic151NY18519.018524.02291115.01970743.0{\"x\": -149.9981944442182, \"y\": 61.174083333778...
1761BHMBirmingham-Shuttlesworth IntlAirportBirminghamJeffersonALPublic650NY16741.016738.01332942.01205588.0{\"x\": -86.75230555515719, \"y\": 33.563888889160...
21179LITBill And Hillary Clinton National/Adams FieldAirportLittle RockPulaskiARPublic266NY15475.015472.01054815.0955228.0{\"x\": -92.22477777747241, \"y\": 34.729444444105...
31503PHXPhoenix Sky Harbor IntlAirportPhoenixMaricopaAZPublic1135NY177027.0177030.019156669.015944437.0{\"x\": -112.01158333360394, \"y\": 33.43427777749...
41593TUSTucson IntlAirportTucsonPimaAZPublic2643YY19494.019494.01569388.01315515.0{\"x\": -110.94102777810946, \"y\": 32.11608333317...
\n", + "
" + ], + "text/plain": [ + " OBJECTID FAA_ID NAME FACILITY \\\n", + "0 29 ANC Ted Stevens Anchorage Intl Airport \n", + "1 761 BHM Birmingham-Shuttlesworth Intl Airport \n", + "2 1179 LIT Bill And Hillary Clinton National/Adams Field Airport \n", + "3 1503 PHX Phoenix Sky Harbor Intl Airport \n", + "4 1593 TUS Tucson Intl Airport \n", + "\n", + " CITY COUNTY STATE OWNER ELEV_FEET INTL TOWER ARRIVALS \\\n", + "0 Anchorage Anchorage AK Public 151 N Y 18519.0 \n", + "1 Birmingham Jefferson AL Public 650 N Y 16741.0 \n", + "2 Little Rock Pulaski AR Public 266 N Y 15475.0 \n", + "3 Phoenix Maricopa AZ Public 1135 N Y 177027.0 \n", + "4 Tucson Pima AZ Public 2643 Y Y 19494.0 \n", + "\n", + " DEPARTURES ENPLANEMEN PASSENGERS \\\n", + "0 18524.0 2291115.0 1970743.0 \n", + "1 16738.0 1332942.0 1205588.0 \n", + "2 15472.0 1054815.0 955228.0 \n", + "3 177030.0 19156669.0 15944437.0 \n", + "4 19494.0 1569388.0 1315515.0 \n", + "\n", + " SHAPE \n", + "0 {\"x\": -149.9981944442182, \"y\": 61.174083333778... \n", + "1 {\"x\": -86.75230555515719, \"y\": 33.563888889160... \n", + "2 {\"x\": -92.22477777747241, \"y\": 34.729444444105... \n", + "3 {\"x\": -112.01158333360394, \"y\": 33.43427777749... \n", + "4 {\"x\": -110.94102777810946, \"y\": 32.11608333317... " + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "airports1_url = 'https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Airports_by_scale/FeatureServer/1'\n", + "airports1 = FeatureLayer(airports1_url).query(as_df=True, out_sr=4326)\n", + "\n", + "airports2_url = 'https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Airports_by_scale/FeatureServer/2'\n", + "airports2 = FeatureLayer(airports2_url).query(as_df=True, out_sr=4326)\n", + "\n", + "airports3_url = 'https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Airports_by_scale/FeatureServer/3'\n", + "airports3 = FeatureLayer(airports3_url).query(as_df=True, out_sr=4326)\n", + "\n", + "airports = pd.concat([airports1, airports2, airports3], ignore_index=True, sort=False)\n", + "airports.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Restrict the airports to 'Public' airports\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If we observe values of the `OWNER` attribute of the `airports` data below, we notice that some of the airports are privately owned and some are dedicated for military use. We want to limit our `airports` data to include only `public_airports` that can be accessed by the general public." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "OWNER\n", + "Public 772\n", + "Air Force 50\n", + "Private 24\n", + "Navy 14\n", + "Army 13\n", + "Name: count, dtype: Int64" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "airports['OWNER'].value_counts()" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [], + "source": [ + "public_airports = airports[airports['OWNER']=='Public']" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(772, 16)" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "public_airports.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As we see above, the US has 772 airports accessible to the public." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### __1. Which counties have public airports within?__\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This leads us to the first of our 3 questions: Which counties have public airports within? \n", + "\n", + "We spatially join the public airports with the counties to see which airports intersect counties and visualize the results on a map. " + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "spatially_joined = public_airports.spatial.join(counties_sdf, how='left', op='intersects')" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(2781, 13)" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "counties_with_airports = counties_sdf[counties_sdf['COUNTY_FIPS'].isin(list((spatially_joined['COUNTY_FIPS']).unique()))]\n", + "counties_with_airports.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This gives us 2781 counties (of the 3144 total counties) which have airports within. This shows us that most counties have public airports within, however there are a few missing gaps in the south, central part of the country, Midwest and south east that indicate counties without airports." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "map1 = gis.map('USA')\n", + "counties_with_airports.spatial.plot(map1)\n", + "map1.zoom_to_layer(counties_with_airports)\n", + "map1" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also print out the names of states that are served by airports in their counties." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "[ 'North Dakota', 'Oklahoma', 'Oregon',\n", + " 'South Dakota', 'Hawaii', 'Idaho',\n", + " 'Kansas', 'Alaska', 'California',\n", + " 'Colorado', 'Georgia', 'Missouri',\n", + " 'Montana', 'Nebraska', 'Nevada',\n", + " 'New Mexico', 'Kentucky', 'Michigan',\n", + " 'Mississippi', 'Wyoming', 'Connecticut',\n", + " 'Texas', 'Utah', 'Virginia',\n", + " 'Washington', 'Ohio', 'Pennsylvania',\n", + " 'South Carolina', 'Tennessee', 'Illinois',\n", + " 'Indiana', 'Iowa', 'Alabama',\n", + " 'Arkansas', 'Florida', 'North Carolina',\n", + " 'Louisiana', 'Maine', 'Massachusetts',\n", + " 'Minnesota', 'West Virginia', 'Wisconsin',\n", + " 'New York', 'Maryland', 'Vermont',\n", + " 'New Hampshire', 'Arizona', 'Rhode Island',\n", + " 'New Jersey', 'Delaware', 'District of Columbia']\n", + "Length: 51, dtype: string\n" + ] + } + ], + "source": [ + "print(counties_with_airports['STATE_NAME'].unique())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## __2. Which counties are underserved by access to airports, despite average or greater population density?__\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This leads us to find which of these missing counties do not have access to airports despite average or greater population density? \n", + "\n", + "For this we use the `enrich_layer` analysis method to enrich the counties with data for population for each of these counties from the most recent year (2025) using the `TOTPOP_CY` variable." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Enrich counties with population data from 2025" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "from arcgis.features.analysis import enrich_layer" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "
\n", + " \n", + " \n", + " \n", + "
\n", + "\n", + "
\n", + " enriched_counties_2025\n", + " \n", + "

Feature Layer Collection by MMajumdar_geosaurus\n", + "
Last Modified: March 05, 2026\n", + "
0 comments, 4 views\n", + "
\n", + "
\n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "enriched_counties = enrich_layer(counties_layer, analysis_variables=['TOTPOP_CY'], output_name='enriched_counties_2025')\n", + "enriched_counties" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "enriched_df = FeatureLayer(enriched_counties.url+'/0').query(as_df=True, out_sr=4326)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Calculating population density for each county\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We now calculate the population density for each of these counties using the population data and the area (`SQMI` field), and the median population density for the entire country." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 51.558841\n", + "1 1.324555\n", + "2 3.073721\n", + "3 2.113379\n", + "4 1.638931\n", + "Name: pop_density, dtype: Float64" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "enriched_df['pop_density'] = enriched_df['TOTPOP_CY'] / enriched_df['SQMI']\n", + "enriched_df['pop_density'].head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Calculating median population density" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "44.903376023126796" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "median_pop_density = enriched_df['pop_density'].median()\n", + "float(median_pop_density)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Finding counties that DO NOT have airports and have average or greater population density\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We use this to to write a query below to find which of the counties that lack airports also have population density greater than or equal to the median. This leaves us with 163 counties, aligning with what we saw in the missing gaps of the previous map." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(163, 20)" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "counties_without_airports = enriched_df[~enriched_df['COUNTY_FIPS'].isin(list((spatially_joined['COUNTY_FIPS']).unique())) & (enriched_df['pop_density']>=median_pop_density)]\n", + "counties_without_airports.shape" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "map2 = gis.map('USA')\n", + "counties_without_airports.spatial.plot(map2)\n", + "map2.zoom_to_layer(counties_without_airports)\n", + "map2" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We also print out the names of states that lack have counties lacking airports. " + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "[ 'Ohio', 'Oklahoma', 'Alabama', 'Colorado',\n", + " 'Georgia', 'Missouri', 'Nevada', 'New York',\n", + " 'North Carolina', 'Kentucky', 'Maryland', 'Michigan',\n", + " 'Minnesota', 'Tennessee', 'Connecticut', 'Texas',\n", + " 'Virginia', 'Iowa', 'Kansas', 'Illinois',\n", + " 'Indiana']\n", + "Length: 21, dtype: string\n" + ] + } + ], + "source": [ + "print(counties_without_airports['STATE_NAME'].unique())" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### __3. Which counties are underserved by access to airports within 30 miles?__\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "However, if these counties do not have airports, are they at least within 30 miles of one? To determine this, we first calculate centroids for our counties and then compute a spatial index for each public airport.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [], + "source": [ + "#Generate centroids for counties\n", + "def get_x(value):\n", + " '''\n", + " Extract x co-ordinate from the centroid column\n", + " '''\n", + " return value[0]\n", + "\n", + "def get_y(value):\n", + " '''\n", + " Extract y co-ordinate from the centroid column\n", + " '''\n", + " return value[1]\n", + "\n", + "counties_without_airports['centroid'] = counties_without_airports['SHAPE'].geom.centroid\n", + "counties_without_airports['centroid_x'] = counties_without_airports['centroid'].apply(get_x)\n", + "counties_without_airports['centroid_y'] = counties_without_airports['centroid'].apply(get_y)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A [spatial index](https://developers.arcgis.com/python/latest/api-reference/arcgis.features.toc.html#arcgis.features.GeoAccessor.sindex) can be thought of as an index on the `SHAPE` column. This can be used to quickly locate and search for features and also perform many selection or identification tasks. \n" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [], + "source": [ + "#Generate spatial index for airports\n", + "sindex_airports = public_airports.spatial.sindex(stype='rtree')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We now write this method below to loop through each county and find its nearest airport using the spatial index for a quick search. We then use the __[`distance()`](https://developers.arcgis.com/python/latest/api-reference/arcgis.geometry.functions.html#distance)__ geometry method to find the distance between the centroid of the county and the closest airport in miles. We extract this data in the form of a distance matrix that gives us the distance between a county and its nearest airport, along with the population density. " + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "counties_without_airport_access = []\n", + "for row in counties_without_airports[['centroid_x', 'centroid_y', 'SHAPE', 'pop_density']].to_records():\n", + " row = list(row)\n", + " g = row[3]\n", + " source_idx = row[0]\n", + " latlong = (row[1], row[2])\n", + " #Finding nearest airport to each county\n", + " r = [i for i in sindex_airports._index.nearest(latlong, num_results=1)]\n", + " r = list(set(r))\n", + " centroid = Geometry({\"x\": row[1], \"y\":row[2], \"spatialReference\":{'wkid':4326}})\n", + " #Find distance between county and closest airport\n", + " dists = [distance(spatial_ref='4326', geometry1 = centroid, geometry2=public_airports['SHAPE'][i], distance_unit=LengthUnits.SURVEYMILE, geodesic=True) for i in r]\n", + " row = row + r + dists\n", + " counties_without_airport_access.append(row)" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
county_idxcentroid_xcentroid_ySHAPEpop_densityairport_idxdist
097-83.16571741.532653{'rings': [[[-83.408963087, 41.497789379], [-8...121.039060739{'distance': 33.49273640493922}
1115-83.36780440.297096{'rings': [[[-83.499565192, 40.1136267840001],...164.26944361{'distance': 32.60488307378162}
2121-84.57604741.561072{'rings': [[[-84.791897201, 41.4278994090001],...86.147014739{'distance': 39.85057292486108}
3185-96.67837234.735751{'rings': [[[-96.824187087, 34.515547381], [-9...53.25305463{'distance': 69.18399832721137}
4268-85.79606532.869278{'rings': [[[-85.589628589, 32.7313466890001],...52.713994414{'distance': 39.07221261130234}
\n", + "
" + ], + "text/plain": [ + " county_idx centroid_x centroid_y \\\n", + "0 97 -83.165717 41.532653 \n", + "1 115 -83.367804 40.297096 \n", + "2 121 -84.576047 41.561072 \n", + "3 185 -96.678372 34.735751 \n", + "4 268 -85.796065 32.869278 \n", + "\n", + " SHAPE pop_density \\\n", + "0 {'rings': [[[-83.408963087, 41.497789379], [-8... 121.039060 \n", + "1 {'rings': [[[-83.499565192, 40.1136267840001],... 164.269443 \n", + "2 {'rings': [[[-84.791897201, 41.4278994090001],... 86.147014 \n", + "3 {'rings': [[[-96.824187087, 34.515547381], [-9... 53.253054 \n", + "4 {'rings': [[[-85.589628589, 32.7313466890001],... 52.713994 \n", + "\n", + " airport_idx dist \n", + "0 739 {'distance': 33.49273640493922} \n", + "1 61 {'distance': 32.60488307378162} \n", + "2 739 {'distance': 39.85057292486108} \n", + "3 63 {'distance': 69.18399832721137} \n", + "4 414 {'distance': 39.07221261130234} " + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df_dist_matrix = pd.DataFrame(data=counties_without_airport_access, columns=['county_idx', 'centroid_x', 'centroid_y', \n", + " 'SHAPE', 'pop_density', 'airport_idx','dist'])\n", + "df_dist_matrix.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "def get_distance(x):\n", + " '''\n", + " Fetch distance value from the distance disctionary\n", + " '''\n", + " return x['distance']\n", + "\n", + "df_dist_matrix['distance'] = df_dist_matrix['dist'].apply(get_distance)" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(77, 8)" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "df_dist_matrix_filter = df_dist_matrix[(df_dist_matrix['pop_density'] >= median_pop_density) & (df_dist_matrix['distance'] > 30)]\n", + "df_dist_matrix_filter.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We now find which of the previous 163 counties also has their closest airport more than 30 miles away. This leaves us with 77 airports.\n", + "\n", + "We now retrieve the `SHAPE` for these counties from our previous SEDF to then visualize the results on a map." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "(77, 23)" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "counties_away_from_airports = counties_without_airports[counties_without_airports['SHAPE'].isin(list((df_dist_matrix_filter['SHAPE']).unique()))]\n", + "counties_away_from_airports.shape" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The map below shows us that a few airports we saw initially in the Midwest and the central part of the country are no longer on this map, indicating that they are within 30 miles of the closest airport." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "map3 = gis.map('USA')\n", + "counties_away_from_airports.spatial.plot(map3)\n", + "map3.zoom_to_layer(counties_away_from_airports)\n", + "map3" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can also see state names for these counties. __The following list of states can be used to plan efforts which advocate for policy changes that benefit the underserved communities with access to airports through future infrastructure development.__" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array(['Alabama', 'Colorado', 'Georgia', 'Indiana', 'Iowa', 'Kentucky',\n", + " 'Michigan', 'Mississippi', 'Missouri', 'New York',\n", + " 'North Carolina', 'Ohio', 'Oklahoma', 'Tennessee', 'Texas',\n", + " 'Virginia'], dtype=object)" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "counties_away_from_airports['STATE_NAME'].unique()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This sample notebook shows us how simple tools can be used from the ArcGIS API for Python to explore and address some critical questions." + ] + } + ], + "metadata": { + "esriNotebookRuntime": { + "notebookRuntimeName": "ArcGIS Notebook Python 3 Standard", + "notebookRuntimeVersion": "9.0" + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.13.7" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} From 8e2ad2c492d32edc896ad9b8bcdf788aa464d81d Mon Sep 17 00:00:00 2001 From: Manushi Majumdar Date: Fri, 6 Mar 2026 16:27:30 -0500 Subject: [PATCH 2/2] requested changes --- ...s_and_states_have_access_to_airports.ipynb | 27 ++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb b/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb index 7102a8f99d..965ec112d0 100644 --- a/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb +++ b/samples/04_gis_analysts_data_scientists/which_counties_and_states_have_access_to_airports.ipynb @@ -676,7 +676,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## __2. Which counties are underserved by access to airports, despite average or greater population density?__\n" + "### __2. Which counties are underserved by access to airports, despite average or greater population density?__\n" ] }, { @@ -685,7 +685,7 @@ "source": [ "This leads us to find which of these missing counties do not have access to airports despite average or greater population density? \n", "\n", - "For this we use the `enrich_layer` analysis method to enrich the counties with data for population for each of these counties from the most recent year (2025) using the `TOTPOP_CY` variable." + "For this we use the `enrich_layer` analysis method to enrich the counties with population data from the most recent year (2025) using the TOTPOP_CY variable.\n" ] }, { @@ -704,6 +704,27 @@ "from arcgis.features.analysis import enrich_layer" ] }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Before we enrich our existing counties layer, let's delete any existing layer having the same title and then generate a layer with updated data." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": {}, + "outputs": [], + "source": [ + "title = 'enriched_counties_2025'\n", + "\n", + "existing_enriched_item = gis.content.search(query='title:'+title)[0]\n", + "# delete it if it exists\n", + "if existing_enriched_item:\n", + " existing_enriched_item.delete()" + ] + }, { "cell_type": "code", "execution_count": 15, @@ -739,7 +760,7 @@ } ], "source": [ - "enriched_counties = enrich_layer(counties_layer, analysis_variables=['TOTPOP_CY'], output_name='enriched_counties_2025')\n", + "enriched_counties = enrich_layer(counties_layer, analysis_variables=['TOTPOP_CY'], output_name=title)\n", "enriched_counties" ] },