BY ARI LAMSTEIN
Ari Lamstein, a technology consultant and author of the free email course, Learn to Map Census Data in R, provides an introduction to mapping US demographic data using open source software R.
Today I will demonstrate how to map US County demographic data in R. Esri recently announced that it is adding additional support for R. This, in turn, has led to an increased interest in R from the GIS community. While R is not a fullfledged GIS program, its ability to import, manipulate and visualize data is phenomenal. Additionally, its packaging system makes it easy for users to create, package and share additional functionality.
We will use the choroplethr package to map our data. The name “choroplethr” is a play on the words “choropleth” and “R”. In addition to facilitating the creation of choropleth maps, choroplethr ships with demographic statistics from the US Census Bureau.
If you are new to R, you might want to take a quick primer (such as here or here) before continuing.
Step 1: Install and Load the Packages
As I mentioned above, we will be using the choroplethr package to generate our maps. We will also need the “choroplethrMaps” package. From the R command line, type the following commands. This will install and load the packages:
install.packages(c("choroplethr", "choroplethrMaps"))
library(choroplethr)
library(choroplethrMaps)
Step 2: Create a Simple Map
The choroplethr package comes with a data frame containing 2012 US County Population Estimates. The data frame is called df_pop_county. We can load it and see the first few elements like this:
Step 2: Create a Simple Map
The choroplethr package comes with a data frame containing 2012 US County Population Estimates. The data frame is called df_pop_county. We can load it and see the first few elements like this:
data(df_pop_county)
head(df_pop_county)
## region value
##1 1001 54590
##2 1003183226
##3 1005 27469
##4 1007 22769
##5 1009 57466
##6 1011 10779
An important point is that the one column is named regionand one column is named value. The regions are county FIPS codes.
The function we will use to create county choropleth maps is called county_choropleth. It requires you to pass it a data frame with one column named region and one column named value.
An important point is that the one column is named regionand one column is named value. The regions are county FIPS codes.
The function we will use to create county choropleth maps is called county_choropleth. It requires you to pass it a data frame with one column named region and one column named value.
county_choropleth(df_pop_county)
Adding a title and legend is as simple as adding parameters to county_choropleth:
Adding a title and legend is as simple as adding parameters to county_choropleth:
county_choropleth (df_pop_county,
title ="2012County Population Estimates",
legend = "Population")
By default county_choropleth uses seven quantiles to display the color. That is, seven colors are used, and an equal number of regions have the same color. The number of quantiles can be changed with the num_colors parameter. For example, num_colors=2 will show which counties are above and below the median:
county_choropleth (df_pop_county,
title = "2012 State Population Estimates",
legend = "Population", num_colors = 2)
Using one color will use a continuous scale. This is useful for seeing outliers in the data:
Using one color will use a continuous scale. This is useful for seeing outliers in the data:
county_choropleth (df_pop_county,
title = "2012 County Population Estimates",
legend = "Population", num_colors = 1)
Los Angeles County (FIPS code 6037) has a population of almost 10 million, which is far larger than any other county in the US.
Los Angeles County (FIPS code 6037) has a population of almost 10 million, which is far larger than any other county in the US.
Step 4: More Demographics
Eight demographic statistics from 2013 are available in the data frame df_country_demographics:
data("df_county_demographics")
colnames(df_county_demographics)
##[1] "region" "total_population" "percent_white"
## [4] "percent_black" "percent_asian" "percent_hispanic"
## [7] "per_capita_income" "median_rent" "median_age"
We can map any of them by creating a new column in the data frame called “value”, and setting it equal to the value we want to map:
We can map any of them by creating a new column in the data frame called “value”, and setting it equal to the value we want to map:
df_county_demographics$value = df_county_demographics$percent_white
county_choropleth (df_county_demographics,
title = "2013 County Demographics\nPercent White",
legend = "Percent White")
Summary
I hope that you have enjoyed this introduction to mapping county demographics in R. Similar functionality exists for mapping state demographics; see the function ?state_choropleth for details.
I hope that you have enjoyed this introduction to mapping county demographics in R. Similar functionality exists for mapping state demographics; see the function ?state_choropleth for details.
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου