Anyone who has survived a natural disaster knows that the end of the event is just the beginning of the process of healing. The same is true for man-made disasters, like the elevated levels of gun violence we saw year after year, starting during the COVID-19 pandemic in 2020.
At The Trace, we wanted to quantify this disaster as a way to acknowledge its scale and significance, even when gun violence has gone back down to its normal, still-incredibly lethal levels.
Shooting deaths are obviously an imperfect proxy for quantifying the harm. In general, at least two people are injured by shootings for every person who dies. Also, as Michelle Kerr-Spry, a gun violence survivor and a community activist with Mothers in Charge, pointed out during our reporting, when a homicide happens, the shooter’s family may lose someone forever to prison.
Why define the pandemic-era gun violence spike as ending in 2024, if the pandemic ended years ago?
A close examination of shooting deaths in 2020 shows that gun violence rates began to rise around the start of the pandemic, but then spiked again later in the year, as protests of the murder of George Floyd in Minneapolis began. Then they rose again in 2021 and stayed high in 2022.
While researchers will be untangling the many causes of the surge for years, what is clear is that it was not due to a single cause, like COVID-19, but rather a complex combination of factors.
If we define the beginning of the spike as the period when gun rates surged, then it seems a neutral and data-based way to define the ending as the period when gun rates returned to their pre-surge levels.
There are also, of course, much more complex ways to approach this question, and The Trace will keep an eye on gun violence research as it comes out.
How did you define neighborhoods?
As stated in the story, gun violence is often concentrated in very small geographic areas, like an intersection or a few blocks. However, data is limited by the geographic units that the U.S. Census Bureau provides. The smallest unit for which demographic data is available is the census block group. These are areas of a few square miles with populations generally between 800 and 3,000. In rural areas, however, a census block group can be much larger in order to include more people.
The Gun Violence Archive data we used to conduct our analysis, crosses the 2020 boundary, when new census block group shapes were drawn. To accommodate our data range, we used 2019 census block group shapes.
How did you define whether census block groups are urban, suburban, or rural?
We used 2010 Rural-Urban Commuting Area codes from the U.S. Department of Agriculture. These are census tract-level, which suited the granular nature of our analysis. We also liked that the designations included commuting data, which allows you to distinguish between towns that are adjacent to large cities — like Jersey City, New Jersey — and towns that are truly freestanding, like Wichita, Kansas.
How did you place neighborhoods in racial categories?
We placed census block groups into racial categories using a simple majority. A group where the population was 50.1 percent African American would be called African American in our analysis. This is, of course, imperfect. A neighborhood where 40 percent of the population is Latinx, for example, can look and feel very different from one that is 99.9 percent white.
There are also block groups where no single race is the majority. In New York City, for example, there are many neighborhoods where Latinx, African American, and Asian residents combined are the vast majority. For these areas we created the ‘Majority POC’ label, which stands for majority persons of color. This is also clearly imperfect. As these neighborhoods become more common, how to label them is an evolving conversation among researchers.
Investigating America’s gun violence crisis
Reader donations help power our non-profit reporting.
How did you conduct your analysis?
This analysis was conducted in R version 4.4.1. We used the httr package to access the API of the Gun Violence Archive, which collects data on shootings using media reports, supplemented by other sources, like social media and city open data portals.
We accessed census block group shapes, populations, and demographic data using the U.S. Census API via the equally amazing tidycensus package.
We then used the sf package to spatially join latitudes and longitudes of shootings with the census block group polygons from the census.
Then, we used PostgreSQL to join on Rural-Urban Commuting Area codes and racial data.
Finally, back in R, we grouped census block groups by year, race, and urbanization to calculate total deaths by year by category. We could then subtract the 2019 value from each of the pandemic years.
How did you define excess deaths?
This is complex because it requires the creation of a counterfactual: How many people do we think would have died from shootings if the complex combination of the pandemic and other events in 2020 had not occurred.
It is possible to make this calculation very complex. Sophisticated models often include variables like the daily temperature and proxies for the state of the economy. We experimented with these, but in the end, they generally produced predictions with a wide margin of error that were not an improvement over our naïve approach. They also came with the downside of being overly complex to explain, and as journalists we want our methods to be parseable by nontechnical readers.
Therefore, it made sense to estimate counterfactual shooting deaths from 2020 onward naïvely, that is, by just projecting the last year before 2020 forward. If we are defining the start and end of the surge as the departure and return from 2019 levels, then it makes sense to compare each year to 2019.
Your visualizations focus on shooting deaths in urban areas. What about all the other categories?
We focused on cities because that’s where the most deaths occurred.
However, it is possible to examine all the race/Rural-Urban Commuting Area combinations at once. It looks like this:
If we confine all categories to the same y-axis scale, it obscures some interesting change over time in categories that represent fewer census block groups. If we allow the y-axis scale to float, it looks like this:
For some race/Rural-Urban Commuting Area combinations, where there are a handful of shootings each year, this produces very erratic charts. But in more populated categories, we can see interesting patterns. We can see that the surge affected even the remotest areas, for example. We can also see that outside of cities, shooting deaths in African American neighborhoods remain elevated, having not yet returned to 2019 levels.