Late last semester I reported a piece on homicides in Brownsville, Brooklyn — part of a larger series on crime in the borough — with my classmates Renny Grinshpan and Nicola Pring. We told the story of a 25-year-old Mervyn Spann and looked at how that fit into broader trends in a city where crime is down but clusters of violence remain.
There aren’t numbers or data in the story, "The Man Who Couldn’t Escape Brownsville," but there was plenty happening behind the scenes. Here’s a breakdown of the data work behind the story.
First, I manually inputted about four months of NYPD news releases into a dataset. (This step was not fun.)
The division that releases these news releases, DCPI, doesn’t seem to follow any standard format in these emailed releases, so writing a program to scrape the emails didn’t seem like it would do me much good.
But after a few hours, I managed to create a CSV that listed the name of each homicide victim, information like age, gender, and race, and both the victim’s address and the location where he or she was killed. The data ultimately looked like this:
Next, I built a map. I used R to geocode the addresses and plot the location of each homicide between August and December onto a map. It was really basic — each dot showed where someone was killed — but it let us identify the neighborhoods with clusters of homicides. We had breakdowns by precinct, yes, but this gave us a better geographic understanding of where things were happening.
I also ran some other analysis — looking at factors like age, race, and gender — but it seemed to use that location told the most important story here.
Finally, we headed to Brownsville. (Other classmates headed to other neighborhoods, like the nearby Ocean Hill.) The data showed us that there had been several recent homicides in this poor, out-of-the-way Brooklyn neighborhood. So we made a list of victim addresses and the location of homicides, and pounded the pavement. Yes, we got lucky finding people who knew one of the victims, but we also knew where to look.
In the end, the story doesn’t read like a data story (and I didn’t even think of it that way myself, at first). But that’s exactly what it is: I used data to figure out where news was happening, and that led us to report a powerful human story. And I’m certain we were able to report a better piece this way than if we just skimmed the DCPI releases, made an educated guess of the best place to go report, and hit the pavement.
And if you’re interested in more data about homicides in Brooklyn, check out this companion piece by Nicola Pring and Chen Wu, complete with awesome maps and infographics.