Data Visualization: Crime in San Francisco

This project was a class assignment for a Data Visualization course. I combined data from Police incidents in SF with geographical data marking out SF neighborhoods. The class provided the question and the dataset, my work below was to combine it to specifications:

In [4]:
## Goal: Structure the geo data by neighborhood, then create a Choropleth map to visualize crime in San Francisco.

import pandas as pd
import folium

police_file = '../datasets/Police_Department_Incidents_-_Previous_Year__2016_.csv'
#police_file = pd.read_csv('https://ibm.box.com/shared/static/nmcltjmocdi8sd5tk93uembzdec8zyaq.csv') ## downloadable
sf_input = pd.read_csv(police_file, index_col=0)

# group by neighborhood
sf = sf_input.groupby('PdDistrict').count()
sf = pd.DataFrame(sf,columns=['Category'])  # remove unneeded columns
sf.reset_index(inplace=True)   # default index, otherwise groupby column becomes index
sf.rename(columns={'PdDistrict':'Neighborhood','Category':'Count'}, inplace=True)
sf.sort_values(by='Count', inplace=True, ascending=False)
#print(sf)

# San Francisco latitude and longitude values
latitude = 37.77
longitude = -122.42
sf_neighborhood_geo = '../datasets/san-francisco.geojson'

# Create map
sf_map = folium.Map(
       location=[latitude,longitude],
       zoom_start=12)

# Use json file  TEST based on class
sf_map.choropleth(
       geo_data=sf_neighborhood_geo,
       data=sf,
       columns=['Neighborhood','Count'],
       key_on='feature.properties.DISTRICT',
       fill_color='YlOrRd',
       fill_opacity='0.7',
       line_opacity='0.2',
       legend_name='Crime Rate in San Francisco, by Neighborhood')

# display the map
sf_map
Out[4]: