Exploratory analysis of London street level crime 2015-2017

- 16 mins

crime

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline

Lets load the dataset

data source : https://data.police.uk/data/archive/

data = pd.read_csv('crime.csv')

check the number of rows in the dataset

data.shape
(18000, 13)

the dataset has 18,000 rows and 13 columns

lets check the first five rows

data.iloc[:5,:]
Unnamed: 0 Crime ID Month Reported by Falls within Longitude Latitude Location LSOA code LSOA name Crime type Last outcome category Context
0 0 80e07583f4bd74b85e457d92eef5d014e4e8d7b0eab0dc... 2015-01 City of London Police City of London Police -0.106453 51.518207 On or near Charterhouse Street E01000916 Camden 027B Bicycle theft Unable to prosecute suspect NaN
1 1 6589894ebc515f501527628eb650d52a6f031116eb0ada... 2015-01 City of London Police City of London Police -0.111497 51.518226 On or near Pedestrian Subway E01000914 Camden 028B Burglary Investigation complete; no suspect identified NaN
2 2 e6dc6a4a33ed886c7c72beaff0c5de92cc35cd2f76c6e5... 2015-01 City of London Police City of London Police -0.111497 51.518226 On or near Pedestrian Subway E01000914 Camden 028B Burglary Unable to prosecute suspect NaN
3 3 b6e6462d45d0d7f4258d57628cab4c8988dc41ac675b63... 2015-01 City of London Police City of London Police -0.111497 51.518226 On or near Pedestrian Subway E01000914 Camden 028B Other theft Investigation complete; no suspect identified NaN
4 4 769e1aa86e62b5f3c4c08c8c140147a275ca721d0801ba... 2015-01 City of London Police City of London Police -0.113767 51.517372 On or near Stone Buildings E01000914 Camden 028B Theft from the person Investigation complete; no suspect identified NaN

lets remove the crime Id and unamed 0 columns

del data['Crime ID']
del data['Unnamed: 0']
data.head(3)
Month Reported by Falls within Longitude Latitude Location LSOA code LSOA name Crime type Last outcome category Context
0 2015-01 City of London Police City of London Police -0.106453 51.518207 On or near Charterhouse Street E01000916 Camden 027B Bicycle theft Unable to prosecute suspect NaN
1 2015-01 City of London Police City of London Police -0.111497 51.518226 On or near Pedestrian Subway E01000914 Camden 028B Burglary Investigation complete; no suspect identified NaN
2 2015-01 City of London Police City of London Police -0.111497 51.518226 On or near Pedestrian Subway E01000914 Camden 028B Burglary Unable to prosecute suspect NaN

lets check the month datatype

data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18000 entries, 0 to 17999
Data columns (total 11 columns):
Month                    18000 non-null object
Reported by              18000 non-null object
Falls within             18000 non-null object
Longitude                17129 non-null float64
Latitude                 17129 non-null float64
Location                 18000 non-null object
LSOA code                17129 non-null object
LSOA name                17129 non-null object
Crime type               18000 non-null object
Last outcome category    14801 non-null object
Context                  0 non-null float64
dtypes: float64(3), object(8)
memory usage: 984.4+ KB

it appears the month datatype is object, we need to convert it to datetime to be able to manipulate it

import datetime
data['Month'] = pd.to_datetime(data['Month'],yearfirst=True)
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18000 entries, 0 to 17999
Data columns (total 11 columns):
Month                    18000 non-null datetime64[ns]
Reported by              18000 non-null object
Falls within             18000 non-null object
Longitude                17129 non-null float64
Latitude                 17129 non-null float64
Location                 18000 non-null object
LSOA code                17129 non-null object
LSOA name                17129 non-null object
Crime type               18000 non-null object
Last outcome category    14801 non-null object
Context                  0 non-null float64
dtypes: datetime64[ns](1), float64(3), object(7)
memory usage: 1.0+ MB

lest check location based on incicence of crime

data.Location.value_counts()
No Location                                871
On or near Police Station                  607
On or near Pedestrian Subway               607
On or near Supermarket                     532
On or near Parking Area                    487
On or near Great St Helen'S                354
On or near Conference/Exhibition Centre    318
On or near Nightclub                       316
On or near Blomfield Street                304
On or near Queen Victoria Street           297
On or near St Martin'S Le Grand            296
On or near Bride Lane                      296
On or near New Change                      284
On or near Shopping Area                   274
On or near Fish Street Hill                269
On or near Clement'S Lane                  235
On or near Gracechurch Street              204
On or near Fleet Street                    194
On or near Artillery Lane                  172
On or near Leadenhall Street               166
On or near Fetter Lane                     158
On or near Finch Lane                      149
On or near Philpot Lane                    149
On or near Bow Lane                        141
On or near Cheapside                       139
On or near Bell Inn Yard                   138
On or near Bishopsgate                     133
On or near Bear Alley                      132
On or near Primrose Street                 130
On or near Eastcheap                       129
                                          ... 
On or near St Dunstan'S Lane                 3
On or near Mount Pleasant                    3
On or near Old Buildings                     2
On or near Dysart Street                     2
On or near Amen Court                        2
On or near Sandy Lane                        2
On or near Tooley Street                     2
On or near Shadwell Gardens                  2
On or near Goswell Place                     2
On or near Mark Street                       1
On or near Milk Street                       1
On or near Little Essex Street               1
On or near Farringdon Street                 1
On or near Cripplegate Street                1
On or near Timber Street                     1
On or near Goldsmith Street                  1
On or near Stoney Lane                       1
On or near Fournier Street                   1
On or near Nelson Terrace                    1
On or near Whiskin Street                    1
On or near Queen Street Place                1
On or near Tower Royal                       1
On or near Mucking Wharf Road                1
On or near Folgate Street                    1
On or near Holywell Row                      1
On or near Banner Street                     1
On or near Upper Ground                      1
On or near Haven Quays                       1
On or near Weaver'S Lane                     1
On or near Bartlett Court                    1
Name: Location, Length: 335, dtype: int64

lets visualize top ten crime location

def adjust_plot(ax):
    plt.rc('xtick',labelsize=22)
    plt.rc('ytick',labelsize=22)
# create a list to collect the plt.patches data
totals = []

# find the values and append to list
for i in ax.patches:
    totals.append(i.get_height())

# set individual bar lables using above list
total = sum(totals)

# set individual bar lables using above list
for i in ax.patches:
    # get_x pulls left or right; get_height pushes up or down
    ax.text(i.get_x()-.03, i.get_height()+.5, \
            str(round((i.get_height()/total)*100, 2))+'%', fontsize=15,
                color='black')
ax = data.Location.value_counts().head(10).plot(kind='bar',figsize=(15,6),title='Crime Location in London city')
plt.title("Crime Location in London 2015-2017", fontname='Ubuntu', fontsize=18,
            fontstyle='italic', fontweight='bold',color='green')
adjust_plot()

crime

it turned out location was unknown for 871 times.Aside from No location, most crime happnede very close to police station!!!

lets visualize the 5 least crime location
data.Location.value_counts().tail(5).plot(kind='bar',figsize=(13,5))

<matplotlib.axes._subplots.AxesSubplot at 0xacba93ec>

crime

we found that the most crime free location are Banner Street,Upper Ground,Weaver’s Lane

lets check crime type

data['Crime type'].value_counts()
Other theft                     4081
Anti-social behaviour           3199
Violence and sexual offences    2536
Shoplifting                     1911
Theft from the person           1270
Drugs                            996
Bicycle theft                    934
Burglary                         665
Public order                     661
Criminal damage and arson        659
Other crime                      470
Vehicle crime                    402
Possession of weapons            111
Robbery                          105
Name: Crime type, dtype: int64

other theft, Anti Social behaviour and Sexual behavoiurs is the most prevalent

ax = data['Crime type'].value_counts().plot(kind='bar',figsize=(15,6),title='Crime type in Londo--2015-2017')
plt.title("Crime Type in London 2015-2017", fontname='Ubuntu', fontsize=18,
            fontstyle='italic', fontweight='bold',color='green')
adjust_plot(ax)

crime

Location in which Violence and Sexual offences occur the most

data[data['Crime type'] == 'Violence and sexual offences']['Location'].value_counts()
No Location                                178
On or near Great St Helen'S                104
On or near Police Station                  101
On or near Supermarket                      78
On or near Pedestrian Subway                75
On or near Nightclub                        65
On or near Parking Area                     64
On or near Conference/Exhibition Centre     55
On or near Blomfield Street                 53
On or near Queen Victoria Street            51
On or near Leadenhall Street                50
On or near Fish Street Hill                 43
On or near Shopping Area                    34
On or near Philpot Lane                     33
On or near Moorgate                         32
On or near Wormwood Street                  30
On or near Gracechurch Street               30
On or near St Martin'S Le Grand             29
On or near Finch Lane                       29
On or near Bride Lane                       29
On or near Watling Court                    28
On or near St Swithin'S Lane                27
On or near New Broad Street                 25
On or near New Change                       23
On or near Wood Street                      23
On or near Mark Lane                        21
On or near Moor Lane                        20
On or near Camomile Street                  20
On or near Distaff Lane                     20
On or near Creed Lane                       19
                                          ... 
On or near Arthur Street                     1
On or near Fournier Street                   1
On or near Cloth Street                      1
On or near Old Seacoal Lane                  1
On or near Lloyd'S Avenue                    1
On or near Bell Yard                         1
On or near Sandy'S Row                       1
On or near Cursitor Street                   1
On or near Amen Corner                       1
On or near Norton Folgate                    1
On or near St Dunstan'S Hill                 1
On or near St Mary Axe                       1
On or near Copthall Avenue                   1
On or near Leadenhall Place                  1
On or near Monkwell Square                   1
On or near Tudor Street                      1
On or near Lombard Lane                      1
On or near Portsoken Street                  1
On or near Pilgrim Street                    1
On or near Nun Court                         1
On or near Temple Lane                       1
On or near South Place Mews                  1
On or near John Carpenter Street             1
On or near Basinghall Street                 1
On or near King'S Arms Yard                  1
On or near Charterhouse Street               1
On or near Moorgate Place                    1
On or near Old Square                        1
On or near Billiter Street                   1
On or near Charterhouse Mews                 1
Name: Location, Length: 256, dtype: int64
sex_crime = data[data['Crime type'] == 'Violence and sexual offences']['Location'].value_counts()
ax = sex_crime.head().plot(kind='bar',figsize=(15,6),title='Location of Violence and Sexual Offence')
plt.title("Location of Violence and Sexual offences 2015-2017", fontname='Ubuntu', fontsize=18,
            fontstyle='italic', fontweight='bold',color='green')
adjust_plot(ax)

crime

analysis of crime based on Last outcome category
data['Last outcome category'].value_counts()
Investigation complete; no suspect identified          7568
Unable to prosecute suspect                            1644
Status update unavailable                              1613
Court result unavailable                                783
Offender given a caution                                591
Under investigation                                     531
Offender sent to prison                                 443
Offender given a drugs possession warning               249
Local resolution                                        245
Offender fined                                          198
Offender given suspended prison sentence                143
Offender given community sentence                       138
Formal action is not in the public interest             129
Defendant found not guilty                              116
Awaiting court outcome                                  104
Offender given penalty notice                            85
Offender given conditional discharge                     81
Court case unable to proceed                             42
Offender otherwise dealt with                            27
Suspect charged as part of another case                  22
Further investigation is not in the public interest      21
Offender deprived of property                            12
Action to be taken by another organisation                7
Offender ordered to pay compensation                      5
Defendant sent to Crown Court                             3
Offender given absolute discharge                         1
Name: Last outcome category, dtype: int64
ax = data['Last outcome category'].value_counts().head(10).plot(kind='bar',figsize=(16,7),title='Crime Outome')
adjust_plot(ax)

crime

turned out that most time suspect was unable to be indentified

Time series analysis of the crime

crime_time = data['Month']
crime_time.head()
0   2015-01-01
1   2015-01-01
2   2015-01-01
3   2015-01-01
4   2015-01-01
Name: Month, dtype: datetime64[ns]
crime_time = pd.DataFrame(crime_time)
#pd.to_datetime(crime_time['Month'],yearfirst=True)
#crime_time['Month'] = pd.to_datetime(crime_time['Month'],yearfirst=True)
crime_time.head()
Month
0 2015-01-01
1 2015-01-01
2 2015-01-01
3 2015-01-01
4 2015-01-01
crime_time['value'] = 0
crime_time['Month'] = pd.to_datetime(crime_time['Month'],yearfirst=True)
crime_time.set_index(crime_time.Month,inplace=True)
del crime_time['Month']
crime_time.head()
value
Month
2015-01-01 0
2015-01-01 0
2015-01-01 0
2015-01-01 0
2015-01-01 0
crime_t = crime_time.resample('M')
crime_t
DatetimeIndexResampler [freq=<MonthEnd>, axis=0, closed=right, label=right, convention=start, base=0]
#
crime_per_month = crime_t.count()
ax = crime_per_month.plot(kind='line',figsize=(16,7),title='Monthly Times seties (2015-2017)')
adjust_pot(ax)

crime

the above show crime was high around march of 2017

crime_w = crime_time.resample('Y')
crime_w = crime_w.count()
crime_w.index.name = 'Year'
crime_w.head()
value
Year
2015-12-31 6185
2016-12-31 6586
2017-12-31 5229
ax = crime_w.plot(figsize=(17,6),title='Yearly Time series of crime per year')
adjust_plot(ax)

crime

there was more crime in 2016 than 2015 and 2017 but there was a particular month in 2017 that recorded highest crime which was around march

we also learnt that through out 2015 crime rate was on the increase



Mustapha Omotosho

Mustapha Omotosho

constant learner,machine learning enthusiast,huge Barcelona fan

comments powered by Disqus
rss facebook twitter github youtube mail spotify lastfm instagram linkedin google google-plus pinterest medium vimeo stackoverflow reddit quora quora