UNDERSTANDING DATA
The statistics gives us the listing of crimes in Montgomery County in Maryland USA. Crimes mentioned are recorded with an Incident Number and applicable offense code is additionally populated. The response time of the police legit is given for every crime alongside the address and also shows longitude and latitude of the given address. The original data set has more than 100,000 records. The chosen random sample from electronically collected data represents crimes of type Bad Checks, Wire fraud and Motor Vehicle Theft across 14 cities through various months across years 2016 to 2018.
The data dictionary provided gives a detailed insight into the metadata and a brief background on how the data is captured in real time. The initial data set captures different times of crimes occurring specifically in various cities of Maryland State since 2013. As a result, the data is geographically specific to few cities in one state in United States of America and not spread globally. The sample chosen though is trying to capture the most recent data since 2016. The purpose of the given data set was to give public an insight into the crime scenario of Montgomery County. Our purpose as a team is however, to delve deeper and provide more specific insight focusing on three types of crime across 14 cities. We built the context around an experience one of our team mates Deepti had. There was a break in into one of her retired colleague’s houses – over 65 yrs. – from Toastmasters Club in Madison. Their car was stolen along with their wallets with cards and other valuables, and they were not into mobile banking so they would visit the bank very often. So, keeping them in mind, we wanted our visualization to cater to retired people looking for a house with lesser Motor Vehicle Thefts, and Wire Frauds or Bad Checks.
Before talking about how we went through visualizing our context, we’d like to mention the custom fields created for the visualization.
1. Place of Crime is a custom Data Category field that puts all kinds of Residence into one bucket, different spots of Parking Lots in one bucket etc.
2. Crime Location is a calculated field we came up with which will calculate the number of places of Place of Crime.
Crime Location = COUNTDISTINCT(Place of Crime)
Once we narrowed down our scenario, we compared the Name of Crime (Back Checks, Motor Vehicle Theft, Wire Fraud) and Place of Crime (as in residence, commercial area, streets, parking lots etc.) with the cities and Crime Location. The initial data set required a cleaning process, which included steps like selecting only the necessary columns for our visualization, addressing erroneous data by making an educated guess based on various other fields such as City, Country and block address based on Location. For example, in some places the State value was populated as 16, when the data had to be MD, this was analyzed based on Zip code and location columns. The final visualization, uses a Comparison graph (consisting of a bar and line) to populate two parameters across X-axis and compare the values.
For the visualization to display a single number to show some key metric along with some relevant data, we created a pivot table on our refined dataset taking the crimes specific to our group for Montgomery Village, MD and the victims across each of them. We came up with the number 3304, which was the number of people who became victims of crime in Montgomery Village, MD. This number could serve as a very good insight for the administrative department of Montgomery Village, MD for planning their next course of action in the future to prevent such incidents and take the necessary actions.