![]() Thus, all pieces of information were used with nothing left out and confirming that all outliers were accounted for in the data. Next step was to merge the weather data with Divvy and prepare a master dataset which could then be imported in R.Īt this point I ensured that divvy and temperature data had no bias, since I had taken census instead of just collecting sample data. Extra columns like ‘Average High’ and ‘Average Low’ were removed. On the other hand, for temperature I collected the data for each individual month and merged all the files into one worksheet (Excel). Next I looked at the mean number of rides which was 4796.5, and arranged the count according to weekends and weekdays. I wrote a counter function in excel to count the number of rides per day: For this I had to add the date in numbered form as R only runs tests if the data of the two groups is in the same format. Next was removing ‘Ride Start Date’ and ‘Ride End Date’, while aggregate the data on daily level to count the number of bike rides per day. I had to remove the columns from the divvy data that I didn’t require such as ‘Station Names’ and ‘Bike ID’ etc. ![]() The hard part in this entire process was filtering the data. After a few hours of research, I found the historic weather tabular data from Accuweather. As shown in figure 1, the temperature has a huge variation in these three months.įinding data for the temperature was a tedious task since most weather pages provide graphical representation while I wanted a tabular format. In this data, I observed temperature ranging from 11 degrees Fahrenheit to 78 degrees Fahrenheit. January to March, with the primary reason, being the season transition from winter to spring during this period. I collected the data for the first quarter of 2017 i.e. I also wondered if the number of rides on the weekend are the same as those on weekdays.Ĭollecting data for Divvy was relatively easy since everything is published on the Divvy website. ![]() I particularly wanted to find out if the number of divvy rides per day had a relationship with temperature. Therefore, I decided to look deeply into Divvy operations and find interesting insights. I chose Divvy data for my Data Science course project since I love cycling and have been an enthusiastic rider from the age of 6. It currently has an annual ridership of around 3.2 Million people, with more than six million trips and totaling more than 13 million miles by Divvy riders. ![]() It was launched in 2013 and has jumped leaps and bounds since its initiation. Analyzing the impact of weather on divvy bikesĭivvy is a bicycling company based out in the City of Chicago. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |