Reading Data for Travel Analysis into R

Reading Data for Travel Analysis into R

The last post covered all of the basics on how to prepare to start an analysis in R. Next, we need some data to analyze. Thanks to the internet we can find data on just about anything, including the historical trends of weather in almost any location. In this case we’re going to use data from Weather Underground. They provide a number of variables, but the raw format would be difficult to use:

MST,Max TemperatureF,Mean TemperatureF,Min TemperatureF,Max Dew PointF,MeanDew PointF,Min DewpointF,Max Humidity, Mean Humidity, Min Humidity, Max Sea Level PressureIn, Mean Sea Level PressureIn, Min Sea Level PressureIn, Max VisibilityMiles, Mean VisibilityMiles, Min VisibilityMiles, Max Wind SpeedMPH, Mean Wind SpeedMPH, Max Gust SpeedMPH,PrecipitationIn, CloudCover, Events, WindDirDegrees
2013-1-1,30,15,-1,17,4,-6,95,73,50,30.38,30.22,30.08,10,8,0,18,6,23,0.00,2,Fog,61
2013-1-2,31,22,12,7,5,1,76,54,31,30.27,30.22,30.15,10,10,10,17,10,26,0.00,0,,80
2013-1-3,25,20,15,6,3,-1,59,49,42,30.35,30.25,30.13,10,10,10,29,18,38,0.00,0,,61
2013-1-4,37,18,-1,7,2,-5,80,53,26,30.45,30.34,30.20,10,10,10,13,3,17,0.00,0,,48
2013-1-5,49,25,0,15,2,-7,83,47,11,30.51,30.39,30.30,10,10,10,14,1,16,0.00,0,,50
2013-1-6,41,24,7,11,7,3,80,54,27,30.35,30.16,29.94,10,10,10,22,4,26,0.00,0,,205
2013-1-7,36,24,11,15,10,5,76,58,39,30.10,30.00,29.90,10,10,10,21,7,25,0.00,0,,68
2013-1-8,45,25,4,19,10,1,88,56,24,30.26,30.14,30.07,10,10,10,24,9,32,0.00,0,,67
2013-1-9,47,34,20,20,18,16,84,58,31,30.33,30.20,30.09,10,10,10,20,7,29,0.00,0,,61
2013-1-10,38,29,19,27,19,13,92,68,43,30.05,29.75,29.58,10,7,0,36,15,51,0.05,5,Fog-Snow,220
2013-1-11,24,12,-1,18,5,-9,88,56,24,30.03,29.81,29.66,10,7,1,22,10,33,T,4,Snow,251

Fortunately, the “XLSX” package we loaded in the last post has some tools to help us format the information. First, copy and paste the data above into an excel file. Then we will write a command in R that will read the table into the program so you can do further analysis on it. The code below is defining an object named “temps” which is equal to the data on the sheet of the Sedona_temps.xlsx file that is labeled “Sedona_temps”. Finally, it is also telling the program that the data has headers. If that was set to false, the variable names would pull into the first row of data instead of being the column header.


temps<-read.xlsx("Sedona_temps.xlsx", sheetName="Sedona_temps", header= TRUE)
head(temps)

Now, in the upper right hand corner of R Studio you should see a file named “temps” alongside a description of the data table that tells you the number of variables and rows.

This is just one of many ways to read data into R. Another option would be to save the data as a .txt file and use this command to read the file:


temps<- read.table(file="Sedona_Temp_Data.txt", header=TRUE, sep=",", quote="")

If you want to see what the data looks like, simply type this command to open a new tab that shows the raw data:


View(temps)

Next up we’re going to work on making sense of this data with some of R’s analytic tools.

Leave a Reply

Your email address will not be published. Required fields are marked *