GETTING CLEAN DATA: Reading local flat files

Reading local CSV files

if (!file.exists(“data”)){

dir.create(“data”)

}

download.file(fileUrl, destfile = “cameras,csv”, method = “curl”)

dataDownloaded <- data()

So now the data have been downloaded from the website, and actually is sitting on my computer, it’s local data to my computer.

The most common way that they’re loaded is with the read.table function:

Loading flat files – read.table():

Example:

cameraData <- read.table(“./data/cameras.csv”, sep = “,”, header = TRUE)

some important parameters:

Reading Excel files

Download the excel file to load:

if(!file.exists(“data”)){dir.create(“data”)}

download.file(fileUrl, destfile = “./data/cameras.xlsx”, method = “curl”)

dateDownloaded <- data()

The R library that is useful for this is the xlsx package.

library(xlsx)

cameraData <- read.xlsx(“./data/cameras.xlsx”, sheetIndex = 1, header = TRUE)

You can read specific rows and specific columns.

colIndex <- 2:3

rowIndex <- 1:4

cameraDataSubset <- read.xlsx(“./data/cameras.xlsx”, sheetIndex = 1, colIndex = colIndex, rowIndex = rowIndex)

**Notes**

The write.xlsx function will write out an Excel file with similar arguments
read.xlsx2 is much faster than read.xlsx but for reading subsets of rows may be slightly unstable
The XLConnect package has more options for writing and manipulating Excel files
The XLConnect vignette is a good place to start for that package
In general it is advised to store your data in either a database or in comma separated files (.csv) or tab separated files (.tab/.txt) se they are easier to distribute

Amateur Scientist