Clean data

GETTING CLEAN DATA: Downloading files

Knowing your working directory:

getwd() : gets the working directory, tells you what directory you’re currently in

    setwd(): sets a different working directory that you might want to move to.

Checking for and creating directories:

file.exists(“directoryName”): will check to see if the directory exists

dir.create(“directoryName”): will create a directory if it doesn’t exist

example (checking for a “data” directory and creating it if it doesn’t exist):

if (!file.exists(“data”)) {



Getting data from the internet – download.file():

Downloads a file from the internet

parameters: url: the place that you’re going to be getting data from.

destfile: the destinaiton file where the data is going to go.

method: needs to be specified particularly when dealing with https.

Useful for downloading tab-limited, CSV files, Excel files.

Download a file from the web:

fileUrl <- “https://address&#8221;

download.file(fileUrl, destfile = “./data/cameras.csv”, method = “curl”)



  • If the url starts with http you can use download.file()
  • If the url starts with https on Mac you may need to set method = “curl”
  • If the file is big, this might take a while
  • Be sure to record when you downloaded