4 Data input and output (IO)
4.1 Some considerations:
Keep the names of local files downloaded from the internet or copied onto your computer unchanged. This will help you trace the provenance of the data in the future.
R’s native file format
.RData
can be accessed usingload
andsave.
4.2 Reading and Writing Files
There are many methods to read and write files in R programming. Your command of these is critical because all scientific work begins with data, and most data is found inside files and databases.
Dealing with input is probably the first step of implementing any significant project.
4.3 Small data
- For very small datasets, is may be preferred to enter the data by hand.
c
is a common function used for combine:
<- c(3,7,11,19)
x <- c(1,1,1,1)
y c(x,y)
<- c(x,y,5) y
There are a suite of functions to enter data in the console. The sequence function seq
:
<- seq(from = 1, to = 10)
y <- seq(from = 1, to = 10, by = 2.5)
y <- seq(from = 1, to = 10, length.out = 22) y
Use the function rep
(repeat):
<- rep(x = 5, times = 4)
y <- c(2,3)
x.value rep(x.value, times = 3)
4.4 Practice
Number | Excercise |
---|---|
1. | How many different ways can you create a vector labled q containing two 3’s and four 5’s? Try some! |
2. | Assign a vector of four elements: 3,7,9 and 2 to w . |
3. | Assign the third element of w to s , where s is equal to 6. |
4. | What is the length of a sequence that starts at 1.1, ends at 9.2, and has increments of 0.894? |
5. | What is the 3rd value of the sequence you created? |
# Create a data frame using the "data.frame" function
<- c(rep("Site.01",3),rep("Site.02",3))
site.name <- rep(x = 2.3, times = length(site.name))
density <- seq(from = 14.5, to = 19.8, length.out = length(site.name))
abundance <- c(F, T, F, F, T, F)
sampled. <- data.frame(site.name, density, abundance, sampled.)
y.data.frame
y.data.frameclass(y.data.frame)
4.5 Large(r) data
4.5.1 Read csv
Most scientific work will involve data larger than can be entered by hand.
In this case we will use a suite of commands and different packages to get the data into our environment.
read.csv("./Data/co2.csv")
4.5.2 Read xlsx
- MS Excel files are widely used
install.packages('readxl')
require('readxl')
read_xlsx("./Data/Codes.xlsx", sheet = 1)
4.5.3 Download data from public repository
: "https://www.stats.govt.nz/large-datasets/csv-files-for-download/"
Website can be found here
:
The data are accessed by this URL<- "https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2020-financial-year-provisional/Download-data/annual-enterprise-survey-2020-financial-year-provisional-csv.csv"
url
in an online directory. Access the directory in a web browser: https://www.stats.govt.nz/
You can see that the url links to a .csv file. This is a file
<- "./Data/output.csv"
destfile
download.file(url, destfile)