5 Data Classes

There are various kinds of R-objects or data structures:

  1. Vectors
  2. Lists
  3. Matrices
  4. Arrays
  5. Factors
  6. Data Frames

Let’s first understand some of the basic datatypes on which the R-objects are built like Numeric, Integer, Character, Factor, and Logical.

5.1 Numeric

num <- 1.2
print(num)
class(num)

5.2 Integer

Integer: Numbers that do not contain decimal values have a data type as an integer. However, to create an integer data type, you explicitly use as.integer() and pass the variable as an argument.

int <- as.integer(2.2)
print(int)
class(int)

5.3 Character

As the name suggests, it can be a letter or a combination of letters enclosed by quotes is considered as a character data type by R.

char <- "datacamp"
print(char)
class(char)
char <- "12345"
class(char)

5.4 Logical

Logical: A variable that can have a value of True and False like a boolean is called a logical variable.

log_true <- TRUE
print(log_true)
class(log_true)
log_false <- FALSE
class(log_false)

5.5 Factor

Factor: They are a data type that is used to refer to a qualitative relationship like colors, good & bad, course or movie ratings, etc. They are useful in statistical modeling.

fac <- factor(c("good", "bad", "ugly","good", "bad", "ugly"))
print(fac)
class(fac)
levels(fac)
nlevels(fac)
class(levels(fac))

5.6 List

Unlike vectors, a list can contain elements of various data types and is often known as an ordered collection of values. It can contain vectors, functions, matrices, and even another list inside it (nested-list).

lis1 <- seq(1,5)     # Integer Vector
lis2 <- factor(1:5)  # Factor Vector
lis3 <- letters[1:5] 
combined_list <- list(lis1, lis2, lis3)

combined_list[[1]]
combined_list[[2]]
combined_list[[3]]
combined_list[[3]][5]

flat_list <- unlist(combined_list)
class(flat_list)

5.7 Vectors

Vectors are an object which is used to store multiple information or values of the same data type. A vector can not have a combination of both integer and character. For example, if you want to store 100 students’ total marks, instead of creating 100 different variables for each student, you would create a vector of length 100, which will store all the student marks in it.

marks <- c(88,65,90,40,65)
marks[4]
marks[1]
marks[6]

5.7.1 Slicing vectors

marks[seq(1,4)]
marks[c(1,2,4)]
char_vector <- c("a", "b", "c")
class(char_vector)
length(char_vector)
char_vector[1:3]
char_vector[-3]
char_num_vec <- c(1,2, "a")
char_vector <- c("a", "b", "c")
class(char_vector)
length(char_vector)
char_vector[1:3]
char_vector[-3]
vec <- seq(1,1024)
vec <- c(1:1024)

5.8 Matrix

Similar to a vector, a matrix is used to store information about the same data type. However, unlike vectors, matrices are capable of holding two-dimensional information inside it.

M <- matrix(vector, 
            nrow=r, 
            ncol=c, 
            byrow=FALSE, 
            dimnames=list(char_vector_rownames, char_vector_colnames))

# byrow=TRUE signifies that the matrix should be filled by rows. 
# byrow=FALSE indicates that the matrix should be filled by columns (the default).
M <- matrix(seq(1,6), 
            nrow = 2, ncol = 3, byrow = TRUE)

M <- matrix(seq(1,6), 
            nrow = 2, ncol = 3, byrow = F)

5.9 DataFrame

Unlike a matrix, Data frames are a more generalized form of a matrix. It contains data in a tabular fashion. The data in the data frame can be spread across various columns, having different data types.

dataset <- data.frame(
   Person = c("Aditya", "Ayush","Akshay"),
   Age = c(26, 26, 27),
   Weight = c(81,85, 90),
   Height = c(6,5.8,6.2),
   Salary = c(50000, 80000, 100000))

class(dataset)
nrow(dataset) 
ncol(dataset) 
df1 <- rbind(dataset, dataset) 
df2 <- cbind(dataset, dataset)
head(df1,3) 
str(dataset) 
summary(dataset)