2.7 Other Useful Things
Alright, we now have a basic understanding of the most fundamental operations in R programming, most of which are things we will frequently use in this course. Next, we’ll introduce a few more useful concepts and some commonly used functions.
2.7.1 Workspace
In R, the workspace refers to the environment where all objects (such as variables, functions, and data) are stored during an R session. It acts as a storage area that retains the data and objects you create, allowing you to work with them without needing to re-import or redefine them every time you start R. The most common scenario is when you’ve worked hard all day and want to take a break, but if you close R, all the objects in your working environment (memory) will disappear. In this case, you can save your current working environment as a workspace file, which has a .RData extension.
There are two ways to save your working environment as a workspace file. First, by mouse actions, you can click Session -> Save Workspace As.... Or you can do it by command
save.image("FileName.RData")The next day, after enjoying the morning sunshine (if conditions permit) and your coffee, you can load this file and continue your hard work!
2.7.2 Packages
If R could only be used for scientific computing, it would undoubtedly be overshadowed by numerous other scientific computing programs. The true strength of R lies in its extensibility, which is achieved through R packages. Initially, R packages were primarily written by statisticians to implement new methods, such as lme4 for fitting generalized linear mixed-effects models; survival for conducting survival analysis; psych for psychological research, and so on. However, writing packages is not exclusive to statisticians; an increasing number of non-statistical application packages have also been developed. Today, R has become incredibly versatile through the extension of various packages, for example this website is written by quarto package. Below, we will briefly illustrate how to install and load packages using examples.
install.packages("kernelab")
# to install a new package. Note: the quotation marks are essential.
library(kernelab)
# you can import a package by function `library`2.7.3 Useful Functions
Next, some useful functions are introduced. These functions were extremely useful back when I was a student. However, in the era of RStudio, their usefulness has been greatly reduced. Nonetheless, they are still quite necessary for those who prefer keyboard operations or need to work on a server. In addition, these functions can, to some extent, enhance R users’ understanding of R programming.
lsfunction: it can list all the objects in the workspace or current environment.rmfunction: it can help us to remove objects from the workspace or current environment.
# Example 1
x = 1
rm(x)
# Example 2
rm(list = ls()) # Danger Warning: This command will remove all objects listed by `ls`strfunction: it displays the structure of an object.
# Example 1
x = list()
x[[1]] = 1:10
x[[2]] = letters[4:10]
str(x)# Example 2
res = t.test(rnorm(30)) # do one sample t-test and save results in `res`
str(res)
# You can see that the testing results are saved in a list of 10.
# if you want to extract elements from it, the information coveryed by ´str´ is ideal.summaryfunction: it helps us to summarize useful information from an R objects. The information extracted depends on the type of the object. For examples
# Example 1
dat = iris[,-5] # we use the first 4 variable from iris data
summary(dat) # the type of ´dat´ is dataframe, then the summarized informations are...# Example 2
res = t.test(rnorm(30))
summary(res)
# the type of ´res´ is results of t test. The designer of this function decided
# to show the names of all the elements in ´res´, similiar to the output of ´str´uniqueandtablefunctions: they are useful when you want to check all possible values in a variable and the frequency of different possible values.
# First, we create a small demo dataset
treatment = c(1,1,0,0,1,1,0,0)
block = c(1,1,1,1,2,2,2,2)
sex = c("F","M","M","M","F","F","M","F")
age = c(19,20,28,22,21,19,23,20)
outcome = c(20,19,33,12,54,87,98,84)
Dat = data.frame(treatment, block, sex, age, outcome)
head(Data, 8)
# Example 1:
unique(Dat$sex)
table(Dat$sex)
unique(Dat$age)
table(Dat$age)
# Example 2:
table(Data$sex, Data$treatment) # do you know the name of the outputs?whichfunction: it finds the index of elements that satisfy some conditions in a vector, or matrix, or data frame.
# Example: Use the same demo data above
which(Dat$sex == "M")
which(Dat$age < 21)applyfunction: it is used to perform operations on rows or columns of matrices, data frames, or higher-dimensional arrays. It allows you to apply a function across the rows or columns without needing to use loops, making code more concise and often more efficient.
# Syntax:
apply(X, margin, fun)
# `margin` is an integer specifying whether to apply the `fun` across rows (1) or columns (2) # Examples: Use the demo data but ignore the variable `sex`
Dat = Dat[, -3]
apply(Dat, 2, mean)Next, show some useful functions for graphics. The ggplot2 package is definitely the top choice for plotting, but sometimes the following functions are more practical and convenient for data visualization. I will only list them below, and you are already strong enough to investigate them by yourself :)
histfunction: it can help use check the distribution of a variable.plotfunction: it is usually used to show the scatter plot of two variables.pairsfunction: it shows the pairwise scatter plot of many variables.
