The most basic action that you will perform in R is “assignment”. The
assignment operator is <-
or =
. Assignment
takes what is on the right side of the assignment operator and “stores”
it into the a “variable” that is on the right side of the operator.
x <- 1
means that x
will function as
1
until you change it.
x <- 1
x #You can print the information stored in any variable by simply entering that variable into the command prompt
## [1] 1
x + 1
## [1] 2
x + 2
## [1] 3
x - 1
## [1] 0
x <- 3
x + 1
## [1] 4
x * 2
## [1] 6
x / 2
## [1] 1.5
y <- 2
y + x
## [1] 5
z <- y + x
z + 2
## [1] 7
You can also store categorical data into a variable:
x <- "Hello world"
x
## [1] "Hello world"
It is also possible to store multiple numbers into a variable.
x <- 1:3
x
## [1] 1 2 3
x + 2
## [1] 3 4 5
We’ll spend more time on what types of elements can be stored in
variables, but for now, let’s discuss the variable name itself. So far,
we’ve been using x
and y
a lot, but those are
definitely not the only variable names you can use. A variable name can
include any letter in any arrangement.
variable <- 1
elbairav <- 2
v <- 3
variablevariable <- 4
But, be careful, R is case sensitive.
variable <- 1
VARIABLE <- 2
variable
## [1] 1
casematters <- "hello"
CaseMatters <- "world"
CaseMatters == casematters
## [1] FALSE
You can also use numbers and the special characters .
and _
.
variable1 <- 1
variable2 <- 2
variable.name <- "Howdy Earth"
variable_name <- "Hallo Welt"
Variable names can start with uppercase or lowercase letters but
cannot start with numbers or the underscore _
.
1x <- 1
## Error: <text>:1:2: unexpected symbol
## 1: 1x
## ^
_variable <- 1
## Error: <text>:1:2: unexpected symbol
## 1: _variable
## ^
Note: R does allow variable names to start with .
as
long as it’s not followed by a number.
.x <- 1
._x <- 2
.2x <- 1
## Error: <text>:1:3: unexpected symbol
## 1: .2x
## ^
However, variables starting with .
are hidden.
## this function prints all of the variables you have in your environment.
ls()
## [1] "casematters" "CaseMatters" "elbairav" "v"
## [5] "variable" "VARIABLE" "variable_name" "variable.name"
## [9] "variable1" "variable2" "variablevariable" "x"
## [13] "y" "z"
At this stage, just stick to letters for variable names.
So, how do you choose what to name a variable? This is actually a bit trickier than you would think. R doesn’t care. R is happy with whatever you use (as long as it follows the previously mentioned rules). But remember that R is picky and doesn’t understand things the way humans do. For example, R is perfectly happy to let you do something insane like:
why_is_hello_world_the_phrase_that_is_always_in_intro_to_programming_lessons <- "Hello world"
why_is_hello_world_the_phrase_that_is_always_in_intro_to_programming_lessons
## [1] "Hello world"
Or:
numeric_variable <- "character_string"
is.numeric(numeric_variable)
## [1] FALSE
Or:
character_variable <- 2
wordVariable <- 3
integer.variable <- character_variable/wordVariable
integer.variable
## [1] 0.6666667
is.integer(integer.variable)
## [1] FALSE
R just doesn’t care. But humans do. And humans will read your code (at least one human, you). So variables names like those above work perfectly fine in R, but are hell if you are trying to figure out what is going on. Your code will break, it will produce unexpected results, and you will forget what certain things are doing.
R has a list of names that have special purposes, these are
off-limits to use as a variable name. You can access the full list with
?reserved
?reserved
What if we need to perform some function like finding the mean or standard deviation or even converting Fahrenheit to Celsius?
### 3 temperature converting functions!
# Note: R doesn't care that Celsius is misspelled. As long as your variable names are consistent, R is happy. This is good news for people who are bad spellers. Well, as long as you are consistent in how you misspell words.
# F to C
fahrenheit_to_celcius <- function(fahrenheit){
(fahrenheit - 32) * 5/9 # R follows the conventional order of operations: Parentheses, Exponents, Multiplication/Division, Addition/Subtraction
}
# C to F
celcius_to_fahrenheit <- function(celcius){
celcius * (9/5) + 32
}
# Both
temp_converter <- function(input_temperature = 32, output_scale = "Fahrenheit"){ #
if (output_scale == "Fahrenheit") {
input_temperature * (9/5) + 32
} else if (output_scale == "Celcius") {
(input_temperature - 32) * 5/9
} else {
errorCondition("Did you mispell Celcius or Fahrenheit? Please use 'Celcius' or 'Fahrenheit' with first letter capitalized")
}
}
A “function” is code that is written to perform some function. Isn’t it great when terminology is straight-forward?!
In this class, we won’t spend a lot of time creating our own functions - as with the temperature converter - But, we will be using a lot of functions so it’s important to know the basics.
Every function has the following elements:
function_name <- function(argument_1, argument_2, ...) {
Function body
}
The function name exists so that you can easily call the function whenever you need it.
Functions take “arguments” as input. These are the elements that you put inside the paretheses.
x <- 1:5
mean(x) #the variable x is the argument
## [1] 3
Functions can take multiple arguments, in fact, some require multiple arguments. The order of the arguments matters.
seq(5, 85, 20) # Seq() creates a sequence of numbers. In this case, from 5 to 85 by 20
## [1] 5 25 45 65 85
seq(20, 85, 5) # From 20 to 85 by 5
## [1] 20 25 30 35 40 45 50 55 60 65 70 75 80 85
You can also specify each argument. For most functions, if you specify each argument explicitly the order no longer matters.
seq(from = 5, to = 85, by = 20)
## [1] 5 25 45 65 85
seq(by = 20, to = 85, from = 5)
## [1] 5 25 45 65 85
Some arguments require an input value and some are set by default.
It’s always best to use ?
and read the documentation before
using a function.
The action that the function performs is found inside the curly
brackets {}
. Unless you want to write your own function or
you want to look inside of a pre-existing function, you don’t need to
worry about this right now.
The purpose of a function is to produce some result. In the functions we’ve seen so far, the result is a value. These values can be stored in a variable.
x <- seq(from = 1, to = 10, by = 0.5)
mean_of_x <- mean(x)
mean_of_x
## [1] 5.5
# A more realistic (but complicated) example
control_group_scores <- rnorm(n = 100, mean = 75, sd = 10)
treatment_group_scores <- rnorm(n = 100, mean = 86, sd = 7)
group <- gl(n = 2, k = 100, length = 200, labels = c("control", "treatment"))
scores <- c(control_group_scores, treatment_group_scores)
linear_model <- lm(scores ~ group)
anova(linear_model)
## Analysis of Variance Table
##
## Response: scores
## Df Sum Sq Mean Sq F value Pr(>F)
## group 1 5983.8 5983.8 85.133 < 2.2e-16 ***
## Residuals 198 13917.1 70.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(linear_model)
##
## Call:
## lm(formula = scores ~ group)
##
## Residuals:
## Min 1Q Median 3Q Max
## -23.0627 -4.9536 0.3812 4.6084 21.9716
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.8711 0.8384 89.304 <2e-16 ***
## grouptreatment 10.9397 1.1856 9.227 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.384 on 198 degrees of freedom
## Multiple R-squared: 0.3007, Adjusted R-squared: 0.2971
## F-statistic: 85.13 on 1 and 198 DF, p-value: < 2.2e-16
Other functions perform actions that are essential to working in R,
but won’t (necessarily) produce an object that you can use. For example,
you can get your working directory by using the function
getwd()
. You can change your working directory by using
setwd()
. list.files()
will produce a list of
all of the files in your working directory.
Two of the most useful functions that you will use are
install.packages()
and library()
. Packages are
like “add-ons” to R. A package contains R functions, example data, and
helpful documentation. If you find yourself thinking “I wonder if there
is a way to do this” the answer is most likely “yes and there is a
package that does it”. You can find all of the R approved packages at https://cran.r-project.org. Packages that are on this
website are very easy to download and start using. Simply use the
install.packages()
function and put the name of the package
as an argument: install.packages("package Name")
The packages you download are stored in your R library. The function
library()
produces a list of all the packages currently
installed. To use a package you have to load it first. To load a package
put the package name as an argument into the library()
function: library("package Name")
. If you don’t know the
path to your library, you can use .libPaths()
In this class, we will learn to use two packages: “dplyr” and “ggplot2”
install.packages("dplyr") # must have quotes around the package name
library(dplyr) # quotes are optional
================================================================================
Last update on 2021-10-11
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=de_AT.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=de_AT.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=de_AT.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.29 R6_2.5.1 jsonlite_1.8.0 magrittr_2.0.3
## [5] evaluate_0.16 stringi_1.7.8 cachem_1.0.6 rlang_1.0.5
## [9] cli_3.3.0 rstudioapi_0.14 jquerylib_0.1.4 bslib_0.4.0
## [13] rmarkdown_2.16 tools_4.2.1 stringr_1.4.1 xfun_0.32
## [17] yaml_2.3.5 fastmap_1.1.0 compiler_4.2.1 htmltools_0.5.3
## [21] knitr_1.40 sass_0.4.2
================================================================================
Copyright © 2022 Dan C. Mann. All rights reserved.