= "C:/Users/User/Documents/R_training"
path setwd(path) # tell R to access the file from 'R_training' folder
getwd() # check the file folder
Overview of RStudio
Operators in R
Coding and arithmetic operations
Variable assignment, data types and structures
Installing packages
Getting help in R
Import and explore data
Overview
The basic layout of RStudio will have three panels
Console (entire left) - the interactive panel where you can type and execute R commands and it displays the output of those commands
Environment/History/Connections (tabbed in upper right) - shows loaded variables and their values
Files/Plots/Packages/Help/Viewer(tabbed in lower right) - displays project files and directories, show plots, list installed packages, provide access to R documentation
Project management in Rstudio
In RStudio, a project is a self-contained environment that manages all the files associated with a particular set of analyses or tasks. It’s a powerful tool for organizing your work, maintaining reproducibility, and simplifying collaboration.
Steps to set up
- Create a New Project:
Click on
File
>New Project
Choose
New Directory
for a new project orExisting Directory
if you want to associate the project with an already existing folder.Follow the prompt to name your project and choose its location on your computer.
- Open Existing Project:
- Navigate to the project directory and open the
.Rproj
file
- Using Version Control:
- During the project creation, you can also initialize a Git repository if you want to use version control, which is highly recommended for tracking changes and collaborating with others.
Once a project is set up, RStudio will automatically set your working directory to the project’s root folder each time you open it, which is incredibly convenient for file management and relative paths in your code.
Set working directory
Set working directory is to tell R where to look for files and where to save outputs.
To set working directory - which is the folder where your R session is focused - you can use the following method
- Using the
setwd()
function:
Type setwd("path/to/your/directory") and replace '``path/to/your/directory' with the actual path to your folder. Make sure to use forward slashes / or double slashes in your path
- You can also use the graphic interface:
Go to the Session menu at the top of RStudio.
Choose Set Working Directory.
Select Choose Directory… and navigate to the folder you want to set as your working directory.
Operators in R
Basic arithmetic and logical operators to perform mathematical operations and expressions in R are:
Operators | Example |
---|---|
+ addition | Add two numbers x + y |
- subtraction | Subtracts one number from the other x - y |
/ division | Divides one number by another x/y |
%% reminder/modulus | Reminder of the devision of one number by another x%%y |
^ exponent | Raises a number to the power of another x^2 |
< Less than | x < y a logical comparison that checks if each element of the vector x is less than the corresponding elements of the vector y . The results are TRUE or FALSE. |
<= less than or equal to | x <= y a logical comparison that checks if each element of the vector x is less than or equal to the corresponding elements of the vector y . The results are TRUE or FALSE. |
> Greater than | x > y a logical comparison that checks if each element of the vector x is greater than to the corresponding elements of the vector y . The results are TRUE or FALSE. |
>= Greater than | x >= y a logical comparison that checks if each element of the vector x is greater than or equal to the corresponding elements of the vector y . The results are TRUE or FALSE. |
== Equal to | x == y a logical comparison that checks if each element of the vector x is equal to the corresponding elements of the vector y . The results are TRUE or FALSE. |
= /<- Assign variable | <- A common assignment operator in R. It used to assign values to variables. x <- 10 Z <- c(1, 2, 3, 4, 5) = also used for assignment in R and works similarly to <- x = 10 Z = "Hello, World" |
!= Not equal to | != used for inequality comparison and returns a logical vector of TRUE or FALSE x != y |
& AND | & used for element-wise logical “AND”. It returns TRUE only if both corresponding elements of the operator are TRUE. x <- c(TRUE, FALSE, TRUE, FALSE) y <- c(TRUE, TRUE, FALSE, FALSE) result <- x & y [1] TRUE FALSE FALSE FALSE` |
| OR | | used for element-wise logical “OR”. It returns TRUE if either corresponding elements of the operator are TRUE. |
! Not | ! is used for logical negation. It inverts the value of a logical expression: TRUE becomes FALSE x <- TRUE result <- !x result [1] FALSE |
%<% and |> Pipes | %<% and |> are pipe operators from dplyr package used to pass the output of one function directly into another, which can help in creating a more clear and concise code.library(dplyr) countries%>% filter(Capital_city %in% "Addis Ababa") %>% mutate(Region= "Eeast Afirca") Without pipel filter (countries , Capital_city ="Addis Ababa" ) |
%IN% contained | %in% is used to determine if elements of one vector are contained in another vector Example: x <- c(1, 2, 3, 4, 5) y <- c(3, 4, 5, 6, 7) x %in% y [1] FALSE FALSE TRUE TRUE TRUE |
Coding Basic Expression
R Console: This is where you can directly enter and run R commands:
.R
extension) where you can save your R code. To create an R script, got to the File Menu
and select New File > R Script
. This will open a fourth pane in RStudio for your script. Using R Scripts allows you to save your code and return it later. Let’s get started! 📊1 + 1 # sum
[1] 2
3^2 # sqrt
[1] 9
3**2
[1] 9
13%%2 #reminder/modules
[1] 1
8/4 # divided
[1] 2
2*4 # multiplication
[1] 8
Understanding the order of arithmetic operations in R is crucial. The order from highest to lowest:
Parenthesis: ()
Exponential: ^
Multiplication: *
Division: /
Addition: +
Subtraction: -
Let’s practice using these operations:
1 + (2^2 * 4*8)) - (4/3) #what is the answer? (
[1] 127.6667
Mathematical functions:
# natural logarithm
log(1)
[1] 0
# base-10 logarithm
log10(10)
[1] 1
# e^(1/2)
exp(0.5)
[1] 1.648721
Character operation
R uses the print function to display the variables
The function
paste
and past0
used to concatenate texts and variables together. For your challenge, what do you notice the difference between paste() and paste0()print("Hello World")
[1] "Hello World"
# assign variable
<- "Hello"
greeting <- "Yalem"
name <- paste(greeting, name)
message message
[1] "Hello Yalem"
# paste0()
<- paste0(greeting, name)
message2 print(message2)
[1] "HelloYalem"
# rep()
rep("hello",10)
[1] "hello" "hello" "hello" "hello" "hello" "hello" "hello" "hello" "hello"
[10] "hello"
Logical operation
A logical value is often created via a comparison between variables.
= 89
test = 76
conf
< test conf
[1] TRUE
= 42
x =144
y = 12
z <- (x %% 2 == 0)
is.even
z
[1] 12
< y & is.even x
[1] TRUE
> y | x > z x
[1] TRUE
Let’s create a scenario related to COVID-19 data analysis where you compare the number of cases and deaths in different countries.
# Assume we have COVID-19 data for three countries: USA, India, and Brazil
# We'll compare the total number of confirmed cases and deaths in these countries
# Define COVID-19 data for each country
<- 1000000
USA_cases <- 50000
USA_deaths
<- 500000
India_cases <- 20000
India_deaths
<- 800000
Brazil_cases <- 40000
Brazil_deaths
# Compare the total number of cases and deaths
<- USA_cases > India_cases & USA_cases > Brazil_cases
USA_more_cases <- India_cases > USA_cases & India_cases > Brazil_cases
India_more_cases <- Brazil_cases > USA_cases & Brazil_cases > India_cases
Brazil_more_cases
<- USA_deaths > India_deaths & USA_deaths > Brazil_deaths
USA_more_deaths <- India_deaths > USA_deaths & India_deaths > Brazil_deaths
India_more_deaths <- Brazil_deaths > USA_deaths & Brazil_deaths > India_deaths Brazil_more_deaths
# Display the comparison results
if (USA_more_cases) {
print("USA has the highest number of confirmed cases.")
else if (India_more_cases) {
} print("India has the highest number of confirmed cases.")
else {
} print("Brazil has the highest number of confirmed cases.")
}
[1] "USA has the highest number of confirmed cases."
if (USA_more_deaths) {
print("USA has the highest number of deaths.")
else if (India_more_deaths) {
} print("India has the highest number of deaths.")
else {
} print("Brazil has the highest number of deaths.")
}
[1] "USA has the highest number of deaths."
Variable assignment
In this scenario, we’ll use malaria surveillance data to demonstrate variable declaration in R.
Variables in R:
A variable name must start with a letter.
It can contain numbers, letters, underscores (_), and periods (.).
Variable names cannot start with a number or contain spaces.
<- "Yalem"
name <- "malaria"
assessment <- "microscopy"
diagn <- "Pf"
result
# Create the message
<- paste(name, "'s result is ", result, " positive.", sep = "")
message print(message)
[1] "Yalem's result is Pf positive."
In R, the traditional method of assigning a value to a variable is using the left arrow <-
. For example:
<- 2 x
Notice that assigning a value does not print it. Instead, the value is stored in a variable. To see the value, you need to call the variable.
Let’s see this in action:
# Assign a value to the variable 'tested'
<- 420
tested
# Display the value of 'tested'
tested
[1] 420
Example: Calculating and Rounding a Ratio
# number
= 284
positive
# ratio
<- positive/tested # float Test_positive
The round()
function rounds its first argument to the specified number of decimal places (default is 0). Use ?round
to see the documentation for the round()
function.
# round the result
round(Test_positive,3)*100
#?round
You can also assign a date as a variable
# assign today
= Sys.Date()
today # print the date with the text
paste("Today is", today)
[1] "Today is 2024-07-11"
Data types in R
R has several basic data types:
Data types | Description |
---|---|
Numeric | Numbers, including integers and floating-point numbers. |
Character | Text strings. |
Logical | Boolean values (TRUE or FALSE ). |
Integer | Whole numbers. |
Complex | Complex numbers with real and imaginary parts |
Let’s explore these in practice
Numeric
# Numeric
<- 42
num print(num)
[1] 42
Character
<- "malaria"
char print(char)
[1] "malaria"
Logical/Boolean
<- TRUE
log print(log)
[1] TRUE
Integer
<- 5L
int print(int)
[1] 5
Complex
<- 1+2i
comp print(comp)
[1] 1+2i
How can you check the data type?
#?typeof()
Data Structures in R
R has several data structures to organize data:
- Vector: A sequence of data elements of the same basic type.
<- c(1, 2, 3, 4, 5)
vector print(vector)
[1] 1 2 3 4 5
**Indexing**
<- c('a', 'b','c', 'd', 'e')
letter 3] # retrive the third value letter[
[1] "c"
- List: An order of sequence of items that can have different data types.
<- list(name="Yalem", assessment="malaria", result="Pf", age = 35, height = 1.68)
list print(list)
$name
[1] "Yalem"
$assessment
[1] "malaria"
$result
[1] "Pf"
$age
[1] 35
$height
[1] 1.68
# retrive third element of the list
3] list[
$result
[1] "Pf"
::: callout-tip
***Challenge: update the list.***
Add the `weight`variable at the end with the value of 65 and update the `result` with 'neg' instead of 'pf'
:::
$weight <- 64
list
print(list)
$name
[1] "Yalem"
$assessment
[1] "malaria"
$result
[1] "Pf"
$age
[1] 35
$height
[1] 1.68
$weight
[1] 64
Matrix: A 2-dimensional array where each element has the same type.
<- matrix(1:6, nrow=2, ncol=3) matrix print(matrix)
[,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6
Data Frame: A table where each column can contain different types of data.
<- data.frame(name=c("Yalem", "Anna"), result=c("Pf", "Neg")) malaria_data print(malaria_data) <- cbind(name, result) m_data #? cbind()
CautionMay I ask you to identify the difference between
malaria_data
andm_data ?
Factor: Used to handle categorical data.
<- factor(c("Pf", "Neg", "Pf")) factor print(factor)
[1] Pf Neg Pf Levels: Neg Pf
Installing packages
A pacakge
is a collection of functions developed by R experts for various purposes. R comes with built-in base function like sum()
, mean()
, summay()
, and others. Additionally, you can install extra packages from CRAN (Comprehensive R Archive Network)
or from GitHub repositories of the package developers. You can install packages from the CRAN in two ways: using the RStudio menu or via the command line.
Installing Packages Using the RStudio Menu:
Open RStudio: Start RStudio on your computer.
Navigate to Tools: Click on the
Tools
menu at the top of the screen.Install Packages: Select
Install Packages...
from the drop down menu.Choose Package: In the dialogue box, type the name of the package you want to install (e.g.,
tidyverse
).Install: Click
Install
to download and install the package from CRAN.
For example installing package via the command line
# Install for the first time
install.packages("tidyverse")
# Check if the package 'tidyverse' is already installed; if not, install it
if (!requireNamespace("tidyverse", quietly = TRUE)) {
install.packages("tidyverse")
}
There are two ways of loading installed packages in R: library()
and require()
, but they have some differences in terms of behaviour and usage.
The library()
function is commonly used to load packages. If the specified package is not installed, library
() will output an error and stop the execution of the code.
library(sf) #Error in library(sf) : there is no package called ‘sf’
Warning: package 'sf' was built under R version 4.3.3
Linking to GEOS 3.11.2, GDAL 3.8.2, PROJ 9.3.1; sf_use_s2() is TRUE
The require() function is useful for conditional loading within functions or scripts where you want running even if the package is missing. If the specified package is not installed, require()
will output a warning but continue executing the code.
require(sf) #Warning: there is no package called ‘sf’
Getting Help in R
R has extensive documentation and help features to assist users. There are different options to search for help:
- Help function: use
?
followed by the function name to get help on that function?round
# ?round()
- Help search: Use
help.search()
to find help pages related to a topic.
# help.search("regression")
- Vignettes: Detailed guides and documentation provided by package authors.
#vignette("ggplot2-specs")
- R Help Website: Access the official R documentation online at CRAN R Documentation and R community help stack overflow.
Importing and exploring data
R provides various ways to import and explore data. You can use built-in datasets, import data from a website, or load data from a local file.
- Load Built-In Dataset
R comes with several built-in datasets that are included in the datasets
package. These datasets are useful for practice and learning.
To load and explore a built-in dataset, you need to load the datasets
package first. Here’s how you can do it:
# load the dataset package
library(datasets)
# Load a built-in dataset, for example, the 'iris' dataset
data(iris)
# View the first few rows of the dataset
head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
# Summary statistics of the dataset
summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
Median :5.800 Median :3.000 Median :4.350 Median :1.300
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
Species
setosa :50
versicolor:50
virginica :50
- Import Data from a Website
You can import data directly from a website. COVID19 data is available online at COVID_19 HUB. As the file is stored as a ZIP
file, to read and execute the content of a zipped CSV file from a URL in R, you can follow these steps:
Download the ZIP file from the URL.
Unzip the file.
Read the CSV file into R.
In the below example, you can load the COVID-19
data by country. Here’s an example:
# Install and load necessary packages
install.packages("downloader")
install.packages("utils")
library(downloader)
library(utils)
# Define the URL and destination file
<- "https://storage.covid19datahub.io/country/AUS.csv.zip"
url <- "AUS.csv.zip" # AUstralia
destfile
# Download the ZIP file
download(url, destfile, mode = "wb")
# Unzip the file
unzip(destfile, exdir = ".")
# Read the CSV file into R
<- read.csv("ETH.csv")
Asutralia_data
# Display the first few rows of the data
head(Asutralia_data)
- Import Data From local file
Data can be stored in different file types such as csv
, .xlsx
, .dta
and .sav
. Remember, you need to know the file formats of our data to install any necessary packages (e.g., readxl
, haven) using install.packages(“package_name”).
Let’s break down the information about reading different file types into R:
CSV files: The most common file format is CSV (Comma-Separated Values). To import CSV files into R, you can use the read.csv()
function from the base R package.
Example using read.csv()
:
<- read.csv("C://Users/User/Documents/R_training/Tutorial_R/R_Tutorial/AUS.csv") aus
Example using read_csv()
:
<- read_csv("C:/Users/User/Documents/R_training/Tutorial_R/R_Tutorial/AUS.csv") aus2
read.csv()
is the most common way to read data into R, there is the alternative read_csv()
function (note the underscore). This is part of the tidyverse group of packages and is often a better option for reading in CSV files. This function is quicker, it reads the data in as a tibble instead of a data frame and allows for non-standard variable names among other benefits.There are various options you can use when importing data, such as whether to include headers, what character to use for decimal points, and what to import as missing values. To explore these options you can look at the help pages e.g.
?read_csv
.Excel Files
.xlsx
: To read Excel files into R, use the read_excel()
function from the readxl
package.When reading files in R, you might encounter the use of
/
and \
:/
: In R,/
is used as the directory separator in file paths. For example,"C://Users/User/Documents/R_training/Tutorial_R/R_Tutorial/Leprosy_am_14.xlsx"
uses/
to specify the directory structure.
Example:
# read one sheet, 2014_Q1
library(readxl)
<- read_excel("C://Users/User/Documents/R_training/Tutorial_R/R_Tutorial/Leprosy_am_14.xlsx",sheet = "2014_Q1", col_names = TRUE, na = "NA") leprosy_q1
Stata Files .dta
: To read the Stata files, use the read_dta()
function from the haven package (part of the tidyverse).
Example:
library(haven)
<- read_dta("C://Users/User/Documents/R_training/Tutorial_R/R_Tutorial/data_exr.dta") my_stata_data
SPSS
, Matlab
, and binary files
can also be read using specific functions.For
SPSS
files, use read_sav()
from the haven package.For
Matlab
files, use readMat()
from the R.matlab
package.For binary files, explore relevant functions based on your specific needs
About the data:
The data I used for this demonstration is a sample of routine leprosy data from Amhara Region, Ethiopia Leprosy surveillance system in 2014. The data were collected and collated at the district level (third administrative system) and stored in Excel file with four sheets. Let’s explore the data
library(readxl)
Warning: package 'readxl' was built under R version 4.3.3
# Install and load the tidyverse package if required
#install.packages("tidyverse", "janitor")
#library(tidyverse) # data management and visualization
#library(janitor) #clean column
# Specify the path to your Excel file
<- "C:/Users/User/Documents/R_training/Tutorial_R/R_Tutorial/Data/Leprosy_am_14.xlsx"
excel_file <- read_excel(excel_file)
excel_file # Read all sheets into a list
#all_sheets <- excel_sheets(excel_file) %>%
# map(~ read_excel(excel_file, sheet = .x)) %>%
# bind_rows(.id = "sheet_name") %>%
#clean_names() # Make the column names readable
Inspect the data
To get an overview of the data, you have several options:
View the whole data: You can view the entire dataset by clicking on it in the global environment window. Alternatively, you can use the command View(routine_data) to open a window displaying the data.
# View(excel_file)
View the Top Five Rows: To quickly inspect the first few rows of the dataset, you can use the head() function to see the top rows and the tail() function to see the last rows. This function displays the first n rows of the data.
head(excel_file, 5) # top five
# A tibble: 5 × 10
Year Woreda quarter Leprosy (new cases) (MB…¹ Grade II disability …²
<dbl> <chr> <chr> <dbl> <dbl>
1 2014 Ankasha Guagusa Q1 0 0
2 2014 Banja Shekudad Q1 0 0
3 2014 Chagni Q1 1 0
4 2014 Dangila Q1 1 0
5 2014 Dangla Town Q1 0 0
# ℹ abbreviated names: ¹`Leprosy (new cases) (MB+PB)`,
# ²`Grade II disability (new cases) (MB+PB)`
# ℹ 5 more variables: `New leprosy cases under 15` <dbl>,
# `Treatment completed leprosy: MB` <dbl>,
# `Cases in MB completed treatment cohort` <dbl>,
# `Treatment completed leprosy: PB` <dbl>,
# `Cases in PB completed treatment cohort` <dbl>
tail(excel_file, 5) # bottom five
# A tibble: 5 × 10
Year Woreda quarter Leprosy (new cases) (MB+P…¹ Grade II disability …²
<dbl> <chr> <chr> <dbl> <dbl>
1 2014 Yilmana Densa Q1 1 0
2 2014 Bahirdar Town Q1 7 2
3 2014 Dessie Town Q1 0 0
4 2014 Gonder Town Q1 0 0
5 2014 Total <NA> 159 12
# ℹ abbreviated names: ¹`Leprosy (new cases) (MB+PB)`,
# ²`Grade II disability (new cases) (MB+PB)`
# ℹ 5 more variables: `New leprosy cases under 15` <dbl>,
# `Treatment completed leprosy: MB` <dbl>,
# `Cases in MB completed treatment cohort` <dbl>,
# `Treatment completed leprosy: PB` <dbl>,
# `Cases in PB completed treatment cohort` <dbl>
Understand the structure of the data: To understand the structure of the data, you can use either the str()
or glimpse()
function.
str()
provides a concise summary of the structure of the dataset, including the data types and the first few values of each variable.
glimpse()
is part of the tidyverse ecosystem and offers a similar summary but with a focus on tibbles, providing additional information such as variable types and the first few values of each variable, displayed in a more compact format.
str(excel_file) # Use str() for a concise summary
tibble [154 × 10] (S3: tbl_df/tbl/data.frame)
$ Year : num [1:154] 2014 2014 2014 2014 2014 ...
$ Woreda : chr [1:154] "Ankasha Guagusa" "Banja Shekudad" "Chagni" "Dangila" ...
$ quarter : chr [1:154] "Q1" "Q1" "Q1" "Q1" ...
$ Leprosy (new cases) (MB+PB) : num [1:154] 0 0 1 1 0 3 4 0 1 0 ...
$ Grade II disability (new cases) (MB+PB): num [1:154] 0 0 0 0 0 0 2 0 0 0 ...
$ New leprosy cases under 15 : num [1:154] 0 0 0 0 0 0 0 0 0 0 ...
$ Treatment completed leprosy: MB : num [1:154] 0 0 0 0 1 1 3 0 0 0 ...
$ Cases in MB completed treatment cohort : num [1:154] 0 0 0 0 1 0 3 0 0 0 ...
$ Treatment completed leprosy: PB : num [1:154] 0 0 3 0 0 1 0 0 0 0 ...
$ Cases in PB completed treatment cohort : num [1:154] 0 0 1 0 0 0 0 0 0 0 ...
library(dplyr)
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
glimpse(excel_file) # Use glimpse() for a tidyverse-friendly summary
Rows: 154
Columns: 10
$ Year <dbl> 2014, 2014, 2014, 2014, 2014…
$ Woreda <chr> "Ankasha Guagusa", "Banja Sh…
$ quarter <chr> "Q1", "Q1", "Q1", "Q1", "Q1"…
$ `Leprosy (new cases) (MB+PB)` <dbl> 0, 0, 1, 1, 0, 3, 4, 0, 1, 0…
$ `Grade II disability (new cases) (MB+PB)` <dbl> 0, 0, 0, 0, 0, 0, 2, 0, 0, 0…
$ `New leprosy cases under 15` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ `Treatment completed leprosy: MB` <dbl> 0, 0, 0, 0, 1, 1, 3, 0, 0, 0…
$ `Cases in MB completed treatment cohort` <dbl> 0, 0, 0, 0, 1, 0, 3, 0, 0, 0…
$ `Treatment completed leprosy: PB` <dbl> 0, 0, 3, 0, 0, 1, 0, 0, 0, 0…
$ `Cases in PB completed treatment cohort` <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0…
Check the Class: You can check the class of the dataset, which could be either tbl or data.frame.
class(excel_file)
[1] "tbl_df" "tbl" "data.frame"
Select Specific Rows or Columns: If you want to select specific rows or columns, you can use square brackets.
head(excel_file[,2], 5) #View the second column
1,] # view the first row excel_file[
# A tibble: 1 × 10
Year Woreda quarter Leprosy (new cases) (MB…¹ Grade II disability …²
<dbl> <chr> <chr> <dbl> <dbl>
1 2014 Ankasha Guagusa Q1 0 0
# ℹ abbreviated names: ¹`Leprosy (new cases) (MB+PB)`,
# ²`Grade II disability (new cases) (MB+PB)`
# ℹ 5 more variables: `New leprosy cases under 15` <dbl>,
# `Treatment completed leprosy: MB` <dbl>,
# `Cases in MB completed treatment cohort` <dbl>,
# `Treatment completed leprosy: PB` <dbl>,
# `Cases in PB completed treatment cohort` <dbl>
Check Column Names: Checking column names is essential for identifying any missing or misspelled variables. Ensure that variable names are clean, short, and readable.
colnames(excel_file) # using colnames
[1] "Year"
[2] "Woreda"
[3] "quarter"
[4] "Leprosy (new cases) (MB+PB)"
[5] "Grade II disability (new cases) (MB+PB)"
[6] "New leprosy cases under 15"
[7] "Treatment completed leprosy: MB"
[8] "Cases in MB completed treatment cohort"
[9] "Treatment completed leprosy: PB"
[10] "Cases in PB completed treatment cohort"
names(excel_file) # using names
[1] "Year"
[2] "Woreda"
[3] "quarter"
[4] "Leprosy (new cases) (MB+PB)"
[5] "Grade II disability (new cases) (MB+PB)"
[6] "New leprosy cases under 15"
[7] "Treatment completed leprosy: MB"
[8] "Cases in MB completed treatment cohort"
[9] "Treatment completed leprosy: PB"
[10] "Cases in PB completed treatment cohort"
variable.names(excel_file) # Assuming variable.names() is a custom function to retrieve variable names
[1] "Year"
[2] "Woreda"
[3] "quarter"
[4] "Leprosy (new cases) (MB+PB)"
[5] "Grade II disability (new cases) (MB+PB)"
[6] "New leprosy cases under 15"
[7] "Treatment completed leprosy: MB"
[8] "Cases in MB completed treatment cohort"
[9] "Treatment completed leprosy: PB"
[10] "Cases in PB completed treatment cohort"
Common Commands in R
$
: Used for accessing elements within an object.
ls()
: Lists the objects in the current environment.
rm(list = ls())
: Removes all objects from the environment.
Ctrl + ENTER
: Typically used in RStudio to execute the current line or selection.
Ctrl + C
: Keyboard shortcut for copying.
Ctrl + L
: Clears your console.
Alt + -
: Keyboard shortcut for assigning values.