This lesson is still being designed and assembled (Pre-Alpha version)

sample dataset and importing data in R studio

Overview

Time: min
Objectives
  • learning how to use the sample dataset

  • understanding how to import data in R studio

Sample Dataset


# INSTALL AND LOAD PACKAGES ################################

# Load base packages manually
library(datasets) # For example datasets
?datasets
library(help = "datasets")

# SOME SAMPLE DATASETS #####################################

iris
?iris

cars <-cars

head(cars)

iris <- iris
head(iris)

tail(iris,20)

iris[,c(1,2)]

iris[,c('Sepal.Length')]

str(iris)

rm(list = ls())

iris

# CLEAN UP #################################################

# Clear environment
rm(list = ls()) 

# Clear packages
detach("package:datasets", unload = TRUE)  # For base

# Clear plots
dev.off()  # But only if there IS a plot

# Clear console
cat("\014")  # ctrl+L

Data Visulaistion using the basic Plot function in R

The most used plotting function in R programming is the plot() function. It is a generic function, meaning, it has many methods which are called according to the type of object passed to plot(). In the simplest case, we can pass in a vector and we will get a scatter plot (default) of magnitude vs index. But generally, we pass in two vectors and a scatter plot of these points are plotted. For example, the command plot(c(1,2),c(3,5)) would plot the points (1,3) and (2,5) In the exercise below we will see how to create a generic plot, bar chart and a histogram in R using the Sample dataset available in the Program

PLOT FUNCTION IN R

LOAD DATASETS PACKAGES ###################################

library(datasets)  # Load/unload base packages manually

LOAD DATA ################################################

head(iris)

PLOT DATA WITH PLOT() ####################################

?plot  # Help for plot()

plot(iris$Species)  # Categorical variable
plot(iris$Petal.Length)  # Quantitative variable
plot(iris$Species, iris$Petal.Width)  # Cat x quant
plot(iris$Petal.Length, iris$Petal.Width)  # Quant pair
plot(iris)  # Entire data frame

Plot with options
plot(iris$Petal.Length, iris$Petal.Width,
  col = "#cc0000",  # Hex code for datalab.cc red
  pch = 19,         # Use solid circles for points
  main = "Iris: Petal Length vs. Petal Width",
  xlab = "Petal Length",
  ylab = "Petal Width")

PLOT FORMULAS WITH PLOT() ################################

plot(cos, 0, 2*pi)
plot(exp, 1, 5)
plot(dnorm, -3, +3)

Formula plot with options
plot(dnorm, -3, +3,
  col = "#cc0000",
  lwd = 5,
  main = "Standard Normal Distribution",
  xlab = "z-scores",
  ylab = "Density")

Plotting Bar Chart in R

 
library(datasets)

LOAD DATA ###############################################
?mtcars
head(mtcars)

sort.default(mtcars, decreasing = TRUE)

BAR CHARTS ###############################################

barplot(mtcars$cyl)             # Doesn't work

Need a table with frequencies for each category
cylinders <- table(mtcars$cyl)  # Create table
barplot(cylinders)              # Bar chart
plot(cylinders)                 # Default X-Y plot (lines)

CLEAN UP #################################################

Clear environment
rm(list = ls()) 

Clear packages
detach("package:datasets", unload = TRUE)  # For base

Clear plots
dev.off()  # But only if there IS a plot

Clear console
cat("\014")  # ctrl+L

Plotting Historgram in R


LOAD PACKAGES ############################################

library(datasets)

LOAD DATA ################################################

?iris
head(iris)

BASIC HISTOGRAMS #########################################

hist(iris$Sepal.Length)
hist(iris$Sepal.Width)
hist(iris$Petal.Length)
hist(iris$Petal.Width)

HISTOGRAM BY GROUP #######################################

Put graphs in 3 rows and 1 column
par(mfrow = c(3, 1))

Histograms for each species using options
hist(iris$Petal.Width [iris$Species == "setosa"],
  xlim = c(0, 3),
  breaks = 9,
  main = "Petal Width for Setosa",
  xlab = "",
  col = "red")

hist(iris$Petal.Width [iris$Species == "versicolor"],
  xlim = c(0, 3),
  breaks = 9,
  main = "Petal Width for Versicolor",
  xlab = "",
  col = "purple")

hist(iris$Petal.Width [iris$Species == "virginica"],
  xlim = c(0, 3),
  breaks = 9,
  main = "Petal Width for Virginica",
  xlab = "",
  col = "blue")

Restore graphic parameter
par(mfrow=c(1, 1))

CLEAN UP #################################################

Clear packages
detach("package:datasets", unload = TRUE)  # For base

Clear plots
dev.off()  # But only if there IS a plot

Clear console
cat("\014")  # ctrl+L

Clear mind :)

This brings us to an end of this workshop however the reference section provides links to all the materials used in this workshop and links which provide more detailed understanding of Each Package in R.

Key Points