logo

photo
Carlos Sotuyo
Instructor

 

Installing R and R Studio:

Rlogo

https://cran.r-project.org
Then, download RStudio here

R Studio online at https://rstudio.cloud/

R Staitsical Software:
Datasets Available in R - GitHub Pages

College of the Redwoods:
using R in Statistics.

Probability Distributions

Columbia University:
Statistics, R Notes.

Basic Statistical Analysis Using R:
by T. Heeren & J.Milton,
Boston University.

Quick R by statmethods.net

College of Staten Island: Using R for Introductory Statistics:
John Verzani:
link or download it here.

SimpleR is a previous version
of John Verzani's ebook

Intro to Statistical methods using RStudio

RStudio

rSTUDIO

Page 1: Data handling and descriptive statistics,
Page 2: Probability,
Page 3: Intervals and sample size,
Page 4: Hypothesis Testing,
Page 5: Contingency tables,
Page 6: Linear Regression.

Page 1 | Page 3 | Page 4 | Page 5 | Page 6



Page 2: Probability,

1. Basic concepts:

# toss a coin, simulation:
mycoin <- c("H", "T")
mypick <- sample(mycoin, size=1000, replace=T)
tb1<-table(mypick);tb1 # counts
prop.table(tb1)# probabilities
# repeat the experiment by increasing the sample size to 50, 100, 1000
#notice that as the number of # trials increases, the empirical (observed)
#probability approach the theoretical probability (Law oflarge numbers)

rs <-1:6 # or rs<-seq(1,6, by=1)
mypick2 <- sample(rs, size=120, replace=T)
tb2<-table(mypick2);tb2 #counts
prop.table(tb2)# probabilites

# simulate theoretical prob = 1/100.
#Choosing a given number, say 89, from 1 to 100:
x <- 1:100
mypick3 <- sample(x, size=200, replace=T)
sum(mypick3==89)#count
p.89<-(sum(mypick3==89))/200 #probability
th <- 2/200; th # theoretical prob
p.89 # actual, or empirical

#3200 students; 2100 in favor of A [ones]; the rest, 1100, [zeros] against it.
pop<-rep(x=0:1, c(1100, 2100))
prop_in_favor=2100/3200;prop_in_favor (theoretical)
a_sample<-sample(pop, size=100, replace=T)
prob_in_sample=(sum(a_sample))/100
prob_in_sample

2. Probability distributions:

#table of a discrete probability distribution: example,
x <- c(0,1,2,3,4,5)
px <- c(0.11, 0.31,0.17,0.02,0.22,0.17)
prob.dist <- data.frame(x,px);prob.dist
sum(px) # it must be equal to 1

#mean prob dist: mu <- sum(x*px); mu
# variance & standard deviation:
vars <- sum((x-mu)^2 * px); vars
sigma <- sqrt(sum((x-mu)^2 * px)); sigma
#The minimum usual value is mu-2*sigma. In this example,
mu-2*sigma #minimum usual value
# maximum usual value is mu+2*sigma
mu+2*sigma # maximum usual value

3. Combinations and Permutations:

#combinations
Example: from 3 items - A,B,C choose 2: AB, AC, BC
choose(3,2)# choose(n,k).
#permutations: taking the order into consideration:
#example: from 3 items - A,B,C permutation taken two at a time: AB,BA,AC,CA,BC,CB
# choose(n,k)*factorial(k)
choose(3,2)*factorial(2)

4. Binomial Distribution:

#Binomial:
#dbinom for exactly k successes
dbinom(x=1, size=10, prob=1/4) # prob of 1 success in 10 trials if p=1/4
#cumulative binomial: pbinom
pbinom(q=2, size=10, prob=1/4) # prob of up two successes in 10 trials if p=1/4
#same as:
sum(dbinom(x=0:2, size=10, prob=1/4))
# prob of at least 3 success in 15 trials if p=.33 (that is: 3 or more success in 15 trials)
1-pbinom(q=2, size=15, prob=0.33) #or, using dbinom:
sum(dbinom(x=3:15, size=15, prob=0.33))
#random binomial:
heads <- rbinom(1, size = 100, prob = .5)
heads # what is it? in 100 trials, with prob of success of 0.5 on every single trial,
# it gives me a random number of success ( it may be close to 0.50 ).

5. Uniform Distribution:

#uniform: punif(q, min, max)
#example: A bus arrives at a bus stop every 8 minutes. What is the chance that
#the bus will arrive in 5 minutes or less if you arrive at the bus stop?
punif(q=5, min=0, max=8)

6. Normal Distribution:

#***Normal Distribution **
#* cumulative normal dist function: pnorm(q, mean, sd,lower.tail = TRUE )
#Ex: Assume that weights of males are normally distributed with a mean of 188 lb and a standard #deviation of 29 lb.
#*a. Find the probability that 1 randomly selected adult male has a weight greater than 155 lb. pnorm(q=155,mean=188, sd=29, lower.tail = F)
#b.Find the probability that 1 randomly selected adult male has a weightless than 145 lb. pnorm(q=145,mean=188, sd=29, lower.tail = T) # lower.tail = T is default
#c. Prob that a randomly selected male weight between 143 and 160 pounds
pnorm(q=160,mean=188, sd=29)-pnorm(q=143,mean=188, sd=29)
#d. Find the probability that a sample of 20 randomly selected adult males has a mean weight greater #than 155 lb.
pnorm(q=155,mean=188, sd=29/sqrt(10), lower.tail = F)

#inverse normal:
# using a known probability to find the corresponding z-critical value in a normal distribution:
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE)
# to find a critical value, z_alpha/2 for 0.975
qnorm(0.975, mean = 0, sd = 1, lower.tail = TRUE)