STA2023 using R

Intro to Statistical methods using RStudio

Page 1: Data handling and descriptive statistics,
Page 2: Probability,
Page 3: Intervals and sample size,
Page 4: Hypothesis Testing,
Page 5: Contingency tables,
Page 6: Linear Regression.

Page 1 | Page 2 | Page 4 | Page 5 | Page 6

Page 3: Intervals and sample size,

1. Confidence Intervals for Proportions:

install.packages("epitools")
library(epitools) # one proportion
binom.approx(52,100,conf.level = .95) # TO BE USED in STA2023
prop.test(52,100,conf.level = .95, correct=F) # slightly different result.

# two proportions
# prop.test(x=c(x1,x2), n=c(n1,n2), conf.level = 0.95, correct = F); example:
prop.test(x=c(10,12), n=c(50,48), conf.level = 0.95, correct = F)

2. Confidence Intervals for means, asume sigma known:

In the practice of statistics confidence intervals for means are found based on samples, raw data; that is, the collected original data; not for summary statistics of the raw data. That is why we have to implement the formulas using basic operations in R:

#Conf Interval for means, sigma known. Summary statistics given:
# xbar plus minus se, the standard error sigma/sqrt(n):
sigma=1.1;n=50; c.level=0.99;xbar=10
se<-qnorm((1+c.level)/2)*sigma/sqrt(n)
ci <- c(xbar-se,xbar+se)
print(ci)

3. Confidence Intervals for means, asume sigma unknown:

#Conf Interval for means, sigma unknown; that is, sample sd known:
# xbar plus minus se, the standard error sd/sqrt(n):
sd=1.6;n=20; c.level=0.95;xbar=10
se<-qt((1+c.level)/2, n-1)*sd/sqrt(n)
ci <- c(xbar-se,xbar+se)
print(ci)

4. Confidence Intervals for means, raw data available:

x<-c(9.2,9.9,11.1,10.3,13,12.3,9.2,11,10.8) # using sample data.
t.test(x, mu=10, conf.level = .90) # testing "is the population mean equal to 10 units?"
#by default, mu=0 and conf.int =.95; we don't need to enter a mu value,
#since, for now, we are only concerned with the confidence interval.

#2 samples t.test with data:
#t.test(x,y, conf.level = ?) where x is a vector of values, say,
# sample 1: x<-c(a,b,c etc);
# sample 2 y<-c(d,e,f,etc) another vector of values.

# Two samples: paired data
before <-c(200.1, 190.9, 192.7, 213, 241.4, 196.9, 172.2, 185.5, 205.2, 193.7)
after <-c(392.9, 393.2, 345.1, 393, 434, 427.9, 422, 383.9, 392.3, 352.2)
results <- t.test(before, after, paired = TRUE);results

5. Determining sample size using formulas:

# In STA2023 we will implement the following formulas in R. For proportions:
#Example:
c.level=0.95; phat=0.43; E=0.035 #(3.5% error)
n<-(qnorm((1+c.level)/2)^2 * phat * (1-phat)/E^2)# If phat unknown, enter 0.5
n <- ceiling(n) # rounding up
print(n)

#for means:
c.level=0.95; sigma=593; E=120
n<-(qnorm((1+c.level)/2)*sigma/E)^2
n <- ceiling(n) # rounding up
print(n)

6. Determining sample size using power package in R:

Power of test
In statistics, the power of a hypothesis test is the probability that the test correctly rejects the null when a specific alternative hypothesis is true; usually set = 80%
Power=(1 minus Type II error probability)

install.packages("pwr")
library(pwr)
#pwr.t.test(n = , d = , sig.level = , power = , type = c("two.sample", "one.sample", "paired"))

pwr.t.test(d=1,sig.level = 0.05,type = "one.sample", power=0.90)

#Power calculations for proportion tests: one sample,
pwr.p.test(h = NULL, n = NULL, sig.level = 0.05, power = NULL, alternative = c("two.sided","less","greater")) #where, h=Effect size; n= sample size

# for more info on power analysis:

https://bookdown.org/pdr_higgins/rmrwr/sample-size-calculations-with-pwr.html

College of the Redwoods:
using R in Statistics.

Probability Distributions

Intro to Statistical methods using RStudio

Page 1: Data handling and descriptive statistics,
Page 2: Probability,
Page 3: Intervals and sample size,
Page 4: Hypothesis Testing,
Page 5: Contingency tables,
Page 6: Linear Regression.

Page 3: Intervals and sample size,

1. Confidence Intervals for Proportions:

2. Confidence Intervals for means, asume sigma known:

3. Confidence Intervals for means, asume sigma unknown:

4. Confidence Intervals for means, raw data available:

5. Determining sample size using formulas:

6. Determining sample size using power package in R:

College of the Redwoods: using R in Statistics.

Probability Distributions

Intro to Statistical methods using RStudio Page 1: Data handling and descriptive statistics, Page 2: Probability, Page 3: Intervals and sample size, Page 4: Hypothesis Testing, Page 5: Contingency tables, Page 6: Linear Regression.

Page 3: Intervals and sample size,

1. Confidence Intervals for Proportions:

2. Confidence Intervals for means, asume sigma known:

3. Confidence Intervals for means, asume sigma unknown:

4. Confidence Intervals for means, raw data available:

5. Determining sample size using formulas:

6. Determining sample size using power package in R:

College of the Redwoods:
using R in Statistics.

Intro to Statistical methods using RStudio

Page 1: Data handling and descriptive statistics,
Page 2: Probability,
Page 3: Intervals and sample size,
Page 4: Hypothesis Testing,
Page 5: Contingency tables,
Page 6: Linear Regression.