Compute the following summary statistics for each of the four variables in the dataset:
The following built-in functions can be accessed from R's base packages to compute summary statistics:
Use the code below to compute the summary statistics.
Compute Summary Statistics
# Summary Statistics
n <- nrow(eda_data)
min <- apply(eda_data, 2, min)
max <- apply(eda_data, 2, max)
range <- max - min
mean <- apply(eda_data, 2, mean)
median <- apply(eda_data, 2, median)
mode <- apply(eda_data, 2, getMode)
variance <- apply(eda_data, 2, var)
skewness <- apply(eda_data, 2, moments::skewness)
kurtosis <- apply(eda_data, 2, moments::kurtosis)
cv <- sqrt(variance)/mean
mh <- apply(eda_data, 2, midhinge)
trimean <- 0.5*(mh + median)
yule_coeff <- apply(eda_data, 2, yule)
iqr <- apply(eda_data, 2, IQR)
pr <- c(0.010, 0.025, 0.050, 0.100, 0.200, 0.800, 0.950, 0.975, 0.990)
var1_pctile <- quantile(pull(eda_data, Var1), probs = pr)
var2_pctile <- quantile(pull(eda_data, Var2), probs = pr)
var3_pctile <- quantile(pull(eda_data, Var3), probs = pr)
var4_pctile <- quantile(pull(eda_data, Var4), probs = pr)
CODE
By running the script you have created up to this point, you will be able to see lists of summary statistics values in the Environment tab in the RStudio interface. Each summary statistic can be viewed in this pane, or to print a list of the summary statistic values for each variable in the Console, type the name of the statistic (e.g. skewness
) and press enter.