1. Which type of statistical test is used to compare means of two independent groups?
A. Paired t-test
B. ANOVA
C. Independent samples t-test
D. Chi-square test
2. Which of the following is an example of ordinal data?
A. Temperature in Celsius
B. Colors of cars in a parking lot
C. Customer satisfaction ratings (e.g., Very Dissatisfied, Dissatisfied, Neutral, Satisfied, Very Satisfied)
D. Heights of students in a class
3. What is the `null hypothesis` in hypothesis testing?
A. The hypothesis that the researcher is trying to prove.
B. The hypothesis that there is no effect or no difference.
C. The hypothesis that is always true.
D. The hypothesis based on the sample data.
4. In regression analysis, what does the R-squared value represent?
A. The slope of the regression line.
B. The correlation coefficient between the variables.
C. The proportion of variance in the dependent variable that is predictable from the independent variable(s).
D. The p-value of the regression model.
5. What is the difference between a Type I error and a Type II error in hypothesis testing?
A. Type I error is failing to reject a false null hypothesis, while Type II error is rejecting a true null hypothesis.
B. Type I error is rejecting a true null hypothesis, while Type II error is failing to reject a false null hypothesis.
C. Type I error occurs with small sample sizes, while Type II error occurs with large sample sizes.
D. Type I error is related to the p-value, while Type II error is related to the confidence interval.
6. What is the primary purpose of hypothesis testing in statistics?
A. To prove the null hypothesis is true.
B. To calculate the sample mean and standard deviation.
C. To determine if there is enough evidence to reject the null hypothesis in favor of the alternative hypothesis.
D. To describe the characteristics of a sample.
7. What does `statistical significance` mean in hypothesis testing?
A. The result is important in a practical sense.
B. The result is very likely to be due to chance alone.
C. The observed effect is unlikely to have occurred by random chance alone, assuming the null hypothesis is true.
D. The sample size is very large.
8. Why is random sampling important in statistical studies?
A. It guarantees that the sample mean will be exactly equal to the population mean.
B. It eliminates the need for statistical inference.
C. It helps to minimize bias and ensures that the sample is more likely to be representative of the population.
D. It makes the data collection process faster and cheaper.
9. Which of the following is the best description of `statistical inference`?
A. The process of summarizing data using graphs and tables.
B. The method of collecting data from every member of a population.
C. The practice of using sample data to make generalizations or conclusions about a larger population.
D. The calculation of basic statistics like mean, median, and mode.
10. What is the Central Limit Theorem?
A. A theorem stating that the sample mean is always equal to the population mean.
B. A theorem that applies only to normally distributed populations.
C. A theorem stating that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the population`s distribution.
D. A theorem used to calculate confidence intervals for small sample sizes.
11. Which of the following is NOT a measure of central tendency?
A. Mean
B. Median
C. Mode
D. Standard Deviation
12. In statistics, what is `bias`?
A. Random error in measurements.
B. Systematic error that consistently distorts results in one direction.
C. The variability of a statistic across different samples.
D. The standard deviation of the sample.
13. In statistical terms, what does `population` refer to?
A. A subset of individuals selected from a larger group.
B. The entire group of individuals, objects, or events of interest in a study.
C. The average of all values in a dataset.
D. The range of values in a dataset.
14. What is the purpose of descriptive statistics?
A. To make predictions about future events.
B. To draw conclusions about a population based on a sample.
C. To summarize and describe the main features of a dataset.
D. To establish cause-and-effect relationships between variables.
15. What does it mean for a statistical test to have high `power`?
A. It is very likely to make a Type I error.
B. It is very likely to fail to reject a false null hypothesis.
C. It is very likely to correctly reject a false null hypothesis.
D. It is very likely to reject a true null hypothesis.
16. What is the purpose of confidence intervals in statistics?
A. To provide a single, exact estimate of a population parameter.
B. To provide a range of values that is likely to contain the true population parameter with a certain level of confidence.
C. To determine if the null hypothesis should be rejected.
D. To eliminate sampling error.
17. Which statistical concept is most directly related to the spread or variability of data?
A. Mean
B. Median
C. Standard Deviation
D. Mode
18. What is the purpose of `stratified sampling`?
A. To select participants based on convenience.
B. To divide the population into subgroups (strata) and then randomly sample from each stratum.
C. To sample every nth individual in a population.
D. To select clusters of individuals that are geographically close.
19. What is a `sampling distribution`?
A. The distribution of values in a single sample.
B. The distribution of a statistic (e.g., sample mean) calculated from all possible samples of a given size from a population.
C. The distribution of the population.
D. Any distribution used for sampling.
20. Why is it important to consider the `assumptions` of a statistical test before applying it?
A. Assumptions are not important if the sample size is large enough.
B. Violating assumptions can lead to unreliable or invalid results from the test.
C. Assumptions only affect the p-value, not the conclusion of the test.
D. All statistical tests have the same assumptions.
21. Which of the following is a common method for visualizing the distribution of a single numerical variable?
A. Scatter plot
B. Bar chart
C. Histogram
D. Pie chart
22. Which type of data is categorical and unordered?
A. Ordinal data
B. Interval data
C. Nominal data
D. Ratio data
23. When is it appropriate to use the median instead of the mean as a measure of central tendency?
A. When the data is normally distributed.
B. When the data is categorical.
C. When the data is skewed or contains outliers.
D. When you want to calculate the average of all values.
24. What is `variance` a measure of?
A. Central tendency.
B. The median value.
C. Dispersion or spread of data points around the mean.
D. The most frequent value.
25. What is `multicollinearity` in regression analysis, and why is it a concern?
A. It is when the dependent variable is categorical; it`s not a concern.
B. It is high correlation between independent variables; it can lead to unstable coefficient estimates and difficulty in determining the individual effect of each predictor.
C. It is low correlation between independent variables; it improves the model.
D. It refers to non-linearity in the relationship between variables; it`s always desirable.
26. What does a p-value in hypothesis testing represent?
A. The probability that the null hypothesis is true.
B. The probability of making a Type I error.
C. The probability of observing data as extreme as, or more extreme than, the observed data if the null hypothesis were true.
D. The probability that the alternative hypothesis is true.
27. What is the difference between `interval` and `ratio` scales of measurement?
A. Interval scales have a true zero point, while ratio scales do not.
B. Ratio scales have a true zero point, while interval scales do not.
C. Interval scales are categorical, while ratio scales are numerical.
D. There is no difference; the terms are interchangeable.
28. What is `correlation` in statistics?
A. The cause-and-effect relationship between two variables.
B. A measure of the linear association between two variables.
C. The difference between two means.
D. The variability of a single variable.
29. What is the relationship between sample size and the margin of error in a confidence interval?
A. As sample size increases, the margin of error increases.
B. As sample size increases, the margin of error decreases.
C. Sample size and margin of error are unrelated.
D. The margin of error is determined only by the confidence level, not sample size.
30. In the context of statistical graphs, what is a `boxplot` primarily used for?
A. Showing the relationship between two categorical variables.
B. Comparing the means of multiple groups.
C. Displaying the distribution of a numerical variable, including median, quartiles, and outliers.
D. Showing the frequency of categorical data.