logo

High School Statistics

Examples • Visuals • Quizzes

Histograms & Frequency Polygons — concept & worked examples

A histogram groups continuous data into adjacent bins. The vertical axis shows frequency (counts). A frequency polygon connects bin midpoints with straight lines to emphasise shape.

Key ideas (brief)

  • Bin width controls smoothness: small width → noisy, large width → over-smooth.
  • Sturges' rule: k ≈ 1 + log2(n) — a quick starting point for number of bins.
  • Histograms show distribution shape: symmetric, skewed, multimodal, or uniform.

Worked example (step-by-step)

Data: 45,52,47,60,62,58,50,49,71,66,53,57

  1. n = 12 → Sturges k ≈ 1 + log2(12) ≈ 5 (choose 5 bins)
  2. Min = 45, Max = 71, range = 26, bin width ≈ 26 / 5 = 5.2 → round to 6
  3. Bins: [45–50),[51–56),[57–62),[63–68),[69–74)
  4. Count frequencies: e.g. 45–50 → {45,47,49,50} → 4
Solution details

              
Quick quiz
  1. Which bin will contain value 57 from the example? (A) 45–50 (B) 51–56 (C) 57–62
  2. True/False: A frequency polygon uses bin midpoints.

Regression analysis — scatter, least-squares, r & R²

Linear regression models the relationship between X and Y with a line y = mx + c. The best-fit (least-squares) line minimises the sum of squared vertical distances.

Important formulas (simple)

Slope: m = Σ(xi−x̄)(yi−ȳ) / Σ(xi−x̄)² — Intercept: c = ȳ − m x̄. Correlation r = covariance/(sx·sy), R² = r².

Quick quiz
  1. What does R² = 0.64 mean? (brief)
  2. True/False: r = 0 implies no linear relationship.

Time Series & Moving Averages

Moving averages smooth short-term fluctuations; a w-period simple moving average at time t is the mean of the w most recent observations.

Quick quiz
  1. For the example, the first 3-month MA is what? (A) 25 (B) 24.3 (C) 26
  2. True/False: Increasing window size reduces noise but lags more.

Mean, Median & Mode — how to choose and worked examples

Mean is the arithmetic average, median is the middle value, and mode is the most frequent value. Median is robust to outliers; mean uses all data and is useful for further calculations (variance).

Tip: For skewed distributions (income, house prices) use median. For symmetric distributions use mean.
Quick quiz
  1. Which is resistant to outliers? (A) Mean (B) Median (C) Mode
  2. True/False: Median is always equal to one of the data points.

Variance & Standard Deviation — intuition and worked example

Variance measures average squared deviations from the mean. Sample variance uses n−1 in the denominator (Bessel's correction). SD is the square root of variance — in same units as data.

Tip: Use population formulas only when you truly have every member; otherwise use sample formulas.
Quick quiz
  1. If all values are equal, variance = ?
  2. True/False: SD is measured in squared units.

Probability basics — counting, independence & simulation

Probability measures likelihood between 0 and 1. For equally-likely discrete outcomes: P = favourable / total. Independence: P(A∩B)=P(A)P(B) if independent.

Worked example: Probability of drawing an ace from a standard deck = 4/52 = 1/13 ≈ 0.0769.

Tip: Simulation helps build intuition — repeat experiments to see the law of large numbers in action.
Quick quiz
  1. Probability of drawing a heart from a full deck? (A) 1/4 (B) 1/13 (C) 1/52
  2. True/False: Probabilities can be negative.

Advanced examples, tips & extra practice

Worked example — Histogram interpretation

We often want to compare shapes (e.g. skewness) and spot outliers. Below is a worked guide to interpret the histogram from the earlier example.

  1. Check symmetry: are bars balanced left and right around the centre?
  2. Identify skew: a long tail on the right = right-skewed (positive), on the left = left-skewed (negative).
  3. Spot outliers: isolated bars far from others — consider investigating or removing for some analyses.

Practice

Using the example dataset (45,52,47,60,62,58,50,49,71,66,53,57), answer:

  • Is the distribution symmetric or skewed?
  • Which bin contains the highest frequency?

Regression — interpreting slope & intercept

Beyond fitting a line, interpretation matters: slope = expected change in Y per unit increase in X. Intercept is the predicted Y when X = 0 (sometimes not meaningful if X=0 outside observed range).

Worked tip

If slope = 2.5 and X measures hours studied, then on average an extra hour is associated with 2.5 more marks (assuming linear model is appropriate).

Mini practice

Suppose fitted line is y = 3.2x + 5. What is predicted y when x = 4?

Time series — choosing MA window

Window size trades smoothing vs responsiveness. Use small windows (2–3) to remove small noise; larger windows (7,12) for strongly seasonal data.

  • Short window → more responsive to changes but noisier.
  • Long window → smoother trend but may hide sudden shifts.

Extra practice problems (with answers hidden)

  1. Compute mean & median for: 12, 15, 11, 14, 100.
  2. Given X: 2,4,6 and Y: 3,5,7 compute slope of least squares line (hint: slope ≈ ?).
  3. Calculate 3-point moving average for series: 8, 9, 10, 12, 11.
  4. Probability: From a standard deck what is P(heart or queen)?

More Probability Topics

Conditional probability, tree diagrams, permutations & combinations — interactive examples and exam-style practice.

Conditional Probability

P(A | B) = P(A and B) / P(B)

Tip: If P(A ∩ B) > P(B) there is an input error — joint cannot exceed marginal.

Interactive Tree Diagram (2-level)

Enter probabilities for level-1 branches, then for each child. The diagram shows joint probabilities.

Permutations & Combinations

nPr = ordered samples, nCr = unordered.