FAD1015: MATHEMATICS III — Interleaved Mastery Problem Set

4-Day Intensive Study Plan
Topics: Probability, Probability Distributions, Hypothesis Testing, Matrices, Statistical Inference


How to Use This Set

Each problem deliberately combines 2-4 statistical topics in real-world data analysis scenarios. This mirrors exam conditions where you must identify appropriate distributions, perform calculations, and interpret results.

Study Schedule:

  • Day 1: Problems 1-3 (Probability foundations & distributions)
  • Day 2: Problems 4-6 (Sampling & estimation)
  • Day 3: Problems 7-9 (Hypothesis testing applications)
  • Day 4: Problems 10-12 (Matrices & multivariate analysis)

Scenario-to-Method Quick Reference

graph LR
    A[Problem Scenario] --> B{Counting defectives<br/>in n trials?}
    B -->|Yes| C[Binomial]
    A --> D{Rare events in<br/>time or space?}
    D -->|Yes| E[Poisson]
    A --> F{Sample mean vs<br/>known value?}
    F -->|Yes| G[Z-test or t-test]
    A --> H{Comparing two<br/>groups?}
    H -->|Yes| I[Two-sample test]
    A --> J{Observed vs expected<br/>frequencies?}
    J -->|Yes| K[Chi-square]
    A --> L{Updating beliefs<br/>with evidence?}
    L -->|Yes| M[Bayes' Theorem]
    A --> N{State transitions<br/>over time?}
    N -->|Yes| O[Markov Chains]

The Mastery Problems


Problem 1: Quality Control in Manufacturing [Binomial + Poisson + Normal Approximation]

A factory produces electronic components with 2% defect rate. Components are packed in boxes of 100.

(a) Calculate the probability that a box contains exactly 3 defective components using the binomial distribution. Then approximate using the Poisson distribution and compare results. [Binomial & Poisson]

(b) For large orders (n = 500), use normal approximation to the binomial to find P(more than 15 defectives). State and justify the continuity correction. [Normal approximation]

(c) If boxes with more than 5 defectives are rejected, calculate the rejection rate and the expected number of boxes rejected in a shipment of 1000 boxes. [Expected value]


Problem 2: Network Security Analysis [Conditional Probability + Bayes' Theorem + Counting]

A company uses two firewalls: Firewall A blocks 90% of attacks but has 5% false positive rate; Firewall B blocks 85% of attacks with 3% false positive rate. 1% of traffic is actually malicious.

(a) Calculate the probability that a randomly flagged connection is actually malicious using Bayes' theorem. [Bayes' theorem]

(b) If both firewalls must flag a connection before it's blocked, calculate the new detection rate and false positive rate. [Conditional probability]

(c) An attacker tries 10 different exploits. If the probability of each succeeding is independent and equals the probability of passing both firewalls, calculate the probability that at least one succeeds. [Counting rules]


Problem 3: Hospital Emergency Room Modeling [Exponential + Poisson + Queuing Theory]

Patients arrive at an ER following a Poisson process with mean 4 patients per hour. Treatment time follows an exponential distribution with mean 30 minutes.

(a) Calculate the probability of no arrivals in a 15-minute period and the probability that the next patient waits more than 20 minutes. [Poisson & Exponential]

(b) What is the probability that more than 3 patients arrive while the doctor is treating one patient? [Poisson process]

(c) If the ER has 3 doctors, calculate the probability that all are busy when a patient arrives. [Basic queuing]


Problem 4: Election Polling & Inference [Sampling + Confidence Intervals + Normal]

A poll of 1200 voters shows 52% support for Candidate A. Assume simple random sampling.

(a) Calculate the 95% confidence interval for the true proportion. Interpret in context. [Confidence intervals]

(b) How large a sample is needed to reduce the margin of error to ±2% at 95% confidence? [Sample size determination]

(c) If Candidate A actually has 50% support, what is the probability that a poll of 1200 voters shows 52% or more? Calculate the p-value for observing 52%. [Hypothesis testing concept]


Problem 5: Pharmaceutical Trial [Hypothesis Testing + Type I/II Errors + Power]

A new drug claims to reduce blood pressure by more than 15 mmHg on average. In a trial of n=64 patients, the sample mean reduction is 17 mmHg with standard deviation 8 mmHg.

(a) Test the claim at α = 0.05 significance level. State null and alternative hypotheses, calculate the test statistic, and draw a conclusion. [Hypothesis testing - one sample]

(b) Calculate the probability of Type II error (β) if the true mean reduction is 18 mmHg. What is the power of the test? [Power analysis]

(c) How large a sample is needed to achieve 90% power for detecting a true mean of 18 mmHg at α = 0.05? [Sample size for power]


Problem 6: Portfolio Risk Analysis [Normal + Binomial + Expected Value]

An investment portfolio has daily returns normally distributed with mean 0.05% and standard deviation 1.2%.

(a) Calculate the Value at Risk (VaR) at 95% confidence for a $1,000,000 portfolio over 1 day. What does this mean? [Normal distribution - percentiles]

(b) Calculate the probability of losing money on any given day and the expected number of losing days in a 252-day trading year. [Binomial]

(c) If the correlation between this portfolio and the market (μ = 0.04%, σ = 1.0%) is 0.6, calculate the covariance and construct the variance-covariance matrix for a two-asset portfolio. [Matrices & statistics]


Problem 7: A/B Testing in E-commerce [Two-Sample Testing + Confidence Intervals + Proportions]

An e-commerce site tests two webpage designs. Design A: 450 conversions out of 5000 visitors. Design B: 520 conversions out of 5500 visitors.

(a) Test whether the conversion rates differ significantly at α = 0.05. Calculate the test statistic and p-value. [Two-proportion z-test]

(b) Construct a 95% confidence interval for the difference in conversion rates. Does it include zero? [CI for difference]

(c) If the true conversion rate for A is 9% and B is 9.5%, what is the probability of detecting this difference with the current sample sizes at α = 0.05? [Power calculation]


Problem 8: Environmental Monitoring [Chi-Square + Poisson + Goodness of Fit]

Air quality measurements count pollutant particles per cubic meter: Observed frequencies: 0 particles (15 days), 1 particle (28 days), 2 particles (32 days), 3 particles (18 days), 4+ particles (7 days).

(a) Test whether the data follows a Poisson distribution at α = 0.05. Estimate λ from the data and calculate expected frequencies. [Chi-square goodness of fit]

(b) Calculate the mean and variance of the observed data. For a true Poisson distribution, what should be the relationship? What does your result suggest? [Poisson properties]

(c) If standards require P(X > 3) < 0.10, does this location meet standards? Calculate the exact probability using your fitted Poisson model. [Probability calculation]


Problem 9: Machine Learning Classification [Bayes' Theorem + Normal + Matrix Operations]

A classifier distinguishes between spam (S) and ham (H) emails. P(S) = 0.3. The word "offer" appears in 60% of spam and 10% of ham emails. Email length (words) for spam: N(150, 50²), for ham: N(100, 30²).

(a) An email contains "offer" and has 120 words. Use Naive Bayes to calculate P(Spam | "offer", length=120). Assume independence. [Bayes' theorem]

(b) Calculate the likelihood ratio and determine the classification threshold that minimizes total misclassification error when false negatives cost twice as much as false positives. [Decision theory]

(c) Represent the feature statistics as a 2×2 matrix (classes × features) and calculate the determinant. What does this tell you about feature independence? [Matrix operations]


Problem 10: Markov Chain Customer Analysis [Matrices + Probability + Stationary Distribution]

A store's customers follow a Markov chain: Weekly states are Shop (S), Online (O), or Inactive (I). Transition matrix P (rows from, columns to): $$P = \begin{bmatrix} 0.6 & 0.3 & 0.1 \ 0.2 & 0.7 & 0.1 \ 0.1 & 0.2 & 0.7 \end{bmatrix}$$

(a) If a customer shops this week, what is the probability they shop again in 2 weeks? Calculate P². [Matrix multiplication]

(b) Find the stationary distribution π satisfying πP = π. What percentage of customers shop in the long run? [Eigenvectors & stationary distribution]

(c) If there are currently 1000 shoppers, 2000 online customers, and 7000 inactive, predict the distribution after 4 weeks using matrix powers. [Matrix applications]


Problem 11: Principal Component Analysis [Matrices + Eigenvalues + Statistics]

A dataset has covariance matrix: $$\Sigma = \begin{bmatrix} 4 & 2 \ 2 & 3 \end{bmatrix}$$

(a) Find the eigenvalues and eigenvectors of Σ. Interpret them in terms of variance explained. [Eigenvalue decomposition]

(b) Calculate the proportion of total variance explained by the first principal component. [PCA interpretation]

(c) If the correlation matrix (instead of covariance) is used, how would the eigenvalues change? Calculate the correlation matrix and find its trace. [Correlation vs covariance]


Problem 12: Clinical Decision Support System [Full Integration — All Topics]

A diagnostic test for a disease has: Sensitivity = 95% (P(T+|D+)), Specificity = 90% (P(T-|D-)), Disease prevalence = 2%.

(a) Calculate PPV (positive predictive value) and NPV (negative predictive value) using Bayes' theorem. Interpret for a patient. [Bayes' theorem]

(b) In a study of 500 patients, 15 had the disease. Test results showed 28 positive tests. Perform a hypothesis test at α = 0.05 to determine if the observed PPV differs significantly from the theoretical value. [Hypothesis testing for proportions]

(c) The test scores follow N(80, 10²) for diseased and N(70, 15²) for healthy patients. Calculate the optimal cutoff score that maximizes the sum of sensitivity and specificity. [Normal distributions - optimization]

(d) Represent the confusion matrix (predicted vs actual) as a matrix and calculate its determinant. What does det = 0 imply about the test's discriminatory power? [Matrix interpretation]


Summary of Topics Combined

Problem Topics Context
1 Binomial + Poisson + Normal Quality control
2 Bayes' + Conditional + Counting Network security
3 Poisson + Exponential Hospital ER
4 Sampling + CI + Normal Election polling
5 Hypothesis testing + Power Clinical trial
6 Normal + Binomial + Matrices Portfolio risk
7 Two-sample + CI A/B testing
8 Chi-square + Poisson Environmental
9 Bayes' + Normal + Matrices ML classification
10 Markov chains + Matrices Customer analysis
11 PCA + Eigenvalues Dimensionality reduction
12 All topics integrated Clinical decision

Hypothesis Testing Flowchart

graph TD
    A[Hypothesis Test] --> B{Parameter?}
    B -->|Proportion| C{One or Two Proportions?}
    B -->|Mean| D{Sigma known?}
    D -->|Yes| E[Z-test]
    D -->|No| F[t-test]
    C -->|One| G[Z-test for proportion]
    C -->|Two| H[Two-proportion Z-test]
    E --> I[State H0 and H1<br/>Calculate test statistic<br/>Find p-value<br/>Draw conclusion]
    F --> I
    G --> I
    H --> I

Key Formulas Reference

Probability

  • $P(A|B) = \frac{P(B|A)P(A)}{P(B)}$ (Bayes' theorem)
  • $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
  • Independence: $P(A \cap B) = P(A)P(B)$

Counting

  • Permutations: $P(n,r) = \frac{n!}{(n-r)!}$
  • Combinations: $C(n,r) = \binom{n}{r} = \frac{n!}{r!(n-r)!}$

Distributions

  • Binomial: $P(X=k) = \binom{n}{k}p^k(1-p)^{n-k}$, $\mu = np$, $\sigma^2 = np(1-p)$
  • Poisson: $P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}$, $\mu = \sigma^2 = \lambda$
  • Normal: $Z = \frac{X-\mu}{\sigma}$, $P(a < X < b) = \Phi(\frac{b-\mu}{\sigma}) - \Phi(\frac{a-\mu}{\sigma})$
  • Exponential: $f(x) = \lambda e^{-\lambda x}$, $P(X > x) = e^{-\lambda x}$

Sampling Distributions

  • $\bar{X} \sim N(\mu, \frac{\sigma^2}{n})$ (Central Limit Theorem)
  • Confidence interval: $\bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}$
  • Sample size: $n = \left(\frac{z_{\alpha/2}\sigma}{E}\right)^2$

Hypothesis Testing

  • $z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$ (one sample z-test)
  • $t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$ (one sample t-test)
  • Two proportions: $z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}}$

Matrices

  • Eigenvalues: $\det(A - \lambda I) = 0$
  • Stationary distribution: $\pi P = \pi$, $\sum \pi_i = 1$
  • Covariance: $\Sigma = \frac{1}{n-1}X^TX$ (centered data)

R Code Snippets (Optional Practice)

# Problem 1: Binomial and Poisson
dbinom(3, 100, 0.02)
dpois(3, 2)

# Problem 4: Confidence interval
prop.test(624, 1200, conf.level=0.95)

# Problem 5: Power calculation
power.t.test(delta=3, sd=8, power=0.90, sig.level=0.05)

# Problem 8: Chi-square test
chisq.test(observed, p=expected_prob)

# Problem 10: Matrix powers
P %*% P  # P^2
eigen(P) # eigenvalues

Related Resources


#mathematics #statistics #interleaved-practice #mastery #fad1015