FAD1015: MATHEMATICS III — Interleaved Mastery Problem Set
4-Day Intensive Study Plan
Topics: Probability, Probability Distributions, Hypothesis Testing, Matrices, Statistical Inference
How to Use This Set
Each problem deliberately combines 2-4 statistical topics in real-world data analysis scenarios. This mirrors exam conditions where you must identify appropriate distributions, perform calculations, and interpret results.
Study Schedule:
- Day 1: Problems 1-3 (Probability foundations & distributions)
- Day 2: Problems 4-6 (Sampling & estimation)
- Day 3: Problems 7-9 (Hypothesis testing applications)
- Day 4: Problems 10-12 (Matrices & multivariate analysis)
Scenario-to-Method Quick Reference
graph LR
A[Problem Scenario] --> B{Counting defectives<br/>in n trials?}
B -->|Yes| C[Binomial]
A --> D{Rare events in<br/>time or space?}
D -->|Yes| E[Poisson]
A --> F{Sample mean vs<br/>known value?}
F -->|Yes| G[Z-test or t-test]
A --> H{Comparing two<br/>groups?}
H -->|Yes| I[Two-sample test]
A --> J{Observed vs expected<br/>frequencies?}
J -->|Yes| K[Chi-square]
A --> L{Updating beliefs<br/>with evidence?}
L -->|Yes| M[Bayes' Theorem]
A --> N{State transitions<br/>over time?}
N -->|Yes| O[Markov Chains]
The Mastery Problems
Problem 1: Quality Control in Manufacturing [Binomial + Poisson + Normal Approximation]
A factory produces electronic components with 2% defect rate. Components are packed in boxes of 100.
(a) Calculate the probability that a box contains exactly 3 defective components using the binomial distribution. Then approximate using the Poisson distribution and compare results. [Binomial & Poisson]
(b) For large orders (n = 500), use normal approximation to the binomial to find P(more than 15 defectives). State and justify the continuity correction. [Normal approximation]
(c) If boxes with more than 5 defectives are rejected, calculate the rejection rate and the expected number of boxes rejected in a shipment of 1000 boxes. [Expected value]
Problem 2: Network Security Analysis [Conditional Probability + Bayes' Theorem + Counting]
A company uses two firewalls: Firewall A blocks 90% of attacks but has 5% false positive rate; Firewall B blocks 85% of attacks with 3% false positive rate. 1% of traffic is actually malicious.
(a) Calculate the probability that a randomly flagged connection is actually malicious using Bayes' theorem. [Bayes' theorem]
(b) If both firewalls must flag a connection before it's blocked, calculate the new detection rate and false positive rate. [Conditional probability]
(c) An attacker tries 10 different exploits. If the probability of each succeeding is independent and equals the probability of passing both firewalls, calculate the probability that at least one succeeds. [Counting rules]
Problem 3: Hospital Emergency Room Modeling [Exponential + Poisson + Queuing Theory]
Patients arrive at an ER following a Poisson process with mean 4 patients per hour. Treatment time follows an exponential distribution with mean 30 minutes.
(a) Calculate the probability of no arrivals in a 15-minute period and the probability that the next patient waits more than 20 minutes. [Poisson & Exponential]
(b) What is the probability that more than 3 patients arrive while the doctor is treating one patient? [Poisson process]
(c) If the ER has 3 doctors, calculate the probability that all are busy when a patient arrives. [Basic queuing]
Problem 4: Election Polling & Inference [Sampling + Confidence Intervals + Normal]
A poll of 1200 voters shows 52% support for Candidate A. Assume simple random sampling.
(a) Calculate the 95% confidence interval for the true proportion. Interpret in context. [Confidence intervals]
(b) How large a sample is needed to reduce the margin of error to ±2% at 95% confidence? [Sample size determination]
(c) If Candidate A actually has 50% support, what is the probability that a poll of 1200 voters shows 52% or more? Calculate the p-value for observing 52%. [Hypothesis testing concept]
Problem 5: Pharmaceutical Trial [Hypothesis Testing + Type I/II Errors + Power]
A new drug claims to reduce blood pressure by more than 15 mmHg on average. In a trial of n=64 patients, the sample mean reduction is 17 mmHg with standard deviation 8 mmHg.
(a) Test the claim at α = 0.05 significance level. State null and alternative hypotheses, calculate the test statistic, and draw a conclusion. [Hypothesis testing - one sample]
(b) Calculate the probability of Type II error (β) if the true mean reduction is 18 mmHg. What is the power of the test? [Power analysis]
(c) How large a sample is needed to achieve 90% power for detecting a true mean of 18 mmHg at α = 0.05? [Sample size for power]
Problem 6: Portfolio Risk Analysis [Normal + Binomial + Expected Value]
An investment portfolio has daily returns normally distributed with mean 0.05% and standard deviation 1.2%.
(a) Calculate the Value at Risk (VaR) at 95% confidence for a $1,000,000 portfolio over 1 day. What does this mean? [Normal distribution - percentiles]
(b) Calculate the probability of losing money on any given day and the expected number of losing days in a 252-day trading year. [Binomial]
(c) If the correlation between this portfolio and the market (μ = 0.04%, σ = 1.0%) is 0.6, calculate the covariance and construct the variance-covariance matrix for a two-asset portfolio. [Matrices & statistics]
Problem 7: A/B Testing in E-commerce [Two-Sample Testing + Confidence Intervals + Proportions]
An e-commerce site tests two webpage designs. Design A: 450 conversions out of 5000 visitors. Design B: 520 conversions out of 5500 visitors.
(a) Test whether the conversion rates differ significantly at α = 0.05. Calculate the test statistic and p-value. [Two-proportion z-test]
(b) Construct a 95% confidence interval for the difference in conversion rates. Does it include zero? [CI for difference]
(c) If the true conversion rate for A is 9% and B is 9.5%, what is the probability of detecting this difference with the current sample sizes at α = 0.05? [Power calculation]
Problem 8: Environmental Monitoring [Chi-Square + Poisson + Goodness of Fit]
Air quality measurements count pollutant particles per cubic meter: Observed frequencies: 0 particles (15 days), 1 particle (28 days), 2 particles (32 days), 3 particles (18 days), 4+ particles (7 days).
(a) Test whether the data follows a Poisson distribution at α = 0.05. Estimate λ from the data and calculate expected frequencies. [Chi-square goodness of fit]
(b) Calculate the mean and variance of the observed data. For a true Poisson distribution, what should be the relationship? What does your result suggest? [Poisson properties]
(c) If standards require P(X > 3) < 0.10, does this location meet standards? Calculate the exact probability using your fitted Poisson model. [Probability calculation]
Problem 9: Machine Learning Classification [Bayes' Theorem + Normal + Matrix Operations]
A classifier distinguishes between spam (S) and ham (H) emails. P(S) = 0.3. The word "offer" appears in 60% of spam and 10% of ham emails. Email length (words) for spam: N(150, 50²), for ham: N(100, 30²).
(a) An email contains "offer" and has 120 words. Use Naive Bayes to calculate P(Spam | "offer", length=120). Assume independence. [Bayes' theorem]
(b) Calculate the likelihood ratio and determine the classification threshold that minimizes total misclassification error when false negatives cost twice as much as false positives. [Decision theory]
(c) Represent the feature statistics as a 2×2 matrix (classes × features) and calculate the determinant. What does this tell you about feature independence? [Matrix operations]
Problem 10: Markov Chain Customer Analysis [Matrices + Probability + Stationary Distribution]
A store's customers follow a Markov chain: Weekly states are Shop (S), Online (O), or Inactive (I). Transition matrix P (rows from, columns to): $$P = \begin{bmatrix} 0.6 & 0.3 & 0.1 \ 0.2 & 0.7 & 0.1 \ 0.1 & 0.2 & 0.7 \end{bmatrix}$$
(a) If a customer shops this week, what is the probability they shop again in 2 weeks? Calculate P². [Matrix multiplication]
(b) Find the stationary distribution π satisfying πP = π. What percentage of customers shop in the long run? [Eigenvectors & stationary distribution]
(c) If there are currently 1000 shoppers, 2000 online customers, and 7000 inactive, predict the distribution after 4 weeks using matrix powers. [Matrix applications]
Problem 11: Principal Component Analysis [Matrices + Eigenvalues + Statistics]
A dataset has covariance matrix: $$\Sigma = \begin{bmatrix} 4 & 2 \ 2 & 3 \end{bmatrix}$$
(a) Find the eigenvalues and eigenvectors of Σ. Interpret them in terms of variance explained. [Eigenvalue decomposition]
(b) Calculate the proportion of total variance explained by the first principal component. [PCA interpretation]
(c) If the correlation matrix (instead of covariance) is used, how would the eigenvalues change? Calculate the correlation matrix and find its trace. [Correlation vs covariance]
Problem 12: Clinical Decision Support System [Full Integration — All Topics]
A diagnostic test for a disease has: Sensitivity = 95% (P(T+|D+)), Specificity = 90% (P(T-|D-)), Disease prevalence = 2%.
(a) Calculate PPV (positive predictive value) and NPV (negative predictive value) using Bayes' theorem. Interpret for a patient. [Bayes' theorem]
(b) In a study of 500 patients, 15 had the disease. Test results showed 28 positive tests. Perform a hypothesis test at α = 0.05 to determine if the observed PPV differs significantly from the theoretical value. [Hypothesis testing for proportions]
(c) The test scores follow N(80, 10²) for diseased and N(70, 15²) for healthy patients. Calculate the optimal cutoff score that maximizes the sum of sensitivity and specificity. [Normal distributions - optimization]
(d) Represent the confusion matrix (predicted vs actual) as a matrix and calculate its determinant. What does det = 0 imply about the test's discriminatory power? [Matrix interpretation]
Summary of Topics Combined
| Problem | Topics | Context |
|---|---|---|
| 1 | Binomial + Poisson + Normal | Quality control |
| 2 | Bayes' + Conditional + Counting | Network security |
| 3 | Poisson + Exponential | Hospital ER |
| 4 | Sampling + CI + Normal | Election polling |
| 5 | Hypothesis testing + Power | Clinical trial |
| 6 | Normal + Binomial + Matrices | Portfolio risk |
| 7 | Two-sample + CI | A/B testing |
| 8 | Chi-square + Poisson | Environmental |
| 9 | Bayes' + Normal + Matrices | ML classification |
| 10 | Markov chains + Matrices | Customer analysis |
| 11 | PCA + Eigenvalues | Dimensionality reduction |
| 12 | All topics integrated | Clinical decision |
Hypothesis Testing Flowchart
graph TD
A[Hypothesis Test] --> B{Parameter?}
B -->|Proportion| C{One or Two Proportions?}
B -->|Mean| D{Sigma known?}
D -->|Yes| E[Z-test]
D -->|No| F[t-test]
C -->|One| G[Z-test for proportion]
C -->|Two| H[Two-proportion Z-test]
E --> I[State H0 and H1<br/>Calculate test statistic<br/>Find p-value<br/>Draw conclusion]
F --> I
G --> I
H --> I
Key Formulas Reference
Probability
- $P(A|B) = \frac{P(B|A)P(A)}{P(B)}$ (Bayes' theorem)
- $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
- Independence: $P(A \cap B) = P(A)P(B)$
Counting
- Permutations: $P(n,r) = \frac{n!}{(n-r)!}$
- Combinations: $C(n,r) = \binom{n}{r} = \frac{n!}{r!(n-r)!}$
Distributions
- Binomial: $P(X=k) = \binom{n}{k}p^k(1-p)^{n-k}$, $\mu = np$, $\sigma^2 = np(1-p)$
- Poisson: $P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}$, $\mu = \sigma^2 = \lambda$
- Normal: $Z = \frac{X-\mu}{\sigma}$, $P(a < X < b) = \Phi(\frac{b-\mu}{\sigma}) - \Phi(\frac{a-\mu}{\sigma})$
- Exponential: $f(x) = \lambda e^{-\lambda x}$, $P(X > x) = e^{-\lambda x}$
Sampling Distributions
- $\bar{X} \sim N(\mu, \frac{\sigma^2}{n})$ (Central Limit Theorem)
- Confidence interval: $\bar{x} \pm z_{\alpha/2}\frac{\sigma}{\sqrt{n}}$
- Sample size: $n = \left(\frac{z_{\alpha/2}\sigma}{E}\right)^2$
Hypothesis Testing
- $z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$ (one sample z-test)
- $t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$ (one sample t-test)
- Two proportions: $z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})(\frac{1}{n_1}+\frac{1}{n_2})}}$
Matrices
- Eigenvalues: $\det(A - \lambda I) = 0$
- Stationary distribution: $\pi P = \pi$, $\sum \pi_i = 1$
- Covariance: $\Sigma = \frac{1}{n-1}X^TX$ (centered data)
R Code Snippets (Optional Practice)
# Problem 1: Binomial and Poisson
dbinom(3, 100, 0.02)
dpois(3, 2)
# Problem 4: Confidence interval
prop.test(624, 1200, conf.level=0.95)
# Problem 5: Power calculation
power.t.test(delta=3, sd=8, power=0.90, sig.level=0.05)
# Problem 8: Chi-square test
chisq.test(observed, p=expected_prob)
# Problem 10: Matrix powers
P %*% P # P^2
eigen(P) # eigenvalues
Related Resources
- Probability Distributions
- Binomial Distribution
- Poisson Distribution
- Normal Distribution
- Hypothesis Testing
- Matrices
- Bayes' Theorem
- Confidence Intervals
- Sampling Methods
#mathematics #statistics #interleaved-practice #mastery #fad1015