R Crash Course — C++ to R Translation

Target: PASUM FAD1015 (Mathematics III) final exam. You know C++ already. This maps every R concept to its C++ analogue, flags the traps, and gives exam-specific patterns. No hand-holding.


Part 1 — The Mental Model Shift

1.1 Everything Is a Vector

There are no scalars in R. A "scalar" is a vector of length 1.

C++ R
int x = 5; x <- 5 — actually c(5), a length-1 vector
std::vector<int> v = {1,2,3}; v <- c(1, 2, 3)

Trap: length(x) returns 1 for what looks like a scalar. There is no int, no double, no float at the variable level — only typeof() reveals the element type.

1.2 1-Based Indexing

v[0];           // C++: first element
v[v.size()-1];  // C++: last element
v[1]            # R: first element
v[length(v)]    # R: last element

Trap: v[0] is valid R — it returns an empty vector. No error, just silent emptiness. If you write C++ loops in R, for (i in 0:n-1) produces -1 and silent garbage. for (i in 1:n) is correct.

1.3 <- vs = vs ==

int x = 5;     // Assignment
if (x == 5)    // Comparison
x <- 5         # Assignment (canonical)
x = 5           # Assignment (also works, but frowned upon in scope)
x == 5          # Comparison — returns TRUE (a logical vector)

Trap: x = 5 inside a function call creates a local variable, not an argument. Use <- for assignment in global scope, = only for named arguments: t.test(x, mu = 0).

1.4 Recycling Rule

// C++: std::vector<int> a = {1,2,3,4};
//       std::vector<int> b = {1,2};
//       a + b — compilation error, mismatched sizes
c(1, 2, 3, 4) + c(1, 2)    # R: c(2, 4, 4, 6)

R recycles the shorter vector. Shorter vector (1, 2) is repeated to (1, 2, 1, 2). No warning unless the longer isn't a multiple of the shorter.

Trap (exam-relevant):

matrix(1:6, nrow = 3) + matrix(1:4, nrow = 2)
# No error — but produces garbage because both are length-6 vectors internally

1.5 Vectorized Operations — No Loops

// C++: for (int i = 0; i < n; i++) result[i] = a[i] * 2;
result <- a * 2    # Entire vector multiplied at once

In R, for loops are slow and wrong-style. Everything is vectorized. mean(x) loops internally; you don't write the loop.


Part 2 — Vectors & Sequences

2.1 Creating Vectors

Operation C++ R
Concatenate std::vector v = {1, 2, 3}; v <- c(1, 2, 3)
Sequence 1..10 std::iota(v.begin(), v.end(), 1); 1:10
Step sequence for-loop seq(1, 5, by = 0.5)
Length-specified std::vector v(5); seq(0, 10, length.out = 5)
Repeat 0 five times std::vector v(5, 0); rep(0, 5)
Repeat pattern for-loop rep(1:3, 4) — 1 2 3 1 2 3 1 2 3 1 2 3

Exam pattern:

a <- seq(5, 160, by = 5)        # 5, 10, 15, ..., 160
b <- seq(87, 56, by = -1)       # 87, 86, 85, ..., 56
D <- a * b                      # Element-wise (recycles? no, same length)
D[19:21]                        # 19th, 20th, 21st elements
D[D < 2000]                     # Logical subsetting
length(D[D > 6000])             # Count elements > 6000

Trap: 1:10 produces integer vector. seq(1, 5, by = 0.5) produces double. For exam output prediction, know the type.

2.2 Subsetting — The Power Tool

// C++: v.at(index);
//       std::find_if(v.begin(), v.end(), pred);
v[3]               # Third element
v[1:3]             # First three elements
v[-(1:3)]          # Everything EXCEPT first three
v[c(1, 3, 5)]      # Elements at positions 1, 3, 5
v[v > 5]           # All elements > 5 (logical subsetting)
v[which(v > 5)]    # Same thing, explicit

C++ equivalent of v[v > 5]:

std::vector<int> result;
std::copy_if(v.begin(), v.end(), std::back_inserter(result),
             [](int x) { return x > 5; });

R does this in 4 characters.

2.3 Length

v.size();
length(v)

Trap: length(matrix) returns total elements (nrow × ncol), not number of rows. Use nrow(), ncol(), dim().


Part 3 — Descriptive Statistics

3.1 Core Functions

Statistic C++ (manual) R
Mean std::accumulate / n mean(x)
Variance std::accumulate of (xᵢ - x̄)² / (n-1) var(x)
SD sqrt(variance) sd(x)
Median sort, then middle median(x)
Sum std::accumulate sum(x)
Min std::min_element min(x)
Max std::max_element max(x)
Range pair of min/max range(x)
All at once write a struct summary(x)

summary(x) output (memorise the field order):

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  0.01    4.00    6.50    6.53    9.00   12.00

3.2 Sorting

std::sort(v.begin(), v.end());
sort(x)          # Ascending (default)
sort(x, decreasing = TRUE)

Trap: sort() returns a new vector. x itself is unchanged. To modify in place: x <- sort(x).


Part 4 — Matrices ⭐⭐⭐⭐ (Highest Weight)

4.1 Creation

// C++: int A[3][3] = {{1,4,7},{2,5,8},{3,6,9}};
A <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, ncol = 3)
#      [,1] [,2] [,3]
# [1,]    1    4    7
# [2,]    2    5    8
# [3,]    3    6    9

THE exam trap: matrix() fills column-by-column by default. This is the opposite of C++ row-major layout.

# Row-wise fill:
B <- matrix(1:9, nrow = 3, byrow = TRUE)
#      [,1] [,2] [,3]
# [1,]    1    2    3
# [2,]    4    5    6
# [3,]    7    8    9

C++ mental model:

  • matrix(..., byrow = FALSE) — column-major (Fortran style). Like accessing A[j][i].
  • matrix(..., byrow = TRUE) — row-major (C++ style). Like accessing A[i][j].

4.2 rbind / cbind

# C++: vector of rows is natural. R works column-first.

# rbind — stack rows
rbind(c(1,2,3), c(4,5,6), c(7,8,9))
#      [,1] [,2] [,3]
# [1,]    1    2    3
# [2,]    4    5    6
# [3,]    7    8    9

# cbind — stack columns
cbind(c(1,4,7), c(2,5,8), c(3,6,9))
#      [,1] [,2] [,3]
# [1,]    1    2    3
# [2,]    4    5    6
# [3,]    7    8    9

Exam trap — mixing:

rbind(1:3, 4:6, 7:9)
#      [,1] [,2] [,3]
# [1,]    1    2    3
# [2,]    4    5    6
# [3,]    7    8    9

Same output for rbind(1:3, 4:6, 7:9) — which is also what matrix(1:9, nrow = 3, byrow = TRUE) gives. Train yourself to trace which dimension is being bound.

4.3 The %*% Trap — CRITICAL

// C++: A * B — matrix multiplication (if you've overloaded operator*)
A * B       # Element-wise multiplication (Hadamard product) — like .* in MATLAB
A %*% B     # Actual matrix multiplication (dot product)

Worked:

A <- matrix(1:4, nrow = 2)   # [1 3; 2 4]
B <- matrix(4:1, nrow = 2)   # [4 2; 3 1]

A * B    # [1*4  3*2; 2*3  4*1] = [4  6; 6  4]
A %*% B  # [1*4+3*3  1*2+3*1; 2*4+4*3  2*2+4*1] = [13  5; 20  8]

Exam prediction question: "What is the output of A * B?" If you say matrix multiplication, you lose marks.

4.4 Essential Operations

Operation C++ R Notes
Transpose A.transpose() (Eigen) t(A)
Determinant A.determinant() (Eigen) det(A) square only
Inverse A.inverse() (Eigen) solve(A) crashes if singular
Solve Ax = b A.fullPivLu().solve(b) (Eigen) solve(A, b)
Identity MatrixXd::Identity(n,n) (Eigen) diag(n)
Diagonal loop diag(A) extracts diagonal
Eigenvalues eigen(A) eigen(A)$values
Row means loop rowMeans(A)
Col means loop colMeans(A)
Row sums loop rowSums(A)
Col sums loop colSums(A)
Dimensions A.rows(), A.cols() nrow(A), ncol(A)
Total elements A.size() length(A) or prod(dim(A))

4.5 Subsetting

// C++: A[i][j] or A.at(i).at(j)
A[i, j]        # Element at row i, column j
A[i, ]         # Row i (returns a VECTOR)
A[, j]         # Column j (returns a VECTOR)
A[1:2, ]       # Rows 1-2 (returns a MATRIX)
A[, -2]        # All columns except column 2
A[c(1,3), c(2,3)]  # Rows 1&3, columns 2&3

Critical trap — drop = FALSE:

A[, 1]                   # Returns a VECTOR (drops dimension)
A[, 1, drop = FALSE]     # Returns a 1-column MATRIX

If you take a single column and try %*% it, the result type changes. drop = FALSE preserves matrix type.

4.6 Cramer's Rule (Exam Standard)

Given the system $Ax = b$:

A <- matrix(c(2, 1, -1, 3), nrow = 2, byrow = TRUE)  # [2 -1; 1 3]
b <- c(7, 0)

det_A <- det(A)                  # D = 2*3 - (-1)*1 = 7
D1 <- det(cbind(b, A[, 2]))      # Replace col 1 with b
D2 <- det(cbind(A[, 1], b))      # Replace col 2 with b

x1 <- D1 / det_A
x2 <- D2 / det_A

4.7 Naming

rownames(A) <- c("Row1", "Row2", "Row3")
colnames(A) <- c("Col1", "Col2", "Col3")

R allows named indices: A["Row1", ]. This will likely appear in output-prediction questions.


Part 5 — Hypothesis Testing in R ⭐⭐⭐ (Critical)

5.1 One-Sample t-test

t.test(x, mu = hypothesized_mean)

Output to memorise:

	One Sample t-test

data:  x
t = -1.6848, df = 11, p-value = 0.1201
alternative hypothesis: true mean is not equal to 15
95 percent confidence interval:
 13.46244 15.20423
sample estimates:
mean of x
 14.33333
Field Meaning C++ equivalent
t Test statistic (mean(x) - mu) / (sd(x) / sqrt(n))
df Degrees of freedom n - 1
p-value Two-tailed p 2 * (1 - pt(abs(t), df))
95 percent CI Confidence interval mean(x) ± qt(0.975, df) * sd(x) / sqrt(n)
mean of x Sample mean std::accumulate / n

5.2 One-Tailed Tests

t.test(x, mu = 100, alternative = "less")      # H₁: μ < 100
t.test(x, mu = 100, alternative = "greater")   # H₁: μ > 100
t.test(x, mu = 100)                            # H₁: μ ≠ 100 (two-tailed, default)

5.3 z-test (Requires BSDA Package)

library(BSDA)
z.test(x, mu = 100, sigma.x = 15)

Use when $\sigma$ is known or $n > 30$. Otherwise t-test.

Manual z-test from summary stats (exam pattern):

z_score <- (x_bar - mu) / (sigma / sqrt(n))
p_value <- 2 * (1 - pnorm(abs(z_score)))

5.4 Manual t-test from Summary Stats (Exam Pattern)

t_score <- (x_bar - mu) / (s / sqrt(n))
p_value <- 2 * (1 - pt(abs(t_score), df = n - 1))

Compare:

Test CDF function
z-test pnorm(z) — area left of z under N(0,1)
t-test pt(t, df) — area left of t under t-distribution

5.5 Decision Rule (Memorise Verbatim)

Condition Conclusion
p-value ≤ α Reject H₀. "There is sufficient evidence at the α level to conclude that [alternative]."
p-value > α Fail to reject H₀. "There is insufficient evidence at the α level to conclude that [alternative]."

Trap: Never say "accept H₀". Always "fail to reject H₀".

5.6 Normality Check

shapiro.test(x)
  • p > 0.05 → data is normal (fail to reject normality)
  • p ≤ 0.05 → data is NOT normal

Also: qqnorm(x); qqline(x) for visual check.

5.7 Confidence Interval (Exam Connection)

t.test(x, mu = 0)$conf.int    # Extract CI from t.test output

The CI and hypothesis test are dual: if the null hypothesis value ($\mu_0$) falls inside the 95% CI, you fail to reject at $\alpha = 0.05$. If outside, reject.


Part 6 — Probability Distributions (Only p Functions Needed)

Only pnorm() and pt() appear in exam-relevant manual p-value computation. That's it.

pnorm(z)         # P(Z ≤ z) for standard normal — used in manual z-test
pt(t, df)        # P(T ≤ t) for t-distribution — used in manual t-test

Exam pattern (already covered in Part 5):

p_value <- 2 * (1 - pnorm(abs(z_score)))    # Two-tailed z-test
p_value <- 2 * (1 - pt(abs(t_score), df))   # Two-tailed t-test

Everything else — dbinom, qbinom, rbinom, dnorm, qnorm, rnorm, runif, set.seed — is not tested.


Part 7 — Data Input (Not Tested, One-Liner)

data <- read.table("filename.txt", header = TRUE)

If a dataset is given in the hypothesis testing question, it'll be loaded for you. Don't memorise this.


Part 8 — Exam Output Prediction (Key Skill)

The exam will show R code and ask "what is the output?" Train yourself to trace:

Template 1 — Matrix creation:

matrix(1:9, nrow = 3)
#     [,1] [,2] [,3]
# [1,]   1    4    7
# [2,]   2    5    8
# [3,]   3    6    9

(Column-major! Not row-major!)

*Template 2 — %*% vs :

A <- matrix(1:4, nrow=2)
B <- matrix(4:1, nrow=2)
A * B        # Element-wise
A %*% B      # Matrix multiplication

Template 3 — t.test output fields:

t.test(data, mu = 100)

Know what each field means: t, df, p-value, 95 percent confidence interval, mean of x.

Template 4 — shapiro.test:

shapiro.test(data)
# p > 0.05 → "data is normal"

Quick Reference — One-Liners

You want... In C++ you'd write... In R you write...
A vector of 1..10 std::iota(v.begin(), v.end(), 1) 1:10
Arithmetic mean accumulate/n mean(x)
Sample variance accumulate((xᵢ-μ)²)/(n-1) var(x)
Matrix multiply nested loops or Eigen A %*% B
Transpose A.transpose() (Eigen) t(A)
Determinant A.determinant() det(A)
Inverse A.inverse() solve(A)
Solve Ax = b A.fullPivLu().solve(b) solve(A, b)
1-sample t-test manual computation t.test(x, mu=0)
p-value from z 2 * (1 - normalCDF(z)) 2 * (1 - pnorm(z))
p-value from t 2 * (1 - tCDF(t, df)) 2 * (1 - pt(t, df))
Normal CDF 0.5 * (1 + erf(z/√2)) pnorm(z)
t-dist CDF numerical integration pt(t, df)
Summary stats write a struct summary(x)
Sort vector std::sort sort(x)

Exam Day Checklist

  • [ ] matrix() fills column-major by default — or use byrow = TRUE
  • [ ] * is element-wise, %*% is matrix multiplication
  • [ ] A[, 1] returns a vector — use drop = FALSE to keep matrix
  • [ ] t.test() three fields: t, df, p-value
  • [ ] p ≤ α → reject. p > α → fail to reject (never "accept")
  • [ ] shapiro.test() p > 0.05 → normal
  • [ ] solve(A) for inverse, solve(A, b) for systems
  • [ ] 1:n for loops, never 0:n-1
  • [ ] summary() field order: Min, 1st Qu., Median, Mean, 3rd Qu., Max
  • [ ] pnorm() and pt() only — the rest of d/p/q/r is not tested

Related Resources