Skip to content

Linear Algebra

Vector and matrix math — the foundation of ML, physics, graphics, and pretty much anything quantitative.

Vectors

A vector is just a 1D array.

import numpy as np

v = np.array([3, 4, 0])

print("length (L2 norm):", np.linalg.norm(v))      # √(9+16+0) = 5
print("L1 norm         :", np.linalg.norm(v, 1))    # 3+4+0 = 7
print("Sum of squares  :", (v ** 2).sum())          # 25

Dot product — np.dot() or @

The dot product of two vectors: a·b = Σ aᵢ bᵢ. Returns a single number.

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print(np.dot(a, b))      # 1*4 + 2*5 + 3*6 = 32
print(a @ b)              # 32 — `@` is shorthand

Cosine similarity — how aligned are two vectors?

cos(θ) = (a·b) / (|a| |b|). Result in [-1, 1]. 1 = same direction, 0 = perpendicular, -1 = opposite.

import numpy as np

def cosine_similarity(a, b):
    return (a @ b) / (np.linalg.norm(a) * np.linalg.norm(b))

a = np.array([1, 0, 0])
b = np.array([1, 0, 0])
c = np.array([0, 1, 0])
d = np.array([-1, 0, 0])

print("a vs b (same)        :", cosine_similarity(a, b))
print("a vs c (perpendicular):", cosine_similarity(a, c))
print("a vs d (opposite)    :", cosine_similarity(a, d))

Cosine similarity powers semantic search, recommendation systems, NLP embeddings.

Matrix multiplication — @ (NOT *!)

* is element-wise. @ is real matrix multiplication.

import numpy as np

A = np.array([
    [1, 2],
    [3, 4],
])
B = np.array([
    [5, 6],
    [7, 8],
])

print("Element-wise (A * B):")
print(A * B)

print("\nMatrix product (A @ B):")
print(A @ B)

@ is np.matmul(A, B) for 2D arrays.

For a matrix (m, k) and (k, n) → result is (m, n). The "inner" dimension k must match.

import numpy as np

A = np.zeros((3, 4))    # 3 rows, 4 cols
B = np.zeros((4, 5))    # 4 rows, 5 cols

C = A @ B
print("A @ B shape:", C.shape)   # (3, 5)

Identity matrix and inverse

import numpy as np

A = np.array([
    [4, 7],
    [2, 6],
], dtype=float)

# Identity matrix
I = np.eye(2)
print("Identity:")
print(I)

# Inverse — A @ A_inv should equal I
A_inv = np.linalg.inv(A)
print("\nInverse:")
print(A_inv)

print("\nA @ A_inv:")
print(np.round(A @ A_inv, decimals=3))    # ≈ identity

A matrix is singular (no inverse) when its rows or columns are linearly dependent. NumPy raises LinAlgError in that case.

Solving linear systems — np.linalg.solve()

Given Ax = b, find x. Don't use invsolve is faster and more accurate.

import numpy as np

# 3x + 2y = 7
# 4x - y  = 1

A = np.array([
    [3, 2],
    [4, -1],
])
b = np.array([7, 1])

x = np.linalg.solve(A, b)
print("x, y =", x)        # solution

# Verify
print("A @ x =", A @ x)    # should match b

Determinant

import numpy as np

A = np.array([
    [4, 7],
    [2, 6],
])
print("det:", np.linalg.det(A))    # 4*6 - 7*2 = 10

Zero determinant → singular matrix → no inverse.

Eigenvalues & eigenvectors

import numpy as np

A = np.array([
    [4, -2],
    [1,  1],
])

eigenvalues, eigenvectors = np.linalg.eig(A)
print("eigenvalues:", eigenvalues)
print("eigenvectors:")
print(eigenvectors)

Used in PCA, vibrations analysis, Google's PageRank, and many other places.

Singular Value Decomposition (SVD) — universal matrix factorization

import numpy as np

A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
], dtype=float)

U, S, Vt = np.linalg.svd(A)
print("U shape:", U.shape)
print("S (singular values):", S)
print("Vt shape:", Vt.shape)

# Reconstruct
reconstructed = U @ np.diag(S) @ Vt
print("\nReconstructed (should match A):")
print(np.round(reconstructed, 3))

SVD is the workhorse behind PCA, recommendation systems, image compression.

Transpose

import numpy as np

A = np.array([
    [1, 2, 3],
    [4, 5, 6],
])
print("Original shape :", A.shape)
print("Transposed shape:", A.T.shape)
print(A.T)

A.T is also np.transpose(A).

Matrix power

import numpy as np

A = np.array([
    [1, 1],
    [1, 0],
])

print("A^5:")
print(np.linalg.matrix_power(A, 5))

Fibonacci-related — A^n of this matrix gives the n-th Fibonacci number!

Useful applications

Linear regression with the normal equation

β = (XᵀX)⁻¹ Xᵀy — solves regression with one matrix expression.

import numpy as np

# Fake data: y ≈ 3x + 2
X = np.array([
    [1, 1],
    [1, 2],
    [1, 3],
    [1, 4],
    [1, 5],
])                                    # first column = bias (ones)
y = np.array([5, 8, 11, 14, 17])

# β = (XᵀX)⁻¹ Xᵀy
beta = np.linalg.inv(X.T @ X) @ X.T @ y
print("β (intercept, slope):", beta)   # ≈ [2, 3]

(In practice, use np.linalg.lstsq — it's more numerically stable.)

Polynomial fitting

import numpy as np

x = np.array([0, 1, 2, 3, 4, 5], dtype=float)
y = np.array([1, 1.5, 3.5, 7.5, 13.5, 21.5])

# Fit a degree-2 polynomial: y = a x² + b x + c
coeffs = np.polyfit(x, y, deg=2)
print("coeffs (highest power first):", coeffs)

# Use it to predict
print("predicted at x=6:", np.polyval(coeffs, 6))

Cheatsheet

Operation NumPy
Dot product a @ b or np.dot(a, b)
Matrix multiply A @ B or np.matmul(A, B)
Element-wise multiply A * B
Transpose A.T
Inverse np.linalg.inv(A)
Solve Ax = b np.linalg.solve(A, b)
Determinant np.linalg.det(A)
Eigenvalues / vectors np.linalg.eig(A)
SVD np.linalg.svd(A)
Norm (length) np.linalg.norm(v)
Matrix power np.linalg.matrix_power(A, n)
Identity np.eye(n)
Polynomial fit np.polyfit(x, y, deg)
Least squares np.linalg.lstsq(X, y, rcond=None)

Common pitfalls

  • A * B is element-wise — for matrix multiply use @. This catches everyone the first time.
  • Shape mismatch in @ — the inner dimensions must match: (m, k) @ (k, n). If ks don't match, you get an error.
  • inv for solving systems — slower and less stable than solve. Always prefer np.linalg.solve(A, b).
  • Singular matrixinv raises if the matrix isn't invertible. Check the determinant first or use lstsq.
  • Float precisionA @ A_inv may not be exactly the identity. Use np.allclose(...) for comparisons.

Practice

What does this print?

Expected: 32

import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(a @ b)

Solve the linear system 3x + 2y = 7, 4x − y = 1 (correct = [1. 2.])

Expected: [1. 2.]

import numpy as np
A = np.array([[3, 2], [4, -1]])
b = np.array([7, 1])
print(np.linalg.inv(A) * b)        # bug: * is element-wise; use solve or @

Quiz — Quick check

What you remember

Q1. For matrix multiplication of A (shape (3, 4)) and B (shape (4, 5)), the result has shape…

  • (3, 5)
  • (4, 4)
  • (3, 4)
  • Error — incompatible shapes

Why: Matrix mult (m, k) @ (k, n) gives (m, n). The inner dimensions (k=4) must match; they get "consumed" by the multiplication.

Q2. Why prefer np.linalg.solve(A, b) over np.linalg.inv(A) @ b?

  • It's a NumPy 2.0 deprecation
  • solve is faster and more numerically stable
  • Same speed, just shorter to type
  • inv doesn't work on square matrices

Why: Computing the inverse explicitly is wasteful and amplifies floating-point errors. solve uses optimized algorithms (LU decomposition) and is the textbook recommendation for Ax = b.

Q3. A matrix has determinant 0. What does that mean?

  • It's diagonal
  • It's symmetric
  • It's singular — has no inverse, columns are linearly dependent
  • Its eigenvalues are all positive

Why: A determinant of 0 means the matrix collapses some dimension — its rows/columns aren't independent. np.linalg.inv raises LinAlgError on such matrices.

Common doubts

When should I use np.dot vs @ vs np.matmul?

For 1D and 2D arrays, they're equivalent. @ is the most readable (Python 3.5+). np.matmul and @ differ from np.dot for higher dimensions — matmul broadcasts the leading axes; dot does something subtly different. Default to @ unless you know you need dot's specific behavior.

What's the practical use of SVD / eigendecomposition?
  • SVD: PCA, image compression, recommender systems, low-rank approximations, pseudo-inverse for least squares
  • Eigen: PCA (related to SVD), PageRank, vibration analysis, stability of dynamical systems They appear constantly in ML. Worth understanding the geometric intuition (axes of greatest variance, scaling factors).
How do I check if two matrices are 'equal' when comparing computed results?

Float math is imprecise — A @ A_inv == np.eye(n) is almost always False. Use np.allclose(A @ A_inv, np.eye(n)) which checks if elements are equal within a small tolerance (default ~1e-8).

What's next

Random Sampling