Broadcasting¶

Broadcasting is how NumPy lets you do math between arrays of different shapes. Once you understand it, you'll write half as much code.

The simplest case — scalar + array¶

import numpy as np

a = np.array([1, 2, 3, 4])
print(a + 5)        # [6, 7, 8, 9]

The scalar 5 is broadcast to the same shape as a.

Adding a 1D row to every row of a 2D array¶

import numpy as np

matrix = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
row = np.array([10, 20, 30])

print(matrix + row)

Result:

[[11 22 33]
 [14 25 36]
 [17 28 39]]

NumPy stretched row from shape (3,) to (3, 3) — adding it to every row.

Subtracting a column¶

To add a different value to each row (a column vector), reshape first:

import numpy as np

matrix = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
col = np.array([100, 200, 300])

# Without reshaping: 3 vs 3 cols → applies to rows. WRONG for our purpose.
print(matrix + col)

# Reshape to a column
col2 = col.reshape(-1, 1)        # shape (3, 1)
print(matrix + col2)

Output of matrix + col2:

[[101 102 103]
 [204 205 206]
 [307 308 309]]

The rules of broadcasting¶

NumPy compares shapes right-to-left. Two dimensions are compatible if:

They are equal, OR
One of them is 1.

If they're incompatible → ValueError.

shape A	shape B	Compatible?	Result shape
`(3, 4)`	`(4,)`	yes	`(3, 4)`
`(3, 4)`	`(3, 1)`	yes	`(3, 4)`
`(3, 4)`	`(1, 4)`	yes	`(3, 4)`
`(3, 4)`	`(3, 4)`	yes	`(3, 4)`
`(3, 4)`	`(3, 2)`	NO	error
`(3, 1, 5)`	`(4, 5)`	yes	`(3, 4, 5)`

Walk-through for (3, 4) + (4,):

A: (3, 4)
B:    (4,)        ← align right
              ↓
   (3, 4) — same!
   (4 == 4) — ok
   B has no first axis → treated as 1 → stretches to 3

A practical example — feature scaling¶

For ML, you often need to subtract the mean of each column and divide by the std:

import numpy as np

# 5 samples, 3 features
data = np.array([
    [1.0, 2.0, 100.0],
    [2.0, 3.0, 200.0],
    [3.0, 4.0, 300.0],
    [4.0, 5.0, 400.0],
    [5.0, 6.0, 500.0],
])

# Per-column statistics — shape (3,)
mean = data.mean(axis=0)
std  = data.std(axis=0)

print("mean:", mean)
print("std :", std)

# Broadcasting: (5,3) - (3,) → (5,3)
scaled = (data - mean) / std
print("\nscaled:")
print(scaled)
print("scaled.mean(axis=0):", scaled.mean(axis=0).round(4))
print("scaled.std(axis=0) :", scaled.std(axis=0).round(4))

Each column now has mean ≈ 0 and std ≈ 1. That's a one-liner thanks to broadcasting.

Outer product — every combination¶

A 1D column times a 1D row gives a 2D table of all products:

import numpy as np

x = np.arange(1, 6)          # shape (5,)
y = np.arange(1, 4)          # shape (3,)

# Reshape to column × row
table = x[:, None] * y[None, :]
print(table)
print("shape:", table.shape)   # (5, 3)

x[:, None] is shape (5, 1); y[None, :] is (1, 3). Broadcasting expands both → (5, 3).

Coordinate grids¶

import numpy as np

x = np.arange(-2, 3)
y = np.arange(-2, 3)

# Build all (x, y) combinations
xx, yy = np.meshgrid(x, y)
print("xx:")
print(xx)
print("yy:")
print(yy)

# Compute z = x² + y² at every grid point
z = xx**2 + yy**2
print("\nz:")
print(z)

Useful for plotting surfaces, image filters, mathematical functions.

When broadcasting fails — visualize the shapes¶

import numpy as np

a = np.zeros((3, 4))
b = np.zeros((3, 2))

try:
    a + b
except ValueError as e:
    print("Error:", e)

(3, 4) + (3, 2) — last dims 4 and 2 are not equal and neither is 1 → fails.

Fix it by aligning shapes explicitly (often with reshape or [:, None]).

More examples¶

Distance from each point to a center:

import numpy as np

points = np.array([
    [1, 2],
    [3, 4],
    [5, 6],
    [7, 8],
])
center = np.array([0, 0])

# (4, 2) - (2,) → (4, 2)
diffs = points - center
print("diffs:")
print(diffs)

# Euclidean distance per point
distances = np.sqrt((diffs ** 2).sum(axis=1))
print("distances:", distances)

Multiplication table:

import numpy as np

n = 10
nums = np.arange(1, n + 1)
table = nums[:, None] * nums[None, :]
print(table)

Broadcasting and memory¶

Broadcasting doesn't actually copy data — it pretends to. NumPy uses clever strides to reuse memory. So broadcasting is fast and memory-efficient.

Cheatsheet — common patterns¶

Goal	Shape match
Add scalar to array	`arr + 5`
Add row vector to every row	`arr (M,N) + row (N,)`
Add col vector to every col	`arr (M,N) + col[:, None] (M,1)`
Outer product	`a[:, None] * b[None, :]`
Normalize each column	`(arr - arr.mean(axis=0)) / arr.std(axis=0)`
Normalize each row	`(arr - arr.mean(axis=1, keepdims=True)) / arr.std(axis=1, keepdims=True)`

keepdims=True is the trick — keeps the reduced dim as size 1 so it broadcasts back.

Common pitfalls¶

❗ Forgetting keepdims=True — arr.sum(axis=1) reduces shape from (M, N) to (M,). Then arr - that broadcasts incorrectly. Use arr.sum(axis=1, keepdims=True) (shape (M, 1)) so it broadcasts back to (M, N).
❗ Adding a row instead of a column — always check the shapes. (M, N) + (M,) will FAIL or do the wrong thing. Reshape to (M, 1).
❗ Operator precedence in & / | — wrap parts in parens: (a > 1) & (a < 5), not a > 1 & a < 5.
❗ Mixing dtypes — int + float → float. Sometimes surprising.

Practice¶

What does this print?

Expected: [[11 22 33] [14 25 36]]

import numpy as np
m = np.array([[1, 2, 3], [4, 5, 6]])
row = np.array([10, 20, 30])
print(m + row)

Add col to every column (not every row) of the matrix

Expected: [[101 102 103] [204 205 206] [307 308 309]]

import numpy as np
m = np.array([[1,2,3],[4,5,6],[7,8,9]])
col = np.array([100, 200, 300])
print(m + col)        # bug: this broadcasts col across ROWS — reshape to (3,1)

Quiz — Quick check¶

What you remember

Q1. Broadcasting (3, 4) + (4,) produces a result of shape…

(3, 4)
(3, 1)
(4, 4)
Error — shapes don't match

Why: NumPy aligns shapes right-to-left. (4,) becomes (1, 4), then stretches to (3, 4) — same as the first operand.

Q2. Why does (3, 4) + (3, 2) fail?

Last dims (4 and 2) are neither equal nor 1
First dims don't match
NumPy doesn't broadcast at all
You need np.broadcast

Why: Two dimensions can broadcast only if they're equal OR one is 1. Neither applies to 4 vs 2, so NumPy raises ValueError.

Q3. When normalizing columns of a (M, N) array with (arr - arr.mean(axis=0)) / arr.std(axis=0), which axis is correct?

axis=0 — collapses rows, gives per-column stats
axis=1
axis=-1
No axis needed

Why: axis=0 reduces along the first axis (rows), producing per-column statistics. Then broadcasting expands the result back to (M, N).

Common doubts¶

Does broadcasting actually copy memory?

No — it uses strides to pretend the smaller array is bigger. The data isn't duplicated. That's why broadcasting is both fast and memory-efficient.

Why do people use keepdims=True so often?

Because reductions collapse a dimension. After arr.mean(axis=1) for shape (M, N), you get (M,). Subtracting that from the original via broadcasting may fail or do the wrong thing. keepdims=True keeps the dimension as size 1 ((M, 1)), which broadcasts cleanly back to (M, N).

When should I reach for np.meshgrid vs broadcasting x[:, None] * y[None, :]?

They achieve the same thing. np.meshgrid is more explicit and produces both xx and yy matrices — better for plotting. The [:, None] trick is shorter and produces just the result. Use whichever is clearer in context.

What's next¶

→ Aggregations — sum, mean, std, axis