Creating Arrays¶

NumPy has many ways to create an array. You'll use the same handful 90% of the time.

From a Python list — `np.array()`¶

import numpy as np

a = np.array([1, 2, 3, 4, 5])
print(a)
print(type(a))

From a list of lists → 2D array:

import numpy as np

b = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
print(b)
print("shape:", b.shape)

`np.zeros()` — all zeros¶

import numpy as np

print(np.zeros(5))             # 1D, length 5
print(np.zeros((2, 3)))         # 2D, 2 rows × 3 cols
print(np.zeros((2, 2, 2)))      # 3D

`np.ones()` — all ones¶

import numpy as np

print(np.ones(4))
print(np.ones((3, 3)))

`np.full()` — filled with any value¶

import numpy as np

print(np.full(5, 7))                # five 7s
print(np.full((2, 3), 3.14))         # 2x3 of 3.14

`np.arange()` — like Python's `range()`¶

import numpy as np

print(np.arange(10))                 # 0 to 9
print(np.arange(1, 11))              # 1 to 10
print(np.arange(0, 20, 2))           # step of 2
print(np.arange(1, 0, -0.1))         # floats also work

`np.linspace()` — evenly spaced points¶

linspace(start, stop, n) — n points from start to stop, inclusive on both ends.

import numpy as np

print(np.linspace(0, 1, 5))       # 5 points from 0 to 1
print(np.linspace(0, 10, 11))     # 11 points from 0 to 10
print(np.linspace(-1, 1, 9))      # 9 points from -1 to 1

arange vs linspace: - arange — give a step, get however many points fit. - linspace — give the number of points, get the step automatically.

`np.eye()` — identity matrix¶

import numpy as np

print(np.eye(4))            # 4x4 identity
print(np.eye(3, 5))         # 3x5 with diagonal of ones

`np.empty()` — uninitialized¶

Faster than zeros because it doesn't bother filling with zeros — but the contents are garbage (whatever was in memory). Only useful when you're about to overwrite the whole thing.

import numpy as np

# Don't rely on the values!
print(np.empty((2, 3)))

`np.random` — random arrays¶

import numpy as np

# Set the seed for reproducibility
rng = np.random.default_rng(seed=42)

print(rng.random(5))                       # 5 floats in [0, 1)
print(rng.integers(0, 100, size=10))       # 10 ints in [0, 100)
print(rng.normal(loc=0, scale=1, size=5))  # 5 samples from N(0, 1)
print(rng.choice(["red", "green", "blue"], size=5))

We'll deep-dive into random in chapter 11.

`dtype` — choose the number type¶

By default, integer lists → int64, float lists → float64. You can pick:

import numpy as np

a = np.array([1, 2, 3], dtype=np.float32)
print(a, a.dtype)

b = np.array([1.5, 2.5, 3.5], dtype=np.int32)
print(b, b.dtype)         # decimals dropped!

# Common dtypes
print(np.zeros(3, dtype=bool))            # all False
print(np.zeros(3, dtype=np.uint8))        # 0-255
print(np.array(["a", "bb", "ccc"]).dtype) # '<U3' (Unicode str, 3 chars)

Smaller dtypes = less memory. A 1-million-element float64 array is 8 MB; float32 is 4 MB; int8 is 1 MB. For ML on big data, choosing the right dtype matters.

`like` versions — match the shape of another array¶

import numpy as np

original = np.array([[1, 2, 3], [4, 5, 6]])

print(np.zeros_like(original))
print(np.ones_like(original))
print(np.full_like(original, 9))

Quick-reference¶

Need	Function
From a list	`np.array(lst)`
All zeros	`np.zeros(shape)`
All ones	`np.ones(shape)`
Any constant	`np.full(shape, value)`
Range with step	`np.arange(start, stop, step)`
N evenly-spaced points	`np.linspace(start, stop, n)`
Identity matrix	`np.eye(n)`
Random	`np.random.default_rng().random(shape)`
Match shape of another	`np.zeros_like(arr)`

Mini-exercise¶

Create a 5×5 array where the diagonal is 1, everything else is 0. Two ways:

import numpy as np

# Way 1
print(np.eye(5))

# Way 2 — manual
a = np.zeros((5, 5), dtype=int)
for i in range(5):
    a[i, i] = 1
print(a)

Common pitfalls¶

❗ np.array(1, 2, 3) doesn't work — must wrap in a list: np.array([1, 2, 3]).
❗ Mixed-type list — np.array([1, 2.0, "3"]) produces a string array.
❗ np.empty() for working data — it has garbage in it. Use zeros unless performance is critical.
❗ arange() with floats — float arithmetic isn't exact. np.arange(0, 1, 0.1) may have 10 or 11 elements depending on rounding. Use linspace for floats.

Practice¶

What does this print?

Expected: [0. 0.25 0.5 0.75 1. ]

import numpy as np
print(np.linspace(0, 1, 5))

Create exactly 11 evenly-spaced points from 0 to 10 (inclusive)

Expected: [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]

import numpy as np
print(np.arange(0, 10))     # bug: stops before 10, not inclusive; use linspace

Quiz — Quick check¶

What you remember

Q1. What's the shape of np.zeros((2, 3))?

(6,)
(2, 3) — 2 rows, 3 columns
(3, 2)
2

Why: Pass a tuple to specify a multi-dimensional shape. (2, 3) means 2 in the first axis (rows), 3 in the second (cols).

Q2. np.linspace(0, 10, 5) and np.arange(0, 10, 2.5) — which gives identical output?

They're identical
No — linspace includes the endpoint (10); arange excludes it
arange is for floats only
linspace is deprecated

Why: linspace(0, 10, 5) → [0, 2.5, 5, 7.5, 10] (5 points, endpoint included). arange(0, 10, 2.5) → [0, 2.5, 5, 7.5] (stops before 10).

Q3. Why prefer np.linspace over np.arange when working with floats?

Float arithmetic with arange can produce unexpected element counts due to rounding
linspace is faster
arange only works on integers
linspace uses less memory

Why: np.arange(0, 1, 0.1) may give 10 or 11 elements depending on float precision. linspace lets you specify the count exactly.

Common doubts¶

Why do I need np.zeros((2, 3)) instead of np.zeros(2, 3)?

NumPy expects a single argument for the shape — a tuple. np.zeros((2, 3)) says "shape is (2, 3)". np.zeros(2, 3) would mean two positional args, which zeros interprets differently (second arg is dtype) — confusing error.

What's np.empty for if its contents are garbage?

Performance. np.empty doesn't waste time zeroing memory. Use it only when you'll immediately overwrite every element (e.g. filling in a loop). For most code, np.zeros is the safe default.

When should I use a smaller dtype like float32 instead of float64?

For ML on large datasets — float32 halves memory and on modern CPUs/GPUs is often faster than float64. For scientific computing where precision matters (e.g. accumulating sums of many small numbers), stay with float64.

What's next¶

→ Array Attributes — shape, dtype, ndim

Creating Arrays¶

From a Python list — np.array()¶

np.zeros() — all zeros¶

np.ones() — all ones¶

np.full() — filled with any value¶

np.arange() — like Python's range()¶

np.linspace() — evenly spaced points¶

np.eye() — identity matrix¶

np.empty() — uninitialized¶

np.random — random arrays¶

dtype — choose the number type¶

like versions — match the shape of another array¶