Matrices

In Chapter 20, we explored vectors as mathematical objects representing magnitude and direction. We saw how vectors model positions, velocities, forces, and high-dimensional data points. While vectors represent individual points or directions in space, matrices provide the mathematical framework for organizing collections of data and performing operations on multiple vectors simultaneously.

What are matrices?

A matrix is a rectangular grid of numbers arranged in rows and columns. While a vector is a single list of numbers representing a point or direction in space, a matrix is a collection of multiple vectors organized together. We can think of a matrix as a table where each row or column is itself a vector.

The dimensions of a matrix tell us its shape. A 2×3 matrix means 2 rows and 3 columns:

[1  2  3]
[4  5  6]

Can a matrix have just a single vector? Yes. A matrix can be a single row or a single column. A 1×3 matrix (one row, three columns) looks like this:

[1  2  3]

This is mathematically identical to the vector [1, 2, 3]. A 3×1 matrix (three rows, one column) represents the same data oriented vertically:

[1]
[2]
[3]

Both are valid matrices because they have the rectangular row-and-column structure, even though one dimension equals 1. In formal linear algebra, vectors are often represented as single-row or single-column matrices. This representation makes some operations work consistently, since object dimensions must align properly. When we write [1, 2, 3] in code, we’re thinking “vector” conceptually, but mathematically it can be treated as either a 1×3 or 3×1 matrix depending on the context.

Vectors as matrices

In Chapter 20, we worked with vectors as one-dimensional arrays like [3.0, 4.0]. Mathematically, this can also be viewed as a single-column matrix (a 2×1 matrix):

import Quiver

// Three equivalent representations of the same data
let vector = [3.0, 4.0]                // As vector (Quiver's approach)
let columnVector = [[3.0], [4.0]]      // As 2×1 matrix (column)
let rowVector = [[3.0, 4.0]]           // As 1×2 matrix (row)

This dual perspective bridges the gap between vectors and matrices. A vector is simply a special case of a matrix where one dimension equals 1. Quiver’s reshaped method makes this relationship concrete — we can convert between shapes without losing any data:

import Quiver

// Start with a flat vector of 6 elements
let vector = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]

// Reshape into a 2×3 matrix
let matrix = vector.reshaped(rows: 2, columns: 3)
// [[1.0, 2.0, 3.0],
//  [4.0, 5.0, 6.0]]

// Flatten back to the original vector
let restored = matrix.flattened()
// [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]

The same six values flow between a 1×6 vector and a 2×3 matrix — nothing is added or removed. Reshaping simply reorganizes how the data is arranged in rows and columns, filling the new shape row by row.

What matrices represent

What do these numbers represent? That depends entirely on the application. In a dataset, each row might represent a different person, and each column a different measurement. The first row [1, 2, 3] could represent Person A’s age (1), height (2), and weight (3). The second row [4, 5, 6] represents Person B’s measurements. Now we have organized data for 2 people across 3 attributes.

Consider a real-world example: tracking three athletes across two fitness metrics (running speed and jump height). Each athlete is a row, each metric is a column:

          Speed  Jump
Athlete A   8.5   2.1
Athlete B   7.2   2.4
Athlete C   9.1   1.9

This is a 3×2 matrix. The numbers represent measured values. We could extract Athlete B’s performance as a vector [7.2, 2.4], or compare all athletes’ speeds by looking at the first column [8.5, 7.2, 9.1].

Transforming objects

In many cases a matrix describes a transformation. This could be how to move, rotate, or scale objects in space. Each number specifies how much one dimension affects another during the transformation. The matrix [[0, -1], [1, 0]] doesn’t represent data about objects—it represents an operation that rotates any vector by 90 degrees. This will be explored in more detail in Chapter 22.

Matrix representation in Quiver

Quiver extends Swift’s array syntax to work with matrices naturally. Since Quiver treats arrays as vectors and arrays of arrays as matrices, we can work with mathematical structures using familiar Swift syntax.

Creating matrices

In Quiver, matrices are simply nested arrays:

import Quiver

// Create a 2×3 matrix
let matrix = [
    [1.0, 2.0, 3.0],  // Row 1
    [4.0, 5.0, 6.0]   // Row 2
]

// Access individual elements
let value = matrix[0][1]  // 2.0 (row 0, column 1)

// Access entire rows
let firstRow = matrix[0]  // [1.0, 2.0, 3.0]

Inspecting matrix dimensions

Once we have a matrix, we often need to know its shape — how many rows and columns it contains. Quiver provides two computed properties for this. The .shape property returns a named tuple, while .size returns the total element count across all dimensions:

import Quiver

let matrix = [
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0]
]

let (rows, columns) = matrix.shape
// rows == 2, columns == 3

matrix.size  // 6 (total elements)

These properties become essential when validating reshape operations, since the total element count must remain the same before and after reshaping.

Matrix initialization

Quiver provides convenient functions for creating matrices with common initial values:

import Quiver

// Create matrices filled with zeros
let zeros = [Double].zeros(3, 4)     // 3×4 matrix of zeros
// [[0.0, 0.0, 0.0, 0.0],
//  [0.0, 0.0, 0.0, 0.0],
//  [0.0, 0.0, 0.0, 0.0]]

// Create matrices filled with ones
let ones = [Double].ones(2, 3)       // 2×3 matrix of ones
// [[1.0, 1.0, 1.0],
//  [1.0, 1.0, 1.0]]

// Create matrices with custom initial values
let fives = [Double].full(2, 2, value: 5.0)
// [[5.0, 5.0],
//  [5.0, 5.0]]

// Identity matrix (1s on diagonal, 0s elsewhere)
let identity = [Double].identity(3)
// [[1.0, 0.0, 0.0],
//  [0.0, 1.0, 0.0],
//  [0.0, 0.0, 1.0]]

// Diagonal matrix from vector values
let diagonal = [Double].diag([3.0, 2.0, 1.0])
// [[3.0, 0.0, 0.0],
//  [0.0, 2.0, 0.0],
//  [0.0, 0.0, 1.0]]

These initialization functions are essential when setting up matrices for data processing or machine learning. The zeros and ones functions are ideal for allocating storage before filling matrices with computed values. Identity and diagonal matrices play special roles in linear transformations, which we’ll explore in Chapter 22.

Accessing columns

While rows are natural array elements, extracting columns requires Quiver’s .column(at:) method:

import Quiver

let data = [
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0]
]

// Extract the second column (index 1)
let column = data.column(at: 1)  // [2.0, 5.0]

This becomes essential when working with data matrices where each column represents a different feature or variable.

Basic matrix operations

Matrices support several fundamental operations that enable data manipulation and mathematical computations. Matrix addition combines two matrices by adding corresponding elements. Both matrices must have the same dimensions:

[1  2]   [5  6]   [6   8]
[3  4] + [7  8] = [10  12]

The result is computed element-wise: result[i][j] = A[i][j] + B[i][j]. Matrix subtraction works identically, but with subtraction instead of addition.

This operation is useful when combining datasets or merging information. For example, if two sensors collect measurements in matrix form, adding their matrices combines their readings. In machine learning, gradient matrices are often added when updating model parameters.

import Quiver

let readings1 = [
    [1.0, 2.0],
    [3.0, 4.0]
]

let readings2 = [
    [0.5, 0.3],
    [0.7, 0.2]
]

// Combine sensor readings using element-wise addition
let combined = readings1.add(readings2)
// Result: [[1.5, 2.3], [3.7, 4.2]]

Quiver provides named methods for element-wise matrix arithmetic, making operations explicit and readable.

Scalar broadcasting

Scalar broadcasting applies a single value to every element of a matrix. This operation appears constantly in data science and machine learning—standardizing data, applying scaling factors, or adding biases.

import Quiver

let matrix = [[100.0, 200.0], [300.0, 400.0]]

// Standardize data: subtract mean, divide by standard deviation (z-score)
let standardized = (matrix - 250.0) / 150.0
// Result: [[-1.0, -0.33], [0.33, 1.0]]

// Scale all values
let scaled = matrix * 0.5
// Result: [[50.0, 100.0], [150.0, 200.0]]

// Add offset
let offset = matrix + 10.0
// Result: [[110.0, 210.0], [310.0, 410.0]]

Quiver broadcasts scalars across matrices automatically, making data transformations concise and readable.

Transpose

Transposing a matrix flips it along its diagonal — rows become columns and columns become rows. A 2×3 matrix becomes a 3×2 matrix, with every element at position (i, j) moving to position (j, i):

Original (2×3)      Transposed (3×2)
[1  2  3]           [1  4]
[4  5  6]           [2  5]
                    [3  6]

This operation is fundamental when data is organized by rows but we need to work with columns, or when matrix dimensions need to align for multiplication. In machine learning, transposing allows us to switch between representing each sample as a row and representing each feature as a row, depending on what an algorithm requires:

import Quiver

// Athlete data: rows are athletes, columns are metrics
let athletes = [
    [8.5, 2.1],   // Athlete A: speed, jump height
    [7.2, 2.4],   // Athlete B
    [9.1, 1.9]    // Athlete C
]

// Transpose to get columns as rows
let byMetric = athletes.transposed()
// [[8.5, 7.2, 9.1],   -- all speeds
//  [2.1, 2.4, 1.9]]   -- all jump heights

// Now we can compute per-metric statistics directly
byMetric[0].mean()  // 8.27 average speed
byMetric[1].std()   // 0.21 jump height variability

Building algorithmic intuition

Matrices provide a powerful framework for organizing multi-dimensional data into structured rectangular arrays. The row-column format naturally represents relationships — whether sensor readings over time, feature values across samples, or word counts across documents. This organization enables systematic operations on entire datasets rather than processing individual values one at a time.

The operations we’ve covered here — creation, reshaping, element-wise arithmetic, broadcasting, and transposing — form the structural foundation for matrices. In Chapter 22, we’ll see how matrix multiplication transforms vectors through space, enabling scaling, rotation, and composition of transformations. Chapter 23 then applies these tools to measure how similar or different data points are, connecting matrix operations directly to recommendation systems and semantic search.