Multiplying matrices and vectors

Vectors and matrices in arbitrary dimensions

We’ve dealt with two- and three-dimensional vectors (i.e., vectors in R2 and R3). We can generalize this concept to an arbitrary number of dimensions, say n dimensions. We refer to an n-dimensional vector as a vector in Rn and write it as an n-tuple of numbers:

x = (x1,x2,x3,...,xn ).
(1)
For example, a = (1, 6,-23, 0.23, 0, 400) is a vector in R6. (Don’t think there’s any practical use to dealing with dimensions higher than three? See this page for examples of why we often need to study Rn for n larger than 3.)

You can view a matrix simply as a generalization of a vector, where we arrange numbers in both rows and columns. Let’s keep the number of rows and columns arbitrary, letting m be the number of rows and n the number of columns. We refer to such a matrix as an m × n matrix and write it as

A = ⌊                     ⌋
  a11   a12  ...  a1n
|| a21   a22  ...  a2n ||
|   .    .    .    .  |
⌈   ..    ..    ..    ..  ⌉
  am1   am2  ... amn.
You have seen example matrices earlier, such as a 2 × 3 matrix.

Vectors as matrices

In many cases, we make our lives simpler by viewing a vectors as a special class of matrices. From looking at the above vector and matrix, the only apparent difference between vectors and matrices is that vectors have only one row while the matrices have multiple rows. However, there is one important twist (literally) that isn’t apparent above. When we view vectors as matrices, we actually view them as a rotated version of the standard form (equation (1)), writing an n-dimensional vector as a n × 1 column matrix

x = ⌊     ⌋
  x1
|| x2  ||
| x   |
||  .3 ||
⌈  ..  ⌉
  xn.
We often call x an n × 1 column vector, so use the terms “column vector” and “column matrix” synonymously. (Note that a column vector has many rows but only one column.)

Matrix-vector product

We define multiplication between a matrix A and a vector x (i.e., the matrix-vector product) only for the case when the number of columns in A equals the number of rows in x. So, if A is an m × n matrix (i.e., with n column), then the product Ax is defined for n × 1 column vectors x. If we let Ax = b, then b is an m × 1 column vector. In other words, the number of rows in A (which can be anything) determines the number of rows in the product b.

The general formula for a matrix-vector product is

Ax = ⌊                     ⌋
| a11   a12  ...  a1n |
| a21   a22  ...  a2n |
|⌈   ...    ...    ...    ...  |⌉

  am1   am2  ...  amn⌊     ⌋
|  x1 |
|  x2 |
|⌈   ... |⌉

   xn = ⌊                               ⌋
    a11x1 + a12x2 + ⋅⋅⋅ + a1nxn
||   a21x1 + a22x2 + ⋅⋅⋅ + a2nxn  ||
|               .               |
||               ..               ||
⌈  am1x1 + am2x2 + ⋅ ⋅⋅ + amnxn ⌉.
Although it may look confusing at first, the process of matrix-vector multiplication is actually quite simple. One takes the dot product of x with each of the rows of A. (This is why the number of columns in A has to equal the number of components in x.) The first component of the matrix-vector product is the dot product of x with the first row of A, etc. In fact, if A has only one row, the matrix-vector product is really a dot product in disguise, as described here.

For example, if

A = [            ]
   1  - 1 2
   0  - 3 1
and x = (2, 1, 0), then
Ax = [           ]
   1  - 1 2
   0  - 3 1⌊    ⌋
   2
⌈  1 ⌉

   0
= [                    ]
   2 ⋅ 1 - 1 ⋅ 1 + 0 ⋅ 2
   2 ⋅ 0 - 1 ⋅ 3 + 0 ⋅ 1
= [     ]
    1
   - 3.

Matrix-matrix product

Since we view vectors as column matrices, the matrix-vector product is simply a special case of the matrix-matrix product (i.e., a product between two matrices). Just like for the matrix-vector product, the product AB between matrices A and B is defined only if the number of columns in A equals the number of rows in B. In math terms, we say we can multiply an m × n matrix A by an n × p matrix B. (If p happened to be 1, then B would be an n × 1 column vector and we’d be back to the matrix-vector product.)

The product AB is an m×p matrix which we’ll call C, i.e., AB = C. To calculate the product B, we view B as a bunch of n × 1 column vectors lined up next to each other:

⌊                    ⌋
   b11  b12  ...  b1p
|  b    b   ...  b   |
||   2.1   22.   .    2.p ||
⌈   ..    ..   ..    ..  ⌉
   bn1 bn2  ...  bnp = ⌊ ⌊     ⌋ ⌊     ⌋    ⌊      ⌋⌋
| |  b11 | |  b12 |    |  b1p ||
| |  b21 | |  b22 |    |  b2p ||
|| ||   ... || ||   ... || ⋅⋅⋅||   ...  ||||
| |     | |     |    |      ||
⌈ ⌈ bn1 ⌉ ⌈  bn2⌉    ⌈  bnp ⌉⌉
Then each column of C is the matrix-vector product of A with the respective column of B. In other words, the component in the ith row and jth column of C is the dot product between the ith row of A and the jth column of B. In math, we write this component of C as cij = ai1b1j + ai2b2j + ⋅⋅⋅ + ainbnj.

I think an example makes the process clear. Let A be the 2 × 3 matrix

A = [              ]
    0   4  - 2
  - 4  - 3   0
and B be the 3 × 2 matrix
B = ⌊  0    1 ⌋
⌈         ⌉
   1  - 1
   2    3.
Then,
AB = [               ]
    0    4  - 2
  - 4  - 3    0⌊        ⌋
  0    1
⌈ 1  - 1 ⌉
  2    3
= [                                                ]
    0 ⋅ 0 + 4 ⋅ 1 - 2 ⋅ 2  0 ⋅ 1 + 4 ⋅ (- 1) - 2 ⋅ 3
  - 4 ⋅ 0 - 3 ⋅ 1 + 0 ⋅ 2 - 4 ⋅ 1 - 3 ⋅ (- 1) + 0 ⋅ 3
= [                          ]
  0 + 4 - 4      0 - 4 - 6
  0 - 3 + 0    - 4 + 3 + 0
= [   0  - 10 ]

  - 3   - 1.

Want more examples?