More Computation on Arrays: Broadcasting

So far we have seen how to use NumPy’s ufuncs to operate on arrays that are of the same dimension. Another means of harnessing the vectorization thet ufuncs offer us is through broadcasting.

Broadcasting is a set of rules that allow us to apply binary ufuncs, like addition, subtraction and multiplications, on arrays of different sizes. This is a powerful concept that can be incredibly useful if understood, or can lead to a total mess or copious error messages when applied without a concrete understanding of what is happening.

Let’s begin by importing NumPy:

import numpy as np
np.random.seed(1234567890)

The Main Idea

In a previous notebook we used ufuncs to perform operations on arrays of the same size. For example

x = np.array([10,9,8])
y = np.array([5, 5, 5])
x + y
array([15, 14, 13])

Broadcasting allows these operations to be done on different sized arrays. For example, we can think of the array y as a scalar valued 5 and add it to an array as follows:

x + 5
array([15, 14, 13])

Notice how we get the exact same result. Think od what we just did as an operation that stretches the scalar value 5 into an array [5,5,5] and then adds the result to x.

Note: This description of stretching the array is not actually what NumPy does, think of it as a mental model to think about the concept of broadcasting.

Broadcasting in this way also works on all higher dimensional arrays. For example we can add a one-dimensional array to a two-dimensional one:

z = np.ones((3,3))
z
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])
z + x
array([[11., 10.,  9.],
       [11., 10.,  9.],
       [11., 10.,  9.]])

In this context, broadcasting has ‘stretched’ the array x across the second dimension (the rows) to match the shape of z.

We can also broadcast across more than one array at a time:

a = np.arange(3)
b = np.arange(3)[:, np.newaxis]

print(a)
print(b)
[0 1 2]
[[0]
 [1]
 [2]]
a + b
array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

Here, we stretched both arrays to match a common shape, resulting in a two dimensional array.

Rules of Broadcasting

Now we have seen Broadcasting in action, let’s consider the rules that NumPy us using to determine how it operates on two arrays:

  • Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

  • Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

  • Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

Let’s look at these rules in more detail by way of some examples:

Example 1

matrix = np.ones((2, 3))
array = np.arange(3)

print(matrix, array)
[[1. 1. 1.]
 [1. 1. 1.]] [0 1 2]

Consider an operation on matrix and array. The shape of each is:

  • matrix.shape = (2, 3)

  • array.shape = (3,)

Rule 1 tells us that array has fewer dimensions, so we pad it on the left with ones:

  • matrix.shape -> (2, 3)

  • array.shape -> (1, 3)

Rule 2, tells us the first dimension disagrees, so we stretch itnsion to match:

  • matrix.shape -> (2, 3)

  • array.shape -> (2, 3)

Now the shapes match, and we see the output of a ufunc operation will be (2, 3):

matrix + array
array([[1., 2., 3.],
       [1., 2., 3.]])

Example 2

Consider again the case where both arrays need to be broadcast:

array_1 = np.arange(3).reshape((3, 1))
array_2 = np.arange(3)

Let’s look at what broadcasting is doing:

The shape of our arrays is:

  • array_1.shape = (3, 1)

  • array_2.shape = (3,)

Rule 1 pads array_2 with ones:

  • array_1.shape -> (3, 1)

  • array_2.shape -> (1, 3)

Rule 2 upgrades dimension of size 1 to match the corresponding size of the other array:

  • array_1.shape -> (3, 3)

  • array_2.shape -> (3, 3)

Now the shapes match, and an operation can be performed. The result is:

array_1 + array_2
array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

Example 3

Finally, here’s an array where the shapes are not compatible:

matrix = np.ones((3, 2))
array  = np.arange(3)

The shape of the arrays are

  • matrix.shape = (3, 2)

  • array.shape = (3,)

Rule 1 pads the shape of array with ones:

  • matrix.shape -> (3, 2)

  • array.shape -> (1, 3)

Rule 2 stretches the first dimension of array to match that of matrix:

  • matrix.shape -> (3, 2)

  • array.shape -> (3, 3)

From Rule 3, the shapes do not match, so a ufunc can not be performed between these two arrays:

matrix + array
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_2582/2107702776.py in <module>
----> 1 matrix + array

ValueError: operands could not be broadcast together with shapes (3,2) (3,) 

Challenge

  1. In Example 3 we saw that matrix and array where not compatible. What would happen if we padded array’s shape on the right rather than the left?

    • Work throught the broadcasting logic, and if the result would be compatible compute np.logaddexp(a, b). Note that this function computes log(exp(a) + exp(b))

  2. Take the array data = np.arange(12).reshape(2,6) and use the principles of broadcasting to demean the columns.

  3. Again take data, and this time demean along the column dimension

  4. Convert data to have zero mean and unit standard deviation along the column dimension

Solution

a[:, np.newaxis].shape
(3, 1)
M + a[:, np.newaxis]
array([[ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.]])
np.logaddexp(M, a[:, np.newaxis])
array([[ 1.31326169,  1.31326169],
       [ 1.69314718,  1.69314718],
       [ 2.31326169,  2.31326169]])
# broadcasting
xs = np.arange(12).reshape(2,6)
print(xs, '\n')
print(xs * 10, '\n')

# broadcasting just works when doing column-wise operations
col_means = xs.mean(axis=0)
print(col_means, '\n')
print(xs + col_means, '\n')

# but needs a little more work for row-wise operations
row_means = xs.mean(axis=1)[:, np.newaxis]
print(row_means)
print(xs + row_means)

# convert matrix to have zero mean and unit standard deviation using col summary statistics
print((xs - xs.mean(axis=0))/xs.std(axis=0))