7.0 NumPy

We have already encountered the sqrt function, which turned out not to be available in the core Python programming language. In order to make this function available to our programs we had to add the statement:

In [76]:
import math

at the top of our code, and then prefix the square-root function with the word 'math', like this:

In [77]:
import math

math.sqrt(2)
Out[77]:
1.4142135623730951

As it has already been mentioned in one of the earlier lectures, 'math' is a module (or library) that contains various mathematical functions (e.g., trig functions and their inverses). Python comes with a large number of such modules, each of which is designed to provide users with specialised tools for performing certain tasks. These have to be imported in the workspace before you can use them (that's precisely what we did above).

An important module for mathematical programming is called numpy, which forms the main object of this unit.

In [78]:
import numpy

numpy.sqrt(2)     # functions need to be prefixed with 
                  # the names of the modules in which they reside
Out[78]:
1.4142135623730951

Above, we have imported 'numpy', and then used the square-root function from this module. Is this the same as the previous square-root function from the 'math' module? The answer is YES and NO. They clearly both give the same values for $\sqrt{2}$, but the version from 'numpy' has additional features, as you can see from the two examples included below:

In [79]:
import numpy

numpy.sqrt([4, 9, 16])  # can take the square root of all of these values in one go....
Out[79]:
array([ 2.,  3.,  4.])
In [80]:
import math

math.sqrt([4, 9, 16])  # Python complains because it can only calculate
                       # one square root at a time.....
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-80-1b16da715550> in <module>()
      1 import math
      2 
----> 3 math.sqrt([4, 9, 16])  # Python complains because it can only calculate
      4                        # one square root at a time.....

TypeError: a float is required

Here's an important rule to remember:

After you have imported a module, you can call its functions by giving the module name, a period, and then the name of the desired function.

There is another way to import functions. Note that this is NOT the recommended way of doing things.... If you want to access all the functions from a given module (say, 'numpy') you can follow the pattern:

In [81]:
from numpy import *

This statement will make available to your workspace all of the functions in the module 'numpy'. This gives you the freedom of using syntax like:

In [82]:
from numpy import *

sqrt(2)      # this is the SQRT function from 'numpy'
Out[82]:
1.4142135623730951

i.e., without using prefixes. This can get very confusing when you have two square-root functions in your workspace (remember the one from the 'math' module?).

As a compromise, there is a middle ground. You can give the module you are importing any nickname you want:

In [89]:
import numpy as np      # 'np' becomes a nickname for 'numpy'

np.sqrt(2)
Out[89]:
1.4142135623730951

There may be times when you only need some very specific functions from a module. In that case you can use the following syntax:

In [90]:
from numpy import sqrt, exp

sqrt(3), exp(3)        # display tuple on the screen
Out[90]:
(1.7320508075688772, 20.085536923187668)

Finally, there is another useful way to handle the import statements. The (harder) example included below shows that you can give a custom nickname to any function you import:

In [91]:
# this also illustrates the concept of a module within a module....

from numpy.random import random as rng

rng(3)     # produces 3 random numbers between 0 and 1
Out[91]:
array([ 0.30520236,  0.72215841,  0.91768826])

In the above code, 'rng' was the name we chose to label the random number generator from 'numpy'. Here, random is a module within the module numpy. This module is referred to as numpy.random. Within this module we have a function called 'random', so the appellative for this would be numpy.random.random (phew!). Rather than use this, we opted for the shorter name 'rng' by using the syntax outlined above.

Since random numbers came up in this example, you may want to know that there are other ways to generate random numbers in Python. In particular, if you want to generate random integers within a specified range, you can do that by importing the module random (this is not 'numpy's' 'random' module!) and use one of its functions, randint:

In [86]:
import random

print(random.randint(4,12))    # returns a random integer between 4 and 12 (both included)
11
In [88]:
# general syntax for 'randint':

random.randint(start, stop)

start = an integer specifying at which value to start
stop = an integer specifying at which value to end

(the integer returned can be 'start' or 'stop' as they are both included in the specified range)

7.1 Numerical (numpy) arrays

A Python list is NOT the same as a numerical array.
(although we can still use Python lists to do quite a bit of numerical mathematical work).

This type of arrays are found in the numpy module.

Here are some examples of such arrays:

In [17]:
import numpy as np

a = np.zeros(4)      # array with 4 elements/ all zeros
print('a=', a)

b = np.ones(6)      # array with 6 elements/all ones
print('b=', b)
a= [ 0.  0.  0.  0.]
b= [ 1.  1.  1.  1.  1.  1.]

In the first example, the function np.zeros requires one argument that describes the shape of the array; the integer $4$ in this case specifies an array with four elements; the function sets all the entries to zero.

What about arrays that have arbitrary numbers as their elements? How do you "fill" them up with numbers? Fortunately, it's very easy to perform such tasks. You create a list of the numbers you want to appear in your array, and then use the function np.array() as seen in the example included below:

In [25]:
a = [3.4, 6, 2.1, 7.2, 2.9]       # (standard) list of numbers
b = np.array(a)                   # b is now a numerical array

print(type(a))
print(type(b),'\n')

# This is how we access the elements of 'b':

i = 0
for item in b:
    print('b[',i,'] =',b[i])
    i += 1 
<class 'list'>
<class 'numpy.ndarray'> 

b[ 0 ] = 3.4
b[ 1 ] = 6.0
b[ 2 ] = 2.1
b[ 3 ] = 7.2
b[ 4 ] = 2.9

We will often want to create numerical arrays of evenly spaced values over some range. NumPy comes with two useful built-in functions: np.linspace and np.arange:

In [92]:
# examples illustrating the 'np.arange' function....

import numpy as np

a = np.arange(1,10)
b = np.arange(5)
c = np.arange(2.4, 5.1, 0.25)

print('a =', a)
print('b =', b)
print('c =', c)
a = [1 2 3 4 5 6 7 8 9]
b = [0 1 2 3 4]
c = [ 2.4   2.65  2.9   3.15  3.4   3.65  3.9   4.15  4.4   4.65  4.9 ]
In [29]:
print(type(a))     # the "lists" displayed above are in fact numerical array....
<class 'numpy.ndarray'>

The np.arange syntax is

$$ {\tt{np.arange}}(start, end, increment) $$

The increment and the starting value ('start') are optional. The default start value is $0$; the default increment is $1$. The 'end' value is NOT included in the array. To find out more, type:

In [32]:
 help(arange)     # on the command line, then press 'ENTER'
Help on built-in function arange in module numpy.core.multiarray:

arange(...)
    arange([start,] stop[, step,], dtype=None)
    
    Return evenly spaced values within a given interval.
    
    Values are generated within the half-open interval ``[start, stop)``
    (in other words, the interval including `start` but excluding `stop`).
    For integer arguments the function is equivalent to the Python built-in
    `range <http://docs.python.org/lib/built-in-funcs.html>`_ function,
    but returns an ndarray rather than a list.
    
    When using a non-integer step, such as 0.1, the results will often not
    be consistent.  It is better to use ``linspace`` for these cases.
    
    Parameters
    ----------
    start : number, optional
        Start of interval.  The interval includes this value.  The default
        start value is 0.
    stop : number
        End of interval.  The interval does not include this value, except
        in some cases where `step` is not an integer and floating point
        round-off affects the length of `out`.
    step : number, optional
        Spacing between values.  For any output `out`, this is the distance
        between two adjacent values, ``out[i+1] - out[i]``.  The default
        step size is 1.  If `step` is specified, `start` must also be given.
    dtype : dtype
        The type of the output array.  If `dtype` is not given, infer the data
        type from the other input arguments.
    
    Returns
    -------
    arange : ndarray
        Array of evenly spaced values.
    
        For floating point arguments, the length of the result is
        ``ceil((stop - start)/step)``.  Because of floating point overflow,
        this rule may result in the last element of `out` being greater
        than `stop`.
    
    See Also
    --------
    linspace : Evenly spaced numbers with careful handling of endpoints.
    ogrid: Arrays of evenly spaced numbers in N-dimensions.
    mgrid: Grid-shaped arrays of evenly spaced numbers in N-dimensions.
    
    Examples
    --------
    >>> np.arange(3)
    array([0, 1, 2])
    >>> np.arange(3.0)
    array([ 0.,  1.,  2.])
    >>> np.arange(3,7)
    array([3, 4, 5, 6])
    >>> np.arange(3,7,2)
    array([3, 5])

The other function mentioned above is np.linspace. The syntax for this is:

$$ {\tt{np.linspace}}(A,\, B,\, N) $$

This creates a one-dimensional array with exactly $N$ evenly spaced entries. The last value (i.e., $B$) is included in the array. The disadvantage is that you don't get to specify the spacing between consecutive elements of such arrays (however, you can still work that out from the information supplied to the function: $A$, $B$ and $N$).

Some advice:

When creating a sequence of $x$-values to compute the values of a function over a particular range, np.linspace is the appropriate choice, as it allows you to explicitly choose the start and end of the range, as well as the number of points in your $x$-sequence.

However, when the EXACT spacing between the points is important, np.arange is the appropriate function. Let's say you need a sequence of $x$-values between $0$ and $1$, which are a certain distance apart from each other, (say) $dx$. You can write something like this:

In [34]:
# this example shows how you can still specify the endpoint
# of your sequence of values by using 'np.arange'

import numpy as np

xstart, xend = 0, 1
dx = 0.1                                  # for illustrative purposes
xvals = np.arange(xstart, xend+dx, dx)

print(xvals)               # note that these values are equidistant
[ 0.   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1. ]

7.2 Two-dimensional (numpy) arrays

The arrays we have encountered so far were one-dimensional; they are sometimes referred to as linear arrays. In Mathematics we often need two-dimensional arrays, also known as rectangular arrays.

In [38]:
# Let's try something new....

u = np.zeros((3,2))           # the parameter passed to 
                              # the function is a tuple

print(u)
[[ 0.  0.]
 [ 0.  0.]
 [ 0.  0.]]

The result is an array, whose entries are themselves arrays. The "outer" array has 3 elements, while the "inner" arrays have only 2 elements. Clearly, these numbers are related to the tuple $(3,2)$ that was passed to the function 'np.zeros'. The elements of this array are arranged in rows (3) and columns (2).

The variable 'u' now refers to a ndarray (an 'n-dimensional NumPy array'). In Mathematics this would correspond to a $3\times 2$ matrix.

Next, let's see how we can have add some non-zero elements to one of these 'ndarrays':

In [43]:
import numpy as np

v = np.array([[2, 3, 4],[5, 6, 7]])
print(v)
[[2 3 4]
 [5 6 7]]

If you want to access the elements of this ndarray just use the fact that this is a list of lists. To access the first and the second rows we write

In [93]:
import numpy as np

v = np.array([[2, 3, 4],[5, 6, 7]])

# new code starts here....:

print('First row:', v[0])
print('Second row:', v[1])

# ... while for accessing individual elements:
print(30*'-')
print('\n2nd element in the 1st row:', v[0][1])
print('\n3rd element in the 2nd row:', v[1][2],'\n')


# these commands are equivalent to the ones immediately above:
print(30*'-')
print('\n2nd element in the 1st row:', v[0, 1])
print('\n3rd element in the 2nd row:', v[1, 2],'\n')
First row: [2 3 4]
Second row: [5 6 7]
------------------------------

2nd element in the 1st row: 3

3rd element in the 2nd row: 7 

------------------------------

2nd element in the 1st row: 3

3rd element in the 2nd row: 7 

Both

$$ A[i][k]\qquad\mbox{and}\qquad A[i,\,k] $$

return the entry at the intersection of row $(i+1)$ and column $(k+1)$. This is best understood as thinking of $A[i, k]$ as the entry $i$ steps down and $k$ steps to the right from the upper-left corner of the array.

7.3 Slicing for ndarrays





It works pretty much the same way as for lists. Here are some examples:

In [57]:
# create an array that has 3 columns and 3 rows:
a = np.array([[1,2,3],[4,5,6],[7,8,9]])

# display the array:
print(a,'\n')

# display the last column (index=2):
print(a[:,2],'\n')

# display the last row (index=2):
print(a[2,:])
[[1 2 3]
 [4 5 6]
 [7 8 9]] 

[3 6 9] 

[7 8 9]

A colon by itself represents every allowed value of an index (i.e., start=0, end=-1, stride=1).

a[0,:] = first row
a[1,:] = second row, etc

a[:,0] = first column
a[:,1] = second column, etc

In [68]:
# More examples:

import numpy as np

# the array below has 4 rows and 6 columns:
m = np.array([[0, 1, 2, 3, 4, 5], 
              [6, 7, 8, 9, 10, 11],
              [12, 13, 14, 15, 16, 17],
              [18, 19, 20, 21, 22, 23]])

# display the full array:
print(30*'*', '\n',m,'\n')

# display a sub-array (starting from column 3, row 3)
print(30*'*', '\n', m[2:, 2:],'\n')

# display every other element starting from the second row/columns
print(30*'*','\n', m[1::2, 1::2],'\n')

#display every other element in the array:
print(30*'*', '\n', m[::2, ::2])
****************************** 
 [[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]] 

****************************** 
 [[14 15 16 17]
 [20 21 22 23]] 

****************************** 
 [[ 7  9 11]
 [19 21 23]] 

****************************** 
 [[ 0  2  4]
 [12 14 16]]
In [ ]:
# Explore further by using the following commands:

# m[::, ::]
# m[::, 1]
# m[1::2, 1:4]
# m[::, 3]
# m[::, 1]
# m[0:2, 0:2]

A common task when working with two dimensional arrays is related to traversing them. This means accessing each element stored in the array so that it can be processed in some way (e.g., maybe you want to check if the element is zero or has some other property). Traversing a two-dimensional array is most commonly done by using a double for-loop. Here is an example:

In [8]:
# in this example we initialise an array ('a') with zero elements
# and then assign each element the value 'i+j', where 'i' and 'j' are 
# the indexes used for locating a[i,j]

import numpy as np

a = np.zeros((3,3))        # initilize array
# traverse the array to change the values of its elements:
for i in range(3):
    for j in range(3):
        a[i, j] = i + j    # assign each element i+j

print(a)

# sum of elements in the 1st column:

# columns 1, 2, and 3:
col1, col2, col3 = a[:,0], a[:,1], a[:,2]       # slicing
print('\ncol 1:', col1)
print('\ncol 2:', col2)
print('\ncol 3:', col3)

# find the sum of the elements in the 1st column:
sum1 = 0.0
for i in range(len(col1)):
    sum1 +=  col1[i]
    
    
 # find the sum of the elements in the 2nd column:
sum2= 0.0
for i in range(len(col2)):
    sum2+=  col2[i]
 
print('')
print('Sum of elements, column 1:', sum1)
print('Sum of elements, column 2:', sum2)
[[ 0.  1.  2.]
 [ 1.  2.  3.]
 [ 2.  3.  4.]]

col 1: [ 0.  1.  2.]

col 2: [ 1.  2.  3.]

col 3: [ 2.  3.  4.]

Sum of elements, column 1: 3.0
Sum of elements, column 2: 6.0

Finally, here are some other things you can't do with normal lists. We are not going to use advanced features of numerical arrays, but you might want to be aware of some of them. For example, let's say you have a numerical array and you want to extract the values that are greater than some particular number, 2 (say).

In [94]:
import numpy as np

# create a numerical array:
w = np.array([23, -1.0, -5.0, 8.0, 16.0])

# extract the values greater than 2 and display them on the screen:
print(w[w>2])


# let's say you want to replace those values with 0:
w[w>2] = 0

# Check to convince yourself that this did happen:
print(w)
[ 23.   8.  16.]
[ 0. -1. -5.  0.  0.]

This, and much more, can be found on the official webpage of the NumPy module:

https://numpy.org/

Lots of useful tips can be found in this tutorial:

https://numpy.org/doc/stable/user/absolute_beginners.html