Exploring NumPy
Bhaskar S | 12/23/2015 |
Introduction
NumPy is a general purpose, high performance, numerical/scientific computing extension module for Python. At the core of NumPy is the support for large n-dimensional arrays and matrices that enable efficient mathematical operations for Linear Algebra and Statistics.
The main features of NumPy multi-dimensional arrays (matrices) can be summarized as follows:
Fixed size at creation time
Homogeneous data type for all elements
Support for various mathematical operations
Installation and Setup
To make it easy and simple, we choose the open-source Anaconda Python distribution, which includes all the necessary Python packages for science, math, & engineering computations as well as statistical data analysis.
Download the Python 3 version of the Anaconda distribution.
Extract the downloaded archive to a directory, say, /home/abc/anaconda3.
Finally, update the PATH environment variable to include /home/abc/anaconda3/bin.
Hands-on with NumPy
Let us jump right in to get our hands dirty with NumPy. Open a terminal window and fire off the IPython Notebook.
To begin using NumPy, one must import the numpy module as shown below:
import numpy as np
To create a simple, fixed-size NumPy one-dimensional array called a, use the array method as shown below:
a = np.array([5, 10, 15, 20, 25, 30, 35, 40, 45, 50])
The above creates a one-dimensional NumPy array with 10 integer elements.
To access the 6th integer from the above one-dimensional NumPy array called a, use the syntax as shown below:
a[5]
Array indices in NumPy start at 0 (the first element).
To access all the elements 4th through 7th from the above one-dimensional NumPy array called a, use the syntax as shown below:
a[3:8]
To access all the elements 1st through 5th in steps of 2 from the above one-dimensional NumPy array called a, use the syntax as shown below:
a[:6:2]
The general syntax to access the elements from a one-dimensional NumPy array called X is as shown below:
X[start:end:step]
where start is the starting index, end is the index one above the last index we desire, and step is the increment to the start index to access the next element.
To get information about the data type of elements from the above one-dimensional NumPy array called a, access the attribute dtype as shown below:
a.dtype
It will return the type as int64 indicating its a 64-bit integer.
To create a one-dimensional NumPy array of nine 64-bit floating point numbers called b, use the array method as shown below:
b = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=np.float64)
To get information about the number of elements from the above one-dimensional NumPy array called b, access the attribute size as shown below:
b.size
To create a two-dimensional NumPy array of integers containing 2 rows and 5 columns called c, use the array method as shown below:
c = np.array([[5, 10, 15, 20, 25], [10, 20, 30, 40, 50]])
To get information about the number of dimensions from the above one-dimensional NumPy array called c, access the attribute ndim as shown below:
c.ndim
It would return a value of 2 for the two-dimesional array.
To get information on the shape (number of rows and columns) from the above two-dimensional NumPy array called c, access the attribute shape as shown below:
c.shape
It would return a tuple value of (2, 5) indicating two rows and 5 columns.
To access the integer at 2nd row and 3rd column from the above one-dimensional NumPy array called c, use the syntax as shown below:
c[1, 2]
Row and column indices in NumPy start at 0 (1st row or 1st column).
One cannot change the size of a NumPy array once created, but one is allowed to reshape (or re-organize) the dimesnsions of the NumPy array. For example, one can change a two-dimensional array from 2 rows by 5 columns to a two-dimensional array of 5 rows by 2 columns.
To change the two-dimensional NumPy array called c from 2 rows by 5 columns to a two-dimensional NumPy array with 5 rows and 2 columns called d, use the reshape method as shown below:
d = c.reshape(5, 2)
To create a fixed-size, NumPy one-dimensional array called e that is evenly spaced within a given interval of integers, use the arange method as shown below:
e = np.arange(10)
The above creates a one-dimensional NumPy array with 10 integers in the interval range of 0 through 9.
To create a fixed-size, NumPy one-dimensional array called f that contains floating point numbers from 10.0 through 90.0 in increments of 10.0, use the arange method as shown below:
f = np.arange(10, 100, 10, dtype=np.float64)
To create a fixed-size, NumPy one-dimensional array called g that contains five zeros, use the zeros method as shown below:
g = np.zeros(5)
The default data type used by the zeros method is 64-bit floating point number.
To create a fixed-size, NumPy two-dimensional array called h that contains zeros as 64-bit integers arranged in 2 rows by 5 columns, use the zeros method as shown below:
h = np.zeros((2, 5), dtype=np.int64)
To create a fixed-size, NumPy two-dimensional array called i that contains ones arranged in 3 rows by 3 columns, use the ones method as shown below:
i = np.ones((3, 3))
Let us create a two-dimensional NumPy array called j that contains numbers 1 through 16 arranged in 4 rows by 4 columns using the arange method as shown below:
j = np.arange(1, 17, 1).reshape(4, 4)
One can use index and slices on a two-dimensional NumPy array to access one or more element(s).
To access entire 3rd row from the above two-dimensional NumPy array called j, use the syntax as shown below:
j[2]
To access all the columns except the first in the 3rd row from the above two-dimensional NumPy array called j, use the syntax as shown below:
j[2,1:4]
To access the 2nd and 3rd columns in the first three rows from the above two-dimensional NumPy array called j, use the syntax as shown below:
j[:3, 1:3]
To create a flat one-dimensional NumPy array from the above two-dimensional NumPy array called j, use the ravel method as shown below:
j.ravel()
To create a new two-dimensional NumPy array from the above two-dimensional NumPy array called j, with the rows and columns transposed, use the transpose method as shown below:
j.transpose()
To create a fixed-size, NumPy two-dimensional array called k that contains random 64-bit floating point numbers arranged in 3 rows by 3 columns, use the radom.randn method as shown below:
k = np.random.randn(3, 3)
The random 64-bit floating point numbers are sampled from a univariate normal (Gaussian) distribution of mean 0 and variance 1.
Let us create 2 two-dimensional NumPy arrays called l and m arranged in 4 rows by 4 columns using the arange method as shown below:
l = np.arange(100, 116).reshape(4, 4)
m = np.arange(150, 166).reshape(4, 4)
One can perform basic arithmetic operations of addition, subtraction, division, and multiplication on the above 2 two-dimensional NumPy arrays called l and m as shown below:
m + l
m - l
m / l
m * l
One can also perform mathematical operations such as finding the sum, the minimum value, the maximum value, the mean, the variance, and the standard deviation on the above two-dimensional NumPy array m as shown below:
m.sum()
m.min()
m.max()
m.mean()
m.var()
m.std()
References