Numpy

This page provides an introduction to numpy. These are my notes from YouTube Video.

Overview

NumPy is a powerful Python library for numerical computing, widely used for handling large, multi-dimensional arrays and matrices. To start using numpy you have to install and import numpy as below:

# install numpy 
pip3 install numpy

# install matplotlib - required later
pip3 install matplotlib

# importing numpy & matplotlib
python3
import numpy as np
import matplotlib.pyplot as plt

Array

# int array
a = np.array([1, 2, 3, 4])

type(a)
# o/p: <class 'numpy.ndarray

a.dtype
# o/p: dtype('int64')

# float array 
b =  np.array([1.2, 3.4, 5.6, 7.8])
b.dtype
# o/p: dtype('float64')

# access a value 
a[0]
# o/p: np.int64(1)

# override a value
a[0] = 10
# o/p: array([10,  2,  3,  4])

# assigning float value in integer array
a[0] = 11.5
# o/p: array([11,  2,  3,  4])

# dimension
a.ndim
# o/p: 1

# shape - shows number of elements along each dimension
a.shape
# o/p: (4, )

# size = number of elements in the array
a.size
# o/p: 4

info

All the data in numpy array should be of the same type. If you overwrite a value in an integer array with a float, the decimal part will be truncated.

Vectorized Operation

# add 
a + b
# o/p: array([12.2,  5.4,  8.6, 11.8])


# div
a / b
# o/p: array([9.16666667, 0.58823529, 0.53571429, 0.51282051])


# multiply
a * b
# o/p: array([13.2,  6.8, 16.8, 31.2])


# pow
a ** b
# o/p: array([1.77693369e+01, 1.05560633e+01, 4.69763237e+02, 4.96670005e+04])


# adding constant 
a + 10
# o/p: array([21, 12, 13, 14])

Universal Functions

np.sin(a)
# o/p: array([-0.99999021,  0.90929743,  0.14112001, -0.7568025 ])

2D Array

# defining a 2d array
a_2d_small = np.array([
  [1, 2, 3, 4],
  [5, 6, 7, 8]
])
# o/p:
# array([[1, 2, 3, 4],
#        [5, 6, 7, 8]])



# creates an array of range from 0 to 25 and the reshape it as 5*5 array
a_2d = np.arange(25).reshape(5, 5)
# o/p:
# array([[ 0,  1,  2,  3,  4],
#        [ 5,  6,  7,  8,  9],
#        [10, 11, 12, 13, 14],
#        [15, 16, 17, 18, 19],
#        [20, 21, 22, 23, 24]])



# shape 
a_2d.shape
# o/p: (5, 5)



# size 
a_2d.size
# o/p: 25



# ndim
a_2d.ndim
# o/p: 2



# set
a_2d[1, 3] = -1
# o/p:
# array([[ 0,  1,  2,  3,  4],
#        [ 5,  6,  7, -1,  9],
#        [10, 11, 12, 13, 14],
#        [15, 16, 17, 18, 19],
#        [20, 21, 22, 23, 24]])



a_2d[1, 3] = 8
# o/p:
# array([[ 0,  1,  2,  3,  4],
#        [ 5,  6,  7,  8,  9],
#        [10, 11, 12, 13, 14],
#        [15, 16, 17, 18, 19],
#        [20, 21, 22, 23, 24]])



# get 
print(a_2d[1, 3])
# o/p: 8

Slicing

Extracts the portion specified by lower(inclusive) and upper(exclusive) bound, taking each step of size step.

a_2d[1]
# o/p: array([5, 6, 7, 8, 9])


# [lower: upper: step] 
a_2d[0:2:1]
# o/p:
# array([[0, 1, 2, 3, 4],
#        [5, 6, 7, 8, 9]])

Indexing and Slicing

We specify indexing first and then specify slicing stride.

# select 0th index and then extract stride 1:3
a_2d[0, 1:3]
# o/p: array([1, 2])



# select strides from each dimension, 
# from row dimension select a stride from 0 to everything and 
# from col dimesion select a stride from 1 to everything
a_2d[0:, 1:] 
# o/p: 
# array([[ 1,  2,  3,  4],
#        [ 6,  7,  8,  9],
#        [11, 12, 13, 14],
#        [16, 17, 18, 19],
#        [21, 22, 23, 24]])




# from row dimension select a stride from 0 to everything and 
# from col dimesion only 2 column
a_2d[0:, 2] 
# array([ 2,  7, 12, 17, 22])



# Use negative to throw away. 
# For example a_2d[0:-1, ], this formula means select all the rows except last row.
a_2d[0:-1, ]
# o/p: 
# array([[ 0,  1,  2,  3,  4],
#        [ 5,  6,  7,  8,  9],
#        [10, 11, 12, 13, 14],
#        [15, 16, 17, 18, 19]])

info

Changing a slice of an array by assigning new values updates the original array, as the modification affects the same memory location.

Blurring an Image

Let's say we have an image as below where each square represents a pixel in an image.

To blur an image, we essentially merge neighboring pixels to create a lower-resolution image. This process involves taking the average color of adjacent pixels to form a single pixel. For instance, all pixels marked with a green dot are combined by averaging their colors to create one pixel.

In the example below, we apply this technique to downscale a 4x4 image to a 2x2 image, reducing the pixel count while maintaining an averaged representation of the original image.

Firstly, we group top pixels of all the four pixel group(Green, Pink, Blue and Yellow).

Similary group left, right, bottom and center pixels from all the four pixel groups and calculate avg. Below are the formulas for slicing pixel groups:

Slicing formula for top pixels: $\text{img}[:-2, 1:-1]$

Start from $0$ , and throw away last $2$ rows.
Start from $1$ , and throw away last $1$ column.

\begin{matrix} [1 & 2 & 3], \\ [6 & 7 & 8], \\ [11 & 12 & 13] \end{matrix}

Slicing formula for left pixels: $\text{img}[1:-1, :-2]$

Start from $1$ , and throw away last $1$ row.
Start from $0$ , and throw away last $2$ columns.

\begin{matrix} [5 & 6 & 7], \\ [10 & 11 & 12], \\ [15 & 16 & 17] \end{matrix}

Slicing formula for right pixels: $\text{img}[1:-1, 2:]$

Start from $1$ , and throw away last $1$ row.
Start from $2$ , and select all columns.

\begin{matrix} [7 & 8 & 9], \\ [12 & 13 & 14], \\ [17 & 18 & 19] \end{matrix}

Slicing formula for bottom pixels: $\text{img}[2:, 1:-1]$

Start from $2$ , and select everything.
Start from $1$ , and throw away last $1$ column.

\begin{matrix} [11 & 12 & 13], \\ [16 & 17 & 18], \\ [21 & 22 & 23] \end{matrix}

Slicing formula for center pixels: $\text{img}[1:-1, 1:-1]$

Start from $1$ , and throw away last $1$ row.
Start from $1$ , and throw away last $1$ column.

\begin{matrix} [6 & 7 & 8], \\ [11 & 12 & 13], \\ [16 & 17 & 18] \end{matrix}

img = np.arange(25).reshape(5, 5)
# o/p: 
# array([[ 0,  1,  2,  3,  4],
#       [ 5,  6,  7,  8,  9],
#       [10, 11, 12, 13, 14],
#       [15, 16, 17, 18, 19],
#       [20, 21, 22, 23, 24]])



blurred_img = (
  img[:-2, 1:-1] + # top
  img[1:-1, :-2] + # left
  img[1:-1, 2:] +  # right
  img[2:, 1:-1] +  # bottom
  img[1:-1, 1:-1]  # center
) / 5.0


blurred_img
# o/p: 
# array([[ 6.,  7.,  8.],
#        [11., 12., 13.],
#        [16., 17., 18.]])

Let's blurr an actual image. Original image looks like below:

# using matplotlib import an nature.png image
nature_img = plt.imread("static/img/nature.png")


# blur logic
def blur_img(nature_img): 
  return (
    nature_img[:-2, 1:-1] + # top
    nature_img[1:-1, :-2] + # left 
    nature_img[1:-1, 2:] +  # right 
    nature_img[2:, 1:-1] +  # bottom
    nature_img[1:-1, 1:-1]  # center
  ) / 5.0


# blur once 
nature_img = blur_img(nature_img)

# save logic
plt.imsave("static/img/nature_blur_1.png", nature_img)

After blurring it once image looks like below:

Let's blurr is $49$ more times.

for _ in range(1, 50): nature_img = blur_img(nature_img)

plt.imsave("static/img/nature_blur_50.png", nature_img)

After blurring the image $49$ more times it looks like below:

Fancy Indexing

a_fancy = np.arange(0, 80, 10)

# indexing by position
indices = [1, 2, 5]
y_indices = a_fancy[indices]
print(y_indices)
# o/p: [10, 20, 50]


# indexing with booleans
mask = np.array([0, 1, 1, 0, 0, 1, 0, 0], dtype=bool)
y_bool = a_fancy[mask]
print(y_bool)
# o/p: [10, 20, 50]


# replacing all the negative numbers in an array with 0
a_fancy_neg = np.array([1, 31, -1, 341, -11, 90, -7])
mask_neg = a_fancy_neg < 0
# o/p: array([False, False,  True, False,  True, False,  True])
a_fancy_neg[mask_neg] = 0
a_fancy_neg
# o/p: array([  1,  31,   0, 341,   0,  90,   0])

Fancy Indexing in 2D

a_fancy_2d = np.arange(25).reshape(5, 5)
# o/p:
# array([[ 0,  1,  2,  3,  4],
#        [ 5,  6,  7,  8,  9],
#        [10, 11, 12, 13, 14],
#        [15, 16, 17, 18, 19],
#        [20, 21, 22, 23, 24]])


# retrieve diagonal elements using indexing by position 
y_fancy_indices_2d = a_fancy_2d[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
print(y_fancy_indices_2d)
# o/p: [ 0  6 12 18 24]


# selecting all rows and column 0, 1, 4. 
# this can't be done with slicing 
# as step increases by 1 and then by 3, so we use fancy indexing.
a_fancy_2d[0:, [0, 2, 4]]
# o/p: array([[ 0,  2,  4],
#        [ 5,  7,  9],
#        [10, 12, 14],
#        [15, 17, 19],
#        [20, 22, 24]])

Array Broadcasting Rules

p = np.array([
  [0, 0, 0],
  [10, 10, 10],
  [20, 20, 20],
  [30, 30, 30]
])
# o/p: 
# array([[ 0,  0,  0],
#        [10, 10, 10],
#        [20, 20, 20],
#        [30, 30, 30]])


q = np.array([
  [0, 1, 2],
  [0, 1, 2],
  [0, 1, 2],
  [0, 1, 2]
])
# o/p: 
# array([[0, 1, 2],
#        [0, 1, 2],
#        [0, 1, 2],
#        [0, 1, 2]])


r = np.array([[0], [10], [20], [30]])
# o/p:
# array([[ 0],
#        [10],
#        [20],
#       [30]])


s = np.array([0, 1, 2])
# o/p: array([0, 1, 2])


# p + q == q + p = p + r = s + r
# here s and r are repeated across rows or columns to match the shape.
a_1 = p + q
# o/p: 
# array([[ 0,  1,  2],
#        [10, 11, 12],
#        [20, 21, 22],
#        [30, 31, 32]])


a_2 = q + p
# o/p: 
# array([[ 0,  1,  2],
#        [10, 11, 12],
#        [20, 21, 22],
#        [30, 31, 32]])


a_3 = p + s
# o/p: 
# array([[ 0,  1,  2],
#        [10, 11, 12],
#        [20, 21, 22],
#        [30, 31, 32]])


a_4 = s + r
# o/p: 
# array([[ 0,  1,  2],
#        [10, 11, 12],
#        [20, 21, 22],
#        [30, 31, 32]])

Array Calculation Methods

a_method = np.arange(9).reshape(3, 3)
# o/p: 
# array([[0, 1, 2],
#        [3, 4, 5],
#        [6, 7, 8]])


# sum all elements in an array
a_method.sum()
# o/p: np.int64(36)


# sum all elements row by rwo
a_method.sum(axis = 0)
# o/p: array([ 9, 12, 15])


# sum all elements col by col
a_method.sum(axis = 1)
# o/p: array([ 3, 12, 21])




# min from an array
a_method.min()
# o/p: np.int64(0)
a_method.min(axis = 0)
# o/p: array([0, 1, 2])



# max from an array
a_method.max()
# o/p: np.int64(8)
a_method.max(axis = 0)
# o/p: array([6, 7, 8])



# index of min from an array
a_method.argmin()
# o/p: np.int64(0)
a_method.argmin(axis = 0)
# o/p: array([0, 0, 0])



# index of max from an array
a_method.argmax()
# o/p: np.int64(8)
a_method.argmax(axis = 0)
# o/p: array([2, 2, 2])



# un-flatten 1D locations
# instead of saying 8 which is flatten 1D location of max element in array, 
# below function gives you (2, 2)
np.unravel_index(
  a_method.argmax(), a_method.shape
)
# o/p: (np.int64(2), np.int64(2))



# where 
np.where(a_method == a_method.max())
# o/p: (array([2]), array([2]))

Overview​

Array​

Vectorized Operation​

Universal Functions​

2D Array​

Slicing​

Indexing and Slicing​

Blurring an Image​

Fancy Indexing​

Fancy Indexing in 2D​

Array Broadcasting Rules​

Array Calculation Methods​