Array, Slices and Fancy indexing¶

In this small post we'll investigate (quickly) the difference between slices (which are part of the Python standard library), and numpy array, and how these could be used for indexing. First let's create a matrix containing integers so that element at index i,j has value 10i+j for convenience.

In [1]:

import numpy as np
from copy import copy

Let's create a single row, that is to say a matrix or height 1 and width number of element. We'll use -1 in reshape to mean "whatever is necessary". for 2d matrices and tensor it's not super useful, but for higher dimension object, it can be quite conveneient.

In [2]:

X = np.arange(0, 10).reshape(1,-1)
X

Out[2]:

array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

now a column, same trick.

In [3]:

Y = (10*np.arange(0, 8).reshape(-1, 1))
Y

Out[3]:

array([[ 0],
       [10],
       [20],
       [30],
       [40],
       [50],
       [60],
       [70]])

By summing, and the rules of "broadcasting", we get a nice rectangular matrix.

In [4]:

R = np.arange(5*5*5*5*5).reshape(5,5,5,5,5)

In [5]:

M = X+Y
M

Out[5]:

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

Slicing¶

Quick intro about slicing. You have likely use it before if you've encoutered the objet[12:34] objet[42:96:3] notation. The X:Y:Z part is a slice. This way of writing a slice is allowed only in between square bracket for indexing.

X, Y and Z are optional and default to whatever is convenient, so ::3 (every three), :7 and :7: (until 7), : and :: (everything) are valid slices.

A slice is an efficent object that (usually) represent "From X to Y by Every Z", it is not limitted to numbers.

In [6]:

class PhylosophicalArray:
    
    def __getitem__(self, sl):
        print(f"From `{sl.start}` to `{sl.stop}` every `{sl.step}`.")        
        
arr = PhylosophicalArray()
arr['cow':'phone':'traffic jam']

From `cow` to `phone` every `traffic jam`.

You can construct a slice using the slice builtin, this is (sometime) convenient, and use it in place of x:y:z

In [7]:

sl = slice('cow', 'phone', 'traffic jam')

In [8]:

arr[sl]

From `cow` to `phone` every `traffic jam`.

In multidimentional arrays, slice of 0 or 1 width, can be used to not drop dimensions, when comparing them to scalars.

In [9]:

M[:, 3] # third column, now a vector.

Out[9]:

array([ 3, 13, 23, 33, 43, 53, 63, 73])

In [10]:

M[:, 3:4]  # now a N,1 matrix.

Out[10]:

array([[ 3],
       [13],
       [23],
       [33],
       [43],
       [53],
       [63],
       [73]])

This is convenient when indices represent various quatities, for example an athmospheric ensemble when dimension 1 is latitude, 2: longitude, 3: height, 4: temperature, 5: pressure, and you want to focus on height==0, without having to shift temprature index from 4 to 3, pressure from 5 to 4...

Zero-width slices are mostly used to simplify algorythmes to avoid having to check for edge cases.

In [11]:

a = 3
b = 3
M[a:b]

Out[11]:

array([], shape=(0, 10), dtype=int64)

In [12]:

M[a:b] =  a-b

In [13]:

M # M is not modified !

Out[13]:

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

When indexing an array, you will slice each dimention individually. Here we extract the center block of the matrix not the 3 diagonal elements.

In [14]:

M[4:7, 4:7]

Out[14]:

array([[44, 45, 46],
       [54, 55, 56],
       [64, 65, 66]])

In [15]:

sl = slice(4,7)
sl

Out[15]:

slice(4, 7, None)

In [16]:

M[sl, sl]

Out[16]:

array([[44, 45, 46],
       [54, 55, 56],
       [64, 65, 66]])

Let's change the sign the biggest square block in the upper left of this matrix.

In [17]:

K = copy(M)
el = slice(0, min(K.shape))
el

Out[17]:

slice(0, 8, None)

In [18]:

K[el, el] = -K[el, el]
K

Out[18]:

array([[  0,  -1,  -2,  -3,  -4,  -5,  -6,  -7,   8,   9],
       [-10, -11, -12, -13, -14, -15, -16, -17,  18,  19],
       [-20, -21, -22, -23, -24, -25, -26, -27,  28,  29],
       [-30, -31, -32, -33, -34, -35, -36, -37,  38,  39],
       [-40, -41, -42, -43, -44, -45, -46, -47,  48,  49],
       [-50, -51, -52, -53, -54, -55, -56, -57,  58,  59],
       [-60, -61, -62, -63, -64, -65, -66, -67,  68,  69],
       [-70, -71, -72, -73, -74, -75, -76, -77,  78,  79]])

That's about for slices, it was already a lot.

In the next section we'll talk about arrays

Fancy indexing¶

Array are more or less what you've seem in other languages. Finite Sequences of discrete values

In [19]:

ar = np.arange(4,7)
ar

Out[19]:

array([4, 5, 6])

When you slice with array, the elements of each arrays will be taken together.

In [20]:

M[ar,ar]

Out[20]:

array([44, 55, 66])

We now get a partial diagonal in out matrix. It does not have to be a diaonal:

In [21]:

M[ar, ar+1]

Out[21]:

array([45, 56, 67])

The result of this operation is a 1 dimentional array (which is a view – when possible – on the initial matrix memory), in the same way as we flipped the sign of the largest block in the previous section, we'll try indexing with the same value:

In [22]:

S = copy(M)

In [23]:

el = np.arange(min(S.shape))
el

Out[23]:

array([0, 1, 2, 3, 4, 5, 6, 7])

In [24]:

S[el, el] = -S[el,el]
S

Out[24]:

array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10, -11,  12,  13,  14,  15,  16,  17,  18,  19],
       [ 20,  21, -22,  23,  24,  25,  26,  27,  28,  29],
       [ 30,  31,  32, -33,  34,  35,  36,  37,  38,  39],
       [ 40,  41,  42,  43, -44,  45,  46,  47,  48,  49],
       [ 50,  51,  52,  53,  54, -55,  56,  57,  58,  59],
       [ 60,  61,  62,  63,  64,  65, -66,  67,  68,  69],
       [ 70,  71,  72,  73,  74,  75,  76, -77,  78,  79]])

Here we flipped the value of only the diagonal elements. It of couse did not had to do the diagonal elements:

In [25]:

S[el, el+1]

Out[25]:

array([ 1, 12, 23, 34, 45, 56, 67, 78])

In [26]:

S[el, el+1] = 0
S

Out[26]:

array([[  0,   0,   2,   3,   4,   5,   6,   7,   8,   9],
       [ 10, -11,   0,  13,  14,  15,  16,  17,  18,  19],
       [ 20,  21, -22,   0,  24,  25,  26,  27,  28,  29],
       [ 30,  31,  32, -33,   0,  35,  36,  37,  38,  39],
       [ 40,  41,  42,  43, -44,   0,  46,  47,  48,  49],
       [ 50,  51,  52,  53,  54, -55,   0,  57,  58,  59],
       [ 60,  61,  62,  63,  64,  65, -66,   0,  68,  69],
       [ 70,  71,  72,  73,  74,  75,  76, -77,   0,  79]])

Nor are we required to have the same elements only once:

In [27]:

el-1

Out[27]:

array([-1,  0,  1,  2,  3,  4,  5,  6])

In [28]:

sy = np.array([0, 1, 2, 0, 1, 2])
sx = np.array([1, 2, 3, 1, 2, 3])
ld = S[sx, sy] # select 3 elements of lower diagonal twice
ld

Out[28]:

array([10, 21, 32, 10, 21, 32])

More in the scipy lectures notes, Numpy quickstart, Python DataScience Handbook

Some experiments¶

In [29]:

S = copy(M)
S[0:10, 0:10] = 0
S

Out[29]:

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [30]:

S = copy(M)
S[0:10:2, 0:10] = 0
S

Out[30]:

array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

In [31]:

S = copy(M)
S[0:10, 0:10:2] = 0
S

Out[31]:

array([[ 0,  1,  0,  3,  0,  5,  0,  7,  0,  9],
       [ 0, 11,  0, 13,  0, 15,  0, 17,  0, 19],
       [ 0, 21,  0, 23,  0, 25,  0, 27,  0, 29],
       [ 0, 31,  0, 33,  0, 35,  0, 37,  0, 39],
       [ 0, 41,  0, 43,  0, 45,  0, 47,  0, 49],
       [ 0, 51,  0, 53,  0, 55,  0, 57,  0, 59],
       [ 0, 61,  0, 63,  0, 65,  0, 67,  0, 69],
       [ 0, 71,  0, 73,  0, 75,  0, 77,  0, 79]])

In [32]:

S = copy(M)
S[0:10:2, 0:10:2] = 0
S

Out[32]:

array([[ 0,  1,  0,  3,  0,  5,  0,  7,  0,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [ 0, 21,  0, 23,  0, 25,  0, 27,  0, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [ 0, 41,  0, 43,  0, 45,  0, 47,  0, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [ 0, 61,  0, 63,  0, 65,  0, 67,  0, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

In [33]:

S = copy(M)
S[0:10:2, 0:10] = 0
S[0:10, 0:10:2] = 0
S

Out[33]:

array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0, 11,  0, 13,  0, 15,  0, 17,  0, 19],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0, 31,  0, 33,  0, 35,  0, 37,  0, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0, 51,  0, 53,  0, 55,  0, 57,  0, 59],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0, 71,  0, 73,  0, 75,  0, 77,  0, 79]])

In [34]:

S = copy(M)
S[0:8, 0:8] = 0
S

Out[34]:

array([[ 0,  0,  0,  0,  0,  0,  0,  0,  8,  9],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 18, 19],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 28, 29],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 38, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 48, 49],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 58, 59],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 68, 69],
       [ 0,  0,  0,  0,  0,  0,  0,  0, 78, 79]])

In [35]:

S = copy(M)
S[np.arange(0,8), np.arange(0,8)] = 0
S

Out[35]:

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10,  0, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21,  0, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32,  0, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43,  0, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54,  0, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65,  0, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76,  0, 78, 79]])

In [36]:

S = copy(M)
S[range(0,8), range(0,8)] = 0
S

Out[36]:

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10,  0, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21,  0, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32,  0, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43,  0, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54,  0, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65,  0, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76,  0, 78, 79]])

In [37]:

S = copy(M)
S[np.arange(0, 10), np.arange(0, 10)] = 0 ## will fail
S

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-37-caff8ce2b957> in <module>
      1 S = copy(M)
----> 2 S[np.arange(0, 10), np.arange(0, 10)] = 0 ## will fail
      3 S

IndexError: index 8 is out of bounds for axis 0 with size 8

In [ ]:

Array, slices and indexing

Array, Slices and Fancy indexing¶

Slicing¶

Fancy indexing¶

Some experiments¶