# Array, slices and indexing

## Array, Slices and Fancy indexing¶

In this small post we'll investigate (quickly) the difference between slices (which are part of the Python standard library), and numpy array, and how these could be used for indexing. First let's create a matrix containing integers so that element at index i,j has value 10i+j for convenience.

In [1]:
import numpy as np
from copy import copy


Let's create a single row, that is to say a matrix or height 1 and width number of element. We'll use -1 in reshape to mean "whatever is necessary". for 2d matrices and tensor it's not super useful, but for higher dimension object, it can be quite conveneient.

In [2]:
X = np.arange(0, 10).reshape(1,-1)
X

Out[2]:
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

now a column, same trick.

In [3]:
Y = (10*np.arange(0, 8).reshape(-1, 1))
Y

Out[3]:
array([[ 0],
[10],
[20],
[30],
[40],
[50],
[60],
[70]])

By summing, and the rules of "broadcasting", we get a nice rectangular matrix.

In [4]:
R = np.arange(5*5*5*5*5).reshape(5,5,5,5,5)

In [5]:
M = X+Y
M

Out[5]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

### Slicing¶

Quick intro about slicing. You have likely use it before if you've encoutered the objet[12:34] objet[42:96:3] notation. The X:Y:Z part is a slice. This way of writing a slice is allowed only in between square bracket for indexing.

X, Y and Z are optional and default to whatever is convenient, so ::3 (every three), :7 and :7: (until 7), : and :: (everything) are valid slices.

A slice is an efficent object that (usually) represent "From X to Y by Every Z", it is not limitted to numbers.

In [6]:
class PhylosophicalArray:

def __getitem__(self, sl):
print(f"From {sl.start} to {sl.stop} every {sl.step}.")

arr = PhylosophicalArray()
arr['cow':'phone':'traffic jam']

From cow to phone every traffic jam.


You can construct a slice using the slice builtin, this is (sometime) convenient, and use it in place of x:y:z

In [7]:
sl = slice('cow', 'phone', 'traffic jam')

In [8]:
arr[sl]

From cow to phone every traffic jam.


In multidimentional arrays, slice of 0 or 1 width, can be used to not drop dimensions, when comparing them to scalars.

In [9]:
M[:, 3] # third column, now a vector.

Out[9]:
array([ 3, 13, 23, 33, 43, 53, 63, 73])
In [10]:
M[:, 3:4]  # now a N,1 matrix.

Out[10]:
array([[ 3],
[13],
[23],
[33],
[43],
[53],
[63],
[73]])

This is convenient when indices represent various quatities, for example an athmospheric ensemble when dimension 1 is latitude, 2: longitude, 3: height, 4: temperature, 5: pressure, and you want to focus on height==0, without having to shift temprature index from 4 to 3, pressure from 5 to 4...

Zero-width slices are mostly used to simplify algorythmes to avoid having to check for edge cases.

In [11]:
a = 3
b = 3
M[a:b]

Out[11]:
array([], shape=(0, 10), dtype=int64)
In [12]:
M[a:b] =  a-b

In [13]:
M # M is not modified !

Out[13]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])

When indexing an array, you will slice each dimention individually. Here we extract the center block of the matrix not the 3 diagonal elements.

In [14]:
M[4:7, 4:7]

Out[14]:
array([[44, 45, 46],
[54, 55, 56],
[64, 65, 66]])
In [15]:
sl = slice(4,7)
sl

Out[15]:
slice(4, 7, None)
In [16]:
M[sl, sl]

Out[16]:
array([[44, 45, 46],
[54, 55, 56],
[64, 65, 66]])

Let's change the sign the biggest square block in the upper left of this matrix.

In [17]:
K = copy(M)
el = slice(0, min(K.shape))
el

Out[17]:
slice(0, 8, None)
In [18]:
K[el, el] = -K[el, el]
K

Out[18]:
array([[  0,  -1,  -2,  -3,  -4,  -5,  -6,  -7,   8,   9],
[-10, -11, -12, -13, -14, -15, -16, -17,  18,  19],
[-20, -21, -22, -23, -24, -25, -26, -27,  28,  29],
[-30, -31, -32, -33, -34, -35, -36, -37,  38,  39],
[-40, -41, -42, -43, -44, -45, -46, -47,  48,  49],
[-50, -51, -52, -53, -54, -55, -56, -57,  58,  59],
[-60, -61, -62, -63, -64, -65, -66, -67,  68,  69],
[-70, -71, -72, -73, -74, -75, -76, -77,  78,  79]])

In the next section we'll talk about arrays

### Fancy indexing¶

Array are more or less what you've seem in other languages. Finite Sequences of discrete values

In [19]:
ar = np.arange(4,7)
ar

Out[19]:
array([4, 5, 6])

When you slice with array, the elements of each arrays will be taken together.

In [20]:
M[ar,ar]

Out[20]:
array([44, 55, 66])

We now get a partial diagonal in out matrix. It does not have to be a diaonal:

In [21]:
M[ar, ar+1]

Out[21]:
array([45, 56, 67])

The result of this operation is a 1 dimentional array (which is a view – when possible –  on the initial matrix memory), in the same way as we flipped the sign of the largest block in the previous section, we'll try indexing with the same value:

In [22]:
S = copy(M)

In [23]:
el = np.arange(min(S.shape))
el

Out[23]:
array([0, 1, 2, 3, 4, 5, 6, 7])
In [24]:
S[el, el] = -S[el,el]
S

Out[24]:
array([[  0,   1,   2,   3,   4,   5,   6,   7,   8,   9],
[ 10, -11,  12,  13,  14,  15,  16,  17,  18,  19],
[ 20,  21, -22,  23,  24,  25,  26,  27,  28,  29],
[ 30,  31,  32, -33,  34,  35,  36,  37,  38,  39],
[ 40,  41,  42,  43, -44,  45,  46,  47,  48,  49],
[ 50,  51,  52,  53,  54, -55,  56,  57,  58,  59],
[ 60,  61,  62,  63,  64,  65, -66,  67,  68,  69],
[ 70,  71,  72,  73,  74,  75,  76, -77,  78,  79]])

Here we flipped the value of only the diagonal elements. It of couse did not had to do the diagonal elements:

In [25]:
S[el, el+1]

Out[25]:
array([ 1, 12, 23, 34, 45, 56, 67, 78])
In [26]:
S[el, el+1] = 0
S

Out[26]:
array([[  0,   0,   2,   3,   4,   5,   6,   7,   8,   9],
[ 10, -11,   0,  13,  14,  15,  16,  17,  18,  19],
[ 20,  21, -22,   0,  24,  25,  26,  27,  28,  29],
[ 30,  31,  32, -33,   0,  35,  36,  37,  38,  39],
[ 40,  41,  42,  43, -44,   0,  46,  47,  48,  49],
[ 50,  51,  52,  53,  54, -55,   0,  57,  58,  59],
[ 60,  61,  62,  63,  64,  65, -66,   0,  68,  69],
[ 70,  71,  72,  73,  74,  75,  76, -77,   0,  79]])

Nor are we required to have the same elements only once:

In [27]:
el-1

Out[27]:
array([-1,  0,  1,  2,  3,  4,  5,  6])
In [28]:
sy = np.array([0, 1, 2, 0, 1, 2])
sx = np.array([1, 2, 3, 1, 2, 3])
ld = S[sx, sy] # select 3 elements of lower diagonal twice
ld

Out[28]:
array([10, 21, 32, 10, 21, 32])

## Some experiments¶

In [29]:
S = copy(M)
S[0:10, 0:10] = 0
S

Out[29]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
In [30]:
S = copy(M)
S[0:10:2, 0:10] = 0
S

Out[30]:
array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])
In [31]:
S = copy(M)
S[0:10, 0:10:2] = 0
S

Out[31]:
array([[ 0,  1,  0,  3,  0,  5,  0,  7,  0,  9],
[ 0, 11,  0, 13,  0, 15,  0, 17,  0, 19],
[ 0, 21,  0, 23,  0, 25,  0, 27,  0, 29],
[ 0, 31,  0, 33,  0, 35,  0, 37,  0, 39],
[ 0, 41,  0, 43,  0, 45,  0, 47,  0, 49],
[ 0, 51,  0, 53,  0, 55,  0, 57,  0, 59],
[ 0, 61,  0, 63,  0, 65,  0, 67,  0, 69],
[ 0, 71,  0, 73,  0, 75,  0, 77,  0, 79]])
In [32]:
S = copy(M)
S[0:10:2, 0:10:2] = 0
S

Out[32]:
array([[ 0,  1,  0,  3,  0,  5,  0,  7,  0,  9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[ 0, 21,  0, 23,  0, 25,  0, 27,  0, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[ 0, 41,  0, 43,  0, 45,  0, 47,  0, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[ 0, 61,  0, 63,  0, 65,  0, 67,  0, 69],
[70, 71, 72, 73, 74, 75, 76, 77, 78, 79]])
In [33]:
S = copy(M)
S[0:10:2, 0:10] = 0
S[0:10, 0:10:2] = 0
S

Out[33]:
array([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[ 0, 11,  0, 13,  0, 15,  0, 17,  0, 19],
[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[ 0, 31,  0, 33,  0, 35,  0, 37,  0, 39],
[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[ 0, 51,  0, 53,  0, 55,  0, 57,  0, 59],
[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
[ 0, 71,  0, 73,  0, 75,  0, 77,  0, 79]])
In [34]:
S = copy(M)
S[0:8, 0:8] = 0
S

Out[34]:
array([[ 0,  0,  0,  0,  0,  0,  0,  0,  8,  9],
[ 0,  0,  0,  0,  0,  0,  0,  0, 18, 19],
[ 0,  0,  0,  0,  0,  0,  0,  0, 28, 29],
[ 0,  0,  0,  0,  0,  0,  0,  0, 38, 39],
[ 0,  0,  0,  0,  0,  0,  0,  0, 48, 49],
[ 0,  0,  0,  0,  0,  0,  0,  0, 58, 59],
[ 0,  0,  0,  0,  0,  0,  0,  0, 68, 69],
[ 0,  0,  0,  0,  0,  0,  0,  0, 78, 79]])
In [35]:
S = copy(M)
S[np.arange(0,8), np.arange(0,8)] = 0
S

Out[35]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
[10,  0, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21,  0, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32,  0, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43,  0, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54,  0, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65,  0, 67, 68, 69],
[70, 71, 72, 73, 74, 75, 76,  0, 78, 79]])
In [36]:
S = copy(M)
S[range(0,8), range(0,8)] = 0
S

Out[36]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
[10,  0, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21,  0, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32,  0, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43,  0, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54,  0, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65,  0, 67, 68, 69],
[70, 71, 72, 73, 74, 75, 76,  0, 78, 79]])
In [37]:
S = copy(M)
S[np.arange(0, 10), np.arange(0, 10)] = 0 ## will fail
S

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-37-caff8ce2b957> in <module>
1 S = copy(M)
----> 2 S[np.arange(0, 10), np.arange(0, 10)] = 0 ## will fail
3 S

IndexError: index 8 is out of bounds for axis 0 with size 8
In [ ]:



# The Pleasure of deleting code

## Good Code is Deleted Code

The only code without bugs is no code. And the less code you have, the less mental load as well. This is why it is often a pleasure to delete a lot of code.

In IPython we recently bumped the version number to 7.0 and dropped support for Python 3.3. This was the occasion to clean, and remove a lots of code that insure compatibility with multiple minor Python version, and while it may seem easy it required a lot of thinking ahead of time to make the process simple.

### Finding what can (and should be deleted)

The hardest part is not deleting the code itself, but finding what can be deleted. In many compiled languages, the compiler may help you, but with Python it can be quite tougher, and some of Python usual practices make it harder.

Here are a few tips on how to prepare your code (when you write it) for deletion.

#### EAFP vs LBYL

Python tend to be more on the Easier to ask Forgiveness than Permission, than Look Before You Leap. It is thus common to see code like:

try:
except ImportError :


In this particular case though, why do we use the try/except ? Unless there is a comment attached, it is hard guess that from imp import reload was deprecated since python 3.4, the comment can easily get out of sync with the actual code.

A better way would be to explicitly check sys.version_info

if sys.version_info < (3, 4):
else:


(Note, tuple from unequal length can be compared in python).

It is now obvious which code should be removed and when. You can see that as "Explicit is better than implicit" rule.

### Deprecated code

Removing legacy deprecated code is also always a challenge, as you may be worried of other library might be still relying deprecation. To help with that let's see how we can improve typical deprecation, here is a typical deprecated method from IPython::

def unicode_std_stream(stream='stdout'):
"""DEPRECATED"""
warn("IPython.utils.io.unicode_std_stream is deprecated", DeprecationWarning)
...


How much are you confident you can remove this ? A few question should pop into your head: - Since when has this function been deprecated ?

def unicode_std_stream(stream='stdout'):
"""DEPRECATED"""
warn("IPython.utils.io.unicode_std_stream is deprecated since IPython 4.0", DeprecationWarning)
...


With this new snippet I'm confident it's been 3 versions and I am more willing to delete. This also helps downstream libraries to know whether they need conditional code or now. I'm still unsure downstream maintainer have updated their code. Let's add a stacklevel (to help them find where the deprecated function is used, and add more informations about how they can replace code uses this function:

def unicode_std_stream(stream='stdout'):
"""DEPRECATED, moved to nbconvert.utils.io"""
warn("IPython.utils.io.unicode_std_stream has moved to nbconvert.utils.io since IPython 4.0", DeprecationWarning, stacklevel=2)
...


Well with this information I'm even more confident downstream maintainer have updated their code. They have an actionable item: replace one import for another, and are more likely to do that, than dig for 1h in history to figure out what to do.

## TLDR

• Be explicit in your conditional import that depends on version of underlying python or library.

• take time to write good deprecation warning with :

• Stacklevel (=2 most of the time)
• Since When it was deprecated.
• What should replace deprecated call for consumers.

The time you put in these will greatly help your downstream consumers, and benefit you later to simplify getting rid of lots of code easily.

# Sign commits on GitHub

## Signing Commit on Tags on GitHub

I've recently set-up keybase and integrated my public key with git to be able to sign commits.

I decided to not automatically sign, as auto-signing would allow any attacker that takes control of my machine to create signed commit. The git Merkle tree of git still insure repos are not tampered with, as long as you issue $git fsck --full on a repo or $ git config --global transfer.fsckobjects true once and forget it.

Using $git log --show-signatur you can now check that commits (and tags) are correctly signed. Be careful though, correct signature does not mean trusted, and if you have a PGP key set; GitHub will helpfully signed the commit you make on their platform with their key. * commit 5ced6c6936563fea7ba7efccecbc4248d84cfabb (tag: 5.2.1, origin/5.2.x, 5.2.x) | gpg: Signature made Tue Jan 2 19:51:17 2018 CET | gpg: using RSA key 99B17F64FD5C94692E9EF8064968B2CC0208DCC8 | gpg: Good signature from "Matthias Bussonnier <bussonniermatthias@gmail.com>" [ultimate] | Author: Matthias Bussonnier <bussonniermatthias@gmail.com> | Date: Tue Jan 2 19:49:34 2018 +0100 | | Bump version number to 5.2.1 for release | * commit 5a28fb0a121c286e35db309fe11b53693969b2d6 |\ gpg: Signature made Tue Jan 2 13:58:08 2018 CET | | gpg: using RSA key 4AEE18F83AFDEB23 | | gpg: Good signature from "GitHub (web-flow commit signing) <noreply@github.com>" [unknown] | | gpg: WARNING: This key is not certified with a trusted signature! | | gpg: There is no indication that the signature belongs to the owner. | | Primary key fingerprint: 5DE3 E050 9C47 EA3C F04A 42D3 4AEE 18F8 3AFD EB23 | | Merge: 3fd21bc 065a16a | | Author: Min RK <benjaminrk@gmail.com> | | Date: Tue Jan 2 13:58:08 2018 +0100 | | | | Merge pull request #326 from jupyter/auto-backport-of-pr-325 | | | | Backport PR #325 on branch 5.2.x | | | * commit 065a16aad2e84d506b36bb2c874a7c287c53c61f (origin/pr/326) |/ Author: Min RK <benjaminrk@gmail.com> | Date: Tue Jan 2 10:57:13 2018 +0100 | | Backport PR #325: Parenthesize conditional requirement in setup.py  So in the previous block, you can see that 5ced6c6... have been done and signed by me, while 5a28fb0... has be allegedly done by Min, but signed by GitHub. By default you do not have GitHub Signature locally, so the GitHub Signed commits can appear as unverified. To do so fetch the GitHub Key: $ gpg --keyserver hkp://keys.gnupg.net --recv-keys 4AEE18F83AFDEB23

Where 4AEE18F83AFDEB23 is the key you do not have locally. And remember Valid Signature, does not mean trusted.

### verifying Tags

Tags can be signed, and need to be checked independently of commits :

# Open in Binder Chrome Extension

Two weeks ago I was pleased to announce the release of the Open-with-Binder for Firefox extension.

After asking on twitter if people were interested in the same for Chrome (29 Yes, 67 No, 3 Other) and pondering whether or not to pay the Chrome Developer Fee for the Chrome App store, I decided to take my chance and try to publish it last week.

I almost just had to use Mozilla WebExt Shim for Chrome, downgrade a few artwork from SVG to PNG (like really??) and upload all by hand, like really again ?

The Chrome Store has way more fields and it is quite complicated – compared to the Mozilla Addons website at least – It is sometime confusing whether fields are optional or not, or if they are per addons on per developer ?

It does though allow you to upload more art that will be show in a store which that looks nicer.

Still I had to pay to go through a really ugly crappy website and had to pay for it to publish a free extension. So Mozilla you win this.

Please rate the extension, or it may not appear in search results for others AFAICT:

install Open with Binder for chrome

It works identically to the Firefox one, you get a button on the toolbar and click on it when visiting GitHub.

Enjoy.

# Open in Binder Browser Extension

Today I am please to announce the release of a first project I've been working on for a bout a week: A Firefox extension to open the GitHub repository you are visiting using MyBinder.org.

If you are in a hurry, just head there to Install version 0.1.0 for Firefox. If you like to know more read on.

### Back to Firefox.

I've been using Chrome for a couple of years now, but heard a lot of good stuff about Rust and all the good stuff it has done or Firefox. Ok that's a bit of marketing but it got me to retry Firefox (Nightly please), and except for my password manager which took some week to update to the new Firefox API, I rapidly barely used Chrome.

### MyBinder.org

I'm also spending more and more time working with the JupyterHub team on Binder, and see more and more developer adding binder badges to their repository. Mid of last week I though:

You know what's not optimal? It's painful to browse repositories that don't have the binder badge on MyBinder.org, also sometime you have to find the badge which is at the bottom of the readme.

You know what would be great to fix that ? A button in the toolbar doing the work for me.

### Writing the extension

As I know Mozilla (which has a not so great new design BTW, but personal opinion) cares about making standard and things simple for their users, I though I would have a look at the new WebExtension.

And 7 days later, after a couple of 30 minutes break, I present to you a staggering 27 lines (including 7 line business logic) extension that does that:

(function() {
function handleClick(){
browser.tabs.query({active: true, currentWindow: true})
.then((tabs) => {return tabs[0]})
.then((tab) => {
let url = new URL(tab.url);
if (url.hostname != 'github.com'){
console.warn('Open in binder only works on GitHub repositories for now.');
return;
};
let parts = url.pathname.split('/');
if (parts.length < 3){
console.warn('While you are on GitHub, You do not appear to be in a github repository. Aborting.');
return;
}
let my_binder_url = 'https://mybinder.org/v2/gh/'+parts[1] +'/'+parts[2] +'/master';
console.info('Opening ' + url + 'using mybinder.org... enjoy !')
browser.tabs.create({'url':my_binder_url});
})

}

console.info('❤️ If you are reading this then you know about binder and javascript. ❤️');
console.info('❤️ So you\'re skilled enough to contribute ! We\'re waiting for you on https://github.com/jupyterhub/ ❤️');
})()


You can find the original source here

The hardest part was finding the API and learning how to package and set the icons correctly. There are still plenty of missing features and really low hanging fruits, even if you have never written an extension before (hey it's my first and I averaged 1-useful line/day writing it...).

### General Feeling

Remember that I'm new to that and started a week ago.

The Mozilla docs are good but highly varying in quality, it feels (and is) a wiki. More opinionated tutorials might have been less confusing. A lot of statements are correct but not quite, and leaving the choice too users is just confusing. For example : you can use SVG or PNG icons, which I did, but then some area don't like SVG (addons.mozilla.org), and the WebExtensions should work on Chrome, but Chrome requires PNG. Telling me that I could use SVG was not useful.

The review of addons is blazingly fast (7min from first submissions to Human approved). Apple could learn from that if what I've heard here and there is correct..

The submission process has way to many manual steps, I'm ok for first submission, but updates, really ? I want to be able to fill-in all the information ahead of time (or generate them) and then have a cli to submit things. I hate filling forms online.

The first submission even if marked Beta will not be considered beta. So basically I published a 0.1.0beta1, then 0.1.0beta2 which did not trigger automatic update because the beta1 was not considered beta. Super confusing. I could "force" to see the beta3 page but with a warning that beta3 was an older version than beta1 ? What ?

There is still this feeling that this last 1% of polishing the process has not been done (That's usually where Apple is know to shine). For example your store icon will be resized to 64x64 (px) and display in a 64x64 (px) square but I have a retina screen ! So even if I submitted a 128x128 now my icon looks blurry ! WTF !

### You can contribute

As I said earlier there is a lot of low hanging fruits ! I went through the process of figuring things out, so that you can contribute easily:

• detect if not on /master/ and craft corresponding binder URL
• Switch Icons to PNGs
• test/package for Chrome
• Add options for other binders than MyBinder.org

So see you there !

# JupyterCon - Display Protocol

This is an early preview of what I am going to talk about at Jupyter Con

## Leveraging the Jupyter and IPython display protocol¶

This is a small essay to show how one can make a better use of the display protocol. All you will see in this blog post has been available for a couple of years but noone really built on top of this.

It is usually know that the IPython rich display mechanism allow libraries authors to define rich representation for their objects. You may have seen it in SymPy, which make extensive use of the latex representation, and Pandas which dataframes have nice HTML view.

What I'm going to show below, is that one is not limited to these – you can alter the representation of any existing object without modifying its source – and that this can be used to alter the view of containers, with the example of lists, to make things easy to read.

### Modifying objects reprs¶

This section is just a reminder of how one can change define representation for object which source code is under your control. When defining a class, the code author needs to define a number of methods which should return the (data, metadata) pair for a given object mimetype. If no metadata is necesary, these can be ommited. For some common representations short methods name ara availables. These methond can be recognized as they all follow the following pattern _repr_*_(self). That is to say, an underscore, followed by repr followed by an underscore. The star * need to be replaced by a lowercase identifier often refering to a short human redable description of the format (e.g.: png , html, pretty, ...), ad finish by a single underscore. We note that unlike the python __repr__ (pronouced "Dunder rep-er" which starts and ends wid two underscore, the "Rich reprs" or "Reprs-stars" start and end with a single underscore.

Here is the class definition of a simple object that implements three of the rich representation methods:

• "text/html" via the _repr_html_ method
• "text/latex" via the _repr_latex_ method
• "text/markdown" via the _repr_markdown method

None of these methonds return a tuple, thus IPython will infer that there is no metadata associated.

The "text/plain" mimetype representation is provided by the classical Python's __repr__(self).

In [1]:
class MultiMime:

def __repr__(self):
return "this is the repr"

def _repr_html_(self):
return "This <b>is</b> html"

def _repr_markdown_(self):
return "This **is** mardown"

def _repr_latex_(self):
return "$Latex \otimes mimetype$"

In [2]:
MultiMime()

Out[2]:
This is html

All the mimetypes representation will be sent to the frontend (in many cases the notebook web interface), and the richer one will be picked and displayed to the the user. All representations are stored in the notebook document (on disk) and this can be choosen from when the document is later reopened – even with no kernel attached – or converted to another format.

### External formatters and containers¶

As stated in teh introduction, you do not need to have control over an object source code to change its representation. Still it is often a more convenient process. AS an example we will build a Container for image thumbnails and see how we can use the code written for this custom container to apply it to generic Python containers like lists.

As a visual example we'll use Orly Parody books covers, in particular a small resolution of some of them so llimit the amount of data we'll be working with.

In [3]:
cd thumb

/Users/bussonniermatthias/dev/posts/thumb


let's see some of the images present in this folder:

In [4]:
names = !ls *.png
names[:20], f"{len(names) - 10} more"

Out[4]:
(['10x-big.png',
'arbitraryforecasts-big.png',
'avoiddarkpatterns-big.png',
'blamingthearchitecture-big.png',
'blamingtheuser-big.png',
'breakingthebackbutton-big.png',
'buzzwordfirst-big.png',
'buzzwordfirstdesign-big.png',
'casualsexism-big.png',
'catchingemall-big.png',
'changinstuff-big.png',
'choosingbasedongithubstars-big.png',
'codingontheweekend-big.png',
'coffeeintocode-big.png',
'copyingandpasting-big.png',
'crushingit-big.png',
'deletingcode-big.png',
'doingwhateverdanabramovsays-big.png'],
'63 more')

in the above i've used an IPython specific syntax (!ls) ton conveniently extract all the files with a png extension (*.png) in the current working directory, and assign this to teh names variable.

That's cute, but, for images, not really usefull. We know we can display images in the Jupyter notebook when using the IPython kernel, for that we can use the Image class situated in the IPython.display submodule. We can construct such object simply by passing the filename. Image does already provide a rich representation:

In [5]:
from IPython.display import Image

In [6]:
im = Image(names[0])
im

Out[6]:

The raw data from the image file is available via the .data attribute:

In [7]:
im.data[:20]

Out[7]:
b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01\x90'

What if we map Images to each element of a list ?

In [8]:
from random import choices
mylist = list(map(Image, set(choices(names, k=10))))
mylist

Out[8]:
[<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>,
<IPython.core.display.Image object>]

Well unfortunately a list object only knows how to represent itself using text and the text representation of its elements. We'll have to build a thumbnail gallery ourself.

First let's (re)-build an HTML representation for display a single image:

In [9]:
import base64
from IPython.display import HTML
def tag_from_data(data, size='100%'):
return (
'''<img
style="display:inline;
width:{1};
max-width:400px;
margin-top:14px"
src="data:image/png;base64,{0}"
/>
''').format(''.join(base64.encodebytes(data).decode().split('\n')), size)


We encode the data from bytes to base64 (newline separated), and strip the newlines. We format that into an Html template – with some inline style – and set the source (src to be this base64 encoded string). We can check that this display correctly by wrapping the all thing in an HTML object that provide a conveninent _repr_html_.

In [10]:
HTML(tag_from_data(im.data))

Out[10]:

Now we can create our own subclass, hich take a list of images and contruct and HTML representation for each of these, then join them together. We define and define a _repr_html_, that wrap the all in a paragraph tag, and add a comma between each image:

In [11]:
class VignetteList:

def __init__(self, *images, size=None):
self.images = images
self.size = size

def _repr_html_(self):
return '<p>'+','.join(tag_from_data(im.data, self.size)  for im in self.images)+'</p>'

def _repr_latex_(self):
return '$O^{rly}_{books} (%s\ images)$ ' % (len(self.images))



We also define a LaTeX Representation – that we will not use here, and look at our newly created object using previously defined list:

In [12]:
VignetteList(*mylist, size='200px')

Out[12]:

, , ,