Scientific Python on M1 Macbook pro

For the past five years I've been working on a 2015 intel macbook pro which is starting to show its age. I've been pondering getting a new machine as it was starting to get difficult to be on video call and do anything else at the same time. I tried macs with touch bars, but no functions keys was a deal-breaker for me. I was considering the framework laptop, but ended up getting a new 2021 macpro (base model). Though it is apple silicon, I know it was going to be likely problematic, so here is my experience getting most of my python stack working on it.

Array, slices and indexing

Array, Slices and Fancy indexing¶ In this small post we'll investigate (quickly) the difference between slices (which are part of the Python standard library), and numpy array, and how these could be used for indexing. First let's create a matrix containing integers so that element at index i,j has value 10i+j for convenience. In [1]: import numpy as np from copy import copy Let's create a single row, that is to say a matrix or height 1 and width number of element. We'll use -1 in reshape to mean "whatever is necessary". for 2d matrices and tensor it's not super useful, but for higher dimension object, it can be quite conveneient. In [2]: X = np.arange(0, 10).reshape(1,-1) X Out[2]: array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]]) now a column, same trick. In [3]: Y = (10*np.arange(0, 8).reshape(-1, 1)) Y Out[3]: array([[ 0], [10], [20], [30], [40], [50], [60], [70]]) By summing, and the rules of "broadcasting", we get a nice rectangular matrix. In [4]: R = np.arange(5*5*5*5*5).reshape(5,5,5,5,5) In [5]: M = X+Y M Out[5]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]]) Slicing¶ Quick intro about slicing. You have likely use it before if you've encoutered the objet[12:34] objet[42:96:3] notation. The X:Y:Z part is a slice. This way of writing a slice is allowed only in between square bracket for indexing. X, Y and Z are optional and default to whatever is convenient, so ::3 (every three), :7 and :7: (until 7), : and :: (everything) are valid slices. A slice is an efficent object that (usually) represent "From X to Y by Every Z", it is not limitted to numbers. In [6]: class PhylosophicalArray: def __getitem__(self, sl): print(f"From {sl.start} to {sl.stop} every {sl.step}.") arr = PhylosophicalArray() arr['cow':'phone':'traffic jam'] From cow to phone every traffic jam. You can construct a slice using the slice builtin, this is (sometime) convenient, and use it in place of x:y:z In [7]: sl = slice('cow', 'phone', 'traffic jam') In [8]: arr[sl] From cow to phone every traffic jam. In multidimentional arrays, slice of 0 or 1 width, can be used to not drop dimensions, when comparing them to scalars. In [9]: M[:, 3] # third column, now a vector. Out[9]: array([ 3, 13, 23, 33, 43, 53, 63, 73]) In [10]: M[:, 3:4] # now a N,1 matrix. Out[10]: array([[ 3], [13], [23], [33], [43], [53], [63], [73]]) This is convenient when indices represent various quatities, for example an athmospheric ensemble when dimension 1 is latitude, 2: longitude, 3: height, 4: temperature, 5: pressure, and you want to focus on height==0, without having to shift temprature index from 4 to 3, pressure from 5 to 4... Zero-width slices are mostly used to simplify algorythmes to avoid having to check for edge cases. In [11]: a = 3 b = 3 M[a:b] Out[11]: array([], shape=(0, 10), dtype=int64) In [12]: M[a:b] = a-b In [13]: M # M is not modified ! Out[13]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 22, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 44, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 66, 67, 68, 69], [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]]) When indexing an array, you will slice each dimention individually. Here we extract the center block of the matrix not the 3 diagonal elements. In [14]: M[4:7, 4:7] Out[14]: array([[44, 45, 46], [54, 55, 56], [64, 65, 66]]) In [15]: sl = slice(4,7) sl Out[15]: slice(4, 7, None) In [16]: M[sl, sl] Out[16]: array([[44, 45, 46], [54, 55, 56], [64, 65, 66]]) Let's change the sign the biggest square block in the upper left of this matrix. In [17]: K = copy(M) el = slice(0, min(K.shape)) el Out[17]: slice(0, 8, None) In [18]: K[el, el] = -K[el, el] K Out[18]: array([[ 0, -1, -2, -3, -4, -5, -6, -7, 8, 9], [-10, -11, -12, -13, -14, -15, -16, -17, 18, 19], [-20, -21, -22, -23, -24, -25, -26, -27, 28, 29], [-30, -31, -32, -33, -34, -35, -36, -37, 38, 39], [-40, -41, -42, -43, -44, -45, -46, -47, 48, 49], [-50, -51, -52, -53, -54, -55, -56, -57, 58, 59], [-60, -61, -62, -63, -64, -65, -66, -67, 68, 69], [-70, -71, -72, -73, -74, -75, -76, -77, 78, 79]]) That's about for slices, it was already a lot. In the next section we'll talk about arrays Fancy indexing¶ Array are more or less what you've seem in other languages. Finite Sequences of discrete values In [19]: ar = np.arange(4,7) ar Out[19]: array([4, 5, 6]) When you slice with array, the elements of each arrays will be taken together. In [20]: M[ar,ar] Out[20]: array([44, 55, 66]) We now get a partial diagonal in out matrix. It does not have to be a diaonal: In [21]: M[ar, ar+1] Out[21]: array([45, 56, 67]) The result of this operation is a 1 dimentional array (which is a view – when possible –  on the initial matrix memory), in the same way as we flipped the sign of the largest block in the previous section, we'll try indexing with the same value: In [22]: S = copy(M) In [23]: el = np.arange(min(S.shape)) el Out[23]: array([0, 1, 2, 3, 4, 5, 6, 7]) In [24]: S[el, el] = -S[el,el] S Out[24]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [ 10, -11, 12, 13, 14, 15, 16, 17, 18, 19], [ 20, 21, -22, 23, 24, 25, 26, 27, 28, 29], [ 30, 31, 32, -33, 34, 35, 36, 37, 38, 39], [ 40, 41, 42, 43, -44, 45, 46, 47, 48, 49], [ 50, 51, 52, 53, 54, -55, 56, 57, 58, 59], [ 60, 61, 62, 63, 64, 65, -66, 67, 68, 69], [ 70, 71, 72, 73, 74, 75, 76, -77, 78, 79]]) Here we flipped the value of only the diagonal elements. It of couse did not had to do the diagonal elements: In [25]: S[el, el+1] Out[25]: array([ 1, 12, 23, 34, 45, 56, 67, 78]) In [26]: S[el, el+1] = 0 S Out[26]: array([[ 0, 0, 2, 3, 4, 5, 6, 7, 8, 9], [ 10, -11, 0, 13, 14, 15, 16, 17, 18, 19], [ 20, 21, -22, 0, 24, 25, 26, 27, 28, 29], [ 30, 31, 32, -33, 0, 35, 36, 37, 38, 39], [ 40, 41, 42, 43, -44, 0, 46, 47, 48, 49], [ 50, 51, 52, 53, 54, -55, 0, 57, 58, 59], [ 60, 61, 62, 63, 64, 65, -66, 0, 68, 69], [ 70, 71, 72, 73, 74, 75, 76, -77, 0, 79]]) Nor are we required to have the same elements only once: In [27]: el-1 Out[27]: array([-1, 0, 1, 2, 3, 4, 5, 6]) In [28]: sy = np.array([0, 1, 2, 0, 1, 2]) sx = np.array([1, 2, 3, 1, 2, 3]) ld = S[sx, sy] # select 3 elements of lower diagonal twice ld Out[28]: array([10, 21, 32, 10, 21, 32]) More in the scipy lectures notes, Numpy quickstart, Python DataScience Handbook Some experiments¶ In [29]: S = copy(M) S[0:10, 0:10] = 0 S Out[29]: array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]) In [30]: S = copy(M) S[0:10:2, 0:10] = 0 S Out[30]: array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]]) In [31]: S = copy(M) S[0:10, 0:10:2] = 0 S Out[31]: array([[ 0, 1, 0, 3, 0, 5, 0, 7, 0, 9], [ 0, 11, 0, 13, 0, 15, 0, 17, 0, 19], [ 0, 21, 0, 23, 0, 25, 0, 27, 0, 29], [ 0, 31, 0, 33, 0, 35, 0, 37, 0, 39], [ 0, 41, 0, 43, 0, 45, 0, 47, 0, 49], [ 0, 51, 0, 53, 0, 55, 0, 57, 0, 59], [ 0, 61, 0, 63, 0, 65, 0, 67, 0, 69], [ 0, 71, 0, 73, 0, 75, 0, 77, 0, 79]]) In [32]: S = copy(M) S[0:10:2, 0:10:2] = 0 S Out[32]: array([[ 0, 1, 0, 3, 0, 5, 0, 7, 0, 9], [10, 11, 12, 13, 14, 15, 16, 17, 18, 19], [ 0, 21, 0, 23, 0, 25, 0, 27, 0, 29], [30, 31, 32, 33, 34, 35, 36, 37, 38, 39], [ 0, 41, 0, 43, 0, 45, 0, 47, 0, 49], [50, 51, 52, 53, 54, 55, 56, 57, 58, 59], [ 0, 61, 0, 63, 0, 65, 0, 67, 0, 69], [70, 71, 72, 73, 74, 75, 76, 77, 78, 79]]) In [33]: S = copy(M) S[0:10:2, 0:10] = 0 S[0:10, 0:10:2] = 0 S Out[33]: array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 11, 0, 13, 0, 15, 0, 17, 0, 19], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 31, 0, 33, 0, 35, 0, 37, 0, 39], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 51, 0, 53, 0, 55, 0, 57, 0, 59], [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [ 0, 71, 0, 73, 0, 75, 0, 77, 0, 79]]) In [34]: S = copy(M) S[0:8, 0:8] = 0 S Out[34]: array([[ 0, 0, 0, 0, 0, 0, 0, 0, 8, 9], [ 0, 0, 0, 0, 0, 0, 0, 0, 18, 19], [ 0, 0, 0, 0, 0, 0, 0, 0, 28, 29], [ 0, 0, 0, 0, 0, 0, 0, 0, 38, 39], [ 0, 0, 0, 0, 0, 0, 0, 0, 48, 49], [ 0, 0, 0, 0, 0, 0, 0, 0, 58, 59], [ 0, 0, 0, 0, 0, 0, 0, 0, 68, 69], [ 0, 0, 0, 0, 0, 0, 0, 0, 78, 79]]) In [35]: S = copy(M) S[np.arange(0,8), np.arange(0,8)] = 0 S Out[35]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 0, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 0, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 0, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 0, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 0, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 0, 67, 68, 69], [70, 71, 72, 73, 74, 75, 76, 0, 78, 79]]) In [36]: S = copy(M) S[range(0,8), range(0,8)] = 0 S Out[36]: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [10, 0, 12, 13, 14, 15, 16, 17, 18, 19], [20, 21, 0, 23, 24, 25, 26, 27, 28, 29], [30, 31, 32, 0, 34, 35, 36, 37, 38, 39], [40, 41, 42, 43, 0, 45, 46, 47, 48, 49], [50, 51, 52, 53, 54, 0, 56, 57, 58, 59], [60, 61, 62, 63, 64, 65, 0, 67, 68, 69], [70, 71, 72, 73, 74, 75, 76, 0, 78, 79]]) In [37]: S = copy(M) S[np.arange(0, 10), np.arange(0, 10)] = 0 ## will fail S --------------------------------------------------------------------------- IndexError Traceback (most recent call last) in 1 S = copy(M) ----> 2 S[np.arange(0, 10), np.arange(0, 10)] = 0 ## will fail 3 S IndexError: index 8 is out of bounds for axis 0 with size 8 In [ ]:

Sign commits on GitHub

Signing Commit on Tags on GitHub I've recently set-up keybase and integrated my public key with git to be able to sign commits. I decided to not automatically sign, as auto-signing would allow any attacker that takes control of my machine to create signed commit. The git Merkle tree of git still insure repos are not tampered with, as long as you issue $git fsck --full on a repo or$ git config --global transfer.fsckobjects true once and forget it. Using $git log --show-signatur you can now check that commits (and tags) are correctly signed. Be careful though, correct signature does not mean trusted, and if you have a PGP key set; GitHub will helpfully signed the commit you make on their platform with their key. * commit 5ced6c6936563fea7ba7efccecbc4248d84cfabb (tag: 5.2.1, origin/5.2.x, 5.2.x) | gpg: Signature made Tue Jan 2 19:51:17 2018 CET | gpg: using RSA key 99B17F64FD5C94692E9EF8064968B2CC0208DCC8 | gpg: Good signature from "Matthias Bussonnier " [ultimate] | Author: Matthias Bussonnier | Date: Tue Jan 2 19:49:34 2018 +0100 | | Bump version number to 5.2.1 for release | * commit 5a28fb0a121c286e35db309fe11b53693969b2d6 |\ gpg: Signature made Tue Jan 2 13:58:08 2018 CET | | gpg: using RSA key 4AEE18F83AFDEB23 | | gpg: Good signature from "GitHub (web-flow commit signing) " [unknown] | | gpg: WARNING: This key is not certified with a trusted signature! | | gpg: There is no indication that the signature belongs to the owner. | | Primary key fingerprint: 5DE3 E050 9C47 EA3C F04A 42D3 4AEE 18F8 3AFD EB23 | | Merge: 3fd21bc 065a16a | | Author: Min RK | | Date: Tue Jan 2 13:58:08 2018 +0100 | | | | Merge pull request #326 from jupyter/auto-backport-of-pr-325 | | | | Backport PR #325 on branch 5.2.x | | | * commit 065a16aad2e84d506b36bb2c874a7c287c53c61f (origin/pr/326) |/ Author: Min RK | Date: Tue Jan 2 10:57:13 2018 +0100 | | Backport PR #325: Parenthesize conditional requirement in setup.py So in the previous block, you can see that 5ced6c6... have been done and signed by me, while 5a28fb0... has be allegedly done by Min, but signed by GitHub. By default you do not have GitHub Signature locally, so the GitHub Signed commits can appear as unverified. To do so fetch the GitHub Key:$ gpg --keyserver hkp://keys.gnupg.net --recv-keys 4AEE18F83AFDEB23 Where 4AEE18F83AFDEB23 is the key you do not have locally. And remember Valid Signature, does not mean trusted. verifying Tags Tags can be signed, and need to be checked independently of commits : $git tag --verify 5.2.1 object 5ced6c6936563fea7ba7efccecbc4248d84cfabb type commit tag 5.2.1 tagger Matthias Bussonnier 1514919438 +0100 release version 5.2.1 gpg: Signature made Tue Jan 2 19:57:18 2018 CET gpg: using RSA key 99B17F64FD5C94692E9EF8064968B2CC0208DCC8 gpg: Good signature from "Matthias Bussonnier " [ultimate] So you can check that I tagged this commit. learn more As usual the git documentation has more to say about this. And signing is not really useful without checking the integrity of Git history, so please set$ git config --global transfer.fsckobjects true as well !

Open in Binder Chrome Extension

Two weeks ago I was pleased to announce the release of the Open-with-Binder for Firefox extension. After asking on twitter if people were interested in the same for Chrome (29 Yes, 67 No, 3 Other) and pondering whether or not to pay the Chrome Developer Fee for the Chrome App store, I decided to take my chance and try to publish it last week. I almost just had to use Mozilla WebExt Shim for Chrome, downgrade a few artwork from SVG to PNG (like really??) and upload all by hand, like really again ? The Chrome Store has way more fields and it is quite complicated – compared to the Mozilla Addons website at least – It is sometime confusing whether fields are optional or not, or if they are per addons on per developer ? It does though allow you to upload more art that will be show in a store which that looks nicer. Still I had to pay to go through a really ugly crappy website and had to pay for it to publish a free extension. So Mozilla you win this. Please rate the extension, or it may not appear in search results for others AFAICT: install Open with Binder for chrome It works identically to the Firefox one, you get a button on the toolbar and click on it when visiting GitHub. Enjoy.

JupyterCon - Display Protocol

This is an early preview of what I am going to talk about at Jupyter Con Leveraging the Jupyter and IPython display protocol¶ This is a small essay to show how one can make a better use of the display protocol. All you will see in this blog post has been available for a couple of years but noone really built on top of this. It is usually know that the IPython rich display mechanism allow libraries authors to define rich representation for their objects. You may have seen it in SymPy, which make extensive use of the latex representation, and Pandas which dataframes have nice HTML view. What I'm going to show below, is that one is not limited to these – you can alter the representation of any existing object without modifying its source – and that this can be used to alter the view of containers, with the example of lists, to make things easy to read. Modifying objects reprs¶ This section is just a reminder of how one can change define representation for object which source code is under your control. When defining a class, the code author needs to define a number of methods which should return the (data, metadata) pair for a given object mimetype. If no metadata is necesary, these can be ommited. For some common representations short methods name ara availables. These methond can be recognized as they all follow the following pattern _repr_*_(self). That is to say, an underscore, followed by repr followed by an underscore. The star * need to be replaced by a lowercase identifier often refering to a short human redable description of the format (e.g.: png , html, pretty, ...), ad finish by a single underscore. We note that unlike the python __repr__ (pronouced "Dunder rep-er" which starts and ends wid two underscore, the "Rich reprs" or "Reprs-stars" start and end with a single underscore. Here is the class definition of a simple object that implements three of the rich representation methods: "text/html" via the _repr_html_ method "text/latex" via the _repr_latex_ method "text/markdown" via the _repr_markdown method None of these methonds return a tuple, thus IPython will infer that there is no metadata associated. The "text/plain" mimetype representation is provided by the classical Python's __repr__(self). In [1]: class MultiMime: def __repr__(self): return "this is the repr" def _repr_html_(self): return "This is html" def _repr_markdown_(self): return "This **is** mardown" def _repr_latex_(self): return "$Latex \otimes mimetype$" In [2]: MultiMime() Out[2]: This is html All the mimetypes representation will be sent to the frontend (in many cases the notebook web interface), and the richer one will be picked and displayed to the the user. All representations are stored in the notebook document (on disk) and this can be choosen from when the document is later reopened – even with no kernel attached – or converted to another format. External formatters and containers¶ As stated in teh introduction, you do not need to have control over an object source code to change its representation. Still it is often a more convenient process. AS an example we will build a Container for image thumbnails and see how we can use the code written for this custom container to apply it to generic Python containers like lists. As a visual example we'll use Orly Parody books covers, in particular a small resolution of some of them so llimit the amount of data we'll be working with. In [3]: cd thumb /Users/bussonniermatthias/dev/posts/thumb let's see some of the images present in this folder: In [4]: names = !ls *.png names[:20], f"{len(names) - 10} more" Out[4]: (['10x-big.png', 'adulting-big.png', 'arbitraryforecasts-big.png', 'avoiddarkpatterns-big.png', 'blamingthearchitecture-big.png', 'blamingtheuser-big.png', 'breakingthebackbutton-big.png', 'buzzwordfirst-big.png', 'buzzwordfirstdesign-big.png', 'casualsexism-big.png', 'catchingemall-big.png', 'changinstuff-big.png', 'chasingdesignfads-big.png', 'choosingbasedongithubstars-big.png', 'codingontheweekend-big.png', 'coffeeintocode-big.png', 'copyingandpasting-big.png', 'crushingit-big.png', 'deletingcode-big.png', 'doingwhateverdanabramovsays-big.png'], '63 more') in the above i've used an IPython specific syntax (!ls) ton conveniently extract all the files with a png extension (*.png) in the current working directory, and assign this to teh names variable. That's cute, but, for images, not really usefull. We know we can display images in the Jupyter notebook when using the IPython kernel, for that we can use the Image class situated in the IPython.display submodule. We can construct such object simply by passing the filename. Image does already provide a rich representation: In [5]: from IPython.display import Image In [6]: im = Image(names[0]) im Out[6]: The raw data from the image file is available via the .data attribute: In [7]: im.data[:20] Out[7]: b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01\x90' What if we map Images to each element of a list ? In [8]: from random import choices mylist = list(map(Image, set(choices(names, k=10)))) mylist Out[8]: [, , , , , , , , ] Well unfortunately a list object only knows how to represent itself using text and the text representation of its elements. We'll have to build a thumbnail gallery ourself. First let's (re)-build an HTML representation for display a single image: In [9]: import base64 from IPython.display import HTML def tag_from_data(data, size='100%'): return ( ''' ''').format(''.join(base64.encodebytes(data).decode().split('\n')), size) We encode the data from bytes to base64 (newline separated), and strip the newlines. We format that into an Html template – with some inline style – and set the source (src to be this base64 encoded string). We can check that this display correctly by wrapping the all thing in an HTML object that provide a conveninent _repr_html_. In [10]: HTML(tag_from_data(im.data)) Out[10]: Now we can create our own subclass, hich take a list of images and contruct and HTML representation for each of these, then join them together. We define and define a _repr_html_, that wrap the all in a paragraph tag, and add a comma between each image: In [11]: class VignetteList: def __init__(self, *images, size=None): self.images = images self.size = size def _repr_html_(self): return '

'+','.join(tag_from_data(im.data, self.size) for im in self.images)+'

' def _repr_latex_(self): return '$O^{rly}_{books} (%s\ images)$ ' % (len(self.images)) We also define a LaTeX Representation – that we will not use here, and look at our newly created object using previously defined list: In [12]: VignetteList(*mylist, size='200px') Out[12]: , , , , , , , , That is nice, though it forces us to unpack all the lists we have explicitely into a VignetteList – which may be annoying. Let's cleanup a bit the above, and register an external formatter for the "text/html" mimetype that should be used for any object which is a list. We'll also improve the formatter to recusrse in objects. THat is to say: If it's an image return the PNG data in an tag, If it's an object that has an text/html reprensetation, use that. Otherwise, use th repr. With this we loose some nice formatting of text lists with the pretty module, we could easily fix that; but we leve it as an exercice for the reader. We're also going to recusrse into objects, that have a html representation. That it to say, make it work with lists of lists. In [13]: def tag_from_data_II(data, size='100%'): return ''''''.format(''.join(base64.encodebytes(data).decode().split('\n')), size) def html_list_formatter(ll): html = get_ipython().display_formatter.formatters['text/html'] reps = [] for o in ll: if isinstance(o, Image): reps.append(tag_from_data_II(o.data, '200px') ) else: h = html(o) if h: reps.append(h) else: reps.append(repr(o)+'') return '['+','.join(reps)+']' Same as before, with square bracket after and before, and a bit of styling that change the drop shadow on hover. Now we register the above with IPython: In [14]: ipython = get_ipython() html = ipython.display_formatter.formatters['text/html'] html.for_type(list, html_list_formatter) In [15]: mylist Out[15]: [,,,,,,,,] Disp¶External integration for some already existing object is available in disp, in particular you will find representation for SparkContext, requests's Responses object (collapsible json content and headers), as well as a couple others. Magic integration¶ The above demonstatratino show that a kernel is more than a language, it is a controling process that manage user requests (in our case code execution) and how the results are returned to the user. There is often the assumtion that a kernel is a single language, this is an incorrect assumtion as a kernl proces may manage several language and can orchestrate data movement from one language to another. In the following we can see how a Python process make use of what we have defined above to make sql querries returning rich results. We also see that the execution od SQL queries have side effects in the Python namespace, showing how the kernel can orchestrate things. In [16]: load_ext fakesql In [17]: try: rly except NameError: print('rly not defined') rly not defined In [18]: %%sql SELECT name,cover from orly WHERE color='red' LIMIT 10 Out[18]: [['buzzwordfirst-big.png',],['buzzwordfirstdesign-big.png',],['goodenoughtoship-big.png',],['noddingalong-big.png',],['resumedrivendevelopment-big.png',],['takingonneedlessdependencies-big.png',]] In [19]: rly[2] Out[19]: ['goodenoughtoship-big.png',] It would not be hard to have modification of the Python namespace to affect the SQL database, this is left as an exercise to the user as well (hint use properties) and to have integration with other languages like R, Julia, ... Note: This notebook has initially been written to display prototype features of IPython and the Jupyter notebook, in particular completions of cell magic (for the Sql Cell), and UI element allowing to switch between the shown mimetype. This will not be reflected in static rendering and is not mentioned in the text, which may lead to a confusing read.

Writing an async REPL - Part 1

This is a first part in a series of blog post which explain how I implemented the ability to await code at the top level scope in the IPython REPL. Don't expect the second part soon, or bother me for it. I know I shoudl write it, but time is a rarte luxury. It is an interesting adventure into how Python code get executed, and I must admit it changed quite a bit how I understand python code now days and made me even more excited about async/await in Python. It should also dive quite a bit in the internals of Python/CPython if you ever are interested in what some of these things are. In [1]: # we cheat and deactivate the new IPython feature to match Python repl behavior %autoawait False Async or not async, that is the question¶ You might now have noticed it, but since Python 3.5 the following is valid Python syntax: In [2]: async def a_function(): async with contextmanager() as f: result = await f.get('stuff') return result So you've been curious and read a lot about asyncio, and may have come across a few new libraries like aiohttp and all hte aio-libs, heard about sans-io, read complaints and we can take differents approaches, and maybe even maybe do better. You vaguely understand the concept of loops and futures, the term coroutine is still unclear. So you decide to poke around yourself in the REPL. In [3]: import aiohttp In [4]: print(aiohttp.__version__) coro_req = aiohttp.get('https://api.github.com') coro_req 1.3.5 Out[4]: In [5]: import asyncio res = asyncio.get_event_loop().run_until_complete(coro_req) In [6]: res Out[6]: In [7]: res.json() Out[7]: In [8]: json = asyncio.get_event_loop().run_until_complete(res.json()) json Out[8]: {'authorizations_url': 'https://api.github.com/authorizations', 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}', 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}', 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}', 'current_user_repositories_url': 'https://api.github.com/user/repos{?type,page,per_page,sort}', 'current_user_url': 'https://api.github.com/user', 'emails_url': 'https://api.github.com/user/emails', 'emojis_url': 'https://api.github.com/emojis', 'events_url': 'https://api.github.com/events', 'feeds_url': 'https://api.github.com/feeds', 'followers_url': 'https://api.github.com/user/followers', 'following_url': 'https://api.github.com/user/following{/target}', 'gists_url': 'https://api.github.com/gists{/gist_id}', 'hub_url': 'https://api.github.com/hub', 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}', 'issues_url': 'https://api.github.com/issues', 'keys_url': 'https://api.github.com/user/keys', 'notifications_url': 'https://api.github.com/notifications', 'organization_repositories_url': 'https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}', 'organization_url': 'https://api.github.com/orgs/{org}', 'public_gists_url': 'https://api.github.com/gists/public', 'rate_limit_url': 'https://api.github.com/rate_limit', 'repository_search_url': 'https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}', 'repository_url': 'https://api.github.com/repos/{owner}/{repo}', 'starred_gists_url': 'https://api.github.com/gists/starred', 'starred_url': 'https://api.github.com/user/starred{/owner}{/repo}', 'team_url': 'https://api.github.com/teams', 'user_organizations_url': 'https://api.github.com/user/orgs', 'user_repositories_url': 'https://api.github.com/users/{user}/repos{?type,page,per_page,sort}', 'user_search_url': 'https://api.github.com/search/users?q={query}{&page,per_page,sort,order}', 'user_url': 'https://api.github.com/users/{user}'} It's a bit painful to pass everything to run_until_complete, you know how to write async-def function and pass this to an event loop: In [9]: loop = asyncio.get_event_loop() run = loop.run_until_complete url = 'https://api.github.com/rate_limit' async def get_json(url): res = await aiohttp.get(url) return await res.json() run(get_json(url)) Out[9]: {'rate': {'limit': 60, 'remaining': 50, 'reset': 1491508909}, 'resources': {'core': {'limit': 60, 'remaining': 50, 'reset': 1491508909}, 'graphql': {'limit': 0, 'remaining': 0, 'reset': 1491511760}, 'search': {'limit': 10, 'remaining': 10, 'reset': 1491508220}}} Good ! And the you wonder, why do I have to wrap thing ina function, if I have a default loop isn't it obvious what where I want to run my code ? Can't I await things directly ? So you try: In [10]: await aiohttp.get(url) File "", line 1 await aiohttp.get(url) ^ SyntaxError: invalid syntax What ? Oh that's right there is no way in Pyton to set a default loop... but a SyntaxError ? Well, that's annoying. Outsmart Python¶ Hopefully you (in this case me), are in control of the REPL. You can bend it to your will. Sure you can do some things. First you try to remember how a REPL works: In [11]: mycode = """ a = 1 print('hey') """ def fake_repl(code): import ast module_ast = ast.parse(mycode) bytecode = compile(module_ast, '', 'exec') global_ns = {} local_ns = {} exec(bytecode, global_ns, local_ns) return local_ns fake_repl(mycode) hey Out[11]: {'a': 1} We don't show global_ns as it is huge, it will contain all that's availlable by default in Python. Let see where it fails if you use try a top-level async statement: In [12]: import ast mycode = """ import aiohttp await aiohttp.get('https://aip.github.com/') """ module_ast = ast.parse(mycode) File "", line 3 await aiohttp.get('https://aip.github.com/') ^ SyntaxError: invalid syntax Ouch, so we can't even compile it. Let be smart can we get the inner code ? if we wrap in async-def ? In [13]: mycode = """ async def fake(): import aiohttp await aiohttp.get('https://aip.github.com/') """ module_ast = ast.parse(mycode) ast.dump(module_ast) Out[13]: "Module(body=[AsyncFunctionDef(name='fake', args=arguments(args=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Import(names=[alias(name='aiohttp', asname=None)]), Expr(value=Await(value=Call(func=Attribute(value=Name(id='aiohttp', ctx=Load()), attr='get', ctx=Load()), args=[Str(s='https://aip.github.com/')], keywords=[])))], decorator_list=[], returns=None)])" In [14]: ast.dump(module_ast.body[0]) Out[14]: "AsyncFunctionDef(name='fake', args=arguments(args=[], vararg=None, kwonlyargs=[], kw_defaults=[], kwarg=None, defaults=[]), body=[Import(names=[alias(name='aiohttp', asname=None)]), Expr(value=Await(value=Call(func=Attribute(value=Name(id='aiohttp', ctx=Load()), attr='get', ctx=Load()), args=[Str(s='https://aip.github.com/')], keywords=[])))], decorator_list=[], returns=None)" As a reminder, as AST stands for Abstract Syntax Tree, you may construct an AST which is not a valid Python, program, like an if-else-else. AST tree can be modified. What we are interested in it the body of the function, which itself is the first object of a dummy module: In [15]: body = module_ast.body[0].body body Out[15]: [<_ast.import at>, <_ast.expr at>] Let's pull out the body of the function and put it at the top level of a newly created module: In [16]: async_mod = ast.Module(body) ast.dump(async_mod) Out[16]: "Module(body=[Import(names=[alias(name='aiohttp', asname=None)]), Expr(value=Await(value=Call(func=Attribute(value=Name(id='aiohttp', ctx=Load()), attr='get', ctx=Load()), args=[Str(s='https://aip.github.com/')], keywords=[])))])" Mouahahahahahahahahah, you managed to get a valid top-level async ast ! Victory is yours ! In [17]: bytecode = compile(async_mod, '', 'exec') File "", line 4 SyntaxError: 'await' outside function Grumlgrumlgruml. You haven't said your last word. Your going to take your revenge later. Let's see waht we can do in Part II, not written yet.