No Pylab Thanks

2013-12-30 19:00

Please Stop using Pylab¶ [Edit] thanks to Randy Olson for sending me a PR with grammar and spelling corrections. TL;DR, Please stop advising people to use the pylab flag when using IPython. It is harmful. If you want to help IPython, try to avoid retweeting, and promoting things that use the pylab flag and make the authors aware of the issues. Use explicit import, and %matplotlib inline magic. It is better (and supports switching between inline/not inline.) This was mainly prompted by a day where I came across consecutive issues due to the pylab flag, and people happy to discover it. Pylab, my worst friend¶ When IPython 1.0 was released almost 6 months ago, we had quite a few decisions to make in less than a week and loads of discussions at the last moment. One of those decisions comes and goes often: we want to get rid of this stupid pylab flag. If you look at our stable (1.0) dev doc and examples there shouldn't be a mention of the pylab flag anywhere. If there is, we will quickly remove it. Why? Because it is harmful, first to us, then to new users, to the Python community, and finally to research in general. Do you know about pylab? If you don't, please be extra careful and try to avoid it as much as possible. It is like smoking: it looks cool at first, then after a few years you realise you can live without it, but it makes you sick. You think you know what pylab does? Really? Take few seconds to think what the pylab flag is doing, then read the rest. What is it supposed to do?¶ The pylab package main purpose was to build a transition tool from other languages to Python. As it was more and more common and painful to do the same import in IPython every time when only the IPython shell was around, the --pylab flag was added. Basically, it did the following: import numpy import matplotlib from matplotlib import pylab, mlab, pyplot np = numpy plt = pyplot from IPython.core.pylabtools import figsize, getfigs from pylab import * from numpy import * Did you get it right the first time without cheating ? Are you able to say what has been imported in from pylab import * ? in from numpy import * ? Of course, that is not the only thing it does. But was it your intention to do that the last time you used pylab? It is irreversible¶ Once you activate pylab mode, there is no going back. You cannot unimport things. Of course, with the %pylab magic you can always restart your kernel, but with the flag, all your kernels will start in pylab mode. Are you sure you will not need a non-pylab kernel? Unclear¶ When using the pylab flag, your audience usually has no way of knowing that you used the flag. If I use plot(range(10)), what will happen? Will it pop up a figure ? Inline it? Or throw an error? In [ ]: plot(range(10)) Because I'm mean, I won't execute the cell, so that you don't know whether or not I'm running in pylab mode or not. You might think it's not a big deal, but in teaching and research it is important to communicate exactly what you are doing. It pollutes the namespace¶ In [1]: len(get_ipython().user_ns.keys()) Out[1]: 20 In [2]: %pylab Using matplotlib backend: Agg Populating the interactive namespace from numpy and matplotlib IPython 2.0-dev gives some warning, and will tell you wether it clobbers already non-built-in variables in the user namespace. In [3]: len(get_ipython().user_ns.keys()) Out[3]: 968 Can you really tell me the 948 additional things you now have in your namespace? Not a big deal? Really? It replaces built-ins¶ In [6]: sum,all Out[6]: (, ) Both used to be , before using pylab, and this changes the behavior of your programs! For example, it leads to non-pickable object in some cases: @treycausey on Twitter @ogrisel @myusuf3 Ah, very curious. It loads fine in IPython but not in an IPython notebook. I must have a namespace collision or something. Me: Let me guess: $ipython notebook --pylab? But $ipython alone? Other unrelated actions¶ IPython developed and added the rich display protocol, so we added from IPython.display import display to what import pylab does. Because matplotlib can work with different event loops, pylab added options to activate qt/gtk/osx/etc. event loops, and with the QtConsole and Notebook creation, the ability to select the inline backend, which registers display hooks. The main things users remember, and often the main message that you can read here and there is that: You need to use Pylab to get inline figures. Which is false. Inline pylab mode is a convenient method to activate inline, and set up a display hook with matplotlib. You do not need pylab to have inline images, nor do you need matplotlib. With 1.0 and above the recommended way to set up inline figures would be to use %matplotlib. Side Effects¶ The worst part is it has the side effect of making people use pylab without even knowing it. Later on, they start asking question about how to reuse a graph in matplotlib because no one ever encounters the following code any more: fig,ax = subplots(1,1) ax.plot(...) or why their script suddenly does not work in plain Python but works in IPython. It kills kittens, with a spoon...¶ Maybe not kittens, but it will force developers to explain the same things, again, and again, and again... In [8]: from IPython.display import YouTubeVideo YouTubeVideo('9VDvgL58h_Y',start=4*60+27) Out[8]: There was a problem loading this remote IFrame, have a look as Javascript console Make --pylab disappear.¶ Of course, we will not remove the --pylab flag for compatibility reasons, but please, for 2014, make the following resolutions: Use %matplotlib [inline|qt|osx|gtx] to select the right backend/event hook. Use explicit imports Make people aware of the issues Help by making no new articles/tutorials that mention --pylab If really you only have 10 seconds left, the %pylab magic is fine. As usual, this has been written in an IPython Notebook. You can send me pull request if I made mistakes, or if you have any additional remarks on why you shouldn't use --pylab.

Matplotlib and IPython love child

2013-12-01 14:00

This is an experiement of how one could try to improve matplotlib configuration system using IPython's traitlets system; It has probably a huge number of disatvantages going from slowing down all the matplotlib stack, as making every class name matter in backward compatibility. Far from beeing a perfect lovechild of two awesome project, this is for now an horible mutant that at some point inspect up it's statck to find information about it's caller in some places. It woudl need a fair amount of work to be nicely integrated into matplotlib. Warning¶ This post has been written with a patched version of Matplotlib, so you will not be able to reproduce this post by re-executing this notebook. I've also ripped out part of IPython configurable sytem into a small self contained package that you would need. TL,DR;¶ Here is an example where the color of any Text object of matplotlib has a configurable color as long as the object that creates it is (also) an Artist, with minimal modification of matplotlib. In [1]: %pylab inline from IPConfigurable.configurable import Config matplotlib.rc('font',size=20) Populating the interactive namespace from numpy and matplotlib Here is the interesting part where one cas see that everything is magically configurable. In [2]: matplotlib.config = Config() # by default all Text are now purple matplotlib.config.Text.t_color = 'purple' # Except Text created by X/Y Axes will red/aqua matplotlib.config.YAxis.Text.t_color='red' matplotlib.config.XAxis.Text.t_color='aqua' # If this is the text of a Tick it should be orange matplotlib.config.Tick.Text.t_color='orange' # unless this is an XTick, then it shoudl be Gray-ish # as (XTick <: tick it will have precedence over matplotlib.config.xtick.text.t_color="(0.4,0.4,0.3)" legend matplotlib.config.textarea.text.t_color="y" matplotlib.config.axessubplot.text.t_color="pink" in plt.plot label="a sinc" plt.ylabel plt.xlabel is x plt.title plt.annotate plt.legend out at> I love Matplotlib¶ I love Matplotlib and what you can do with it, I am always impressed by how Jake Van Der Plas is able to bend Matpltolib in dooing amazing things. That beeing said, default color for matplotlib graphics are not that nice, mainly because of legacy, and even if Matplotlib is hightly configurable, many libraries are trying to fix it and receipt on the net are common. IPython configuration is magic¶ I'm not that familiar with Matplotlib internal, but I'm quite familiar with IPython's internal. In particular, using a lightweight version of Enthought Traits we call Traitlets, almost every pieces of IPython is configurable. According to Clarke's third law Any sufficiently advanced technology is indistinguishable from magic. So I'll assume that IPython configuration system is magic. Still there is some rule you shoudl know. In IPython, any object that inherit from Configurable can have attributes that are configurable. The name of the configuration attribute that will allow to change the value of this attribute are easy, it's Class.attribute = Value, and if the creator of an object took care of passing a reference to itself, you can nest config as ParentClass.Class.attribute = value, by dooing so only Class created by ParentClass will have value set. With a dummy example. class Foo(Configurable): length = Integer(1,config=True) ... class Bar(Configurable): def __init__(self): foo = Foo(parent=self) class Rem(Bar): pass every Foo object length can be configured with Foo.length=2 or you can target a subset of foo by setting Rem.Foo.length or Bar.Foo.lenght. But this might be a little abstarct, let's do a demo with matplotlib In [4]: cd ~/matplotlib/ /Users/bussonniermatthias/matplotlib let's make matplotlib Artist an IPython Configurable, grab default config from matplotlib.config if it exist, and pass it to parrent. -class Artist(object): +class Artist(Configurable): - def __init__(self): + def __init__(self, config=None, parent=None): + + c = getattr(matplotlib,'config',Config({})) + if config : + c.merge(config) + super(Artist, self).__init__(config=c, parent=parent) Now we will define 2 attributes of Patches (a subclass of Artist) ; t_color, t_lw that are respectively a Color or a Float and set the default color of current Patch to this attribute. class Patch(artist.Artist): + t_color = MaybeColor(None,config=True) + t_lw = MaybeFloat(None,config=True) + ... if linewidth is None: - linewidth = mpl.rcParams['patch.linewidth'] + if self.t_lw is not None: + linewidth = self.t_lw + else: + linewidth = mpl.rcParams['patch.linewidth'] ... if color is None: - color = mpl.rcParams['patch.facecolor'] + if self.t_color is not None: + color = self.t_color + else : + color = mpl.rcParams['patch.facecolor'] One could also set _t_color_default to mpl.rcParams['patch.facecolor'] but it becommes complicaed for the explanation That's enough¶ This is the minimum viable to have this to work we can know magically configure independently any Subclass of Patches We know that Wedge, Ellipse,... and other are part of this category, so let's play with their t_color In [5]: # some minimal imports import matplotlib.pyplot as plt; import numpy as np import matplotlib.path as mpath import matplotlib.lines as mlines import matplotlib.patches as mpatches from matplotlib.collections import PatchCollection In [6]: matplotlib.config = Config({ "Wedge" :{"t_color":"0.4"}, "Ellipse" :{"t_color":(0.9, 0.3, 0.7)}, "Circle" :{"t_color":'red'}, "Arrow" :{"t_color":'green'}, "RegularPolygon":{"t_color":'aqua'}, "FancyBboxPatch":{"t_color":'y'}, }) Let's see what this gives : In [7]: """ example derived from http://matplotlib.org/examples/shapes_and_collections/artist_reference.html """ fig, ax = plt.subplots() grid = np.mgrid[0.2:0.8:3j, 0.2:0.8:3j].reshape(2, -1).T patches = [] patches.append(mpatches.Circle(grid[0], 0.1,ec="none")) patches.append(mpatches.Rectangle(grid[1] - [0.025, 0.05], 0.05, 0.1, ec="none")) patches.append(mpatches.Wedge(grid[2], 0.1, 30, 270, ec="none")) patches.append(mpatches.RegularPolygon(grid[3], 5, 0.1)) patches.append(mpatches.Ellipse(grid[4], 0.2, 0.1)) patches.append(mpatches.Arrow(grid[5, 0]-0.05, grid[5, 1]-0.05, 0.1, 0.1, width=0.1)) patches.append(mpatches.FancyBboxPatch( grid[7] - [0.025, 0.05], 0.05, 0.1, boxstyle=mpatches.BoxStyle("Round", pad=0.02))) collection = PatchCollection(patches, match_original=True) ax.add_collection(collection) plt.subplots_adjust(left=0, right=1, bottom=0, top=1) plt.axis('equal') plt.axis('off') plt.show() It works !!! Isn't that great ? Free configuration for all Artists ; of course as long as you don't explicitely set the color, or course. Let's be ugly.¶ We need slightly more to have nested configuration, each Configurable have to be passed the parent keyword, but Matplotlib is not made to pass the parent keyword to every Artist it creates, this prevent the use of nested configuration. Still using inspect, we can try to get a handle on the parent, by walking up the stack. adding the following in Artist constructor: import inspect def __init__(self, config=None, parent=None): i_parent = inspect.currentframe().f_back.f_back.f_locals.get('self',None) if (i_parent is not self) and (parent is not i_parent) : if (isinstance(i_parent,Configurable)): parent = i_parent .... let's patch Text to also accept a t_color configurable, bacause Text is a good candidate for nesting configurability: class Text(Artist): + t_color = MaybeColor(None,config=True) if color is None: - color = rcParams['text.color'] + if self.t_color is not None: + color = self.t_color + else : + color = rcParams['text.color'] + if self.t_color is not None: + color = self.t_color + if fontproperties is None: fontproperties = FontProperties() Now we shoudl be able to make default Text always purple, nice things about Config object is that once created they accept acces of any attribute with dot notation. In [8]: matplotlib.config.Text.t_color = 'purple' In [9]: fig,ax = plt.subplots(1,1) plt.plot(sinc(arange(0,6,0.1))) plt.ylabel('SinC(x)') plt.title('SinC of X') Out[9]: Ok, not much further than current matplotlib configuratin right ? We also know that XAxis and Yaxis inherit from Axis which itself inherit from Artist. Both are responsible from creating the x- and y-label In [10]: matplotlib.config.YAxis.Text.t_color='r' matplotlib.config.YAxis.Text.t_color='aqua' same goes for Tick, XTick and YTicks. I can of course set a parameter to the root class: In [11]: matplotlib.config.Tick.Text.t_color='orange' and overwrite it for a specific subclass: In [12]: matplotlib.config.XTick.Text.t_color='gray' In [13]: fig,ax = plt.subplots(1,1) plt.plot(sinc(arange(0,6,0.1))) plt.ylabel('SinC(x)') plt.title('SinC of X') Out[13]: This is, as far as I know not possible to do with current matplotlib confuguration system. At least not without adding a rc-param for each and every imaginable combinaison. What more ?¶ First thing it that this make it trivial for external library to plug into matplotlib configuration system to have their own defaults/configurable defaults. You also can of course refine configurability by use small no-op class that inherit from base classes and give them meaning. Especially right now, Ticks are separated in XTick and YTick with a major/minor attribute. They shoudl probably be refactor into MajorTick/MinorTick. With that you can mix and match configuration from the most global Axis.Tick.value=... to the more precise YAxis.MinorTick. Let's do an example with a custom artist that create 2 kinds of circles. We'll need a custom no-op class that inherit Circle. In [14]: class CenterCircle(mpatches.Circle): pass In [15]: matplotlib.config = Config() matplotlib.config.Circle.t_color='red' matplotlib.config.CenterCircle.t_color='aqua' In [16]: from IPConfigurable.configurable import Configurable import math class MyGenArtist(Configurable): def n_circle(self, x,y,r,n=3): pi = math.pi sin,cos = math.sin, math.cos l= [] for i in range(n): l.append(mpatches.Circle( ## here Circle (x+2*r*cos(i*2*pi/n),y+2*r*sin(i*2*pi/n)), r, ec="none", )) l.append(CenterCircle((x,y),r)) ## Here CenterCircle return l fig, ax = plt.subplots() patches = [] patches.extend(MyGenArtist().n_circle(4,1,0.5)) patches.extend(MyGenArtist().n_circle(2,4,0.5,n=6)) patches.extend(MyGenArtist().n_circle(1,1,0.5,n=5)) collection = PatchCollection(patches, match_original=True) ax.add_collection(collection) plt.subplots_adjust(left=0, right=1, bottom=0, top=1) plt.axis('equal') plt.axis('off') Out[16]: (-1.0, 6.0, -1.0, 6.0) What's next ?¶ This configuration system is, of course not limited to Matplotib. But to use it it should probably be better decoupled into a separated package first, independently of IPython. Also if it is ever accepted into matplotlib, there will still be a need to adapt current mechnisme to work on top of this. Bonus¶ This patches version of matplotlib keep track of all the Configurable it discovers while you use it. Here is a non exaustive list. In [17]: from matplotlib import artist print "-------------------------" print "Some single configurables" print "-------------------------" for k in sorted(artist.Artist.s): print k print "" print "----------------------------------" print "Some possible nested configurables" print "----------------------------------;" for k in sorted(artist.Artist.ps): print k ------------------------- Some single configurables ------------------------- Annotation Arrow AxesSubplot CenterCircle Circle DrawingArea Ellipse FancyBboxPatch Figure HPacker Legend Line2D PatchCollection Rectangle RegularPolygon Spine Text TextArea VPacker Wedge XAxis XTick YAxis YTick ---------------------------------- Some possible nested configurables ----------------------------------; AxesSubplot.Legend AxesSubplot.Text AxesSubplot.XAxis AxesSubplot.YAxis TextArea.Text XAxis.Text XTick.Text YAxis.Text YTick.Text

Dear DiffLib

2013-08-01 20:00

Dear difflib, What The F@! ?¶ TL,DR:¶ Acording to difflib, the following strings: 'baaaaa', 'aaabaa' and 'aaaaba' share 5 caracters with 'aaaaaa', but 'aabaaa' share only 3 .... but only if the change is in the first half of the string. In [1]: import difflib print sum(x.size for x in difflib.SequenceMatcher(None, '-aaaaa', 'aaaaaa').get_matching_blocks()) print sum(x.size for x in difflib.SequenceMatcher(None, 'a-aaaa', 'aaaaaa').get_matching_blocks()) print sum(x.size for x in difflib.SequenceMatcher(None, 'aa-aaa', 'aaaaaa').get_matching_blocks()) print sum(x.size for x in difflib.SequenceMatcher(None, 'aaa-aa', 'aaaaaa').get_matching_blocks()) print '- Control -' print sum(x.size for x in difflib.SequenceMatcher(None, 'aaaaaa', 'aaaaaa').get_matching_blocks()) 5 4 3 5 - Control - 6 It only get weirder and more fractal if you read the rest Context¶ A few weeks back I was in EuroSciPy, where I had to dive a little deeper than usual into difflib. Indead, one often requested feature in IPython is to be able to diff notebooks, so I started looking at how this can be done. Thanks to @MinRK for helping me figuring out the rest of this post and give me some nices ideas of graphs. Naturaly I turned myself toward difflib : This module provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce difference information in various formats, including HTML and context and unified diffs. For comparing directories and files, see also, the filecmp module. More especially, I am interested in SequenceMatcher which aims to be : [...] a flexible class for comparing pairs of sequences of any type, so long as the sequence elements are hashable. It is also explicitely state that generated diff might not be minimal, but might looks more "human readable" than classical diff (emphasis mine). The basic algorithm [..] is a little fancier than,[...] “gestalt pattern matching.” The idea is to find the longest contiguous matching subsequence that contains no “junk” elements [...]. The same idea is then applied recursively to the pieces of the sequences to the left and to the right of the matching subsequence. This does not yield minimal edit sequences, but does tend to yield matches that “look right” to people. To do that, we need to define "junk" which are things you don't want the algorithme to match on. Let see a common example in python where when you add a function with a derorator, the decorator is often added to the "next" function. In [2]: s1 = """ @decorate def fun1(): return 1 @decorate def fun3(): return 3 """ s2 = """ @decorate def fun1(): return 1 @decorate def fun2(): return 2 @decorate def fun3(): return 3 """ Classical diff In [3]: import difflib print ''.join(difflib.ndiff(s1.splitlines(1), s2.splitlines(1))) @decorate def fun1(): return 1 @decorate + def fun2(): + return 2 + + @decorate def fun3(): return 3 Now we will tell it that blank line are junk: In [4]: blankline = lambda x:x.strip() =='' print ''.join(difflib.ndiff(s1.splitlines(1), s2.splitlines(1), linejunk=blankline)) @decorate def fun1(): return 1 + @decorate + def fun2(): + return 2 + @decorate def fun3(): return 3 This is clearly better, as hunk do not have blank lines in the midlle, but more on sides. Where it gets weird¶ Things is, SequenceMatcher is also the method that help you get the proximity between two strings. If the sequence you pass to SequenceMatcher is a string, it will try to match each caracters. In [5]: from difflib import SequenceMatcher In [6]: print SequenceMatcher('hello world','hello world').ratio() print SequenceMatcher('xyz','abc').ratio() 0.0 0.0 Oh, sorry, you need to explicitelty, pass the isjunk function. Hopefully, it accepts None. In [7]: print SequenceMatcher(None, 'hello world','hello world').ratio() print SequenceMatcher(None, 'xyz','abc').ratio() 1.0 0.0 Ok, that's better, it goes from 0.0 for completely different strings (nothing in common) to 1.0 for perfectly matching strings. API is weird, but for compatibility we keep the old one... fine with me let's try longer... In [8]: print SequenceMatcher(None, 'y'+'abc'*150,'x'+'abc'*150).ratio() 0.0 Still, don't pass it strings longer than 200 char, it automatically detect junk... Yes this is documented, but not obvisous. In [9]: print SequenceMatcher(None, 'y'+'abc'*150,'x'+'abc'*150, autojunk=False).ratio() 0.9977827051 So let's define a custom SequenceMatcher that make sens for the rest of the post, so that isjunkis None by default, and no autojunk for n > 200, and a simple ratio(a, b) method as a shortcut. In [10]: def MySequenceMatcher( seq1, seq2, isjunk=None, autojunk=False): return SequenceMatcher(isjunk, seq1, seq2, autojunk=autojunk) def ratio(a,b): return MySequenceMatcher(a,b).ratio() In [10]: print ratio('abc','abc') print ratio('abcd','abcf') 1.0 0.75 Where it gets weirder¶ I'll probably won't go into how we found out the following, but for some reason ratio(a,b) was different from ratio(reversed(a),reversed(b)), that is to say ratio('ab...yz','ab...yz') != ratio('zy...ba','zy...ba') in some case. So let's look at what happend if we compare a string against itself, with only one modification, as a function as the position of the modification, that is to say the ratio of aaaa... vs baaa... vs abaa... vs aaba... In [11]: %matplotlib inline import matplotlib import matplotlib.pyplot as plt In [12]: n = 100 step = 1 s1 = u'a'*n r = lambda x : ratio(x, s1) def modified(i): """ return the string s1, where the i-th is replaced by -""" return s1[:i-1]+u'-'+s1[i+1:] xx = range(1, n, step) print(s1[:10]),'...' for i in range(10): print(modified(i)[:10]),'...' print('..............') distance = map( r, [modified(i) for i in xx]) plt.plot(xx, distance) plt.ylabel('similarity') plt.xlabel('position of modified caracter') aaaaaaaaaa ... aaaaaaaaaa ... -aaaaaaaaa ... a-aaaaaaaa ... aa-aaaaaaa ... aaa-aaaaaa ... aaaa-aaaaa ... aaaaa-aaaa ... aaaaaa-aaa ... aaaaaaa-aa ... aaaaaaaa-a ... .............. Out[12]: WWWHHHHAAAAAT ??? Hum..let's look with some other string, basically same as before, but with repeating sequence ababab.., abcabcabcabc... In [13]: import string In [14]: def test_rep(k=1, n=128): s1 = string.letters[:k]*int(n/k*2) s1 = s1[:n] r = lambda x : ratio(x, s1) def modified(i): """ return the string s1, where the i-th is replaced by -""" return s1[:i-1]+u'-'+s1[i+1:] xx = range(1, n, step) distance = map( r, [modified(i) for i in xx]) return xx,distance fig,ax = plt.subplots(1,1) fig.set_figwidth(16) fig.set_figheight(8) for k in [1,2,4,8,16,32]: xx, distance = test_rep(k, n=64) plt.step(xx, distance, where='post', label='k={k}'.format(k=k)) plt.ylabel('similarity') plt.xlabel('position of modified caracter') plt.legend(loc=4) Out[14]: (bottom of the graph is at 0.5, not 0.0) Huummm... this definitively does not look at what I was expecting (mainly something constant), but looks more like a Sierpinski triangle to me. Bottom line, even if difflib make some pretty looking diff, I cannot trust it for giving me proximity of two sequences. Still Some good sides...¶ Anyway, I knew what levenstein distance was, but not really what the efficient algorithme where. I played around a little bit with them, and recoded them in pure python. I'll post the link soon, just time to clean things up im make them as pure python module. Below is how it looks like. Thing is, as far as I can tell, libdiff stays an order of magnitude faster in some cases, as well as computes the matches at the same time. In [15]: def lcs_len3(Seq1 , Seq2): """ Compute the LCS len 2 sequences Do not calculate the matrix and try to be as efficient as possible in storing only the minimal ammount of elelment in memory, mainly the previous matrix row + 1 element. """ LL1 = len(Seq1)+1 LL2 = len(Seq2)+1 ## we will do the big loop over the longest sequence (L1) ## and store the previous row of the matrix (L2+1) if LL2 > LL1 : Seq2, Seq1 = Seq1, Seq2 LL2, LL1 = LL1, LL2 previousrow = [0]*(LL2) cindex = 0 for Seq1ii in Seq1: for jj in range(1,LL2): cindex = (cindex+1) % LL2 if Seq1ii == Seq2[jj-1]: if jj == 1: previousrow[cindex] = 1 else : previousrow[cindex]+=1 if Seq1ii != Seq2[jj-1] : up = previousrow[(cindex+1) % LL2] if jj != 1 : left = previousrow[(cindex-1) % LL2] if left > up : previousrow[cindex] = left continue previousrow[cindex] = up return previousrow[cindex] In [16]: import numpy as np import difflib In [17]: def compare(s1,s2): m0 = difflib.SequenceMatcher(None, s1, s2, autojunk=False).get_matching_blocks() m1 = lcs_len3(s1,s2) a,b = sum([x.size for x in m0]),m1 return a,b for k in range(10): s1 = np.random.randint(0,250, 100) s2 = np.random.randint(0,250, 100) a,b = compare(s1,s2) print 'random',u'√' if a == b else 'x',a,b a,b = compare(sorted(s1),sorted(s2)) print 'sorted',u'√' if a == b else 'x',a,b random x 4 8 sorted √ 23 23 random x 4 9 sorted √ 26 26 random x 6 10 sorted √ 25 25 random x 4 12 sorted √ 27 27 random x 5 7 sorted √ 27 27 random x 7 10 sorted √ 30 30 random x 6 13 sorted √ 33 33 random x 9 10 sorted √ 25 25 random x 2 7 sorted √ 28 28 random x 5 10 sorted √ 27 27 Except on the sorted case, SequenceMatcher from stdlib give the wrong matches length almost all the time (sometime it is right) In [18]: %timeit difflib.SequenceMatcher(None, s1, s2, autojunk=False).get_matching_blocks() %timeit lcs_len3(s1,s2) print '--------- sorted ----------' s1.sort() s2.sort() %timeit difflib.SequenceMatcher(None, s1, s2, autojunk=False).get_matching_blocks() %timeit lcs_len3(s1,s2) 1000 loops, best of 3: 402 µs per loop 100 loops, best of 3: 9.38 ms per loop --------- sorted ---------- 1000 loops, best of 3: 734 µs per loop 100 loops, best of 3: 9.33 ms per loop On both sorted and unsorted arrays, SequenceMatcher is way faster (10 to 20 times) than I am, but wrong. In [19]: s1 = 'a'*251 i = 124 s2 = 'a'*(i)+'b'+'a'*(250-i) %timeit MySequenceMatcher(s1, s2).get_matching_blocks() %timeit lcs_len3(s1,s2) 10 loops, best of 3: 26 ms per loop 10 loops, best of 3: 30.8 ms per loop In [20]: sum(x.size for x in MySequenceMatcher(s1, s2).get_matching_blocks()) Out[20]: 126 But in the cases where SequenceMatcher is clearly wrong for the optimal sequence, we are around the same time to execute. Not that I don't like current python libdiff, but I hope to be able to build a lower lever library (pure python of course) without any surprise behavior, on which maby one can rebuild the current libdiff and avoid to rely on regular expression to parse the output of another function. I'll be happy to get any tips to make it faster, of course it is not possible in all the cases but I'm sure we can figure it out. As usual comments, typo and PRs welcommed on the repo that host thoses notebooks. I should probably learn how to use Pelican now that it support notebooks. Edit: At the request of some person, I opened a place to comment on this post until I turn it into a real blogpost.

07-the-sound-of-hydrogen

2013-07-01 20:00

The sound Of Hydrogen¶ Inspired by minutephysics, and the explanation do do it in mathematica: The sound of hydrogen. The goal of this notebook is to show how one can play a sound file in notebook, using Html5

06-NBconvert-Doc-Draft

2013-06-01 20:00

How to Use NBConvert¶ NBconvert migration¶ NBconvert has now been merged into IPython itself. You will need IPython 1.0 or above to have this works (asuuming the API have not changed) Intro¶ In this post I will introduce you to the programatic API of nbconvert to show you how to use it in various context. For this I will use one of @jakevdp great blog post. I've explicitely chosen a post with no javascript tricks as Jake seem to be found of right now, for the reason that the becommings of embeding javascript in nbviewer, which is based on nbconvert is not fully decided yet. This will not focus on using the command line tool to convert file. The attentive reader will point-out that no data are read from, or written to disk during the conversion process. Indeed, nbconvert as been though as much as possible to avoid IO operation and work as well in a database, or web-based environement. Quick overview¶ The main principle of nbconvert is to instanciate a Exporter that controle a pipeline through which each notebook you want to export with go through. Let's start by importing what we need from the API, and download @jakevdp's notebook. In [1]: import requests response = requests.get('http://jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb') response.content[0:60]+'...' Out[1]: '{\n "metadata": {\n "name": "XKCD_plots"\n },\n "nbformat": 3,\n...' We read the response into a slightly more convenient format which represent IPython notebook. There are not real advantages for now, except some convenient methods, but with time this structure should be able to guarantee that the notebook structure is valid. In [2]: from IPython.nbformat import current as nbformat jake_notebook = nbformat.reads_json(response.content) jake_notebook.worksheets[0].cells[0] Out[2]: {u'cell_type': u'heading', u'level': 1, u'metadata': {}, u'source': u'XKCD plots in Matplotlib'} So we have here Jake's notebook in a convenient for, which is mainly a Super-Powered dict and list nested. You don't need to worry about the exact structure. The nbconvert API exposes some basic exporter for common format and default options. We will start by using one of them. First we import it, instanciate an instance with all the defautl parameters and fed it the downloaded notebook. In [3]: import IPython.nbconvert In [6]: from IPython.config import Config from IPython.nbconvert import HTMLExporter ## I use basic here to have less boilerplate and headers in the HTML. ## we'll see later how to pass config to exporters. exportHtml = HTMLExporter(config=Config({'HTMLExporter':{'default_template':'basic'}})) In [7]: (body,resources) = exportHtml.from_notebook_node(jake_notebook) The exporter returns a tuple containing the body of the converted notebook, here raw HTML, as well as a resources dict. The resource dict contains (among many things) the extracted PNG, JPG [...etc] from the notebook when applicable. The basic HTML exporter does keep them as embeded base64 into the notebook, but one can do ask the figures to be extracted. Cf advance use. So for now the resource dict should be mostly empty, except for 1 key containing some css, and 2 others whose content will be obvious. Exporter are stateless, you won't be able to extract any usefull information (except their configuration) from them. You can directly re-use the instance to convert another notebook. Each exporter expose for convenience a from_file and from_filename methods if you need. In [6]: print resources.keys() print resources['metadata'] print resources['output_extension'] # print resources['inlining'] # too lng to be shown ['inlining', 'output_extension', 'metadata'] defaultdict(None, {'name': 'Notebook'}) html In [7]: # Part of the body, here the first Heading start = body.index('

XKCD plots in Matplotlib¶

This notebook originally appeared as a blog post at {{ super() }}

{%- endblock codecell %} Try to look at what Jinja can do, thenlearn about Jinja Filters and imagine they can magically read your config file. For example we provide a filter that highlight by presupposing code is Python. Or one that wraps text at a default length of 80 char... Want a rot13 filter on some codecell when doing exercises for student ? See you next time ! More...¶ One more example from one Pull-Request. In [20]: from IPython.nbconvert.filters.highlight import _pygment_highlight from pygments.formatters import HtmlFormatter from IPython.nbconvert.exporters import HTMLExporter from IPython.config import Config from IPython.nbformat import current as nbformat def my_highlight(source, language='ipython'): formatter = HtmlFormatter(cssclass='highlight-ipynb') return _pygment_highlight(source, formatter, language) c = Config({'CSSHtmlHeaderTransformer': {'enabled':True, 'highlight_class':'highlight-ipynb'}}) exportHtml = HTMLExporter( config=c , filters={'highlight': my_highlight} ) (body,resources) = exportHtml.from_notebook_node(jake_notebook) In [21]: from jinja2 import DictLoader dl = DictLoader({'html_full.tpl': """ {%- extends 'html_basic.tpl' -%} {% block footer %} FOOOOOOOOTEEEEER {% endblock footer %} """}) exportHtml = HTMLExporter( config=None , filters={'highlight': my_highlight}, extra_loaders=[dl] ) (body,resources) = exportHtml.from_notebook_node(jake_notebook) for l in body.split('\n')[-4:]: print l

This post was written entirely in an IPython Notebook: the notebook file is available for download here. For more information on blogging with notebooks in octopress, see my previous post on the subject.

FOOOOOOOOTEEEEER

05-YAML Notebook

2013-05-01 20:00

YAML IPython notebook¶ Little experiment base on the fact that apparently YAML is made to be better readable by Humans than JSON. We've also had some complaint that metadata are not keep in nbconvert when roundtripping through markdown, those two made me think that I could try to see what ipynb files stored as YAML would look like. I'll also use this post to do some experiment for nbviewer future nbviewer features, if you see anything wrong with the css on some device, please tell me. First atempt¶ Apparently Json is a subset of YAML: cp foo.ipynb foo.ipyamlnb Yeah, Mission acomplished ! Second try¶ Install PyYaml, and see what we can do. In [42]: import json import yaml In [43]: from IPython.nbformat import current as nbf In [44]: ls Y*.ipynb YAML Notebook.ipynb In [45]: with open('YAML Notebook.ipynb') as f: nbook = nbf.read( f, 'json') In [46]: nbook.worksheets[0].cells[9] Out[46]: {u'cell_type': u'code', u'collapsed': False, u'input': u'from IPython.nbformat import current as nbf', u'language': u'python', u'metadata': {}, u'outputs': []} I'll skipp the fiddling around with the yaml converter. In short, you have to specify explicitely the part you want to dump in the literal form, otherwise they are exported as list of strings, which is a little painfull to edit afterward. I'm using the safe_dump and safe_load methods (or pass safeLoader and Dumper). Those should be default or otherwise you could unserialise arbitrary object, and have code exucuted. We probably don't want to reproduct the recent file Rail's critical vulnerability that append not so long ago. In [47]: # we'll patch a safe Yaml Dumper sd = yaml.SafeDumper # Dummy class, just to mark the part we want with custom dumping class folded_unicode(unicode): pass class literal_unicode(unicode): pass I know classes should be wit upper case, but we just want to hide the fact that thoses a class to end user. At the same time I define a folded method to use it with markdown cell. when markdown contain really long lines, those will be wrapped in the yaml document. In [48]: def folded_unicode_representer(dumper, data): return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='>') def literal_unicode_representer(dumper, data): return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|') sd.add_representer(folded_unicode, folded_unicode_representer) sd.add_representer(literal_unicode, literal_unicode_representer) with open('YAML Notebook.ipynb') as f: nbjson = json.load(f) now we patch the part of the ipynb file we know we want to be literal or folded In [49]: for tcell in nbjson['worksheets'][0]['cells']: if 'source' in tcell.keys(): tcell['source'] = folded_unicode("".join(tcell['source'])) if 'input' in tcell.keys(): tcell['input'] = literal_unicode("".join(tcell['input'])) In [50]: with open('Yaml.ipymlnb','w') as f: f.write(yaml.dump(nbjson, default_flow_style=False, Dumper=sd)) You can round trip it to json, and it's still a valid ipynb file that can be loaded. Haven't fiddled with it much more. There are just a few gotchas with empty lines as well as trailing whitespace at EOL that can respectively diseapear or make the dumper fall back to a string quoted methods to store values. You can skip down to the end of this notebook to look at how it looks like. It's probably much compact than the current json we emit, in some cases it might be more easy to read, but I don't think it is worth considering using in the format specification. ipynb files are ment to be humanely fixable, and I strongly prefere having a consistent format with simple rules than having to explain what are the meaning of the differents shenigan like : |2+ for literal string. Also support across languages are not consistent, and it would probably be too much of a security burden for all code that will support loading ipynb to take care of sanitazing Yaml. One area where I woudl use it would be to describe the ipynb format at a talk for example, and/or to have metadata editing more human readable/writable. In [51]: !cat Yaml.ipymlnb metadata: name: YAML Notebook nbformat: 3 nbformat_minor: 0 worksheets: - cells: - cell_type: heading level: 1 metadata: {} source: >- YAML IPython notebook - cell_type: markdown metadata: {} source: "Little experiment base on the fact that apparently YAML is made to be\ \ better readable by Humans than JSON.\nWe've also had some complaint that metadata\ \ are not keep in nbconvert when roundtripping through markdown, those two\n\ made me think that I could try to see what ipynb files stored as YAML would\ \ look like. " - cell_type: heading level: 4 metadata: {} source: >- First atempt - cell_type: markdown metadata: {} source: >- Apparently Json is a subset of YAML: - cell_type: markdown metadata: {} source: >2+ cp foo.ipynb foo.ipyamlnb - cell_type: markdown metadata: {} source: >- Yeah, Mission acomplished ! - cell_type: heading level: 4 metadata: {} source: >- Second try - cell_type: markdown metadata: {} source: "Install PyYaml, and see what we can do. " - cell_type: code collapsed: false input: |- import json import yaml language: python metadata: {} outputs: [] - cell_type: code collapsed: false input: |- from IPython.nbformat import current as nbf language: python metadata: {} outputs: [] - cell_type: code collapsed: false input: |- ls Y*.ipynb language: python metadata: {} outputs: [] - cell_type: code collapsed: false input: |- with open('YAML Notebook.ipynb') as f: nbook = nbf.read( f, 'json') language: python metadata: {} outputs: [] - cell_type: code collapsed: false input: |- nbook.worksheets[0].cells[9] language: python metadata: {} outputs: [] - cell_type: markdown metadata: {} source: >- I'll skipp the fiddling around with the yaml converter. In short, you have to specify explicitely the part you want to dump in the literal form, otherwise they are exported as list of strings, which is a little painfull to edit afterward. I'm using the `safe_dump` and `safe_load` methods (or pass safeLoader and Dumper). Those should be default or otherwise you could unserialise arbitrary object, and have code exucuted. We probably don't want to reproduct the recent file Rail's critical vulnerability that append not so long ago. - cell_type: code collapsed: false input: |- # we'll patch a safe Yaml Dumper sd = yaml.SafeDumper # Dummy class, just to mark the part we want with custom dumping class folded_unicode(unicode): pass class literal_unicode(unicode): pass language: python metadata: {} outputs: [] - cell_type: markdown metadata: {} source: >- I know classes should be wit upper case, but we just want to hide the fact that thoses a class to end user. At the same time I define a folded method if I want to use it later. - cell_type: code collapsed: false input: |- def folded_unicode_representer(dumper, data): return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='>') def literal_unicode_representer(dumper, data): return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|') sd.add_representer(folded_unicode, folded_unicode_representer) sd.add_representer(literal_unicode, literal_unicode_representer) with open('YAML Notebook.ipynb') as f: nbjson = json.load(f) language: python metadata: {} outputs: [] - cell_type: markdown metadata: {} source: >- now we patch the part of the ipynb file we know we want to be literal or folded - cell_type: code collapsed: false input: |- for tcell in nbjson['worksheets'][0]['cells']: if 'source' in tcell.keys(): tcell['source'] = folded_unicode("".join(tcell['source'])) if 'input' in tcell.keys(): tcell['input'] = literal_unicode("".join(tcell['input'])) language: python metadata: {} outputs: [] - cell_type: code collapsed: false input: |- with open('Yaml.ipymlnb','w') as f: f.write(yaml.dump(nbjson, default_flow_style=False, Dumper=sd)) language: python metadata: {} outputs: [] - cell_type: markdown metadata: {} source: >- You can round trip it to json, and it's still a valid ipynb file that can be loaded. Haven't fiddled with it much more. There are just a few gotchas with empty lines as well as trailing whitespace at EOL that can respectively diseapear or make the dumper fall back to a string quoted methods to store values. One could also try to tiker with `folded_unicode` in markdown cell that tipically have long lines to play a little more nicely with VCS. - cell_type: markdown metadata: {} source: >- You can skip down to the end of this notebook to loko at how it looks like. It's probably much compact than the current json we emit, in **some** cases it might be more easy to read, but I don't think it is worth considering using in the format specification. ipynb files are ment to be humanely fixable, and I strongly prefere having a consistent format with simple rules than having to explain what are the meaning of the differents shenigan like `: |2+` for literal string. Also support across languages are not consistent, and it would probably be too much of a security burden for all code that will support loading ipynb to take care of sanitazing Yaml. One area where I woudl use it would be to describe the ipynb format at a talk for example, and/or to have metadata editing more human readable/writable. - cell_type: code collapsed: false input: |- !cat Yaml.ipymlnb language: python metadata: {} outputs: [] metadata: {}

04-initialisation-cell

2013-04-01 20:00

IPython Notebook Duck-Punching¶ If it walks like a duck and talks like a duck, it’s a duck. So if this duck is not giving you the noise that you want, you’ve got to just punch that duck until it returns what you expect. Small blogpost to answer to one question that has been asked on stackoverflow and on the Issue tracker : When I open a saved IPython Notebook, I need to evaluate all the cells with imports, function definitions etc. to continue working on the session. It is convenient to click Cell > Run All to do this. But what If I do not want to re-evaluate all calculations? Do I need to pick the cells to evaluate by hand each time? [There is] the concept of "initialization cells". You can mark some cells in the notebook as initialization cell, and then perform "evaluate initialization cells" after opening the notebook. This feature should allow certain cells to be marked as Initialization Cells and be evaluated together with the appropriate command. I'll let you get there to read the official answer, but in short : Not in core IPython, this can be done as an extension. Some warning before we start. As long as you don't shut down the IPython webserver, or don't ask for explicit shutdown, a notebook kernel stay alive, meaning that you can leave the page and come back, you wil still have the same active namespace. So you might not want to run those init cell again. Second do not ever, in any case, whatever reason you have, automatically run initialisation cell on notebook load. It is a security risk : If you have such an extension and I send you a notebook with a Initialisation Cell that have rm -rf ~/ in it. You just lost your home folder when openning this notebook. So this will show you how to to that in less than 60 lines of javascript, you might want to take a look at previous post for some info. You know where to find the finished code, as usual it is not perfect, and I wait for PR to fix the edge cases. Let's start. Overview¶ This extension will be in two part even the all thing will fit in one file. So let's decompose what we need. The ability to mark a cell with a certain flag (Initialisation Cell) The ability run all those initialisation cell when needed For me this sound like the need for a custom CellToolbar checkbox, and a Toolbar button 'Run init Cell'. A checkbox because this is typically a Boolean statement, the cell is a Initialisation cell, or not. And the toolbar button is the easiest to reach, and not too complicated to add. The CellToolbar checkbox¶ We will store the boolean of wether of not the cell is an Initialisation Cell in the cell Metadata. To stay clean, we will use a prefixed key to avoid future collision. As this is a draft, I will prefix the name of the key with an underscore to warn future user that directly accessing this key is not supported and that I reserved myself the right to change anything without warning. I will store [true|false|undefined] in cell.metadata._draft.init_cell. I will not forget to check that cell.metadata._draft exist before playing with it. The IPython notebook provide a convenient API to generate checkbox in Cell Toolbar. to use this we need to define a getter and a setter for our metadata. The setter take the cell we act on, and the new value: function(cell, value){ // we check that the _draft namespace exist and create it if needed if (cell.metadata._draft == undefined){cell.metadata._draft = {}} // set the value cell.metadata._draft.init_cell = value } The getter is not much more complicated : function(cell){ var ns = cell.metadata._draft; // if the _draft namespace does not exist return undefined // (will be interpreted as false by checkbox) otherwise // return the value return (ns == undefined)? undefined: ns.init_cell } The api to generate the checkbox has the following signature : CellToolbar.utils.checkbox_ui_generator(label, setter, getter) I can then create my function easily: var CellToolbar= IPython.CellToolbar; var init_cell = CellToolbar.utils.checkbox_ui_generator('Initialisation Cell', // setter function(cell, value){ // we check that the _draft namespace exist and create it if needed if (cell.metadata._draft == undefined){cell.metadata._draft = {}} // set the value cell.metadata._draft.init_cell = value }, //getter function(cell){ var ns = cell.metadata._draft; // if the _draft namespace does not exist return undefined // (will be interpreted as false by checkbox) otherwise // return the value return (ns == undefined)? undefined: ns.init_cell } ); The label will be use in the UI to put a name in front of the checkbox to know its use. So I used a descriptive name. Now we need to register our function, for that we will use CellToolbar.register_callback(name, function);. name should be a string we will use to refer to the function later, in order to use it in multiple place if we wish. Here simply CellToolbar.register_callback('init_cell.chkb', init_cell); And finaly, we use a private method (for now) to generate a CellToolbar preset that can be chosen by the User in the CellToolbar dropdown with : CellToolbar.register_preset(label, ['callback_name','callback_name',....]). This allow to simply mix and match ui elements from different preset for customisation. Here we only have one checkbox so we do: CellToolbar.register_preset('Initialisation Cell', ['init_cell.chkb']); With all the extension I have, I could create a custom CellToolbar simply by adding: CellToolbar.register_preset('My toolbar', ['init_cell.chkb','default.rawedit','slideshow.select']); And you can see below how it looks like In [19]: from IPython.display import Image Image(filename='/Users/bussonniermatthias/Desktop/Ctoolbar.png') Out[19]: Now you just need to select the Initiallisation Cell CellToolbar and check the checkbox you wish. The Toolbar button¶ Now we need a way to run all cells marked as Initialisation Cells. Let's first make a function that loop on all cell and run is they are marked: var run_init = function(){ var cells = IPython.notebook.get_cells(); for(var i in cells){ var cell = cells[i]; var namespace = cell.metadata._draft|| {}; var isInit = namespace.init_cell; // you also need to check that cell is instance of code cell, // but lets keep it short if( isInit === true){ cell.execute(); } } }; Now we use the API to create a function that register a callback on a button click on the main toolbar, we use a descriptive label that shows on hower, on Icon from jQuery UI themeroller, and assign the callback to previous defined function: var add_run_init_button = function(){ IPython.toolbar.add_buttons_group([ { 'label' : 'run init_cell', 'icon' : 'ui-icon-calculator', 'callback': run_init } ]); }; Now we just need to run this function late enough to have effect. We will just listen for the loaded even to trigger this : $([IPython.events]).on('notebook_loaded.Notebook',add_run_init_button); That's it. You just have to put all this stuff in one big file, name it correctly and add the $.getScript in your custom.js. Reload your page/Restart your server (depending on the config you choose). And you should have a new toolbar preset and a button to run your initialisation cell. I've got less than 60 lines of javascript comment and blank lines counted. As this is a separated file, you can easily fork it and add modifications/options. I'm waiting for PRs. I hope this show you that writing extension for the IPython notebook is not that hard, and are easy to share. In [ ]:

03-on-notebook-format

2013-03-01 20:00

IPython Notebook Duck-Punching (II)¶ If it walks like a duck and talks like a duck, it’s a duck. So if this duck is not giving you the noise that you want, you’ve got to just punch that duck until it returns what you expect. Some thoughts on the IPython notebook. For once, this will discuss more what are the possible things than can be done with IPython, with only few really usable code. I hope it will help the reader to better understand in what direction the Core Team whant to drive IPython. This is of course my personnal interpretation of the different points. We discussed since I came on board. But I hope it will help the comunity to feel.concern and participate. Terminology¶ I think one of the first things that should be adressed is the terminology of things. As a non native english speaker I am pretty bad at choosing names for things. And I probably have an horible.english. And I think there is a big problem the Naming in the IPython notebook. The problem is the same as what happend with the terme "gene", at some point in history it was used to explain a specific effect that some phenotype where expressed depending on the genes you had, the problem is that with our current technologie an knowledge on genetics, the boundaries of what a gene is becommes fuzzy. Same apply to the IPython notebook. Let's see. You probably can start the IPyrhon notebook from your command line. You can also ask a friend to send his or her IPython notebook by email. You probably know that shift enter execute a cell in an IPython notebook. But you probably don't send a webserver by email. I haven't seen anyone converting GMail to a power point. I think you see the point. "IPython notebook" represent at the same time: The webserver application that yoi start from the command line, the web frontend that you are used to use, the file format that defines the structures of ipynb files, the actual files you exchanges, and the representations of thoses files into the web applocation. Add on top the fact that kernel can speek to multiple frontend and can talk to non-python kernel, and people get lost. Now you wonder why you should care as you are only using the all bundle? Because you will probably what to share and or reuse what you write, and once you understand the differents things.you will be able to do even more amazing things. The part you don't (really) care about. The webserver. It can mostly be seen as a middleware. The configuration options should be left to the sysadmin. It mostly make the ZMQ/websocket bridge, and is also an endpoint for the web application to access remote filesystem (yes remote, even if localhost). The part you can do without. The kernel. You probably had one or twice a kernel that died In [ ]:

02-css-selector

2013-02-01 20:00

IPython Notebook Duck-Punching (II)¶ If it walks like a duck and talks like a duck, it’s a duck. So if this duck is not giving you the noise that you want, you’ve got to just punch that duck until it returns what you expect. Response to comments¶ I might have been unclear in previous post introduction about Better typography for IPython notebooks, I didn't ment to say that good typography and theme was a feature request and that we wouldn't do it. It is planned, it just need refactoring, and especially, it shoudln't be as difficult as it is now do developp a new theme. If developping a theme is easy, then we can do in parallel of Main IPython developpement, and we just have to make this theme the default one. But right now, changing css is a real pain with all the browser os and configuration around. I also was a little too optimistic on the necessary version to have custom.css working. It aapparently appeard after the 0.13 branch but was not backported to 0.13.1, so you will need developpemetn version to have previous post (and this one) fully working. TL;DR:¶ Last wee we saw custom.css is the way to inject css, today we'll use custom.js to asynchronouly fetch css and replace the default one. The rest is api detail and rely wrapping arount the following jquery call : $('link:nth(7)').attr('href','/static/css/new-style.min.css') Intro¶ This the second nbviewer post on the IPython notebook. Mainly in responses to comment on Better typography for IPython notebooks, we saw last time how to locally add custom css to a notebook wihtout too much pain. Now we are going to dg deeper into IPython code and see what we can do. As for the first post, you will stil be able to do everything without changing the source, but still need to get the dev version. For those of you on linux; you can get a daily build with Julian Taylor PPA that should work great. Be carefull though, it has a few custom patches that moved the static resources into /usr/share/ This notebook will contain javascript snippet, but unlike last post, I will not put them in %%javascript magic for a few reasons, the first beeing that it will not work on nbviewer, the second is that you should stop relying on publishing javascipt using the _repr_javascript_ of IPyton as it will be deprecated. This will be the subject of another post, but don't complain we have bigger plan to replace that. Start Punching¶ Warning¶ I'm dooing this post on a dev version, so some stuff might differs, especially in some places you might need to use $('link:nth(5)') instead of $('link:nth(7)') and vice versa. Let's carry on... If you made your home work, you probably know that custom.js is in profile_xxx/static/js/custom.js and that any file in profile_xxx/static/[dirs]/[name] is availlable with the /static/[dirs]/[name] url once in notebook. Here is the doc for custom js : Placeholder for custom user javascript mainly to be overridden in profile/static/js/custom.js This will always be an empty file in IPython User could add any javascript in the profile/static/js/custom.js file (and should create it if it does not exist). It will be executed by the ipython notebook at load time. Same thing with profile/static/css/custom.css to inject custom css into the notebook. Example : Create a custom button in toolbar that execute %qtconsole in kernel and hence open a qtconsole attached to the same kernel as the current notebook $([IPython.events]).on('notebook_loaded.Notebook', function(){ IPython.toolbar.add_buttons_group([ { 'label' : 'run qtconsole', 'icon' : 'ui-icon-calculator', 'callback': function(){IPython.notebook.kernel.execute('%qtconsole')} } // add more button here if needed. ]); }); Example : ... You can view the doc using yuidoc than run with node and can be installed via npm, running yuidoc --server in the js directory will provide you with the most up to date documentation, and you are welcomed to bring correction and improvement to it, as well as helpin us having a daily build. As custom.css help us to inject css into the web-notebook, custom.js allo us to inject custom javascript. And you might guess maintaining a lot of javascript into one file is painfull, so we will just use that as en entry point. Little side note : For those of you that don't do javascript, the dollar sign $ is a valid variable name and is usually bind to jQuery ( a library ) that do many things. So $(foo).bar() is calling jQuery with the parameter foo (most of the time a selector) and then calling the method bar on $(foo). Wherease $.foo is calling the method foo of jQuery itself. Don't worry about the exact meaning or difference, just remember that $ is an object that does things and not something part of JS language, and that it is often refered in text as jQuery. So we can use jQuery getScript to just load another file where we will work. In [5]: %%bash touch ~/.ipython/profile_default/static/js/css_selector.js In [25]: %%file /Users/bussonniermatthias/.ipython/profile_default/static/js/css_selector.js console.log("I'm loaded") Overwriting /Users/bussonniermatthias/.ipython/profile_default/static/js/css_selector.js Be carefull the %%file magic does not warn before replacing a file Now we load dynamicaly this file, adding the following to our `custom.js`` $.getScript('/static/js/css_selector.js') I cheat a little and don't do it from the notebook as my config file have more stuff in it, but here is the corresponding line. In [20]: !cat ~/.ipython/profile_default/static/js/custom.js | head -n 6 | tail -n 1 $.getScript('/static/js/css_selector.js') Now you can run your notebook and you should see a message in the javascript console that say "I am loaded". So now, the big seecret is to find the current css file and replace the link to our custom link. This is the first feature that is not supported (yet ?) by the notebook and you will need to adapt to this durring the developpement. With time things will become easier to do hopefully So, select the 5th (or 7th) in which contain the current style. You can do that on the js console. $('link:nth(5)') Check that it give you : [] You can now set the new css with the following command: $('link:nth(5)').attr('href','/static/css/new-style.min.css') But wait, if you replace the default style won't this screw everything up by removing the default style ? Yes, That why know we'll see how to compile your own full style :-) Isn't it the goal of thoses posts ? Recompilying a style¶ You can if you wish create a style from scratch, this does not require any more tool. It will just probably be painfull, so I suggest you start by modifying the IPython ones to create a new theme and compile it. To do so you'll need some extra dependencies python fabric, lessc, bower and a few other that bower will install. Get the IPython dev source tree, and go into IPython/frontent/html/notebook/ and issue a $ bower install It will look for all the dependencies in component.json and install them. This include bootstrap for the css and a few more things, that will be installed in the componentdirectory. The IPython repo will move more and more toward requiring things as "component" later, especially for dev version. You can now use less or lessc compiler to turn all files in /static/less/*.less into one big css theme file. this can be done with : $ lessc -x less/style.less output_file or : $ fab css to regenerate IPython one. Now this is more or less up to you, I advise you to "fork" static/less/style.less and static/less/varaibles.less then recompile to get your new theme. Those file will probably change a lot during the next few month. And the is typically the place where we need your help to refactor. You can also go to the end of this post where I link to a few ugly themes. Wrinting the selector¶ Using a dropdown list to select the theme we want seem to be the correct UI choice, but writing api to insert select/options element in toolbar woudl be more complicated than dooing it by hand. For simpler action that require a button, astonishly the required code is more difficult, hence we provide some help methods to do so. I'll let you hit the docs for that. As writing the selector is not the interesting exercise, I'll just give you the code to put in your css_selector.js: In [ ]: var add_css_list = function (element,names) { console.log(element) var label = $('').text('Css:'); var select = $('') .addClass('ui-widget-content') .append($('').attr('value', 'style.min').text('default')); element.append(label).append(select); select.change(function() { var val = $(this).val() $('link:nth(5)').attr('href','/static/css/'+val+'.css') /*********************/ // missing stuff Here /*********************/ }); for (var i=0; i').attr('value', name).text(name)); } }; add_css_list(IPython.toolbar.element,['dark','duck']) This does the basic job, it changes the notebook css. I tested 2 themes here, "dark" and "duck", in which I only change the following : @corner_radius: 0px/0px; @notebook_background : yellow/balck; @borderwidth : 3px; @fontBaseColor : black/white; and compiled. In [5]: from IPython.display import Image Image(filename='/Users/bussonniermatthias/Desktop/d-and-d.png') Out[5]: look closely, some buttons are unusual for a notebook... So as you can see, not all the css does support variable right now, and if you know a little how the IPython notebook works, we use CodeMirror as an edittor and you might want to set the theme of all codemirror edittor at the same time. Hence the /*********************/ // missing stuff Here /*********************/ in the above snippet. One should both set the default CodeMirror theme on all cell, but also change the actual with: editor.setOption("theme", theme); as well as load the corresponding css. This probably need a loop on IPyhton.notebook.get_cells() Any way, I hope this show you that you can do a lot with the IPython notebook without diving deep into the source code. For the Lazy reader, the needed code and css files are availlabe here, I'll probably move it in a more appropriate place later. You will still need to find the right place where to copy it, and add aline in your config.js (how hard!). Feel free to send me PR to improve this code/ share new css. Next time¶ Next time we'll probably go back a little more on (I)Python side or explore some thought on what can be done on the Metadata in the notebook. I might also speek to you a little about nbconvert and the philosophy behind the notebook format. If you have any comment, feel free to comment on the gihub repo of this blog post, and/or to send me PR to fix mistakes.

Blog1

2013-01-01 20:00

IPython Notebook Duck-Punching (I)¶ If it walks like a duck and talks like a duck, it’s a duck. So if this duck is not giving you the noise that you want, you’ve got to just punch that duck until it returns what you expect. This will try to be a series of blog post (or nbviewer post) on the IPython notebook. Mainly in responses to comment on Better typography for IPython notebooks and some of the comment at the end of the page, especially this one, that will have a full answer next post. First some presentation, I am Matthias, you can usually find me on github aka @carreau, or on twitter @mbussonn. I'm a Phd student in Biophysic, more physic than bio. And I contributed mostly to the IPython notebook. This would also probably be the starting point for the IPython Advance tutorial I'll be giving next August in Euro SciPy. Addendum¶ I was a little optimistic in presupposing that custom.css was availlable in 0.13.1, so most of what is discussed here need a more recent developpement version to work. A more general issue¶ The question about changing the style of the notebook come often and it is part in my opinion of the question of type : Why dont you want to integrate X? Why is it not implemented? I would like to take some time to respond to it, in both a general way and try to introduce to to some concept of the notebook few people know about. Brian Granger already did some good post about it here and here. As I am also responsible for the refactoring of nbconvert and are the principal maintainer of nbviewer I'll try to share my thought on what the future will be, and what is currently in progress. So, back to our question, why do we not implement some feature. Well, a non negligeable part of the time, because they already exist, you are just not aware of that. Plese read the doc (or say you did) and ask how to do it because you didn't well understood. Once you acchieve to do it, a good way to help us and other is to improve the doc, Pull Request are welcomed :-) Second possibility: You don't need code in the core to do it. We provide more or less easy way to hook onto IPython. And as there a are many reason for which some things are better outside of IPython than in the core, we will encourage you to do ship it separately. Third: we won't special case only for you. This does not mean we don't like what you did, more often it mean that the internal of IPython shoudl allow what you did to be an extension, they just don't allow it yet. Bare with us for some time or help us to acchieve our goal and it will be better for everyone soon ! What about css theming ?¶ Short answer : I haven't seen any requests about theming that cannot be done without patching IPython code in itself. Some part could be made much easier, and it is in progress. What you ask (Custom theme, share them...) is already here, you just don't know how to do it. Carl said : Warning: I don’t know what I’m doing. Don’t make any of these changes, or any others, without backing up the files first. But me, I do know what I'm dooing when speeking of IPython, but not when speaking of design, so let's start to tinker with IPython to have custom css, and easy to share. First thing to remember if you ever think of modifying a file in IPython/notebook/html/static/* you are dooing it wrong. Let's dive a little into how to customise the IPython notebook. Adding custom css:¶ How to add custom css to notebook, starting by the wrong way. Things you must never do. Modifyin IPython source files.¶ Never do that, except if you want to open a pull request to fix a bug. Actually you should never modify a file which is under IPython/frontend/html/notebook/static/* because you don't need to do it. We'll se later why and what to do. This mean, that if you are not admin on your machine, or you just don't want to modify the system file, you can still read the rest and try by yourself. CSS In a markdown cell¶ Probably the worse way to do it. You can create a markdown cell with style tag in it, and write some css that will apply to the notebook. It will most likely break on future version, you have to add it every time. You will bother other when sharing your notebook, and it will probably break the conversion process into pdf/rst/markdown when you use nbconvert Html markup will not be the same in nbviewer, so your post might be ugly, and if there are any update of something at some point, you will have to update all your .ipynb files. Even if this is great to test some css as a quick an dirty way (like I did for this notebook) I strongly advise not to do it. It is not yet clear with how things are right now, but a notebook is a document that contain data, and the frontend is responsible for the formatting. Right now the notebook server has few frontends: - the browser one - and the [emacs client](https://github.com/tkf/emacs-ipython-notebook). But this is likely to change, so please no style in Markdown cell. The answer : custom.css¶ So here we are, the right way to add custoom css to a notebook when you look at it throught the browser interface, use the custom.css file. This file is not created by default, and can exist on a per-profile basis, if you don't know what ipython profiles are, then you are probably using the default profile. In short, profile are a way to have different configuration for ipython which you can choose through a command line flag (ipython notebook --profile=). Let's locate our profile folder: In [2]: %%bash ipython locate /Users/matthiasbussonnier/.ipython This tells me that IPyton is expecting profiles in the above directory, and more specially on profile named foo will have the corresponding files in /Users/matthiasbussonnier/.ipython/profile_foo/ let's create a profile for the sake of this blog post. In [3]: %%bash ipython profile create customcss [ProfileCreate] Generating default config file: u'/Users/matthiasbussonnier/.ipython/profile_customcss/ipython_config.py' [ProfileCreate] Generating default config file: u'/Users/matthiasbussonnier/.ipython/profile_customcss/ipython_qtconsole_config.py' [ProfileCreate] Generating default config file: u'/Users/matthiasbussonnier/.ipython/profile_customcss/ipython_notebook_config.py' This created the needed folder structure for IPython to work, but we won't be interested in those file for now. If you do not want to create a custom profile, you could also modify the files in profile_default which is the profile IPython uses when nothing is specified. I will now create the file I am interested in : static/custom/custom.css In [5]: %%bash mkdir ~/.ipython/profile_customcss/static/ mkdir ~/.ipython/profile_customcss/static/custom/ touch ~/.ipython/profile_customcss/static/custom/custom.css Use the file magic to write something in it. In [34]: %%file /Users/matthiasbussonnier/.ipython/profile_customcss/static/custom/custom.css /**write your css in here**/ /* like */ Overwriting /Users/matthiasbussonnier/.ipython/profile_customcss/static/custom/custom.css In [35]: cat ~/.ipython/profile_customcss/static/custom/custom.css /**write your css in here**/ /* like */ Now every time you start ipython with : $ ipython notebook --profile customcss [NotebookApp] Using existing profile dir: u'~/.ipython/profile_customcss' [NotebookApp] Serving notebooks from local directory: /Users/matthiasbussonnier/ [NotebookApp] The IPython Notebook is running at: http://:8889/ [NotebookApp] Use Control-C to stop this server and shut down all kernels. You will get the right css, let's try : In [36]: %%bash curl --noproxy localhost http://localhost:8889/static/custom/custom.css /**write your css in here**/ /* like */ % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 240 100 240 0 0 102k 0 --:--:-- --:--:-- --:--:-- 234k Yeah ! We now get you custom css that will be loaded in notebook, without dangerous file modifications, and without using root rights ! Things to come.¶ On IPython notebook-server side¶ I must warn you, do not rely too much on current css and classes to make your custom theme. We are both refactoring and introducing new tools to make our (and your) life easier. We are progressively moving our css to bootrap, and we currently have part of it that is generated through a compilation of less file. This allow us to introduce css variables, so that you can, for example, set a global HUE for the theme an a radius for the corner, recompile, and you get your new theme ready. Just use it as a custom css in your profile dir and you are good to go. Here are one example of what you can do. And as a bonus, (I'll let you search) we added a notebook flag to compile css on the fly in the browser, so you can develop your theme with a less folder wirhout triggering compilation yourself. ![img](https://f.cloud.github.com/assets/335567/22913/bb412fdc-4a19-11e2-9a9b-2700e5b24843.png) I told you I was not good in design. So now our notebook look more like an ugly duckling, but we now how to pet it so that it behave more like we want, and you can share it! On nbconvert/viewer side¶ The notebook format suppor metadata, so I don't see any reason not to set a prefered theme for a notebook whe viewing with a specific application/frontend. This might include nbviewer, it we consider that css are safe enough and got the time to add the concept of user to nbviewer I don't see any reason not to support external css. @damiamavilla already have build a slideshow version of nbviewer (that we will probably release in the next 6 month) that support multiple theme for the same notebook. And if you made some theme, feel free to share, I even think that a user-governed repository with multiple css woudl be great. Conclusion¶ Custom css is doable, and will improve, and the more you help us, the faster it will arrives ! Also this show you that this does not need to be part of IPython core to exist, and having it separately will allow faster release cycle or even rolling release of themes. If you have any comment corrections, I think you'll probably find the gist/github repo that correspond to this notebook. Next post trailer.¶ Custom.css is not the only file that you can use to inject css in the notebook. Actually any file that you ask to the webserver that start with the /static/ prefix will be first searched in your profile dir. You can also add path with NotebookApp.extra_static_paths= configuration option. So as you'll have guessed, custom.css exist in the directory where the static resources are installed, it only contains comments. Next time, we will use custom.js (I think you can find it by yourself) as an entry point into the notebook to load more javascript dynamically and look at where we can hook to create a css selector. I'll dive into the recent api we added to javascript, and what are the great things you can do with it. I'll let you prepare a few themes to play with, feel free to share them on ipython-contrib. Not many javascript knowledge will required, you just need to find the curly bracket on your keyboard.