This time of the year again, it's soon going to be the period where many websites and organisation will push you to make contribution to Open-Source, for example via hacktoberfest I got a nice T-shirt last year, and 24pullrequests seem to get tractions as well each years. Theses are really nice incentive that push users of open-source to start contributing and already seasons developers to try touch new project.
Here is a request I got for you whether you participate or not to these events: Please close a Pull Request.
Less is More
While I really appreciate having new contributions, there is a point were too many opened pull-requests can – I think – be harmful. I'm going to expose the various case, why I think these are harmful and what can be done.
Here are two specific examples : the Sympy Project (as Aaron feel targeted), the authors are absolutely extraordinary and reactive. The current count of opened PR is 378. Matplotlib is also apparently at 207. You can see in the discussion linked here that maintainers feel differently about high number of PRs.
I open to many pull requests
I currently have 12 opened pull requests, see how many you have. This mean that I (at least) have to follow-up with around 12 projects every days. This is an extremely hight cognitive cost of switching. I try to not keep a PR older than 6 month. If it's older then it's most likely not going to be merged or taken care of by the maintainers. Every time I get to this screen I at least spend 30 sec wondering what to do about old PRs.
My advice is to stay focus: If you are not going to work on a Pull Request, let the maintainers know about this fact: close it. It can still be reopened. You might want to leave a message explaining why you are not working on it, and that you would be happy (or not), for someone else to take over.
I'm now back to 8. It fits on one screen, I can be more focused.
Also if you are a maintainer and know a pull-request will likely not get merged, I would prefer you don't give me false hope, and close it. Explain why. Even if it's just that's you are busy on something else and would appreciate if this was resubmitted later. I'm more likely to get over it and try a few other time than if my first contribution got no responses.
I receive too many pull-requests
I strongly encourage you to try minrk.github.io/all-my-pulls it allows you to view all the pull-requests you have the ability to merge. And filter by repositories you do not wish to see. After filtering, I have 61 pull requests in 19 repos. It is too much to stay focused as well.
Many of these pull-requests have stalled, and I would gladly appreciate for the authors to close them if they have no intention on working on things. To be honest many of the oldest pull-requests have entered this "Awkward state" of wanting to close it but not actually doing so because it can be rough for the author to see his work dismiss.
As a maintainer I should do a better job as saying when a Pull request have stalled and is just polluting the PR list. Close it with a nice explanation. It's always possible to reopen if needed. GitHub allows canned responses, I use it as a template to list the policy of PR closing. I've found that having a clear policy often make decision easier. And sometime closing even allow work to be resubmitted, to appear on the top of the pile, and start anew.
There is also the possibility of taking over the author work and finishing up in a separate PR, or push directly on authors forks if he is allowing it. I personally rarely do that, as I feel like it is a slippery slope for the maintainer to do everything.
I find myself much more efficient when there is only 5 to 6 opened pull-requests. I can keep track of each of them, judge whether or not the work will conflict and give proper care to each of these. I fail to do so when there are many pages.
I don't contribute to repository that have too many PRs.
When I come across a repository with more than 20-ish pull-requests, I tend to think that the authors are not responding so why bother to contribute. I know that often these are only impressions and I can get over it because I have the chance to often know the maintainers. This feeling is though hard to get over on repositories I'm new to.
With a high number of opened PRs, I tend to also be discouraged at searching whether someone is fixing the bug I saw, or implementing the feature I wish. Moreover the higher the number of opened PRs the more chance there is for the maintainers to review my PR in a long time, and the higher chance there will be that I will need to rebase my work, which regardless of whether you are a git master or not can be painful process to go through (and to ask someone to go through).
I'm pretty certain I'm not the only one to be discouraged from seeing a large number of open non active Pull requests. I've asked on twitter and it looks like roughly every other respondent are discouraged to contribute if too many PR are opened.
What do you think ?
The above paragraphs are my though on too many opened pull-requests ? How are you feeling about that ? As you might have read in the twitter conversation linked to above, different people have different opinions.
If you want to comment, please open an issue on GitHub, and if you have the courage to help improve my English feel free to send me a PR (sic) to make this more readable.
Close a PR !
Thanks you for reading up until here ! If you want to restore part of the
sanity of some maintainers, or want to appeal a bit more to some users, please
go close a PRs ! Or help finish a Pr that have stalled ! I can't give you a
free T-shirt like for HactoberFest but feel free to tweet with hashtag
Note, I started this writing in September, it took me a while to finish, so things might have changed a bit during the writing.
This is a follow on previous post about retiring Legacy Python. I got some response, in particular thanks to Konrad Hinsen who took the time to ping me on Twitter, and who responded in a GitHub issue.
First I want to apologize if the post felt condescending, it was not my intention, by re-reading it I still do not feel it. Maybe this is because I'm still not skilled enough with words in English, and this is typically the kind of feedback I appreciate.
Don't break user code
One other thing that seem to leek through my post was the impression I was advocating breaking user code. What I am advocating is to make a step forward, by actually dropping compatibility with previous version. I really like semantic versioning, and as a tool maintainer, I would not drop support for APIs in between minor revisions. Keeping backward compatibility, is not always the easier job, and in a lot of case I fought hard and complain about changes that shouldn't have happen, and even reverted pull requests. As a developper I would have like these changes to go in though.
What I am advocating or is for libraries to start thinking of new API that would be Python 3 only. And potentially consider new major revision that does not support Legacy Python, or at least do not make efforts to support it. This is already the path that a growing number of tools are going through. Nikolas developers have decided that the next version (8.0) would be Python 3 only (more background). I suppose the necessary modification would be minimal, but the overhead and developer time to support Legacy Python at the same time was too much. Scikit-Bio is envisaging the same thing: whether or not they should stop trying to support python 2. Some project have even already started active removal of Legacy Python compatibility layer.
An increasing number of distribution also now come with Python 3, and python 3.5 is now considered as default installed Python more and more. Ubuntu is considering having Legacy Python as persona non grata on next stable release, and so does Fedora apparently.
Changing major library versions would not break old code. Or at least not
more than any other upgrade of major versions. User would still be able to pin
version of the libraries in use, and if there are dependencies conflict, your
package manager i there to resolve them. There is a specific
that you can give to your package to indicate whether or not it's Python2/3
compatible (including minor versions), which would just prevent non-compatible
version to be installed, though it is not commonly use. This is one the reason
the Py2/Py3 transition might look complicated to some people. One example is
fabric, which is not compatible Python3, but will accept to install in a
Python3 environment. In a perfect world you would just pip install your package
and get the last compatible version without worrying about the Legacy
Python/Python 3 compatibility, but we venture in Packaging, so here be dragons.
Python 3 does improve on Python 2
Even if, as a scientist, you might see Python as only a tool; this tool rely on
a lot of work done by other. You might not feel the difference on the day to
day basis when using Python, but , the feature that Python 3 provide do make a
difference for the developers of software stack you use. During the last few
month, we had the chance to get a new release of
that support the new python 3
__matmul__) operator, and a recent
release of Pandas, that now support more operation with
released its 1.5 version and
should relatively soon release the 2.0 version that change the default color
scheme. These are only some of the
core packages that support the all Scientific Python infrastructure, and the
support of both Legacy Python and Python 3 is a huge pain on developers. The
support of a single major Python version for a code base make the life of
maintainers much more easier, and Python 2.x branch is coming to end of life in
2020, we can do our best to allow these maintainer to drop the 2.x branch.
I really think that the features offered by Python 3 make maintaining packages easier, and if it takes me 10 minutes less to track a bug down because I can re-raise exceptions, or even get a bug report with a multiple-stage exception, it give me 10 more minutes to answer questions.
As a side note, If you think I'm exaggerating that Legacy Python support can be a huge overhead, I I've already spend a day on this bug and I'm giving up on writing a test that work on Python 2.
Also, for the record, re-raising exceptions,
yield, keyword-only arguments
are things that make the science I did during my PhD easier. Maybe no better,
but easier for sure. I playing more and more with
yield from, Python 3 Ast,
generic unpacking since then, and they also do help (understand I wish they
were there or I knew how to use them at the time). I strongly believe that the
quality and the speed at which you do science is influenced by your
environment, and Python 3 is a much nicer environment than Python 2 was. Of
course there is a learning curve, and yes it is not easy: I went through it
more than a year ago when it was harder than today. It was not fun nor was I
convince after a week, but after spending some time with Python 3, I really
feel impaired when having to go back to Legacy Python.
Just one Python
So while not all scientist are seeing the immediate effect of Python 3, I would really hope for all the core packages to be able to just forget about Legacy Python compatibility and have the ability to focus their energy on only pushing their library forward. It would really help improving the scientific stack forward if the all SciPy related libraries could be released more often. And one of the way to help that is to push Python 3 forward. We won't get end-user to migrate to Python 3 if the list of available features of package are identical. The carrots are new features.
Mostly what I would like is only one Python. I honestly don't think that Python 2 has a long term future. New distribution
For the in house legacy code that scientist cannot update, I'm really sorry for them, and it would be nice to find a solution. Maybe something close to PyMetabiosis that would allow to run Legacy Python modules inside Python 3 ? I understand the lack of funding and/or technical competence/time to do the migration. Though asking a volunteers based project which also lack funding to maintain backward compatibility indefinitely seem also unreasonable. Not all project have finding like IPython/Jupyter, and even though, those which get funding, also have deliverable, if we don't ship the new feature we promise to our generous sponsors, we will likely not get out grants renewed. More generally this lead toward the question on software sustainability in science. Is it unreasonable to ask a specific software to work only in a restricted environment ? It is definitively convenient, and that's why many papers now rely on VMs for results to be reproduced. But if you want your software to be used you have to make it work on new hardware, on new OS, which often means drivers, newer library, so why not new version of the language you use ? How many user are still not upgrading from XP? Is it unreasonable to ask then to install the version of libraries that were distributed at that time?
A good single resource on how to operate the Python2 to 3 transition is likely needed. Matt Davis created an example repository on how to write a Python2 and 3 compatible extension. CFFI is in general something we should point user to as it is seem to become the way which, in the long term, will lead to the less pain for future upgrades, and even for more backward compatibility with 2.6. Cython also allows to write pseudo-python that can run closer to C-Speed. Cython also have a pure python mode, which I think is under-used. I suppose that with Python 3 type-hinting, function annotation could be used in the same way that Cython Pure Python mode magics atributes work. This to provide Cython with some of the needed type annotations to generate fast efficient code. How to tackle the human problem is more complicated. It is hard to organize a community to help scientific software being ported to Python 3, as everyone is missing time and/or money. There is a need to realise that in-house software written has a cost, and this cost at some point need to be financed by institution. The current cost is partially hidden as it goes into Graduate student and Post-Doc time, who in the end will write the software. The Graduate and Post-doc often lack the best practices of a good software engineer which leads to technical dept accumulation.
Which path forward
The Python Scientific community is still growing. Laboratories are still starting to adopt Python. Wether or not it is a good thing, Python 2 will reach it's end of life is a couple of year. And despite the many in-house libraries that still haven't been ported to Python 3, it is still completely possible and reasonable to do science with only a Python 3 stack. It is important to make this new generation of Python programmers to understand that the Python 3 choice is perfectly reasonable. In a perfect world the question of which python to use should get an obvious answer, which is Python 3. Of course with time passing, we will always have some necessity fro developers speaking Legacy Python, in the same way that Nasa is looking for programmer fluent in 60 Years-old languages
Reviving Python 2.7
I'm Playing devil advocate here, to try to understand how this could go forward. The scientific community is likely not the only one to use old version of Python. Google recommend to install Python 1.6 to run it's Real-time Google drive collaboration examples. So once 2.7 support will be officially dropped by the PSF, one can imagine trying to have Google (or any other company) taking over. I am not a lawyer, but I guess in a case of language revival, trademark might be an issue, so this new version might need to change name.
It is not the first time a fork of a project have became successful. From the top of my head I can think of Gnome 2, that gave birth to Mate, Gcc that was forked to Egcs, (which in the end was re-named Gcc), as well as Libre Office, successful alternative to it's "Open" sibling.
Seeing that the Scientific community has already a lack of time and funding, I find the chance of this happening slim.
One of the main pain in the transition to Python 3 is keeping compatibility with both Python. Maybe one of the solution that will come up is actually jumping over all this Python 2/3 problem to an alternative interpreter like PyPy, which got a brand new release. The promises of PyPy is to make your Python code as fast as C, removing most of the need to write lower level languages. Other alternative interpreter like Pyston and Pyjion are also trying to use Just-In-Time compilation to improve Python performance. For sure it still need some manual intervention, but if it can highly decrease the amount of work needed. I'm still unsure that this is a good solution as C-API/ABI compatibility CPython 2.x is a complex task that does hinder language development. More generally, CPython implementation details, do leak into the language definitions and make alternative implementation harder.
The Julia community dealt with migrating code from Fortran by writing a Fortran-to-julia transpiler, Python have the 2to3 and now python-modernize tools, it might be possible to write better conversion tools that handle common use case more seamlessly, or even import-hooks that would allow to import packages cross versions ? Maybe having a common AST module for all version of Python and have tools working more together is the way forward.
Make Python 3 adoption easier, or make it clearer what is not Python 3 compatible, both for humans and machine. There are some simple steps that can be taken: Make sure you favorite library run tests on latest python version, it's often not too much work.
Point the default docs at Python 3 using canonical links. I too often come across example that point to Python 2 docs, because they are better referenced by Google. Often the difference LPy/Py3 is small enough that example works, but when it does not, it give the impression that Python 3 is broken.
Tell us what you think, what you use, what you want, what you need, what you can't do.
This is a bit of an anger post, I won't explicitly give name, but if you know me you should be able to put the pieces together.
I'm angry for the reason that almost a year ago, I warn a known Londonian startup, that they had a large security flaw in their software. They responded that they were aware, and are working on a fix. Here we are almost a year later, the fix is still not implemented, all their user password and credit card number are potentially in the wild, and they are responding to me and twitter that basically I don't understand security.
I have a tip for them. If people disagree whether a system is secure or not, then the system is most likely not secure.
Do you trust your postman ?
To be more accessible, I'll take the example of Alice and Bob communicating by snail mail, instead of use the https/ssl encryption jargon.
Alice and Bob have never met, though they are communicating regularly by mail. Bib is good in French, and Bob is helping her improve her French by improving her translation. To thanks Bob Alice decide to send him a gift, for now in the form of money. As Alice does not trust the postman to not look into the mail and steel cash, she decide to send a Paper Check. Though this does not prevent the postman to steel the check, though if Bob account number is written on the check, the postman will not be able to cash the check.
The solution implemented by above not mentioned company is the following:
Alice send a letter asking Bob for his account number. Alice receive a response with an account number. Alice write a check fro this account and send it to Bob's bank whose address was with the account number.
According to ANMC (Above Not Mentioned Company) this is secure as Alice does send a check with Bob bank account number to Bob bank.
So let's see what the postman can do:
The Postman intercept Alice's letter to Bob, asking for his account number. He does not even bother sending the main to Bob, and respond to Alice with a letter impersonating Bob, and having Postman's bank address and Postman's account number. Alice send a Paper check to the Postman's Bank without even realising Bob's was not receiving mail.
So in the end ANMC, your implementation is wrong. If you login page is on http, and you tell me that the password is send over https, nothing prevent me to Man in the middle your login form and have it submitted to a different address.
I'm not a clown
This is basic security when you care a bit on what you are doing, and please be aware that people that will take the time to submit you such bug report, are not clowns and actually take time to explain you these things. Please respect them instead of raising some funds by venture capitals.
You get a statement on your website that you care about user privacy, apparently you don't.
As usual, I'm not a good at writing English so I appreciate any Pull Request that fix grammar, spelling and english expression.
On September 18 and 19, 2015, the Data Structure for Data Science workshop gathered at UC Berkeley's BIDS [Berkeley Institute for Data Science]. It was a productive two days of presentation, discussion and working groups — a collaborative effort aimed at expanding what data science can do.
Despite having mostly Python developers, the workshop reached out and included members from many other programming communities (e.g., C, C++, Julia, R, etc.) as the workshop's explicit goal was to improve cross language operability. In particular, the goal was to enable python's scientific computing tools (numpy, scipy, pandas, etc.) to have a consensus backbone data-structure that would enable easier interaction with other programming languages.
Out of the discussion arose a topic that has long plagued the python community at large: code that requires legacy Python 2.7 is holding back the development of data-science toolsets and – by extension – the progress of data science as a whole. Python 2.7 was an important part of the history of scientific computing, but now it should be left as part of that history. Thus, we convened a small working group to plan a early death for Legacy Python.
Move over Legacy Python once and for all.
In collaboration with many developers among whom @jiffyclub, @tacasswell, @kbarbary, @teoliphant, @pzwang, @ogrisel, we discussed different options to push Legacy Python more or less gently through the door. We understand that some people are still requiring the use of Legacy Python in their code base, or the use some libraries which are still only available on Legacy Python and we don't blame them. We understand that Legacy Python was a great language and that it's hard to move over it. Though the retirement of Legacy Python is 2020, you will ave to make the transition then, and it will be even harder to transition at that point.
So what are the step we can do to push the transition forward.
Choose your words.
The choice of words you make on the internet and in real life will influence the vision people have for Legacy Python vs Python. Assume that Python 3 is just Python, and refer to Python 2 as legacy python. IDEs and TwitterSphere is starting to do that, join the movement.
Refer to Legacy Python in the past tense. It will reinforce the old and deprecated state of Legacy Python. I still don't understand why people would like to stay with a language which that many defects:
- it did not protect you from mixing Unicode and bytes,
- tripped you with integer division
- did not allow you to replace the printing function
- had a range object which is not memory efficient
- did not permit to re-raise exception
- had a bad asynchronous support, without yield from
- forced you to repeat the current class in
- let you mix tab and space.
- did not support function annotations
Legacy Python was missing many other feature which are now part of Python.
Do not state what's better in Python 3, state that it was missing/broken in Legacy Python. Like the missing matrix multiplication operator was missing multiplication operator. Legacy Python was preventing people to use efficient numeric library which are relying on the numerical operator.
Don't respond to "and on Python 2"
Personally during talks I plan to not pay attention to question regarding legacy Python, and will treat questions such questions as someone asking whether I support windows Vista. Next question please. The less you talk about Legacy Python the more you imply Legacy Python is not a thing anymore.
Drop support for Legacy Python (at least on paper)
If you a library author, you have probably had to deal with user trying your software on Legacy Python, and spend lot of time making your codebase compatible with both Python 3 and legacy Python. There are a few step you can take to push user toward Python 3.
If a user is not willing to update to a new version on Python, and decide to stay on legacy Python, they can most likely pin the version on your library to versions which support Legacy Python.
Make your examples/documentation Python 3 only
Or at least do not make effort to have examples that work using Legacy Python.
Sprinkle with function annotation, and
await keyword can help with
communicating your example are Python 3 only.
You can even avoid mention of Legacy Python in your documentation and assume your users are using Python 3, this will make writing documentation much easier, and increase the chances to get examples right.
Ask user to reproduce but on up-to-date Python version.
Have you ever had a bug report where you ask users to upgrade your libraries dependencies ? Do the same with Python. If a user make a bug report with Python 2.7 ask them if they can reproduce with an up-to date version of Python, even if the bug is obviously from your side. If they really can't upgrade they will know, if they do and can reproduce, then you'll have at least converted one user from Legacy Python (and in the meantime you might have already corrected the bug).
Defer 2.7 support to companies like Continuum
This is already what Nick Coghlan recommands for Python 2.6, and that's what you can do for Legacy Python fix. If you have a sufficient number of user which are asking for 2.7 support, accept the bug report, but as an open source maintainer do not work on it. You can partner with companies like Continuum or Enthought, from which user would "buy" 2.7 support for your libraries, in exchange of which the Companies could spend some of their developer time fixing your Legacy Python bugs.
After a quick discussion with Peter Wang, it would be possible, but details need to be worked on.
Make Python 3 attractive
Create new features
Plan you new features explicitly for Python 3, even if the feature would be simple to make Legacy Python compatible, disable it on old platforms, and issue a warning indicating that the feature is not available on Legacy Python install.
You will be able to use all the shiny Python features which are lacking on Legacy Python like Unicode characters !
Create new Python packages
Make new packages Python 3 only, and make all the design decision you didn't do on your previous package. Pure python libraries are much easier to create and build once you are not hold back by legacy Python.
Helping Other project
Despite all the good will in the world the Migration path from Legacy Python can be hard. There are still a lot of things that can be done to help current and new project to push forward the adoption of Python.
Make sure that all the project you care about have continuous integration on Python 3, if possible even the documentation building done with Python 3, help to make Python 3 the default.
With continuous integration, check that your favorites projects are tested on
Python Nightly, most CI provider allow the tests to be ran on nightly, but do
not make the status of the project turn red if the test are failing. See
on Travis-CI for example.
Porting C-extensions, move to Cython
The path to migrate C-extension is not well documented, the preferred approach is to use CFFI, but there is still alack of well written centralised, document on how to integrate with Python 3. IF you are knowledgeable on this domain, your help is welcomed.
The things we will (probably) not do.
Make a twitter account that shame people that use Legacy Python, though we might do a Parody account which say funny things, and push people toward Python 3.
Slow code on purpose and obviously on Legacy Python:
if sys.version_info < 3:
Though it would be fun.
Ask user at IDE/CLI startup time if they want to upgrade to Python3:
```bash $ python2 Python 2.7.10 (default, Jul 13 2015, 12:05:58) Type "help", "copyright", "credits" or "license" for more information.
Warning you are starting a Legacy Python interpreter. Aren't you sure you don't want not to upgrade to a newer version ? [y]:_
Delay Legacy Python packages releases by a few weeks to incentive people to migrate, or should we actually consider the people on Python 2 as guinea pig and release nightly ?
Remember, Legacy Python is responsible for global warming, encourage people to stay with IE6, and is voting for Donald Trump.
If you have any ideas, please send me a Pull Request, I'll be happy to discuss.
As usual my English is far from perfect, so Pull Request welcomed on this blog post. Thanks to @michaelpacer who already did some rereading/rephrasing of first draft.
As you may know we have an Jupyter logo, as an SVG. But as for the Kilogram that as a prototype kilogram that act a reference, we suffer from the fact that the logo does not have an abstract description that could explain how to construct it, which is bad.
So with a little bit of reverse engeniering and Inkscape I was able to extract some geometric primitive to build the jupyter logo.
Still work that need to be done, especailly for the gradient to get the endpoint and the colors.
But this allow to do some nice things like plotting the logo in matplotlib :-)
from matplotlib.patches import Circle, Wedge import matplotlib.pyplot as plt
def Jupyter(w=5, blueprint=False, ax=None, moons=True, steps=None, color='orange'): ## x,y center and diameter of primitive circles, ## than to Inkspace. xyrs = numpy.array(( (315, 487, 406), (315, 661, 630), (315, 315, 630), (178, 262, 32), (146, 668, 20), (453, 705, 27), )) center = (315,487) xyrs[:,0] -= center xyrs[:,1] -= center if not ax: fig,axes = plt.subplots() fig.set_figheight(w) fig.set_figwidth(w) else: axes = ax axes.set_axis_bgcolor('white') axes.set_aspect('equal') axes.set_xlim(-256,256) axes.set_ylim(-256,256) t=-1 ec='white' if blueprint: ec = 'blue' for xyr in xyrs[:steps]: xy,r= xyr[:2],xyr if r == 630: fill=True; c = 'white' else: fill=True c=color if r==630 : a = 40 axes.add_artist(Wedge(xy,r/2,a+t*180, 180-a+t*180, fill=fill, color=c, width=145, ec=ec)) t = t+1 elif r==406 : axes.add_artist(Circle(xy,r/2, fill=fill, color=c, ec=ec)) else: if r<100 and moons: axes.add_artist(Circle(xy,r/2, fill=fill, color='gray', ec=ec)) if blueprint: axes.plot(xy,xy,'+b') axes.xaxis.set_tick_params(color='white') axes.yaxis.set_tick_params(color='white') axes.xaxis.set_ticklabels('') axes.yaxis.set_ticklabels('') return axes
ax = Jupyter(10, blueprint=False)
And lets make the outline apparent:
ax = Jupyter(10, blueprint=True)
Color and gradients are not right yet, but looks great !
for small graphs you can also remove moons.
fig, axes = plt.subplots(2,3) for ax in axes.flatten(): Jupyter(ax=ax, moons=False)
If you run it locally, below you will have an interactive version that will show you the differents step of drawing the logo.
from IPython.html.widgets import interact cc = (0,1,0.01) @interact(n=(0,6),r=cc,g=cc, b=cc) def fun(n=6, outline=False, r=1.0, g=0.5, b=0.0): return Jupyter(steps=n, blueprint=outline, color=(r,g,b))
As usual PR welcommed !
One of the annoying things when I want to post a blog post like this one, is that I never remember hom to deploy my blog. So, why not completly automatize with a script ?
Well, that one step, but you know what is really good at runnign scripts ? Travis.
Travis have the nice ability to run script in the category
after_success , or encrypting file, whice allow a nice deployment setup.
The first step is to create an ssh key with empty pass passphrase;
I like to add it (encrypted) to
.travis folder in my repository.
Travis have nice doc for that.
Copy the public key to the target github repository deploy key in setting.
In my particular setup the tricky bit where :
To get IPython and nikola master:
- pip install -e git+https://github.com/ipython/ipython.git#egg=IPython
- pip install -e git+https://github.com/getnikola/nikola.git#egg=nikola
Get the correct layout of folders:
- blog (gh/carreau/blog)
- posts (gh/carreau/posts)
- output (gh/carreau/carreau.github.io)
I had to soft link
posts, as this is the repo which trigger travis build, and is by default cloned into
~/posts by travis.
carreau/carreau.github.io is clone via
https to allow pull request to build (and not push) as the ssh key can only be decrypted on master branch.
after_success step (you might want to run unit-test like non-broken link on your blog) you want to check that you are not in a PR, and on master branch before trying to decrypt the ssh key and push the built website:
The following snippet works well.
if [[ $TRAVIS_PULL_REQUEST == false && $TRAVIS_BRANCH == 'master' ]];
Travis CL tool gave you the instruction to decrypt the ssh key, you now "just" have to add it to the keyring.
eval `ssh-agent -s` chmod 600 .travis/travis_rsa ssh-add .travis/travis_rsa cd ../blog/output
And add an ssh remote, which user is git:
git remote add ghpages ssh://firstname.lastname@example.org/Carreau/carreau.github.io
Push, and you are done (don't forget to commit though) !
When testing, do not push on master, push on another branch and open the PR manually to see the diff. Travis env might defier a bit from yours.
Nikola read metadata file from
.meta file, which is annoying. I should patch it to read metadata from notebook Metadata. ALso need a JS extension to make that easy.
PR and comments welcommed as usual.
It occured to me recently that we (the IPython team), are relatively bad at communicating. Not that we don't respond to questions, or welcomme pull request, or will not spend a coupel of hours to teach you git even if you just want to update an URL. We are bad for writing good recapitulatif of what as been done. Of course we have weekly meeting on YouTube, plus notes on Hackpad, but it's painfull to listen to, and not alway friendly.
I hence will try to gather in this small post some of the latest questions I think need to be answerd.
What is Jupyter ?¶
You might have seen Fernando's Talk t SciPy last year that introduce Jupyter. It was a rough overview, a bit hitorical and technical in some point. In even less technical word, it's anything in IPython that could apply to not Python. We are creating this to welcome non-pythonista in the environement and promote interlanguage interoperability.
Can I be part of Jupyter ?¶
Yes. Right now only IPython team are Jupyter member (we are drafting legal stuff), but if you want to be part of the steering council we'll be happy for you to join. We would especially like you to join if you have a different background than any of us. You are using Jupyter/IPython for teaching a lot ? Or Maintain a Frobulate-lang kernel ? You point of view will be really usefull !
Yes, here again we are still drafting things, but we want more visibility for sponsors and/or company that develop product that are based on JupyterWork with Jupyter and contrbute back. If you have a specific request or question, please contact us, we'd like to know your point of view !
Do I need technical expertise ?¶
No. You prefer to write blog post ? Do user testing ? Record demo on how to use a software on youtube ? that's helpful. You are designer ? you want to translate things ? That's awesome to. You like doing paperwork and challenging administrative task ? We can probably ask you how to mound a non-profit organisation to organise all that. You don't really know anything but understand a bit who deals with what on te project ? Your help will be helpfull to assign people tasks on the project ! You write a better english than I do ? Great, english is not my first language and I definitively need someone to point out my mistakes.
When will be Jupyter first public release ?¶
Well technically Jupyter is not only a project but an organisation. We already have a long list of project you might have heard of.
If you are reffering to The Notebook, then current IPython master is what is closer to Jupyter 1.0.
IPython Notebook 4.0 will probably not be as itmight be name Jupyter 1.0.
IPython will disapear ?¶
No it will not. IPython the console REPL will stay IPython. the IPython kernel will probably become a separate project call IPython-kernel that will be usable in IPython notebook. Basically most of the feature of IPython (traitlets, widgets, config, kernel, qtconsole...) woudl disapear from IPython repo.
You are removing Traitlest/Widgets/Magics.... ?¶
Not quite, we will make them separate repository, sometime under Jupyter organisation, sometime under IPython.
For exaple the QtConsole is not Python specific (well, it is written in Python), but can be used with Julia, Haskell... Thus it will be transfered to Jupyter org where it will have its own dedicated team and release schedule. This will allow quicker release of versions, and easier entry to contributor to get commits right. the team in charge will also have more flexibility on the feature shipped. Right now we would probably refuse a qtconsole background beeing a pink fluffy unicorn, once split is done the responsible team can decide to do so.
Traitlets in the other hand (that are use from config to widgets) will be under IPython organisation as they are a specificity from IPython kernel. They could be used in matplotlib to configure loads of things. Once split we will probably add things like numpy traitlets, and a lot of other things requested.
IPython will probably start depending on (some of) these subpackages.
Can you describe the future packages that will exists ?¶
Roughly will have a package/repo for:
- The Notebook
- maybe even 2 packages, 1 for Python side, other for JS Side
- IPython as a kernel
- IPython as a CLI REPL.
Splitting this will moslty be the work we will do in betwwen IPython 3.0 and Jupyter 1.0
Can you give a timescale ?¶
IPython 3.0 beta early February, release end of February, Jupyter 1.0 three Month Later. If we get more help from people, maybe sooner.
What about new features ?¶
Starting with IPython 3.0, you can have multi-user server with Jupyter-hub. I, in particular, will be working on real-time colaborationand integration with google Drive (think of something like google doc). As it will not be Python specific, it will be under Jupyter umbrella.
Only google Drive ?¶
Google gave us enough money to hire me for a year. I'll do my best. With more money, more people we could go faster, have more backend. Developpent is also open, and PR, and testing welcommed !
It would be awesome if something like IPython existed for
<a language> !¶
Well, it's not a question but It's called Jupyter. Please don't reinvent the wheel, comme talk to us, we would love to have interoperability or give you that hook you need. Who knows you can help steering the project in a better direction.
Since Last post lots of things happend, I had 2 pull requests on github to fix some typos on previous posts (now updated). We had yet another Jupyter/IPython developper meeting in Berkeley. By the way, part of IPython will be renamed Jupyter. See Fernando Perez Slides and Talk. I got my PhD on my spare time when not workin on IPython. Went to EuroSciPy, hacked in London In Bloomberg Headquarters for a week-end organized by BloomberLab, got a Post Doc at Berkeley BIDS under the direction of above cited IPython BDFL and is about to move almost 9000 km from my current location to spen a year working on improving your everyday scientific workflow from the West of United-States, I think that's not too bad.
Since I got a little more time than usual to do some python everyday since this is officially my day job I was confronted to a few pyton problems, and also sa some amazing stuff.
Eg, James Powel Ast Expression hack of CPyhton: (original tweet)
>>> e = $(x+y) * w * x >>> e(x = $z+10, w = 20)(30, 40) (×, [(×, [(+, [(+, [40, 10]), 30]), 20]), (+, [40, 10])]) >>> _() 80000
Give you unbound python expression that you can manipulate as you like. Pretty awesome, and would allow some neat meta programming. Woudl be nice if he could release his patch so that we could play with it.
A discussion with Olivier Grisel (sikit-learn) raised the point that a good package to deal with deprecation would be nice. Not just deprecation warning, but a decorator/context manager that takes as parameter the version of the software at which point a feature is removed and warn you that you have soem code you can clean.
Using grep to find deprecation warning don't always work, and sometime you would like a flag to just raise instead of printing a deprecation warning, or even do nothing when you are developpin aginst dev library.
Jupyter/IPython 3.0 is on its way. Slowly, we are late as usual.Recap will come at the right time.
I am working on integratin with Google Drive, and will try to add real-time synchronisation to Jupyter 1.0/IPython 4.0. IWe plan on addin other Operational Transform library backend, but workin on Google drive first got me my post doc. (Yes Google signed a Check to UC Berkeley)
Of course new project mean more packagin, and as you all know Python packaging is painfull. Especially when starting a new project you need to get all the basic files...etc.
Of course you can use project like audrayr/cookiecutter, but the you stil have the hassle to :
- set up the GitHub repository,
- Find a name
- Hook Travis-CI find the annoyin setting that let you trigger build.
- Hook up Readthedoc,
- ... 
- And so on and so forth
 add everything that every project should do.
I stronly believe that intelligence shoudl be in the package-manager, not the packager, and my few experiences with julia
Pgk, and a small adventure with nodejs
npm convinced me that there is really something wrong with Python packagin.
I'm not for a complete rewrite of it, and I completly understand the need for really custom setup script, but the complexity just to create a python package is too high. SO as a week-end hack introducing...
So yes I know that cookiecutter as pre- and post- hooks that might allow you to do that, but that's not the point. I just want a (few) simple command I can remember and which can do most of the heavy liftin for me.
I shamelessly call it
PipCreate, because I hope that at some point in the future you will be able to just do a
$pip create to enerate a new package.
So to work beyond dependencies, it just need a GitHub Private token (and maybe that you once loin to TravisCI, but haven't tested). The best thin is, if you don't have github token availlble, it will even open the browser for you at the right page. Here are teh step:
$ python3 -m pipcreate # need to create a real executable in future
Now either it grabs the github token from your keychain, or ask you for it and then does all the rest for you:
$ python3 -m pipcreate will use target dir /Users/bussonniermatthias/eraseme Comparing "SoftLilly" to other existing package name... SoftLilly seem to have a sufficiently specific name Logged in on GitHub as Matthias Bussonnier Workin with repository Carreau/SoftLilly Clonning github repository locally Cloning into 'SoftLilly'... warning: You appear to have cloned an empty repository. Checking connectivity... done. I am now in /Users/bussonniermatthias/eraseme/SoftLilly Travis user: Matthias Bussonnier syncing Travis with Github, this can take a while... ..... syncing done Enabling travis hooks for this repository Travis hook for this repository are now enabled. Continuous interation test shoudl be triggerd everytime you push code to github /Users/bussonniermatthias/eraseme/SoftLilly [master (root-commit) f4a9a5d] 'initial commit of SoftLilly' 28 files changed, 1167 insertions(+) create mode 100644 .editorconfig create mode 100644 .gitignore create mode 100644 .travis.yml create mode 100644 AUTHORS.rst create mode 100644 CONTRIBUTING.rst create mode 100644 HISTORY.rst create mode 100644 LICENSE create mode 100644 MANIFEST.in create mode 100644 Makefile create mode 100644 README.rst create mode 100644 docs/Makefile create mode 100644 docs/authors.rst create mode 100755 docs/conf.py create mode 100644 docs/contributing.rst create mode 100644 docs/history.rst create mode 100644 docs/index.rst create mode 100644 docs/installation.rst create mode 100644 docs/make.bat create mode 100644 docs/readme.rst create mode 100644 docs/usage.rst create mode 100644 requirements.txt create mode 100644 setup.cfg create mode 100755 setup.py create mode 100755 softlilly/__init__.py create mode 100755 softlilly/softlilly.py create mode 100755 tests/__init__.py create mode 100755 tests/test_softlilly.py create mode 100644 tox.ini Counting objects: 32, done. Delta compression using up to 4 threads. Compressing objects: 100% (25/25), done. Writing objects: 100% (32/32), 13.41 KiB | 0 bytes/s, done. Total 32 (delta 0), reused 0 (delta 0) To email@example.com:Carreau/SoftLilly.git * [new branch] master -> master
It finishes up by openin travis on the test page after a push so that you can see the first test passin after a few seconds.
Hope that will give you soem ideas and patch welcomes.
Almost two weeks ago was the beginning of PyCon 2014, I took a break durring the writting of my dissertation to attend, end even if I was a little concerned because it was my first non-scientific-centric Python conference that I attend, I am really happy to have made the choice to go.
I'll write what hopefully will be a (short) recap of (my view) of the conference. Maybe not in a perfect chronological order though.
So this year, PyCon Us was... in Montreal Canada, 7 hour flight from where I leave, in a seat narrower than me, and I'm not that wide. Hopefully there was video on demand. I watched Gravity, but I must say I was really desapointed. Beeing a physicist, I saw a lot of mistakes that ruinded the movie. Guys, it is worthless to program an engine that simulate physics if you gave it the wrong rules. From basic physics classes you will learn the following :
- Angular momentum conservation also apply in space.
- There is no buoyancy convection, hence no "flame" like the one on earth.
- Usually the law of physics don't change with time.
Landing in Canada, Custom were more thorough than US custom. They actually searched what PyCon was, and asked me question about it.
Going from Airport to Center Montreal was quite easy, direct by bus 747, finding the right stop was harder has the bus driver was announcing stop number, and not name of the station, and no map was availlable in the bus.
Hotel was a block away from conference center (but not that easy to find when you enter by the given street adress and not the main entrance).
So I was finally settle around 18 hours before the beginning of the tutorials, tweeted to know if anybody was around, meet with Agam, get dinner, sleep.
Durring the first day I was able to attend two tutorials, which I both enjoyed, and allowed me to adjust to the right timezone. 1h lunch is just enough to meet with people and discuss, hopefully you get (some) time to discuss and meet people in the evening.
I catched up with Fernando who was giving the IPython tutorial the next day in the morning.
Btw, do not forget to update your phones app before leaving the country, especially the one you want to use that require textto send you a text when you are abroad and do not get roaming.
The IPython tutorial on thursday went great, except some really weird installation issues where :
Anaconda did refuse to install because it is already installed but nowhere on the hardrive, and in the end, you force miniconda install and do all the dependency by hand.
Of course you brought Pendrives to avoid user to download the 300Mb install on conference wifi, but you forgot the Win-Xp-32bit installer.
I was really suprised to see people still using IPython 0.12, and modal interface seem to make sens for people having never used IPython.
One thig that really make PyCon so great, (and the python comunity in general are the people) I basically missed many tutorial and talk during the week only because of discussing with people. I must that I learned a lot by discussing with Nick Coghlan and Fernando Pérez (ok, as a non native english I might have been really quiet, but learned a lot, and I'm gettign better, I'm starting to hear people accents).
If you want something on python core, you need to be persistant. By default, anything you will propose to the CPython core comunity/PSF will be refused, but you need to perseverate. If you perseverate then you show the community that you are ready to put time into supporting what you are proposing.
Also keep in mind that even if there are obvious issues (haem packaging) people are working on it, you probably don't know the implication :
So please either be patient, or help them with the most basic thing missing : documenting ! Has seen later in the conference the CPython/packaging/docummenting/pip that once were difficult to distinguish becomme separate spaces, where for now people tend to be the same, but things are shifting.
Becomming member of the PSF is now also open to everybody, so you can know get even more involve into Python.
Even after missing one tutorial afternoon can get busy, volunteers packed stuff in bags for the (real) oppenig of the conference the day after. It is also the time were sponsors are starting to set-up their booth and were you can talk to people before the big affluence. Usually the time also to finally see in flesh and blood people you only know by their twitter/github/... handle. Often it even takes you a day or two find out you already know each other.
Unlike other meals, there was (a limited number per person of) free beer, still found a little annoying that you had to pay for soft drink afterward. It is also the right time to look for the different competions around the different booth to win stuff. This goes from Linked-in that has this informal "do this puzzle as fast as you can", to Google were you actually had five minutes chrono to program an AI for your snake in a "Snake Arena" game.
Reception ended pretty early as teh conference had not already started, but if you plan on arriving the day before, it is a must attend if you want to start meeting with people and start adjusting to the timezone when you are flying from further west.
Still empty stomach (except for 3 beers), I met with Jess Hamrick (which describe her version of PyCon here) to eat dinner, before crashing to sleep. I won't repeat was jess has said as she is much more skillful with words than I am.
I really appreciated the morning during the main conference as the breakfast was served in the convention center, which allow you to socialize early in the morning.
from IPython.display import Image Image('img/breakfast.jpg', width=600)
As you can see there is room for plety of people !
I wont spend too much time on the keynotes and talk that were presented, they are already all uploaded on PyVideo / Youtube and I must say that the PyCon team in charge of filming/organising the events did a pretty awesome job !
In general I felt like PyCon was really different from SciPy first by the number of people that are present which is huge :
from IPython.display import Image Image('img/crowd.jpg')
Main conference room before keynote (stolen from twitter at some point, if you re-find the author plese tell me so that I could update the post)
Also unlike SciPy, PyCon is much less axed toward science, which both gave me the opportunity to discover lots of project and found out that in the end, IPython is not that used and known. Having been once in Texas and once in Canada, in my own experience conference food during break is alway better in Austin than in Montreal.
In evening we either went to restaurant, or to party organised by Speakers or Sponsors. Note that even if it is great to have open bar (and not cheap alchool, but nice wine) and unlimited cake, I felt like real food was missing.
When going to another continent, I usually prefer to optimise and stay as much as I can, hence I also stayed for sprint (I'll write my PhD dissertaion later). In the other hanf Fernando Perez is a Busy man and only was present for the first day and a half. Hence we were not supposed to have a real IPython sprint for the first day.
So we arrived at the sprint on monday morning off the cuff, ready to have only a few people for the sprint. To our surprise after 1h, we were around 20 around the table with our number growing. We were more and more in need for a blackboard to a globab explanation of the IPython internal. Hopefully our hacker spirit lead us to the back of a pycon sign we steal and use the back of the blackboard: