Writing an async REPL - Part 1
This is a first part in a series of blog post which explain how I implemented the ability to await
code at the top level scope in the IPython REPL. Don't expect the second part soon, or bother me for it. I know I shoudl write it, but time is a rarte luxury.
It is an interesting adventure into how Python code get executed, and I must admit it changed quite a bit how I understand python code now days and made me even more excited about async
/await
in Python.
It should also dive quite a bit in the internals of Python/CPython if you ever are interested in what some of these things are.
# we cheat and deactivate the new IPython feature to match Python repl behavior
%autoawait False
Async or not async, that is the question¶
You might now have noticed it, but since Python 3.5 the following is valid Python syntax:
async def a_function():
async with contextmanager() as f:
result = await f.get('stuff')
return result
So you've been curious and read a lot about asyncio, and may have come across a few new libraries like aiohttp and all hte aio-libs, heard about sans-io, read complaints and we can take differents approaches, and maybe even maybe do better. You vaguely understand the concept of loops and futures, the term coroutine is still unclear. So you decide to poke around yourself in the REPL.
import aiohttp
print(aiohttp.__version__)
coro_req = aiohttp.get('https://api.github.com')
coro_req
import asyncio
res = asyncio.get_event_loop().run_until_complete(coro_req)
res
res.json()
json = asyncio.get_event_loop().run_until_complete(res.json())
json
It's a bit painful to pass everything to run_until_complete, you know how to write async-def function and pass this to an event loop:
loop = asyncio.get_event_loop()
run = loop.run_until_complete
url = 'https://api.github.com/rate_limit'
async def get_json(url):
res = await aiohttp.get(url)
return await res.json()
run(get_json(url))
Good ! And the you wonder, why do I have to wrap thing ina function, if I have a default loop
isn't it obvious what where I want to run my code ? Can't I await things directly ? So you try:
await aiohttp.get(url)
What ? Oh that's right there is no way in Pyton to set a default loop... but a SyntaxError
? Well, that's annoying.
Outsmart Python¶
Hopefully you (in this case me), are in control of the REPL. You can bend it to your will. Sure you can do some things. First you try to remember how a REPL works:
mycode = """
a = 1
print('hey')
"""
def fake_repl(code):
import ast
module_ast = ast.parse(mycode)
bytecode = compile(module_ast, '<fakefilename>', 'exec')
global_ns = {}
local_ns = {}
exec(bytecode, global_ns, local_ns)
return local_ns
fake_repl(mycode)
We don't show global_ns
as it is huge, it will contain all that's availlable by default in Python. Let see where it fails if you use try a top-level async statement:
import ast
mycode = """
import aiohttp
await aiohttp.get('https://aip.github.com/')
"""
module_ast = ast.parse(mycode)
Ouch, so we can't even compile it. Let be smart can we get the inner code ? if we wrap in async-def ?
mycode = """
async def fake():
import aiohttp
await aiohttp.get('https://aip.github.com/')
"""
module_ast = ast.parse(mycode)
ast.dump(module_ast)
ast.dump(module_ast.body[0])
As a reminder, as AST stands for Abstract Syntax Tree, you may construct an AST which is not a valid Python, program, like an if-else-else. AST tree can be modified. What we are interested in it the body of the function, which itself is the first object of a dummy module:
body = module_ast.body[0].body
body
Let's pull out the body of the function and put it at the top level of a newly created module:
async_mod = ast.Module(body)
ast.dump(async_mod)
Mouahahahahahahahahah, you managed to get a valid top-level async ast ! Victory is yours !
bytecode = compile(async_mod, '<fakefile>', 'exec')
Grumlgrumlgruml. You haven't said your last word. Your going to take your revenge later. Let's see waht we can do in Part II, not written yet.