Python is Weird

Dave Halter (@jedidjah_ch)

I'm an engineer for cloudscale.ch. We are serving cloud services (mostly IaaS) with a focus on simplicity. In my free time I work a lot on parsers/type inference for the Python world. Jedi is probably the project people would know me for.

Abstract

Tags: tokenizer parser python

A lot of people think that Python is a really simple and straightforward language. Python hides a lot of peculiarities very well, but for the sake of this talk we will try to uncover them.

Is ++4; valid Python? And what does it do? Let me give you an introduction into tokenizers/parsers.

Description

A lot of people think that Python is a really simple and straightforward language. Python hides a lot of peculiarities very well, but for the sake of this talk we will try to uncover them.

I will be explaining how the whole process of tokenizing -> parsing -> ast creation -> bytecode works and will use odd Python code to give you an insight on the internals. Do you think ++4; is valid Python? Or how about 0jif.1else-2? There are no spaces in it. Go figure! "Edge cases" will help us understand the inner workings of Python. These two cases are possible, because of the tokenizer. Other "weird" Python code is waiting as soon as you start looking at the grammar file. Have you for example heard about lambda generators?

We will be looking into how modules, classes and instances are really just fancy dictionaries and how importing is really nothing else than storing a module into a dictionary (sys.modules).

There are a lot of things we can learn from diving deep into the details of our beloved languages. This talk will give you a very small introduction in how languages are built and explain how Python itself is defined by its parser, tokenizer and bytecode generation. Knowing how those abstractions work makes you a better Python programmer, because you will know better how the language behaves.