class: center, middle # Python 3 Lumír Balhar (@lumirbalhar) lbalhar@redhat.com Python SW engineer at Red Hat ??? Python maintenance team Main responsibility is to help various upstream projects with issues related to supporting Python 2 and 3 --- class: center, middle # Why is there Python in Samba? ??? Lure external developers - Python is easy to read/write/unsderstand --- class: center, middle # History of Python ### Implementation started in 1989 ### First published version (0.9.0) in 1991 Same year of release as the **first Web page**, the **first Linux kernel** (0.01) or **Visual basic** ??? Actually same as year of my birth, which cannot be a coincidence :D --- class: center, middle # Python 3.0 ### First backward incompatibile release ### Released in 2008 In the same year **GitHub** was founded, first mobile phone with **Android** (HTC Dream) was released and Intel announced the **Intel Atom** family of processors. ??? Drop of backwards compatibility may looked unresonable but I want to talk about reasons for this step today and I'll show you why this was a good idea. --- class: center, middle # Why backward incompatibile? ### "There should be one- and preferably only one -obvious way to do it" ??? Python 3.0 was developed with the same philosophy as in prior versions. However, as Python had accumulated new and redundant ways to program the same task. This is one of the most important rule in zen of python and a lot of changes in Python itself are driven by this rule. If you don't know zen of python and want to read the rest, just do `import this` in the python console. --- # `print` ??? Lets start with some basics -- ## Python 2 ```python print "Hello world" ``` ??? Like very simple usage of print -- ## Python 3 ```python print("Hello world") ``` ??? Ok, you may argue: was it really necesary to change print from statement to funciton? Even when print is the first thing everybody lear in Python? This may look unresonable but what about using more complex print? --- # More complex `print` -- ## Python 2 ```python print >> sys.stderr, 'Hello world', ``` ??? If you want to print to standart error output and you don't want to add a line break at the end of the line, you end with something like this which looks horrible and it is really hard to understand what this construct does. -- ## Python 3 ```python print("Hello world", end="", file=sys.stderr) ``` ??? But in Python 3 you have just function with keyword arguments. You may ask why print wasn't function from beggining - the reason is that in early stages of Python development functions didn't support variable lenght of arguments. --- # Handling exceptions -- #### Single exception ```python try: a = 1/0 except ZeroDivisionError: print "You can't divide by zero!" ``` ??? If I want to handle single exception without storing it in variable, it is quite simple. -- #### Two exceptions ```python try: a = 1/"test" except ZeroDivisionError, TypeError: print "You can't divide by zero or by string!" ``` ??? I can remember that there was comma to separate things in exceptions handling so it looks as a good idea to separate exceptions by comma. But does it work? No! It uses TypeError as a name of variable where exception will be stored. -- #### Correct way ```python try: a = 1/"test" except (ZeroDivisionError, TypeError), e: print "You can't divide by zero or by string!" ``` ??? This is a correct way how to handle more than one exception type in one except statement and I am pretty sure that sytax with commas only could be confusing for new python user. --- # Python 3 syntax (backported to 2.6+) ```python try: a = 1/"test" except (ZeroDivisionError, TypeError) as e: print("You can't divide by zero or by string!") ``` ??? Situation described on previous slide is a good reason for changing syntax to be more straightforward than just commas so Python 3 came with `as` keyword as separator between exception types and variable name to store exception in. --- # Exception chains ```python class ConfigNotFoundError(Exception): pass def open_config_file(): open('missing.conf') def c(): try: open_config_file() except IOError as e: raise ConfigNotFoundError(e) c() ``` ??? Imagine you have this simpel module created to load configuration from file. And you want to catch IOError exception and raise your own exception describing that configuration file wasn't found. ... --- # Python 2 traceback ```python Traceback (most recent call last): File "/tmp/pasted.py", line 13, in
c() File "/tmp/pasted.py", line 11, in c raise ConfigNotFoundError(e) ``` ??? In Python 2 you have only this short traceback and it can be really hard to find bug in big codebase where one exceptions may occur during another exception handling. --- # Python 3 traceback ```python Traceback (most recent call last): File "/tmp/pasted.py", line 9, in c open_config_file() File "/tmp/pasted.py", line 5, in open_config_file open('missing.conf') FileNotFoundError: [Errno 2] No such file or directory: 'missing.conf' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/tmp/pasted.py", line 13, in
c() File "/tmp/pasted.py", line 11, in c raise ConfigNotFoundError(e) __main__.ConfigNotFoundError: [Errno 2] No such file or directory: 'missing.conf' ``` ??? In Python 3 you have a complete traceback with all exceptions raised. This behaviour is possible because in Python 3 exceptions is stored in one object copletely and this object contains name, type and complete traceback, so it is possible to create a chain of multiple exceptions which makes finding bug much easier. --- # Sorting ## using `cmp` (Python 2) ```python def compare(a, b): """Comparison of last names""" return cmp(a.split()[-1], b.split()[-1]) names = ['Adam Smith', 'Donald Brown', 'John Silver'] print(sorted(names, cmp=compare)) ``` ??? CMP keyword argument for .sort() method or sorted() function is here to compare two items during sorting. CMP funtion has a visible disadvantage because it is called every time when comparison of two items is needed during sorting. This may cause a performace problem when function for comparison is complex or the dataset is big. -- ## using `key` (Python 2 and 3) ```python def get_last_name(full_name): return full_name.split()[-1] names = ['Adam Smith', 'Donald Brown', 'John Silver'] print(sorted(names, key=get_last_name)) ``` ??? So there is another posibility how to achieve this. Key keyword argument is here to return key for each element. Visible advantage is that the function returning key has to be executed only once for every item during sorting. And because this two approaches do same task, cmp function and cmp keyword argument for .sort() method and sorted function has been removed in Python 3. --- # Copmarison of objects ```python class Orderable(object): def __init__(self, firstname, lastname): self.first = firstname self.last = lastname def __cmp__(self, other): return cmp((self.last, self.first), (other.last, other.first)) sorted([Orderable('Donald', 'Duck'), Orderable('Paul', 'Anka')]) ``` ??? There is one more thing which has been removed from Python 3 related to comparison and sorting - special `__cmp__` method for implementing comparison for your own objects. -- Usin `cmp`, we can only implement total ordering --- # Old `cmp`-based comparison * `__cmp__` for everything # Rich comparison (since Python 2.1) * `__lt__` for **`<`** * `__le__` for **`<=`** * `__eq__` for **`==`** * `__ne__` for **`!=`** * `__gt__` for **`>`** * `__ge__` for **`>=`** * `__cmp__` as a fallback ??? Because for special cases you may want to implement only some of comparions operators, rich comparison is in Python since version 2.1. And again we have more than one way how to implement comparion so the `__cmp__` method has been removed. -- # Python 3 * `cmp` was removed --- class: center # Data formats ## Images JPG, PNG, BMP ## Music MP3, OGG, WAV ## Text? -- UTF-8, ASCII, Shift-JIS ??? All of you know a lot of data formats for different file types. For example Data format is just a way how to store data byte by byte on disk or how to send the data throw network. But what about text? For a lot of developers text are only sequence of bytes but is that true? --- class: center # You can pretend that text is only sequence of ASCII chars and store it as bytes -- # And then you come to G
ö
ttingen -- ## Also, you cannot ignore emoji 😎 ### that would make your users 😢 --- class: center, middle # Plain text is a myth --- class: center, middle ## Python 3 contains: ### \- `str` type for text ### \- `bytes` type for binary data ??? Because Python upstream developers know that text is much more than only bytes, Python 3 contains two different types - unicde for text and bytes for raw data. This may be a most important part of porting codebase to Python 3 because you have to decide if you want to store data or text in variable. This is also a hard nut to crack during porting samba because you need to understand the code very well before you can decide which type is better. --- class: middle # Changes in the standard library -- | Python 2 name | Python 3 name | |------------------------|---------------| | `__builtin__` |`builtins` | | `ConfigParser` |`configparser` | | `cStringIO.StringIO()` |`io.StringIO` | | `raw_input()` | `input()` | | `xrange()` |`range()` | | `reduce()` |`functools.reduce()` | ??? There are also a lot of changes in standard library names. This table is just an example of a few changed names. --- class: center, middle # How to handle the changes? --- class: center # Porting strategies ??? Ok, now you know that you really want to support Python 3 and you probably cannot drop support for Python 2 in one day. So there are some strategies how to achieve this goal. -- ## Support only Python 3 -- ## Maintain separated codebase (branches) -- ## Convert with 2to3 or 3to2 -- ## Write compatible code --- class: center, middle # Conservative porting guide ## [portingguide.readthedocs.io](http://portingguide.readthedocs.io/en/latest/) ??? Conservative porting guide is online documentation projet contains a lot of interesting information about supporting codebase for Python 2 and Python 3 and also a guide how to use python-modernize tool to make you code more compatible. --- class: center, middle # `six` library ## [pythonhosted.org/six/](https://pythonhosted.org/six/) ### simple utilities for wrapping over differences between Python 2 and 3 #### (2 * 3 == 6) ??? When you want to write compatible code, sonner or later you will need Six. Six is a Python library which may help you a lot during writing compatible code. Six contains only one Python file with a lot of handy variables and wrappers. --- class: center, middle # Python C extensions ### Exposing Samba's C code to Python ??? We decided to use Python for comfortable writing readable code so why anybody would want to write Python modules in C? --- class: center, middle # How to make a C extension ## Cython, CFFI, Python C API ??? Why C API is on the last position? I'll describe it after a while but now you have to know that Cython is not only the simplient way how to create Python C extensions but it is a preferred way by Python upstream developers. --- class: center # Python C API incompatible changes -- ## Module initialization -- ## Object initialization -- ## Python `long` type was merged with `int` -- ## Comparison-related changes ## etc. --- # Module initialization ??? Modules initialization is a good example how big difference is between Python 2 and 3 C API and how complicated may be to support both version in one code. -- ## Python 2 ```c m = Py_InitModule3("themodulename", module_functions, "This is a module"); ``` ??? In Python 2 there is a simple function to initialize module. -- ## Python 3 ```c static struct PyModuleDef moduledef = { PyModuleDef_HEAD_INIT, .m_name = "themodulename", .m_doc = "This is a module.", .m_size = -1, .m_methods = module_functions, }; m = PyModule_Create(&moduledef); ``` ??? But in Python 3 you have to pass special structure to new function to initialize module. --- class: center, middle # `py3c` project ## “`six` for Python C extenstions” ## + porting guide ## [py3c.readthedocs.io](https://py3c.readthedocs.io/en/latest) ??? But don't be scared. Help is coming :D Project py3c is something like six library for C extensions. This contains a set od macros to rename some incompatible names and create non existing once for Python C API and was created by Petr specially for Samba but after that it looked like this could be usefull for other projects so Py3c is released as separate project and we use small subset of py3c in Samba. --- class: center, middle # Thank you! -- # Questions? ## [portingguide.readthedocs.io](http://portingguide.readthedocs.io/en/latest/) ## [py3c.readthedocs.io](https://py3c.readthedocs.io/en/latest)