| Author: | Christopher Arndt |
|---|---|
| Date: | 2008-04-13 |
| Location: | RuPy Conference Poznań, Poland |
| Copyright: | CC Attribution/Share-Alike |
|---|
Contents
This talk is based to a good part on the slides of David Goodger's talk Code Like a Pythonista. Idiomatic Python resp. Jeff Hinrich's adaptation with the same title, from which I picked the topics I could most relate too and then added a few of my own favorite Python idioms. The presentation is released under the Creative Commons Attribution/Share-Alike License.
The Zen of Python
Try this at your Python prompt:
>>> import this
The Zen of Python, by Tim Peters
The Zen of Python, cont.
Beautiful is better than ugly
Programs must be written for people to read, and only incidentally for machines to execute.
— Abelson & Sussman, Structure and Interpretation of Computer Programs
Every Python programmer should know PEP 8:
http://www.python.org/dev/peps/pep-0008/
PEP = Python Enhancement Proposal
The Python community has its own standards for what source code should look like, codified in PEP 8. These standards are different from those of other communities, like C, C++, C#, Java, VisualBasic, etc.
Because indentation and whitespace are so important in Python, the Style Guide for Python Code is as good as a standard.
Most open-source projects and (hopefully) in-house projects follow the style guide quite closely and there are even tools to check whether code adheres to the standard.
- 4 spaces per indentation level.
- No hard tabs.
- Never mix tabs and spaces.
- One blank line between functions.
- Two blank lines between classes.
def make_squares(key, value=0):
"""Return a dictionary and a list..."""
d = {key: value}
l = [key, value]
return d, l
I never use __private form. And so will probably you.
Keep lines below 80 characters in length.
Use implied line continuation inside parentheses/brackets/braces:
def __init__(self, first, second, third,
fourth, fifth, sixth):
output = (first + second + third
+ fourth + fifth + sixth)
Use backslashes as a last resort:
VeryLong.left_hand_side \
= even_longer.right_hand_side()
Backslashes are fragile; they must end the line they're on. If you add a space after the backslash, it won't work any more. Also, they're ugly.
Adjacent literal strings are concatenated by the parser:
>>> print 'o' 'n' "e"
one
The string prefixed with an "r" is a "raw" string. Backslashes are not evaluated as escapes in raw strings. They're useful for regular expressions and Windows filesystem paths.
Note named string objects are not concatenated:
>>> a = 'three'
>>> b = 'four'
>>> a b
File "<stdin>", line 1
a b
^
SyntaxError: invalid syntax
That's because this automatic concatenation is a feature of the Python parser/compiler, not the interpreter. You must use the "+" operator to concatenate strings at run time.
text = ('Long strings can be made up '
'of several shorter strings.')
The parentheses allow implicit line continuation.
Multiline strings use triple quotes:
"""\
Triple
double
quotes"""
Good:
if foo == 'blah':
do_something()
do_one()
do_two()
do_three()
Bad:
if foo == 'blah': do_something()
do_one(); do_two(); do_three()
Docstrings = How to use code
Comments = Why (rationale) & how code works
Simple is Better Than Complex
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
— Brian W. KernighanIn other words, KISS!
General Python Idioms
In other languages:
temp = a
a = b
b = temp
In Python:
b, a = a, b
We saw that the comma is the tuple constructor, not the parentheses. Example:
>>> 1,
(1,)
The Python interpreter shows the parentheses for clarity, and I recommend you use parentheses too:
>>> (1,)
(1,)
Don't forget the comma!
>>> (1)
1
This is a really useful feature that surprisingly few people know.
In the interactive interpreter, whenever you evaluate an expression or call a function, the result is bound to a temporary name, _ (an underscore):
>>> 1 + 1
2
>>> _
2
_ stores the last printed expression.
When a result is None, nothing is printed, so _ doesn't change. That's convenient!
Note
This only works in the interactive interpreter, not within a module.
Start with a list of strings and then join() it:
colors = ['red', 'blue', 'green', 'yellow']
print ", ".join(colors)
We want to join all the strings together into one large string. Especially when the number of substrings is large...
Don't do this:
result = ''
for s in colors:
result += s
Good:
if foo is None:
do_something()
Maybe:
if not foo:
do_someting()
Bad:
if foo == None:
do_something()
Good:
for i, item in enumerate(mylist):
if i >= 1:
print mylist[i-1] + item
Bad:
for i in range(len(mylist)):
if i >= 1:
print mylist[i-1] + mylist[i]
Also bad:
i = 0
for item in mylist:
if i >= 1:
print mylist[i-1] + item
i += 1
Good:
for key in d:
print key
Dicts have a setdefault method that is very useful to initialise dicts:
navs = {}
for (portfolio, equity, position) in data:
navs.setdefault(portfolio, 0)
navs[portfolio] += position * prices[equity]
The setdefault dictionary method returns the default value, and we're taking advantage of setdefault's side effect, that it sets the dictionary value only if there is no value already.
Tip
Python 2.5 has the defaultdict class. Look it up in the standard library reference!
In many other languages, assigning to a variable puts a value into a box.
int a = 1;
Box "a" now contains an integer 1.
Assigning another value to the same variable replaces the contents of the box:
a = 2;
Now box "a" contains an integer 2.
Assigning one variable to another makes a copy of the value and puts it in the new box:
int b = a;
"b" is a second box, with a copy of integer 2. Box "a" has a separate copy.
In Python, a "name" or "identifier" is like a parcel tag (or nametag) attached to an object.
a = 1
Here, an integer 1 object has a tag labelled "a".
If we reassign to "a", we just move the tag to another object:
a = 2
If we assign one name to another, we're just attaching another nametag to an existing object:
b = a
The name "b" is just a second tag bound to the same object as "a".
This is a common mistake that beginners often make. Even more advanced programmers make this mistake if they don't understand Python names.
def bad_append(new_item, a_list=[]):
a_list.append(new_item)
return a_list
The problem here is that the default value of a_list, an empty list, is evaluated at function definition time. So every time you call the function, you get the same default value. Try it several times:
>>> print bad_append('one')
['one']
>>> print bad_append('two')
['one', 'two']
Lists are a mutable objects; you can change their contents. The correct way to get a default list (or dictionary, or set) is to create it at run time instead, inside the function:
def good_append(new_item, a_list=None):
if a_list is None:
a_list = []
a_list.append(new_item)
return a_list
List comprehensions ("listcomps" for short) are syntax shortcuts for this general pattern:
The traditional way, with for and if statements:
new_list = []
for item in a_list:
if condition(item):
new_list.append(fn(item))
As a list comprehension:
new_list = [fn(item) for item in a_list
if condition(item)]
Let's sum the squares of the numbers up to 100. As a loop:
total = 0
for num in range(1, 101):
total += num * num
We can use the sum function to quickly do the work for us, by building the appropriate sequence. As a list comprehension:
total = sum([num * num for num in range(1, 101)])
As a generator expression:
total = sum(num * num for num in xrange(1, 101))
Rule of thumb:
Here's a useful generator à la find(1):
def walkfiles(startdir, pattern=None):
"""Return generator for full paths of all files below startdir.
Optionally filters out files not matching pattern.
"""
for dir, dirlist, filelist in os.walk(startdir):
for fname in filelist:
if pattern and not fnmatch.fnmatch(fname, pattern):
continue
yield os.path.join(dir, fname)
DSU = Decorate-Sort-Undecorate
Instead of creating a custom comparison function, we create an auxiliary list that will sort naturally:
alist = [(4, 5), (3, 2), (2, 1), (6, 7)]
# Decorate:
to_sort = [(item[2], item) for item in alist]
# Sort:
to_sort.sort()
# Undecorate:
alist = [item[-1] for item in to_sort]
In Python 2.4 and above, you can use the key parameter to sort to do this in one step.
from operator import itemgetter
alist.sort(key=itemgetter(1))
It's easier to ask forgiveness than permission
Look before you leap
Good:
try:
return str(x)
except TypeError:
...
Bad:
if isinstance(x, basestring):
do_something(x)
Program structure
- (Shebang)
- Source encoding declaration
- Module docstring
- Imports (stdlib, third-party, private modules)
- Global constants and initialization code
- Exceptions
- Module-level functions
- Classes
- main function
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | #!/usr/bin/env python
# examples/script-template.py
def main(args):
if not args:
print "Usage: foo ARG1 [ARG2...]"
return 2
return 0
if __name__ == '__main__':
import sys
status = main(sys.argv[1:])
sys.exit(status)
# or combined
# sys.exit(main(sys.argv[1:]))
|
OO-Programming
Bad:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | class Foo:
def __init__(self, spamm, eggs):
self.spamm = spamm
self.eggs = eggs
def get_spamm(self):
return self.spamm
def set_spamm(self, value):
self.spamm = value
# et cetera
f = Foo('bar', 'baz')
myspamm = f.get_spamm()
|
Good:
1 2 3 4 5 6 7 8 | class Foo:
def __init__(self, spamm, eggs):
self.spamm = spamm
self.eggs = eggs
f = Foo('bar', 'baz')
myspamm = f.spamm
|
But what if you need to make your attribute dynamic later?
Bad:
1 2 3 4 5 6 7 8 | class Foo:
# ...
def get_spamm(self):
return make_spamm()
f = Foo()
myspamm = f.get_spamm()
|
Solution: use the property builtin:
Good:
1 2 3 4 5 6 7 8 9 10 | class Foo:
# ...
def _spamm(self):
"""Return fresh portion of spamm."""
return make_spamm()
spamm = property(_spamm)
f = Foo()
myspamm = f.spamm
|
Or, using property as a decorator:
Very good:
1 2 3 4 5 6 7 8 9 | class Foo:
# ...
@property
def spamm(self):
return make_spamm()
f = Foo()
myspamm = f.spamm
|
Warning
Both forms will turn spamm into a read-only attribute!
The same works for setting attributes:
Bad:
1 2 3 4 5 6 7 8 9 | class Foo:
def set_spamm(self, value):
if is_valid(value):
self.spamm = value
else:
raise ValueError('I want my spamm!')
f = Foo()
f.set_spamm = "Eggs"
|
Good (again using property):
1 2 3 4 5 6 7 8 9 10 11 12 | # examples/property_01.py
class Foo(object):
def _get_spamm(self):
return make_spamm()
def _set_spamm(self, value):
if is_valid(value):
self.__dict__['spamm'] = value
else:
raise ValueError('I want my spamm!')
spamm = property(_get_spamm, _set_spamm, None,
"Tasty spamm")
|
Static methods don't receive the instance as the first argument. They can be be thought of as functions living in the namespace of the class. They are similar to the same concept in Java or C++. They are not very useful in Python (just use a normal function instead) but can be used for helper functions in a class, which doesn't need access to self, and are no use outside the class.
1 2 3 4 5 6 | class Foo:
# ...
@staticmethod
def _format_name(name):
return name.strip().replace('_').capitalize()
|
Class methods receive the class object as the first argument, not the instance. It is therefore good practice to name the first parameter cls (class is a keyword!) instead of self. A good use for class methods are factory functions, i.e. alternative, convenient ways to create pre-configured class instances.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | # examples/classmethod_01.py
class Template:
def __init__(self, template, **data):
self.template = template
self.data = data
@classmethod
def from_file(cls, filename, **data):
"""Return Template with template string read
from filename."""
return cls(open(filename).read(), data)
def render(self, **data):
subst = self.data.copy()
subst.update(data)
return self.template % data
|
Note
Classmethods can be called on the class:
Template.from_file(...)
or the instance with the same effect:
t.from_file(...)
Summary
- Follow PEP 8
- Read the standard library reference
- Know your lists, dicts and iterators
- Import this
And now for something completely different...
Number 1: The Larch