Collections
To see if a collection (list, set, tuple, dictionary) is empty just test whether it is logically
true. So rather than
if len(myList) == 0:
doSomething()
instead do:
if myList:
doSomething()
To copy a list you can use the list() keyword or use the [:] slice notation:
duplicateList = list(firstList)
duplicateList = firstList[:]
A slice notation can also be used to get a reversed copy of a list (remembering that the
last element of the slice notation is the step):
revList = firstList[::-1]
This is more compact than copying and then using reverse():
revList = list(firstList)
revList.reverse()
or using the reversed()iterator, which is handy when going through loops in reverse order
(for example, for x in reversed (a List):), but needs an explicit conversion to make a
duplicate list:
revList = list(reversed(firstList))
For dictionaries don’t forget the .get() and .setdefault() methods. So:
if x in myDict:
y = myDict[x]
else:
y = defaultValue
becomes:
y = myDict.get(x, defaultValue)
or if the default value is None simply:
y = myDict.get(x)
If the default value should be actually put in the dictionary then you can do:
y = myDict.setdefault(x, defaultValue)
In Python 2 if you want to simply enquire whether something is present in a dictionary
it is simpler, and slightly faster, to use in rather than call has_key().
if myDict.has_key(key):
doSomething()
becomes:
if key in myDict:
doSomething()
In Python 3 dictionaries no longer have the has_key() method.
It may sometimes be helpful to construct a dictionary from a list. Rather than going
through a loop, a list of 2-tuples with (key,value) pairs can be used:
listData = [(1,'Apples'), (2, 'Bananas'), (3, 'Cherries')]
dictData = dict(listData)
print(dictData[2])
In Python 2 to do the reverse you can use .items() to get a list of all pairs, or .iteritems()
to get an iterator object, which can be looped though like a list but which yields one item
at a time, and so saves memory by not making the complete list:
for k, v in dictData.items(): # Makes a list
print('Key: %d, Value: %s' % (k,v))
for k, v in dictData.iteritems(): # Uses an efficient iterator
print('Key: %d, Value: %s' % (k,v))
In Python 3 there is no .iteritems() method and .items() returns an iterable view on the
items in the dictionary, rather than a list.
The zip keyword can be used to combine corresponding elements from multiple lists,
which is handy for dictionaries when you initially have separate lists for the keys and
values:
keys = [1, 2, 3]
values = ['Apples', 'Bananas', 'Cherries']
listData = zip(keys, values) # [(1,'Apples'), (2,'Bananas'),
(3,'Cherries')]
dictData = dict(listData)
The next tip was mentioned before, but we repeat it in the compendium, and it can
reverse the above operation (although for dictionaries, .keys() and .values() also do the
job). If you already have data in a list of lists (or tuples) then zip can neatly extract the
elements which share the same index.
listData = [(1,'Apples'), (2,'Bananas'), (3,'Cherries')]
numbers, fruits = zip(*listData)
The way to imagine this one is that the call is actually zip((1,‘Apples’), (2, ‘Bananas’),
(3, ‘Cherries’)), with the * extracting the items in the list as separate arguments. The zip
then combines the first elements and the second elements together, exactly as above. This
is neater than using the equivalent list comprehension:
listData = [(1,'Apples'), (2,'Bananas'), (3,'Cherries')]
numbers = [x[0] for x in listData]
fruits = [x[1] for x in listData]
The zip can also come in handy as a compact notation for looping through two lists,
although in Python 2 it does make a new list, so is not so space efficient. Accordingly,
something like:
for i, aValue in enumerate(aList):
bValue = bList[i]
print(aValue, bValue)
could become:
for aValue, bValue in zip(aList, bList):
print(aValue, bValue)
In Python the set data type is sometimes overlooked, especially by those who started
with early versions of Python. Nonetheless, it is exceedingly useful and can avoid the need
to do looping with lists, as long as order is not important (or can be reconstructed). There
is a caveat to such set operations, however: the elements must be hashable, which means
they cannot be internally modifiable, a requirement to keep things unique. In essence, sets
can contain most objects, numbers, strings, tuples and frozen sets but cannot contain other
sets, lists or dictionaries.
Looking up elements in a set is fast, so where you have lots of look-ups to do, instead
of:
for x in firstList:
if x in veryLongList:
doSomething()
you can make things quicker with:
bigSet = set(veryLongList)
for x in firstList:
if x in bigSet:
doSomething()
Note this assumes that the speed gained using bigSet for look-up makes up for the time
spent creating the set in the first place.
Sets provide a neat way of removing duplicates from a list, as long as you don’t want to
preserve order, you just convert to a set and back to a list again:
myList = ['apple', 'banana', 'lemon', 'apple', 'lemon', 'lemon']
uniqueList = list( set(myList) ) # ['lemon', 'apple', 'banana']
To get the common elements of several lists using set operations is neat and efficient,
although it may be prudent to simply work with sets in the first place:
a = ['G','S','T','P','A']
b = ['A','V','I','L','P']
intersection = set(a) & set(b)
commonList = list(intersection) # ['A', 'P']
Likewise to find elements that are present in either list:
a = ['G','S','T','P','A']
b = ['A','V','I','L','P']
union = set(a) | set(b)
combinedList = list(union) # ['A', 'G', 'I', 'L', 'P', 'S', 'T', 'V']
When constructing lists it can be quicker and more compact to use list comprehensions
than loops. For example:
squares = []
for x in range(1001): # in Python 2 use xrange(1001)
squares.append( x * x )
is slower than:
squares = [x*x for x in range(1001)]
Also, if we don’t need the whole loop, but just need to iterate though it, we can use
round parentheses to make a generator object (which has no length as such and does not
have indices).
squares = (x*x for x in range(1001)) # Using () not []
for y in squares:
doSomething()
squares[3] # Fail: This will not work on () generators.
It is sometimes overlooked that list comprehensions can be concatenated, although it is
easy to take this sort of thing too far:
[(x,y) for x in range(3) for y in range(3)]
# Gives [(0, 0), (0, 1), (0, 2),
# (1, 0), (1, 1), (1, 2),
# (2, 0), (2, 1), (2, 2)]
[(x,y) for x in range(3) for y in range(x,3) if x+y >1]
# Gives [(0,2), (1,1), (1,2), (2,2)]
Sometimes you may wish to construct a list of blank lists, to put items into later. For
this it is tempting to do:
data = [[]] * 3
print(data) # Gives [[], [], []]
but here the same list object was repeated three times internally:
data[1].append(True)
print(data) # Gives [[True], [True], [True]]
so try a list comprehension instead:
data = [[] for x in range(3)]
data[1].append(True)
print(data) # Gives [[], [True], []]
Although perhaps not such common operations, the any and all keywords can be used
to find whether any or all elements in a list hold a certain condition. Accordingly:
for x in myList:
if x < 2:
doSomething()
break
becomes:
if any(x < 2 for x in myList):
doSomething()
Likewise:
if len(myList) == len([x<2 for x in myList]):
doSomething()
is the same as:
if all(x<2 for x in myList]):
doSomething()
For obtaining a sorted list, the inbuilt sorted function is useful when you don’t want to
modify the original list. So instead of :
b = list(a)
b.sort()
you can do:
b = sorted(a)
If you want to sort a list on something other than the items’ innate value you can
construct a list of 2-tuples which will be sorted on the first item (which contains the values
to sort on). Here we sort according to the length of the strings:
aList = ['homer', 'bart', 'maggie', 'lisa', 'marge']
bList = [(len(x), x) for x in aList]
bList.sort()
aList = [x for (lenX, x) in bList]
# Gives ['bart', 'lisa', 'homer', 'marge', 'maggie']
However. the key option of sort() is much more nifty and allows you to pass in the
function that is used to generate the sort key:
aList = ['homer', 'bart', 'maggie', 'lisa', 'marge']
aList.sort(key=len)
Sometimes when dealing with objects we would like to sort on the value of a particular
attribute. You can readily write a function to fetch that attribute (for any object in the list,
as required by the sort operation), and thus generate a key for the sort. So, for example:
def getSortAttr(obj):
return obj.something
objList = [objA, objB, objC]
objList.sort(key=getSortAttr)
However, you can also use the key option in combination with the operator module.
The function operator.attrgetter() uses the name of an attribute to create a separate on-the-
fly function
5
which sends back the value of an attribute, which in this case is the value to
sort with. So an alternative to the above is:
from operator import attrgetter
objList = [objA, objB, objC]
objList.sort(key=attrgetter('something')) # Name of attribute as a string
The functions operator.itemgetter (for selecting items in a collection) and
operator.methodcaller (for invoking class functions) can also be used in a similar manner.
Loops
We’ve been using enumerate() throughout the book, but it is still something novices
occasionally overlook. So instead of:
myList = ['e', 'f', 'g']
for i in range(len(myList)):
print(i, myList[i])
do:
myList = ['e', 'f', 'g']
for i, val in enumerate(myList):
print(i, val)
And from Python 2.6 you can use a second argument to specify the start point for the
index:
for i, val in enumerate(myList, 5):
print(i, val)
# Gives:
# 5 e
# 6 f
# 7 g
In Python 2 when looping though sequential numbers, such as indices, consider using
xrange() rather than range(). This saves space because it only yields numbers on demand.
Helpfully an xrange still has a length and can be indexed. In Python 3 xrange is effectively
renamed range and replaces the old list constructor.
for x in xrange(100, 1000000): # Doesn't make all the numbers (Python 2)
doSomething()
To make an indefinite
6
loop, use a while loop that tests something that is logically true,
although don’t forget to break out of the loop eventually:
while 1:
test = doSomething()
if test:
break
Because loops are constructs that allow you to repeat operations many times, when
thinking about speed a general principle is to put as few operations into the loop as
possible. For example, when doing function calls in a loop to construct a list using
.append(), a speed improvement can be made if the dot notation call is done only once
outside the loop. For example:
aList = []
for x in someBigList:
if testFunc(x):
aList.append(x)
becomes the faster:
aList = []
addToList = aList.append
for x in someBigList:
if testFunc(x):
addToList(x)
Related to the above, if you know how long a list will be it is faster to pre-construct it in
a quick manner and curate it using indices, rather than appending repeatedly.
aList = [0] * n
bList = [0] * n
for i in range(n):
aList[i] = someCall(i)
bList[i] = anotherCall(i)
If you need two loops and have to break out of both of them, cunning use of else,
continue and break can do the job without having to set any flags:
for a in oneList:
for b in anotherList:
if discoverSomething(a,b):
# Quit inner loop and subsequently the outer too
break
else:
# Without a break we get here at the end of the inner loop
# Continuing the outer loop the next break is skipped
continue
# Only get here due to the first break
break
If you have a loop that may cause an error (throw an exception) then it may be tempting
to do a precautionary check to stop errors before they occur. However, it is generally
quicker to let the exception happen and then catch it in a safe way. This is because with
try: there is no repeated checking and extra time is taken only if an error is encountered.
So, for example:
for x in bigList:
if rareEvent(x):
rareEventOccurred(x)
else:
commonTask(x)
can be modified into:
for x in bigList:
try:
commonTask(x)
except SpecialException:
rareEventOccurred(x)
2>2> Do'stlaringiz bilan baham: |