Python Programming for Biology: Bioinformatics and Beyond



Download 7,75 Mb.
Pdf ko'rish
bet131/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   127   128   129   130   131   132   133   134   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Collections

To see if a collection (list, set, tuple, dictionary) is empty just test whether it is logically

true. So rather than

if len(myList) == 0:

doSomething()

instead do:

if myList:

doSomething()

To copy a list you can use the list() keyword or use the [:] slice notation:

duplicateList = list(firstList)

duplicateList = firstList[:]

A slice notation can also be used to get a reversed copy of a list (remembering that the

last element of the slice notation is the step):

revList = firstList[::-1]

This is more compact than copying and then using reverse():

revList = list(firstList)

revList.reverse()

or using the reversed()iterator, which is handy when going through loops in reverse order

(for  example,  for  x  in  reversed  (a  List):),  but  needs  an  explicit  conversion  to  make  a

duplicate list:

revList = list(reversed(firstList))

For dictionaries don’t forget the .get() and .setdefault() methods. So:




if x in myDict:

y = myDict[x]

else:

y = defaultValue



becomes:

y = myDict.get(x, defaultValue)

or if the default value is None simply:

y = myDict.get(x)

If the default value should be actually put in the dictionary then you can do:

y = myDict.setdefault(x, defaultValue)

In Python 2 if you want to simply enquire whether something is present in a dictionary

it is simpler, and slightly faster, to use in rather than call has_key().

if myDict.has_key(key):

doSomething()

becomes:

if key in myDict:

doSomething()

In Python 3 dictionaries no longer have the has_key() method.

It  may  sometimes  be  helpful  to  construct  a  dictionary  from  a  list.  Rather  than  going

through a loop, a list of 2-tuples with (key,value) pairs can be used:

listData = [(1,'Apples'), (2, 'Bananas'), (3, 'Cherries')]

dictData = dict(listData)

print(dictData[2])

In Python 2 to do the reverse you can use .items() to get a list of all pairs, or .iteritems()

to get an iterator object, which can be looped though like a list but which yields one item

at a time, and so saves memory by not making the complete list:

for k, v in dictData.items(): # Makes a list

print('Key: %d, Value: %s' % (k,v))

for k, v in dictData.iteritems(): # Uses an efficient iterator

print('Key: %d, Value: %s' % (k,v))

In Python 3 there is no .iteritems() method and .items() returns an iterable view on the

items in the dictionary, rather than a list.

The zip  keyword  can  be  used  to  combine  corresponding  elements  from  multiple  lists,

which  is  handy  for  dictionaries  when  you  initially  have  separate  lists  for  the  keys  and

values:

keys = [1, 2, 3]

values = ['Apples', 'Bananas', 'Cherries']



listData = zip(keys, values) # [(1,'Apples'), (2,'Bananas'),

(3,'Cherries')]

dictData = dict(listData)

The  next  tip  was  mentioned  before,  but  we  repeat  it  in  the  compendium,  and  it  can

reverse  the  above  operation  (although  for  dictionaries,  .keys()  and  .values()  also  do  the

job).  If  you  already  have  data  in  a  list  of  lists  (or  tuples)  then  zip  can  neatly  extract  the

elements which share the same index.

listData = [(1,'Apples'), (2,'Bananas'), (3,'Cherries')]

numbers, fruits = zip(*listData)

The way to imagine this one is that the call is actually zip((1,‘Apples’), (2, ‘Bananas’),

(3, ‘Cherries’)), with the * extracting the items in the list as separate arguments. The zip

then combines the first elements and the second elements together, exactly as above. This

is neater than using the equivalent list comprehension:

listData = [(1,'Apples'), (2,'Bananas'), (3,'Cherries')]

numbers = [x[0] for x in listData]

fruits = [x[1] for x in listData]

The  zip  can  also  come  in  handy  as  a  compact  notation  for  looping  through  two  lists,

although  in  Python  2  it  does  make  a  new  list,  so  is  not  so  space  efficient.  Accordingly,

something like:

for i, aValue in enumerate(aList):

bValue = bList[i]

print(aValue, bValue)

could become:

for aValue, bValue in zip(aList, bList):

print(aValue, bValue)

In  Python  the  set  data  type  is  sometimes  overlooked,  especially  by  those  who  started

with early versions of Python. Nonetheless, it is exceedingly useful and can avoid the need

to do looping with lists, as long as order is not important (or can be reconstructed). There

is a caveat to such set operations, however: the elements must be hashable, which means

they cannot be internally modifiable, a requirement to keep things unique. In essence, sets

can contain most objects, numbers, strings, tuples and frozen sets but cannot contain other

sets, lists or dictionaries.

Looking up elements in a set is fast, so where you have lots of look-ups to do, instead

of:


for x in firstList:

if x in veryLongList:

doSomething()

you can make things quicker with:

bigSet = set(veryLongList)

for x in firstList:

if x in bigSet:



doSomething()

Note this assumes that the speed gained using bigSet for look-up makes up for the time

spent creating the set in the first place.

Sets provide a neat way of removing duplicates from a list, as long as you don’t want to

preserve order, you just convert to a set and back to a list again:

myList = ['apple', 'banana', 'lemon', 'apple', 'lemon', 'lemon']

uniqueList = list( set(myList) ) # ['lemon', 'apple', 'banana']

To get the common elements of several lists using set operations is neat and efficient,

although it may be prudent to simply work with sets in the first place:

a = ['G','S','T','P','A']

b = ['A','V','I','L','P']

intersection = set(a) & set(b)

commonList = list(intersection) # ['A', 'P']

Likewise to find elements that are present in either list:

a = ['G','S','T','P','A']

b = ['A','V','I','L','P']

union = set(a) | set(b)

combinedList = list(union) # ['A', 'G', 'I', 'L', 'P', 'S', 'T', 'V']

When constructing lists it can be quicker and more compact to use list comprehensions

than loops. For example:

squares = []

for x in range(1001): # in Python 2 use xrange(1001)

squares.append( x * x )

is slower than:

squares = [x*x for x in range(1001)]

Also,  if  we  don’t  need  the  whole  loop,  but  just  need  to  iterate  though  it,  we  can  use

round parentheses to make a generator object (which has no length as such and does not

have indices).

squares = (x*x for x in range(1001)) # Using () not []

for y in squares:

doSomething()

squares[3] # Fail: This will not work on () generators.

It is sometimes overlooked that list comprehensions can be concatenated, although it is

easy to take this sort of thing too far:

[(x,y) for x in range(3) for y in range(3)]

# Gives [(0, 0), (0, 1), (0, 2),

# (1, 0), (1, 1), (1, 2),

# (2, 0), (2, 1), (2, 2)]




[(x,y) for x in range(3) for y in range(x,3) if x+y >1]

# Gives [(0,2), (1,1), (1,2), (2,2)]

Sometimes  you  may  wish  to  construct  a  list  of  blank  lists,  to  put  items  into  later.  For

this it is tempting to do:

data = [[]] * 3

print(data) # Gives [[], [], []]

but here the same list object was repeated three times internally:

data[1].append(True)

print(data) # Gives [[True], [True], [True]]

so try a list comprehension instead:

data = [[] for x in range(3)]

data[1].append(True)

print(data) # Gives [[], [True], []]

Although perhaps not such common operations, the any and all keywords can be used

to find whether any or all elements in a list hold a certain condition. Accordingly:

for x in myList:

if x < 2:

doSomething()

break

becomes:


if any(x < 2 for x in myList):

doSomething()

Likewise:

if len(myList) == len([x<2 for x in myList]):

doSomething()

is the same as:

if all(x<2 for x in myList]):

doSomething()

For obtaining a sorted list, the inbuilt sorted function is useful when you don’t want to

modify the original list. So instead of :

b = list(a)

b.sort()


you can do:


b = sorted(a)

If  you  want  to  sort  a  list  on  something  other  than  the  items’  innate  value  you  can

construct a list of 2-tuples which will be sorted on the first item (which contains the values

to sort on). Here we sort according to the length of the strings:

aList = ['homer', 'bart', 'maggie', 'lisa', 'marge']

bList = [(len(x), x) for x in aList]

bList.sort()

aList = [x for (lenX, x) in bList]

# Gives ['bart', 'lisa', 'homer', 'marge', 'maggie']

However.  the  key  option  of  sort()  is  much  more  nifty  and  allows  you  to  pass  in  the

function that is used to generate the sort key:

aList = ['homer', 'bart', 'maggie', 'lisa', 'marge']

aList.sort(key=len)

Sometimes when dealing with objects we would like to sort on the value of a particular

attribute. You can readily write a function to fetch that attribute (for any object in the list,

as required by the sort operation), and thus generate a key for the sort. So, for example:

def getSortAttr(obj):

return obj.something

objList = [objA, objB, objC]

objList.sort(key=getSortAttr)

However,  you  can  also  use  the  key  option  in  combination  with  the  operator  module.

The function operator.attrgetter() uses the name of an attribute to create a separate on-the-

fly function

5

which sends back the value of an attribute, which in this case is the value to



sort with. So an alternative to the above is:

from operator import attrgetter

objList = [objA, objB, objC]

objList.sort(key=attrgetter('something')) # Name of attribute as a string

The  functions  operator.itemgetter  (for  selecting  items  in  a  collection)  and

operator.methodcaller (for invoking class functions) can also be used in a similar manner.



Loops

We’ve  been  using  enumerate()  throughout  the  book,  but  it  is  still  something  novices

occasionally overlook. So instead of:

myList = ['e', 'f', 'g']

for i in range(len(myList)):

print(i, myList[i])

do:

myList = ['e', 'f', 'g']




for i, val in enumerate(myList):

print(i, val)

And from Python 2.6 you can use a second argument to specify the start point for the

index:


for i, val in enumerate(myList, 5):

print(i, val)

# Gives:

# 5 e


# 6 f

# 7 g


In Python 2 when looping though sequential numbers, such as indices, consider using

xrange() rather than range(). This saves space because it only yields numbers on demand.

Helpfully an xrange still has a length and can be indexed. In Python 3 xrange is effectively

renamed range and replaces the old list constructor.

for x in xrange(100, 1000000): # Doesn't make all the numbers (Python 2)

doSomething()

To make an indefinite

6

loop, use a while loop that tests something that is logically true,



although don’t forget to break out of the loop eventually:

while 1:


test = doSomething()

if test:


break

Because  loops  are  constructs  that  allow  you  to  repeat  operations  many  times,  when

thinking  about  speed  a  general  principle  is  to  put  as  few  operations  into  the  loop  as

possible.  For  example,  when  doing  function  calls  in  a  loop  to  construct  a  list  using

.append(),  a  speed  improvement  can  be  made  if  the  dot  notation  call  is  done  only  once

outside the loop. For example:

aList = []

for x in someBigList:

if testFunc(x):

aList.append(x)

becomes the faster:

aList = []

addToList = aList.append

for x in someBigList:

if testFunc(x):

addToList(x)

Related to the above, if you know how long a list will be it is faster to pre-construct it in

a quick manner and curate it using indices, rather than appending repeatedly.

aList = [0] * n

bList = [0] * n




for i in range(n):

aList[i] = someCall(i)

bList[i] = anotherCall(i)

If  you  need  two  loops  and  have  to  break  out  of  both  of  them,  cunning  use  of  else,

continue and break can do the job without having to set any flags:

for a in oneList:

for b in anotherList:

if discoverSomething(a,b):

# Quit inner loop and subsequently the outer too

break


else:

# Without a break we get here at the end of the inner loop

# Continuing the outer loop the next break is skipped

continue


# Only get here due to the first break

break


If you have a loop that may cause an error (throw an exception) then it may be tempting

to  do  a  precautionary  check  to  stop  errors  before  they  occur.  However,  it  is  generally

quicker to let the exception happen and then catch it in a safe way. This is because with

try: there is no repeated checking and extra time is taken only if an error is encountered.

So, for example:

for x in bigList:

if rareEvent(x):

rareEventOccurred(x)

else:

commonTask(x)



can be modified into:

for x in bigList:

try:

commonTask(x)



except SpecialException:

rareEventOccurred(x)




Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   127   128   129   130   131   132   133   134   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish