Python Programming for Biology: Bioinformatics and Beyond

Download 7,75 Mb.

Pdf ko'rish

bet	130/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 126 127 128 129 130 131 132 133 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Simple operations

When taking powers of numbers the pow() function is slower than the ** operator or

simply using multiplication. So instead of:

v = pow(w, 1.5)

x = pow(y, 3)

try:

v = w**1.5 # Faster

x = y*y*y # Faster still

When it comes to taking square roots, sometimes you can avoid the operation entirely,

e.g. by using the square, which is quicker to calculate. A common application of this is in

finding distances; the smallest square distance will still be the smallest overall. Thus,

instead of doing:

from math import sqrt

for x, y in data:

if sqrt(x) < y:

doSomething()

consider:

for x, y in data:

if x < y*y:

doSomething()

The values of two variables can be swapped cleanly in one line by assigning the

elements in two implicit tuples. Hence the three lines involving a temporary variable:

temp = x

y = x

x = temp

become one neat swap:

x, y = y, x

Different variables can be set to the same value with a concatenated assignment, so

instead of:

x = 0

y = 0

z = 0

you can do:

x = y = z = 0

This does not work for variables that represent collections. Thus:

u = []

v = []

means something different to:

u = v = []

because in the first example we have two separate lists, and in the second example they

both represent the same list.

Comparison operators can be concatenated, e.g. to check a value is within an upper and

lower limit. So instead of:

if (lower < x) and (x <= upper):

doSomething()

the following is simpler:

if lower < x <= upper:

doSomething()

The Boolean and ternary (one-line if clause) operators can avoid simple multi-line

clauses like:

aList = [1, 0, 0, 1, 1, 0]

for x in aList:

if x:

print('Yes')

else:

print('No')

This first alternative using Boolean logic works but is a little dangerous, because it

depends on both the alternatives being logically true (which is fortunately the case here for

the strings ‘Yes’ and ‘No’):

aList = [1, 0, 0, 1, 1, 0]

for x in aList:

print(x and 'Yes' or 'No')

A better alternative is to use the ternary operator:

aList = [1, 0, 0, 1, 1, 0]

for x in aList:

print('Yes' if x else 'No')

It is sometimes convenient to use or to provide a default value, so in the following

example the variable is assigned to None if the function call returns zero:

x = sum(myList) or None

Strings

When joining strings into longer text it is faster to use a separtor.join()call than redefining

the string each time. Thus for big lists avoid:

seqList = ['C','A','T','G','G','C','T','C','T','C']

seqString = ''

for letter in seqList:

seqString += letter

when you can do:

seqList = ['C','A','T','G','G','C','T','C','T','C']

seqString = ''.join(seqList)

though in later versions of Python the latter is only about 10% quicker.

Similarly, using a formatted string is faster than string concatenation:

line = 'First:' + s1 + ' Second:' + s2 + ' Third:' + s3 # Slower

line = 'First:%s Second:%s Third:%s' % (s1, s2, s3) #Faster

If you need to get sequential letters you can use the handy ord() and chr() functions,

which respectively fetch and decode character numbers (according to the Unicode

scheme).

startNum = ord('A')

tenLetters = [chr(startNum+i) for i in range(10)]

# Gives ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']

Removing characters from a string can be done with the often forgotten .translate()

method, though this is used slightly differently in Python 3 compared to Python 2. Here

we remove ‘A’, ‘T’ and ‘,’ characters:

seq = 'A,T,G,A,C,A,T,C,A,T,G,G,C,T,C,T,C'

# Python 2

seq = seq.translate(None, 'AT,')

# Python 3

transTable = str.maketrans('', '', 'AT,')

seq = seq.translate(transTable)

# Both give 'GCCGGCCC'

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 126 127 128 129 130 131 132 133 ... 514