Python Programming for Biology: Bioinformatics and Beyond

Figure 16.3 (Plate 4). The results of various array comparison procedures

Download 7,75 Mb.

Pdf ko'rish

bet	250/514
Sana	30.12.2021
Hajmi	7,75 Mb.
	#91066

1 ... 246 247 248 249 250 251 252 253 ... 514

Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Figure 16.3 (Plate 4). The results of various array comparison procedures. Shown

from left to right are: the product of red and green channels, displayed as yellow, to

illustrate the coincidence between channels; the G-test scores of red × log

(red/green),

which is designed to show where the values in the two channels are different and of

significant value; hierarchical clustering to produce shuffled rows and columns in the test

data, which here shows that there are replicates for each row.

As another example, next we find the logarithm (here in base 2) of the ratio of the red

and green channels. Combining two-channel red and green microarray data in this way is

commonplace (for example, when making ‘MA’ plots). To achieve this we first define a

small helper function log2Ratio which will accept two input data arrays and give back the

combined, comparison array, noting that we take a copy of the original data and add a

small amount to each array, to ensure that we do not divide by zeros or take logarithms of

zero:

from numpy import log2

def log2Ratio(data1, data2):

data1 = array(data1) + 1e-3

data2 = array(data2) + 1e-3

return log2(data1/data2)

We can use this function to combine the red and green channels (index 0 and 1), placing

the result in the blue channel (index 2).

rgArray = loadArrayImage(imgFile, 'TwoChannel', 18, 17)

rgArray.combineChannels(0, 1, combFunc=log2Ratio, replace=2)

The result can be visualised by selecting only the blue channel, here making a greyscale

image.

rgArray.makeImage(20, channels=(2,2,2)).show()

A further alternative to show differences between the two channel intensities is to use

logarithms to calculate x*log(x/y), where x and y are two colour channels, which is a

convenient way of showing the information content

of one distribution over another and

which is used in the G-test (see

Chapter 22

). Similar to the previous example red and

green channel arrays are both shifted away from zero by a small amount, so that zeros do

not occur in the division that will follow.

from numpy import log2

def gScore(data1, data2):

data1 = array(data1) + 1e-3

data2 = array(data2) + 1e-3

return data1 * log2(data1/data2)

This can be tested as before, though we normalise the values so the logarithms are

scaled into the same range as the other channels:

rgArray = loadArrayImage(imgFile, 'TwoChannel', 18, 17)

rgArray.combineChannels(0, 1, combFunc=gScore, replace=2)

rgArray.normaliseMax(perChannel=True)

rgArray.makeImage(20, channels=(2,2,2)).show()

With values calculated in this way we can perform significance tests, as described in

Chapter 22

, given that the random expectation is that values will be chi-square distributed.

Here the comparative term has been calculated from the perspective of the red channel,

but we could also do it for green (i.e. green * log(green/red)).

Download 7,75 Mb.

Do'stlaringiz bilan baham:

1 ... 246 247 248 249 250 251 252 253 ... 514