Python Programming for Biology: Bioinformatics and Beyond


Correlation and covariance



Download 7,75 Mb.
Pdf ko'rish
bet360/514
Sana30.12.2021
Hajmi7,75 Mb.
#91066
1   ...   356   357   358   359   360   361   362   363   ...   514
Bog'liq
[Tim J. Stevens, Wayne Boucher] Python Programming

Correlation and covariance

For  the  last  part  of  this  chapter  we  move  from  studying  distributions  of  one  type  of

measurement  to  the  comparison  of  two  different  types,  each  with  a  different  random

variable.  We  can  imagine  the  random  variables  to  correspond  to  different  dimensions  or

axes. Hence, a data point will be composed of two values, one for each axis. An approach

here might be to apply statistical tests to a two-dimensional, joint probability distribution,

employing  the  methods  already  discussed.  However,  we  are  often  interested  in  the

relatively  simple  question  of  whether  the  values  for  the  two  axes  vary  together  in  some

way.  In  other  words  if  the  value  of  one  measurement  increases  we  would  like  to  know

whether the other measurement also increases, decreases or stays the same overall. This is

what we call correlation. Naturally, this is also subject to significance testing because the

variation  associated  with  sampling  of  the  probability  distributions  impinges  on  our

measures of correlation. In particular, because of the variation arising from a small number

of samples we may observe an apparent correlation and need to know the likelihood that it

was generated by a random process.

Covariance

Covariance  is  a  measure  of  whether  two  random  variables  vary  simultaneously  as  their

values increase or decrease. The covariance is calculated by subtracting the means of the



random  variables,  so  they  are  effectively  centred  on  zero,  and  then  finding  the  average

product  of  the  two  coordinates.  Hence  for  two  probability  distributions,  described  by

random variables X and Y with sample points x

i

and y



i

respectively, the covariance may be

written as:

The idea is that if there is a correlation then the positions from both axes will be on the

same side of their means, giving consistently positive products. If there is no correlation

the products will be both positive and negative, averaging towards zero. In Python there is

the  handy  numpy.cov()  function  to  do  the  work  for  us.  Here  we  illustrate  with  two  test

combinations for random xVals: yVals1 is completely random and yVals2 is derived from

xVals by adding an offset, gradient and small random deviations:

from numpy import random, cov

xVals = random.normal(0.0, 1.0, 100)

yVals1 = random.normal(0.0, 1.0, 100) # Random, independent of xVals

deltas = random.normal(0.0, 0.75, 100)

yVals2 = 0.5 + 2.0 * (xVals - deltas) # Derived from xVals

cov1 = cov(xVals, yVals1)

# The exact values below depend on the random numbers

# Cov 1: [[0.848, 0.022]

# [0.022, 1.048]]

cov2 = cov(xVals, yVals2)

# Cov 2: [[0.848, 1.809]

# [1.809, 5.819]]

The  result  here  is  the  covalence  matrix,  rather  than  just  a  single  value.  This  is  just  a

generalisation of the process, where if you pass in several arrays it will give back a matrix

of the covariance for all possible pairs. Hence for our two input arrays we will get a matrix

with four values, i.e.

, so the diagonal is simply the variances for X and Y and the

other  values  are  equal  to  the  covariance  we  generally  want.  Here  the  interesting

covariances are 0.022 and 1.809 for yVals1 and yVals2 respectively.




Download 7,75 Mb.

Do'stlaringiz bilan baham:
1   ...   356   357   358   359   360   361   362   363   ...   514




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish