Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
73 views14 pages

Biostatistics Assignment: Dna Microarray: AN

This document discusses DNA microarrays and statistical analysis methods. It introduces DNA microarrays, describes the basic procedure and types. It then discusses the steps involved, sources of variation, experimental design approaches like linear models and single channel designs. Finally, it covers hypothesis testing, significance tests like t-tests, Welch's t-test, Mann-Whitney U test, and the significance analysis of microarrays (SAM) method.

Uploaded by

Akhil Nair
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views14 pages

Biostatistics Assignment: Dna Microarray: AN

This document discusses DNA microarrays and statistical analysis methods. It introduces DNA microarrays, describes the basic procedure and types. It then discusses the steps involved, sources of variation, experimental design approaches like linear models and single channel designs. Finally, it covers hypothesis testing, significance tests like t-tests, Welch's t-test, Mann-Whitney U test, and the significance analysis of microarrays (SAM) method.

Uploaded by

Akhil Nair
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 14

BIOSTATISTICS ASSIGNMENT

Akhil nair 4/14/2010 DNA MICROARRAY AN INTRODUCTION PROCEDURE & TYPES OF DNA MICROARRAY STEPS OF DNA MICROARRAY SOURCES OF VARIATION EXPERIMENTAL DESIGN METHODS OF SIGNIFICANCE TESTING refernces

DNA MICROARRAY: AN INTRODUCTION


Before , I start to describe the statistical analysis of micro array technology let me describe what DNA microarray means and its various working procedures.

The condition of any cell at a particular instant of time depends on the transcription stage and amount of mRNA present in the cell. This is due to fact that mRNA is the information carrier which leads to the production of the necessary amino acid, hence analysing the information stored in the mRNA we can reach up to the conclusion of the state of the cell. This technique is essentially followed because extraction of each and every protein translated and comparison is very tedious and time consuming. Basically, DNA micro array is used to understand the expression patterns of various segments of DNA. For eg:- If a cancerous cell has to be compared to the normal cell compare the transcribed mRNA in both the cells to find the differences. There by spotting reverse or faulty transcribed mRNA.

Now the procedure of DNA micro array:A glass slide is considered to be having many rectangular grids , each rectangular grid consists of a solution containing known segment of mRNA from a cell. Now another solution containing segments from unknown cell are allowed to interact

with it , the segments complementary to each other hybridise with each other. And others get washed away, Now the fluorescence solution that was poured initially, is detected using laser beams and now the hybridised probes glow for detection of the same. Now lets understand briefly about the types of micro array experiment. Experiment Type 1: tissue specific gene expression Now each cell of a tissue behaves differentially compared to another cell of another tissue now the difference basically is due to enzymes and proteins present in the respective tissues. Microarray can be used to understand which genes are expressed in which set of tissues thereby helping to understand the biochemical mechanisms of the tissue concerned. Experiment type 2:Developmental Genetics It has been found that a set of genes express differentially and different stages of life .There is an early set of genes which is used and then reused in different stages of life. One of the important applications be behaviour of growth factors, as this may be linked to the proto- on co genes. Microarray can be used to track the changes in the gene profile of the organism to understand the differences.

Experiment Type 3: Genetic diseases These diseases occur due to a variety of reasons from mutation, inactivate gene, to even reverse transcription. Micro array can be used to differentiate between these particular gene segments which differ between the normal and the diseased. As cancer has different forms and involves different course of action Microarray can be used to identify the affected segments and prescription of the drugs accordingly.

STEPS INVOLVED IN DNA MICROARRAY


A typical microarray involves these five respective steps: Preparation of microarray Labelling of micro array

Hybridising the microarray and then washing of the array Scanning the array Interpreting the scanned image .

Statistically interpretation

analysing

the

The part of the experiment that bothers us at this point of time is inference and statistical analysis. This all starts with a statistical experimental design followed by a hypothesis then a test to analyse it statistically. To go through the above mentioned procedure we need to first know the source that are responsible for un wanted variation.

Sources of variation: There are two dyes used for this experiment cyt5 and cyt 3 and the sensitivities are different for both, there could be a problem of dyes getting swapped. Another source is the length of the base segments in each of the probes and replicates. Chance of variation in form of measure mental error. Even experimenter time of the day a different equipment also have effect on variation.

Errors in scanning , hybridisation and scanning also account for variation

Once the sources of variation are known then the experimental design can be sorted out. Now for the experimental design there are three things to be kept in mind. Blocking factor: Dividing the units into blocking factors and grouping the units which are similar and assigning the given conditions respectively for the group. Randomization: The order in which the treatment should be given should be unbiased and should follow a random procedure. Crossing: To know the effect of combination of conditions prepare a combination of all factors under question and then estimate the interaction. It removes the effect confounding. CONFOUNDING:It is the difficulty arisen due to the preparation of blocks whereby the individual fails to understand whether the variation is real or due to blocking.

Design of experiments followed: Linear model:Traditional method by (Kerr et al 2000) Log (yi j kg) = + Ai + Dj +Vk +Gg +(AG)ig +(VG)kg + ikg It can measure score of nuisance effect as well as evaluate the differential expression Ai - array effect Dj - dye effect (AG)ig - spot effect The model is then expressed in the form of a scatter plot and analysed further. This model turns out to be ambiguous (even i feel not to understand the concept of it so well) The problem with the model is both its simplicity and complexity. Single channel micro array design: This design is interesting, it has implementations from single comparative experiment to multi complex comparisons and even diagnostic test. 2 2

2 2 2 2 2 2 2 2 2 2
Condition 4 Condition 3 Condition 2 Condition 1

[1] A single channel design that compares four conditions using three microarrays for each condition, with hybridisation mixture containing 2 biological samples each. For example ,Here in this case : Lets take up an example of expression of a gene of a mice over 4 conditions namely virgin pregnancy lactation and involution of breasts, Three biological sample for each day. This results in 54 micro arrays over 18 time points. This design is my personal favourite as it analysis the interaction between condition but also occupies the effect of each condition resulting in more specificity, even though it my become hectic in case of large number of sample. There is also a design named as optimal design of dual channel arrays ,it seems interesting but was quite complex to stay within my intellectual limits, hence, Im afraid that I cannot present my understanding regarding the same here. Now once , the design is selected and the experiment is performed the next job is to statistically analyze the inference of the result.

HYPOTHESIS FORMATION AND SIGNIFICANCE TESTING To start the analysis we , first set up the hypothesis. Null hypothesis:- There is no difference between the two gene groups. Alternate hypothesis:- there is considerable difference between the gene expressions of both the groups. Now let us take a look at the statistical tests used for significance testing of micro array. A two sample t test:The test statistic for this is T = (x-x')- [2] 1n1+1n2

Where = log 2 for a twofold difference Here the numerator corresponds to the noise factor and denominator for the variability of the system. Degrees of freedom df = n1+n2-2

Here what i learnt from the test is that if the test is having a large tail for the normal distribution the denominator of the t test

is inflated and then it becomes hard to reject the null hypothesis. And due to this the test is prone to type 2 error. Welch t test:When huge false rate is expected and the unequal variance form of t test is adopted T = (x-x')- [3] (s1)2n1+(s2)2n2 The degrees of freedom associated with this variance estimate is approximated using the

[4] Here = N 1, the degrees of freedom associated with the i variance estimate.
i i th

And if (t calculated >= t table) then the gene is considered to be differentially expressed to its counterpart. Here at low degrees of freedom the test becomes ineffective. IN a further effective method is adopted when data does not follow any distribution then we go for non parametric analysis. Here the preferred test accordingly is mann-whitney u test:

The test involves the calculation of a statistic, usually called U, whose distribution under the null hypothesis is known. U is then given by:

[5] Where: U=Mann-Whitney U test N1 = sample size one N2= Sample size two Ri = Rank of the sample size The smaller value of U is the one used when consulting significance tables. Here also at (n-1) degrees of freedom and 0.05 level of significance if (U calculated >= H table) then null hypothesis is rejected

Significance analysis of microarrays (SAM) :It is a statistical technique, established in 2001 by Tusher, Tibshirani and Chu, for determining whether changes in gene expression are statistically significant. The data generated is considerable and a method for sorting out what is significant and what isnt is essential.

SAM is developed and circulated around the world by Stanford university in a package[6] SAM identifies statistically significant genes by carrying out gene specific t-tests . This analysis uses non-parametric statistics, since the data may not follow a normal distribution. The response variable describes and groups the data based on experimental conditions. In this method, repeated permutations of the data are used to determine if the expression of any gene is significant related to the response. The use of permutation-based analysis accounts for correlations in genes and avoids parametric assumptions about the distribution of individual genes. This is an advantage over other techniques. I personally feel that this testing technique has future because it takes care of the correlation between the expressions and even is more specific software to work with compared to the other tests mentioned above.

REFERENCES
[1] - page 42 fig 3.4 in Statistical design for microarrays by Ernst Wit, John D. McClure ISBN 0-470-84993-2

[2],[3],[4] page 102 in Exploration and analysis of microarray and protein array data by Dhammika Amaratunga, Javier Cabrera ISBN-0-471-27398-8 [5]referred from the below mentioned web page

http://www.statisticssolutions.com/methods-chapter/statisticaltests/mann-whitney-u-test/ [6]http://en.wikipedia.org/wiki/Significance_analysis_of_microarra ys with further reading in http://wwwstat.stanford.edu/~tibs/SAM/

You might also like