Penalized Likelihood Regression Approach for Quantitative Trait Loci Mapping From Samples with Related Individuals

Research Article

Austin Biom and Biostat. 2014;1(2): 7.

Penalized Likelihood Regression Approach for Quantitative Trait Loci Mapping From Samples with Related Individuals

Ku HC1 and Zhu L2*

1McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, USA

2Department of Statistics, Oklahoma State University, USA

*Corresponding author: Zhu L, Department of Statistics, Oklahoma State University, 301C MSCS Bldg, Stillwater, 74078, USA.

Received: November 11, 2014; Accepted: December 04, 2014; Published: December 11, 2014

Abstract

Identifying Quantitative Trait Loci (QTL) by association mapping is critical for understanding the genetic architecture of complex traits or diseases. Many statistical methods have been developed to locate genes and estimate the effects of these genes that are responsible for quantitative traits. Penalized maximum likelihood method is one of the powerful statistical tools for QTL mapping, especially in dealing with the problem of p >n, where p is the number of genetic effects and n is the sample size. Most methods derived from it are limited to analyzing single trait from samples with independent individuals. Genetic inheritable complex diseases usually affect family members and are expressed by multiple correlated traits. The purpose of this study is to develop a statistical method (penalized likelihood regression approach) to target QTL from samples in a general setting, that is, arbitrary related individuals, for both single and multiple traits. Simulation studies show that the proposed method has great performance in detecting QTL in both single- and two-trait scenarios with related and unrelated individuals.

Keywords: Quantitative trait loci; Penalized maximum likelihood; Related individuals; Genome-wide association; False discovery rate; Multiple-trait association mapping

Introduction

Identifying Quantitative Trait Loci (QTL) is critical for understanding the genetic architecture of complex diseases or inheritable traits. Thus, QTL mapping aims to locate genomic variants and estimate the effects of these variants that are responsible for quantitative traits. One of initial methods for QTL mapping is Single Marker Analysis (SMA) based on a simple regression model [1,2]. The basic concept of this method is to consider each marker individually and check if there is an association between the trait and a marker. SMA provides the valuable framework for QTL mapping since the model is simple and easy to be extended to the multiplemarker analysis. However, this method tends to underestimate QTL effects and is not powerful unless sample size is relatively large [3]. Moreover, this method may not be able to detect the accurate position of QTL, as it is unlikely that QTL is right at the marker position if the density of markers does not cover all variants in the genome [4]. Interval Mapping (IM) method, an extension of SMA, was introduced by Thoday [5] and mathematically developed by Lander and Botstein [3]. IM can estimate the location and effect of QTL between two flanking markers if only one QTL on a tested region is assumed. The estimated location and effect of QTL are likely biased since the test statistics may be affected by other putative QTL out of the tested region. For multiple QTL methods, an extended and improved version of IM, called Composite Interval Mapping (CIM) [6-9] that accounted markers outside of the tested interval, has been widely applied in practice. The main idea of CIM is to combine IM with multiple regression analysis to detect multiple QTL. Some markers outside of the tested region are selected as genetic background to increase the resolution of IM. However, model-selection is somewhat subjective, depending on what variables are included or excluded [10- 13]. Moreover, for SNP datasets, the number of SNPs (p) is usually larger than the number of individuals (n), which makes the QTL mapping more challenging.

Bayesian shrinkage methods have become important computational tools to overcome the problems in CIM. Xu [14] proposed a method called Bayesian analysis implemented via the Markov Chain Monte Carlo (MCMC). In this model, individuals were assumed independent. Each marker was treated as a putative QTL and included in the model as one variable. The variance of QTL effects was then assumed to be different across QTL as a prior parameter. To perform satisfactorily, a sample size of 600 independent individuals is suggested in the method of Bayesian analysis implemented via the MCMC [15]. In addition, the MCMC algorithm requires a large number of iteration to converge to the stationary distribution. Both large numbers of sample size and iteration require intensive computational time, which becomes a major concern for the method. To reduce the computational burden, Zhang and Xu [16] developed an extended method of the Bayesian analysis implemented via the MCMC, called the penalized maximum likelihood method. It is similar in spirit to the method proposed by Xu [14] in that both methods shrink the null marker effects to zero, where the null marker effects are defined as the effects of the markers that are truly not QTL. The key of this method is to impose a prior normal distribution on the effect of each marker as a penalty, allowing the penalty to vary across each marker.

Both Bayesian shrinkage methods consider all markers simultaneously and include as many QTL as the model can handle. However, these methods assume independence among individuals and can only deal with one trait at a time. In addition, these methods ignore the issue of multiple tests, which result in an increased rate of overall type I error (i.e., false positive). Studies in humans, animals, or plants may involve related individuals such as trio families and inbred pedigrees. Furthermore, complex traits usually have multiple phenotypic measurements and these traits may be correlated. Statistical methods for QTL mapping that considering the relatedness among individuals as well as multiple traits are still under development. In this study, we propose extended penalized maximum likelihood methods for single- and multiple-trait analysis on arbitrary related individuals and retain the feature of handling the p >n problem. Multiplicity issue is also considered by selecting a threshold of LOD score that controls FDR at 0.05.

Methods

Single trait analysis

Let Zi denote the quantitative trait of individual i in an arbitrary pedigree and express as a linear function of the genetic effects. Assuming no interaction among effects, the model is

z i = b 0 + j=1 p x ij b j + j=1 p w ij d j + e i i=1,2,...,n                         (1) [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOEamaaBaaaleaacaWGPbaabeaakiabg2da9iaadkgadaWgaaWcbaGaaGimaaqabaGccqGHRaWkdaaeWbqaaiaadIhadaWgaaWcbaGaamyAaiaadQgaaeqaaOGaamOyamaaBaaaleaacaWGQbaabeaaaeaacaWGQbGaeyypa0JaaGymaaqaaiaadchaa0GaeyyeIuoakiabgUcaRmaaqahabaGaam4DamaaBaaaleaacaWGPbGaamOAaaqabaGccaWGKbWaaSbaaSqaaiaadQgaaeqaaaqaaiaadQgacqGH9aqpcaaIXaaabaGaamiCaaqdcqGHris5aOGaey4kaSIaamyzamaaBaaaleaacaWGPbaabeaakiaabYcacaqGGaGaamyAaiabg2da9iaaigdacaGGSaGaaGOmaiaacYcacaGGUaGaaiOlaiaac6cacaGGSaGaamOBaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabcca[email protected][email protected]

where b0> is the overall mean; p is the total number of markers; xij and wij are dummy variables indicating the genotype of the jth marker for individual i and defined as

x ij = 2  and  w ij =1 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaBaaaleaacaWGPbGaamOAaaqabaGccqGH9aqpdaGcaaqaaiaaikdaaSqabaGccaqGGaGaaeyyaiaab6gacaqGKbGaaeiia[email protected][email protected] for genotype A1A1, xij=0 and wij=1 for genotype A1A2, and x ij = 2  and  w ij =1 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiEamaaBaaaleaacaWGPbGaamOAaaqabaGccqGH9aqpcqGHsisldaGcaaqaaiaaikdaaSqabaGccaqGGaGaaeyyaiaab6gacaqGKbGaaeiia[email protected][email protected]

for genotype A1A2, such that they have a zero expectation and a unity variance [14]; bj and dj are additive and dominant effects associated with marker j, respectively. We assume bj and dj are independent; and ei˜N(0,s2) are the random environmental effect.

In the matrix form, when all characteristics of n individuals are included, formula (1) expands to

Z=1b1+X*B+W*D+E*, (2)

where Z is an nx1 vector of the quantitative trait; 1 is an nx1 vector of 1s; X* and W* are nxp matrices of dummy variables; and B and D are px1 vectors of additive and dominant effects, respectively.

We propose to take the relationship of individuals in a pedigree into account in the penalized maximum likelihood method by considering the relatedness coefficient ωω, which is defined as two times the kinship coefficient. The kinship coefficients of any arbitrary pedigree can be calculated based on the relationships between individuals. For instance, the relatedness coefficient for parentoffspring relationship is 0.5, meaning that theoretically 50% of the offspring's genome comes from that parent. Thus, in formula (1), we assume that Cov(eu,ev)=s2ωuv where ωuv is the relatedness coefficient for individual u and v. The distribution of E* for unrelated individuals is assumed to follow a multivariate normal distribution with mean vector 0 and covariance matrix s2In, where In is the nxn identity matrix.

When we take the relatedness among individuals into account, we assume E*˜N(0, s2Ω), where [Ω]uvuv. Given the covariance matrix, it is possible to find a transformation matrix A [17] such that ATA=Ω-1 and the model then becomes Y=AZ=A1b0+AX*B+AW*D+AE*=Cb0+ XB+WD+E. Note that E˜N(0,s2In).

Suppose that θ = (b0,b1,...,bp,d1,...,dp,s2) is the vector of parameters of interest. Under the assumption of multivariate normality for the quantitative trait, the likelihood function of the pedigree is given by

L( θ )=ϕ( Y;β, σ 2 I n )= i=1 n ϕ ( Y i ; β i , σ 2 ) [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitamaabmaabaGaeqiUdehacaGLOaGaayzkaaGaeyypa0Jaeqy1dy2aaeWaaeaacaWGzbGaai4oaiabek7aIjaacYcacqaHdpWCdaahaaWcbeqaaiaaikdaaaGccaWGjbWaaSbaaSqaaiaad6gaaeqaaaGccaGLOaGaayzkaaGaeyypa0ZaaebCaeaacqaHvpGzaSqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHpis1aOWaaeWaaeaacaWGzbWaaSbaaSqaaiaadMgaaeqaaOGaai4oaiabek7aInaaBaaaleaacaWGPbaa[email protected][email protected]

where Φ is the normal density, β = Cb0+XB+WD, and

β i = c i b 0 + j=1 p x ij b j + j=1 p w ij d j [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdi2aaSbaaSqaaiaadMgaaeqaaOGaeyypa0Jaam4yamaaBaaaleaacaWGPbaabeaakiaadkgadaWgaaWcbaGaaGimaaqabaGccqGHRaWkdaaeWbqaaiaadIhadaWgaaWcbaGaamyAaiaadQgaaeqaaOGaamOyamaaBaaaleaacaWGQbaabeaaaeaacaWGQbGaeyypa0JaaGymaaqaaiaadchaa0GaeyyeIuoakiabgUcaRmaaqahabaGaam4DamaaBaaaleaacaWGPbGaamOAaaqabaGccaWGKbWaaSbaaSqaa[email protected][email protected]

the main idea of penalty in the penalized maximum likelihood method is to have prior densities of the parameters, that is, hyper parameters, in the Bayesian framework. Since b0 and s2 are always in the model, their inclusion should not be penalized [16].

In this study, prior densities of parameters are defined similarly as Xu [14]. Assume that additive and dominant effects are normally distributed, then bj˜N(μbj,s2 bj) and dj˜N(μdj,s2 dj) for j = 1,..., p. The hyper parameters μbj, μdj, s2 bj and s2 dj in the prior distributions are very important in the oversaturated model, from the experience of Zhang and Xu [16] these parameters should be estimated from the data by assigning prior distributions to μbj and μdj such that μbj˜N(0, s2 bj /?) and μdj˜N(0, s2 dj /?) for j = 1,..., p, where ? is a positive prior value for accessing μbj and μdj. It is useful in the shrinking process because it controls the convergence rate.

Now suppose that ξ=(μb1,...,μbpd1,...,μdp,s2 b1,...,s2 bp,s2 d1,...,s2 dp) is the vector of the hyper parameters of interest in the prior distribution. The prior density is

P( θ,ξ )= j=1 p [ ϕ( b j ; μ b j , σ b j 2 )ϕ( d j ; μ d j , σ d j 2 )ϕ( μ b j ;0, σ b j 2 /η )ϕ( μ d j ;0, σ d j 2 /η ) ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaabmaabaGaeqiUdeNaaiilaiabe67a4bGaayjkaiaawMcaaiabg2da9maarahabaWaamWaaeaacqaHvpGzdaqadaqaaiaadkgadaWgaaWcbaGaamOAaaqabaGccaGG7aGaeqiVd02aaSbaaSqaaiaadkgadaWgaaadbaGaamOAaaqabaaaleqaaOGaaiilaiabeo8aZnaaDaaaleaacaWGIbWaaSbaaWqaaiaadQgaaeqaaaWcbaGaaGOmaaaaaOGaayjkaiaawMcaaiabew9aMnaabmaabaGaamizamaaBaaaleaacaWGQbaabeaakiaacUdacqaH8oqBdaWgaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqabaGccaGGSaGaeq4Wdm3aa0baaSqaaiaadsgadaWgaaadbaGaamOAaaqabaaaleaacaaIYaaaaaGccaGLOaGaayzkaaGaeqy1dy2aaeWaaeaacqaH8oqBdaWgaaWcbaGaamOyamaaBaaameaacaWGQbaabeaaaSqabaGccaGG7aGaaGimaiaacYcacqaHdpWCdaqhaaWcbaGaamOyamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaGccaGGVaGaeq4TdGgacaGLOaGaayzkaaGaeqy1dy2aaeWaaeaacqaH8oqBdaWgaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqabaGccaGG7aGaaGimaiaacYcacqaHdpWCdaqhaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaGccaGGVaGaeq4TdGgacaGLOaGaayzkaaaacaGL[email protected][email protected]

and the penalized likelihood function is ψ(?,?)=L(?)P(?,?). The parameters in the penalized likelihood function are estimated by taking the derivative of logψ(?,?) with respect to ? and ? and then set the derivatives equal to zero. The solutions (PMLE) are performed by an iterative algorithm in the following steps.

Step 1. Initialization: set ?>0 and initialize ? and ? values

Step 2. Updating b0: b 0 = ( i=1 n c i 2 ) 1 [ i=1 n c i ( y i j=1 p x ij b j j=1 p w ij d j ) ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOyamaaBaaaleaacaaIWaaabeaakiabg2da9maabmaabaWaaabCaeaacaWGJbWaa0baaSqaaiaadMgaaeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aaGccaGLOaGaayzkaaWaaWbaaSqabeaacqGHsislcaaIXaaaaOWaamWaaeaadaaeWbqaaiaadogadaWgaaWcbaGaamyAaaqabaaabaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdGcdaqadaqaaiaadMhadaWgaaWcbaGaamyAaaqabaGccqGHsisldaaeWbqaaiaadIhadaWgaaWcbaGaamyAaiaadQgaaeqaaOGaamOyamaaBaaaleaacaWGQbaabeaaaeaacaWGQbGaeyypa0JaaGymaaqaaiaadchaa0GaeyyeIuoakiabgkHiTmaaqahabaGaam4DamaaBaaaleaacaWGPbGaamOAaaqabaGccaWGKbWaaSbaaSqaaiaadQgaaeqaaaqaaiaadQgacqGH9aqpcaa[email protected][email protected]

Step 3. Updating bj: b j = ( i=1 n x i 2 + σ 2 σ b j 2 ) 1 [ i=1 n x ij ( y i c i b 0 k1 p x ik b k j=1 p w ij d j )+ σ 2 σ b j 2 μ b j ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOyamaaBaaaleaacaWGQbaabeaakiabg2da9maabmaabaWaaabCaeaacaWG4bWaa0baaSqaaiaadMgaaeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOGaey4kaSYaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaamOyamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqGHsislcaaIXaaaaOWaamWaaeaadaaeWbqaaiaadIhadaWgaaWcbaGaamyAaiaadQgaaeqaaOWaaeWaaeaacaWG5bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0Iaam4yamaaBaaaleaacaWGPbaabeaakiaadkgadaWgaaWcbaGaaGimaaqabaGccqGHsisldaaeWbqaaiaadIhadaWgaaWcbaGaamyAaiaadUgaaeqaaOGaamOyamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaeyiyIKRaaGymaaqaaiaadchaa0GaeyyeIuoakiabgkHiTmaaqahabaGaam4DamaaBaaaleaacaWGPbGaamOAaaqabaGccaWGKbWaaSbaaSqaaiaadQgaaeqaaaqaaiaadQgacqGH9aqpcaaIXaaabaGaamiCaaqdcqGHris5aaGccaGLOaGaayzkaaGaey4kaSYaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaamOyamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaaaaOGaeqiVd02aaSbaaSqaaiaadkgadaWgaaadbaGaamOAaaqabaaaleqaaaqaaia[email protected][email protected]

Step 4. Updating dj: d j = ( i=1 n w i 2 + σ 2 σ d j 2 ) 1 [ i=1 n w ij ( y i c i b 0 j=1 p x ij b j k1 p w ik d k )+ σ 2 σ d j 2 μ d j ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamizamaaBaaaleaacaWGQbaabeaakiabg2da9maabmaabaWaaabCaeaacaWG3bWaa0baaSqaaiaadMgaaeaacaaIYaaaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOGaey4kaSYaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaaaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqGHsislcaaIXaaaaOWaamWaaeaadaaeWbqaaiaadEhadaWgaaWcbaGaamyAaiaadQgaaeqaaOWaaeWaaeaacaWG5bWaaSbaaSqaaiaadMgaaeqaaOGaeyOeI0Iaam4yamaaBaaaleaacaWGPbaabeaakiaadkgadaWgaaWcbaGaaGimaaqabaGccqGHsisldaaeWbqaaiaadIhadaWgaaWcbaGaamyAaiaadQgaaeqaaOGaamOyamaaBaaaleaacaWGQbaabeaaaeaacaWGQbGaeyypa0JaaGymaaqaaiaadchaa0GaeyyeIuoakiabgkHiTmaaqahabaGaam4DamaaBaaaleaacaWGPbGaam4AaaqabaGccaWGKbWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacqGHGjsUcaaIXaaabaGaamiCaaqdcqGHris5aaGccaGLOaGaayzkaaGaey4kaSYaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaaaaOGaeqiVd02aaSbaaSqaaiaadsgadaWgaaadbaGaamOAaaqabaaaleqaaaqaaia[email protected][email protected]

Step 5. Updating s2: σ 2 = i=1 n ( y i c i b 0 j=1 p x ij b j j=1 p w ij d j ) 2 n [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaaeWbqaamaabmaabaGaamyEamaaBaaaleaacaWGPbaabeaakiabgkHiTiaadogadaWgaaWcbaGaamyAaaqabaGccaWGIbWaaSbaaSqaaiaaicdaaeqaaOGaeyOeI0YaaabCaeaacaWG4bWaaSbaaSqaaiaadMgacaWGQbaabeaakiaadkgadaWgaaWcbaGaamOAaaqabaaabaGaamOAaiabg2da9iaaigdaaeaacaWGWbaaniabggHiLdGccqGHsisldaaeWbqaaiaadEhadaWgaaWcbaGaamyAaiaadQgaaeqaaOGaamizamaaBaaaleaacaWGQbaabeaaaeaacaWGQbGaeyypa0JaaGymaaqaaiaadchaa0GaeyyeIuoaaOGaayjkaiaawMcaamaaCaaaleqabaGaaGOmaaaaaea[email protected][email protected]

Step 6. Updating μbj: μ b j = b j η+1 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiVd02aaSbaaSqaaiaadkgadaWgaaadbaGaamOAaaqabaaaleqaaOGaeyypa0ZaaSaaaeaa[email protected][email protected]

Step 7. Updating μdj: μ d j = d j η+1 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiVd02aaSbaaSqaaiaadsgadaWgaaadbaGaamOAaaqabaaaleqaaOGaeyypa0ZaaSaaaeaa[email protected][email protected]

Step 7. Updating μdj: σ d j 2 = ( d j μ d j ) 2 +η μ d j 2 2 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aa0baaSqaaiaadsgadaWgaaadbaGaamOAaaqabaaaleaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaqadaqaaiaadsgadaWgaaWcbaGaamOAaaqabaGccqGHsislcqaH8oqBdaWgaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaGccqGHRaWkcqaH3oaAcqaH8oqBdaqhaaWcb[email protected][email protected]

Step 8. Updating s2bj: σ b j 2 = ( b j μ b j ) 2 +η μ b j 2 2 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aa0baaSqaaiaadkgadaWgaaadbaGaamOAaaqabaaaleaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaqadaqaaiaadkgadaWgaaWcbaGaamOAaaqabaGccqGHsislcqaH8oqBdaWgaaWcbaGaamOyamaaBaaameaacaWGQbaabeaaaSqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaGccqGHRaWkcqaH3oaAcqaH8oqBdaqhaaWcb[email protected][email protected]

Step 9. Updating s2dj: σ d j 2 = ( d j μ d j ) 2 +η μ d j 2 2 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aa0baaSqaaiaadsgadaWgaaadbaGaamOAaaqabaaaleaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaqadaqaaiaadsgadaWgaaWcbaGaamOAaaqabaGccqGHsislcqaH8oqBdaWgaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiaaikdaaaGccqGHRaWkcqaH3oaAcqaH8oqBdaqhaaWcb[email protected][email protected]

Step 10. Repeat 2-9 until a certain criterion of convergence is satisfied. The terms σ 2 σ s 2  and  σ 2 σ s 2 μ s [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaam4CaaqaaiaaikdaaaaaaOGaaeiiaiaabggacaqGUbGaaeizaiaabccadaWcaaqaaiabeo8aZnaaCaaaleqabaGaaGOmaaaaaOqaaiabeo8aZnaaDaa[email protected][email protected] for s=bj or s=dj, in Steps 3 and 4 are important in the iterative algorithm. s2 s's are defined in Steps 8 and 9 as the average of squared deviance and a squared mean effect multiplied by a constant. If s2 s is large (large effect), the estimated additive effect (bj) or dominant effect (dj) is expected to be unaffected (i.e. no shrinkage). However, if s2 sis small (small effect or no effect), the estimates will be shrunk towards zero.

Theoretically non-QTL effects are shrunk to zero whereas QTL with effects subject to no shrinkage. This makes the signals of QTL very clear. The estimated additive and dominant effects should be visualized by plotting the estimated effects over all markers along the genome. To ensure that the estimated effects are significant, a likelihood ratio test can be performed. Due to over parameterization, the usual likelihood ratio test is not appropriate. Therefore, we follow a two-stage process proposed by Zhang and Xu [16]. In the first stage, markers that have estimated effects | b ^ j |/ σ ^ > 10 6  or | d ^ j |/ σ ^ > 10 6 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaqWaaeaaceWGIbGbaKaadaWgaaWcbaGaamOAaaqabaaakiaawEa7caGLiWoacaGGVaGafq4WdmNbaKaacqGH+aGpcaaIXaGaaGimamaaCaaaleqabaGaeyOeI0IaaGOnaaaakiaabccacaqGVbGaaeOCaiaabccadaabdaqaaiqadsgagaqcamaaBaaaleaacaWGQbaabeaaaOGaay5bSlaawIa7aiaac+cacuaHd[email protected][email protected] are selected for the second stage of analysis. This is good because biologically we are not interested in the effects that are relatively small. In the second stage of analysis, since the dimension of markers is greatly reduced, a likelihood ratio test can be performed on the markers that have passed the first round of selection. To test the null hypothesis of no additive or dominant effects, we apply the LOD LO D j = log 10 ( L( θ ^ ) L( θ ^ j ) )= L R j 2ln10              (3) [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitaiaad+eacaWGebWaaSbaaSqaaiaadQgaaeqaaOGaeyypa0JaciiBaiaac+gacaGGNbWaaSbaaSqaaiaaigdacaaIWaaabeaakmaabmaabaWaaSaaaeaacaWGmbWaaeWaaeaacuaH4oqCgaqcaaGaayjkaiaawMcaaaqaaiaadYeadaqadaqaaiqbeI7aXzaajaWaaSbaaSqaaiabgkHiTiaadQgaaeqaaaGccaGLOaGaayzkaaaaaaGaayjkaiaawMcaaiabg2da9maalaaabaGaamitaiaadkfadaWgaaWcbaGaamOAaaqabaaakeaacaaIYaGaciiBaiaac6gacaaIXaGaaGimaaaacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaG[email protected][email protected] score test [3],

where j is the index of the marker after the first round of selection,

L R j =2ln( L( θ ^ j ) L( θ ^ ) ), L( θ ^ j ) [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamitaiaadkfadaWgaaWcbaGaamOAaaqabaGccqGH9aqpcqGHsislcaaIYaGaciiBaiaac6gadaqadaqaamaalaaabaGaamitamaabmaabaGafqiUdeNbaKaadaWgaaWcbaGaeyOeI0IaamOAaaqabaaakiaawIcacaGLPaaaaeaacaWGmbWaaeWaaeaacuaH4oqCgaqcaaGaayjkaiaawMcaaaaaaiaawIcacaGLPaaacaGGSaGaaeiiaiaadYeadaqadaqaaiq[email protected][email protected] is the likelihood under the null hypothesis, and L(?) is the likelihood without restriction on the parameters. The null hypothesis is rejected if LODj exceeds a threshold that controls the FDR at 0.05.

Multiple traits analysis

For studies that involve multiple traits (e.g. in clinics), singletrait model will simply ignore the correlation among these traits and detect QTL for each trait separately. However, complex traits, especially for complex diseases, these traits may be correlated. To analyze correlated traits in one model, we expect to gain more statistical power in detecting QTL. In this study, we also propose an algorithm that incorporates the correlation between traits in the model. This algorithm can be easily extended to a model with more than two traits. The definitions of notations in the two-trait model are similar to those in the single-trait model, except now they are 2x1 vectors. We distinguish notations for the two-trait model from the single-trait model by having underscores on matrices for the two-trait model. That is, let Zi=[Zi (1),Zi (2)]T be a 2x1 vector of quantitative traits (1) and (2) of individual i in a pedigree. The model is z ¯ i = 1 ¯ b 0 + j=1 p x ¯ ij b j + j=1 p w ¯ ij d j + e ¯ i      (4) [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaWaaaeaacaWG6baaamaaBaaaleaacaWGPbaabeaakiabg2da9maamaaabaGaaGymaaaacaWGIbWaaSbaaSqaaiaaicdaaeqaaOGaey4kaSYaaabCaeaadaadaaqaaiaadIhaaaWaaSbaaSqaaiaadMgacaWGQbaabeaakiaadkgadaWgaaWcbaGaamOAaaqabaaabaGaamOAaiabg2da9iaaigdaaeaacaWGWbaaniabggHiLdGccqGHRaWkdaaeWbqaamaamaaabaGaam4DaaaadaWgaaWcbaGaamyAaiaadQgaaeqaaOGaamizamaaBaaaleaacaWGQbaabeaakiabgUcaRmaamaaabaGaamyzaaaadaWgaaWcbaGaamyAaaqabaaabaGaamOAaiabg2da9iaaigdaaeaacaWGWbaaniabg[email protected][email protected] where 1 is a 2x1 vector of 1s; xij and wij are 2x1 vectors of dummy variables; bj and dj are additive and dominant effects associated with maker j, respectively; and ei is a 2x1 vector of the random environmental effect. Note that ei ˜N(0,s2R), R is a 2x2 correlation matrix with correlation coefficient r between two quantitative traits on the off- diagonal. In the matrix form, the statistical model can be expressed as Z=1b0+X*B+W*D+E, where W˜N(0,(Ω⊗s2R)) and ⊗ is the Kronecker product. Similar to the single-trait model, we use A as a transformation matrix such that Y=(A⊗I2) Z=Cb0+XB+WD+E and E˜N(0,(In⊗s2R)).

The likelihood function is

L ¯ ( θ )=ϕ( Y ¯ ;β,( I n σ 2 R ) )= i=1 n ϕ( Y ¯ i ; β i , σ 2 R ) [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaWaaaeaacaWGmbaaamaabmaabaGaeqiUdehacaGLOaGaayzkaaGaeyypa0Jaeqy1dy2aaeWaaeaadaadaaqaaiaadMfaaaGaai4oaiabek7aIjaacYcadaqadaqaaiaadMeadaWgaaWcbaGaamOBaaqabaGccqGHxkcXcqaHdpWCdaahaaWcbeqaaiaaikdaaaGccaWGsbaacaGLOaGaayzkaaaacaGLOaGaayzkaaGaeyypa0ZaaebCaeaacqaHvpGzdaqadaqaamaamaaabaGaamywaaaadaWgaaWcbaGaamyAaaqabaGccaGG7aGaeqOSdi2aaSbaaSqaaiaadMgaaeqaaOGaaiilaiabeo8aZnaaCaaaleqabaGaaGOmaaaakiaadkfaaiaaw[email protected][email protected]

where now ?=(b0,b1,...,bp,d1,...,dp,r,s2), β=Cb0+XB+WD and i=Cibi+XiB+WiD. Note that b0 r, and s2 are not penalized. The prior density is the same as that in the single-trait analysis. The penalized log likelihood function is ψ(?,?)=L(?)P(?,?). The derivation of the maximum likelihood estimates for two-trait model is similar to the single-trait model with an additional step of updating r, the coefficient of correlation between two traits.

Step 1. Initialization: set ?>0 and initialize ? and ? values.

Step 2. Updating b0: b 0 = ( i=1 n C ¯ i T R 1 C ¯ i ) 1 [ i=1 n C ¯ i T R 1 ( Y ¯ i X ¯ i B W ¯ i D ) ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOyamaaBaaaleaacaaIWaaabeaakiabg2da9maabmaabaWaaabCaeaadaadaaqaaiaadoeaaaWaa0baaSqaaiaadMgaaeaacaWGubaaaOGaamOuamaaCaaaleqabaGaeyOeI0IaaGymaaaakmaamaaabaGaam4qaaaadaWgaaWcbaGaamyAaaqabaaabaGaamyAaiabg2da9iaaigdaaeaacaWGUbaaniabggHiLdaakiaawIcacaGLPaaadaahaaWcbeqaaiabgkHiTiaaigdaaaGcdaWadaqaamaaqahabaWaaWaaaeaacaWGdbaaamaaDaaaleaacaWGPbaabaGaamivaaaakiaadkfadaahaaWcbeqaaiabgkHiTiaaigdaaaGcdaqadaqaamaamaaabaGaamywaaaadaWgaaWcbaGaamyAaaqabaGccqGHsisldaadaaqaaiaadIfaaaWaaSbaaSqaaiaadMgaaeqaaOGaamOqaiabgkHiTmaamaaabaGaam4vaaaadaWgaaWcbaGaamyAaaqabaGccaWGebaacaGLOaGaayzkaaaaleaac[email protected][email protected]

Step 3. Updating bj: b j = ( i=1 n X ¯ ij T R 1 X ¯ ij + σ 2 σ b j 2 ) 1 [ i=1 n X ¯ ij T R 1 ( Y ¯ i C ¯ i b 0 X ¯ i(j) B j W ¯ i D) + σ 2 σ b j 2 μ b j ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOyamaaBaaaleaacaWGQbaabeaakiabg2da9maabmaabaWaaabCaeaadaadaaqaaiaadIfaaaWaa0baaSqaaiaadMgacaWGQbaabaGaamivaaaakiaadkfadaahaaWcbeqaaiabgkHiTiaaigdaaaGcdaadaaqaaiaadIfaaaWaaSbaaSqaaiaadMgacaWGQbaabeaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoakiabgUcaRmaalaaabaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaGcbaGaeq4Wdm3aa0baaSqaaiaadkgadaWgaaadbaGaamOAaaqabaaaleaacaaIYaaaaaaaaOGaayjkaiaawMcaamaaCaaaleqabaGaeyOeI0IaaGymaaaakmaadmaabaWaaabCaeaadaadaaqaaiaadIfaaaWaa0baaSqaaiaadMgacaWGQbaabaGaamivaaaakiaadkfadaahaaWcbeqaaiabgkHiTiaaigdaaaGccaGGOaWaaWaaaeaacaWGzbaaamaaBaaaleaacaWGPbaabeaakiabgkHiTmaamaaabaGaam4qaaaadaWgaaWcbaGaamyAaaqabaGccaWGIbWaaSbaaSqaaiaaicdaaeqaaOGaeyOeI0YaaWaaaeaacaWGybaaamaaBaaaleaacaWGPbGaaiikaiabgkHiTiaadQgacaGGPaaabeaakiaadkeadaWgaaWcbaGaeyOeI0IaamOAaaqabaGccqGHsisldaadaaqaaiaadEfaaaWaaSbaaSqaaiaadMgaaeqaaOGaamiraiaacMcaaSqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOGaey4kaSYaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaamOyamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaaaaOGaeqiVd02aaSb[email protected][email protected]

Step 4. Updating dj: d j = ( i=1 n W ¯ ij T R 1 W ¯ ij + σ 2 σ d j 2 ) 1 [ i=1 n W ¯ ij T R 1 ( Y ¯ i C ¯ i b 0 X ¯ i B W ¯ i( j ) D j )+ σ 2 σ d j 2 μ d j ] [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamizamaaBaaaleaacaWGQbaabeaakiabg2da9maabmaabaWaaabCaeaadaadaaqaaiaadEfaaaWaa0baaSqaaiaadMgacaWGQbaabaGaamivaaaakiaadkfadaahaaWcbeqaaiabgkHiTiaaigdaaaGcdaadaaqaaiaadEfaaaWaaSbaaSqaaiaadMgacaWGQbaabeaakiabgUcaRmaalaaabaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaaGcbaGaeq4Wdm3aa0baaSqaaiaadsgadaWgaaadbaGaamOAaaqabaaaleaacaaIYaaaaaaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaad6gaa0GaeyyeIuoaaOGaayjkaiaawMcaamaaCaaaleqabaGaeyOeI0IaaGymaaaakmaadmaabaWaaabCaeaadaadaaqaaiaadEfaaaWaa0baaSqaaiaadMgacaWGQbaabaGaamivaaaakiaadkfadaahaaWcbeqaaiabgkHiTiaaigdaaaGcdaqadaqaamaamaaabaGaamywaaaadaWgaaWcbaGaamyAaaqabaGccqGHsisldaadaaqaaiaadoeaaaWaaSbaaSqaaiaadMgaaeqaaOGaamOyamaaBaaaleaacaaIWaaabeaakiabgkHiTmaamaaabaGaamiwaaaadaWgaaWcbaGaamyAaaqabaGccaWGcbGaeyOeI0YaaWaaaeaacaWGxbaaamaaBaaaleaacaWGPbWaaeWaaeaacqGHsislcaWGQbaacaGLOaGaayzkaaaabeaakiaadseadaWgaaWcbaGaeyOeI0IaamOAaaqabaaakiaawIcacaGLPaaacqGHRaWkaSqaaiaadMgacqGH9aqpcaaIXaaabaGaamOBaaqdcqGHris5aOWaaSaaaeaacqaHdpWCdaahaaWcbeqaaiaaikdaaaaakeaacqaHdpWCdaqhaaWcbaGaamizamaaBaaameaacaWGQbaabeaaaSqaaiaaikdaaaaaaOGaeqiVd02aaSb[email protected][email protected]

Step 5. Updating r:

E(1)=Y(1)-C(1)b0-X(1)B-W(1)D E(2)=Y(2)-C(2)b0-X(2)B-W(2)D r=corr(E(1),E(2)).

Step 6. Updating s2:

σ 2 = i=1 n ( Y ¯ i C ¯ i b 0 X ¯ i B W ¯ i D ) T R 1 ( Y ¯ i C ¯ i b 0 X ¯ i B W ¯ i D ) 2n [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aaWbaaSqabeaacaaIYaaaaOGaeyypa0ZaaSaaaeaadaaeWbqaamaabmaabaWaaWaaaeaacaWGzbaaamaaBaaaleaacaWGPbaabeaakiabgkHiTmaamaaabaGaam4qaaaadaWgaaWcbaGaamyAaaqabaGccaWGIbWaaSbaaSqaaiaaicdaaeqaaOGaeyOeI0YaaWaaaeaacaWGybaaamaaBaaaleaacaWGPbaabeaakiaadkeacqGHsisldaadaaqaaiaadEfaaaWaaSbaaSqaaiaadMgaaeqaaOGaamiraaGaayjkaiaawMcaamaaCaaaleqabaGaamivaaaakiaadkfadaahaaWcbeqaaiabgkHiTiaaigdaaaGcdaqadaqaamaamaaabaGaamywaaaadaWgaaWcbaGaamyAaaqabaGccqGHsisldaadaaqaaiaadoeaaaWaaSbaaSqaaiaadMgaaeqaaOGaamOyamaaBaaaleaacaaIWaaabeaakiabgkHiTmaamaaabaGaamiwaaaadaWgaaWcbaGaamyAaaqabaGccaWGcbGaeyOeI0YaaWaaaeaacaWGxbaaamaaBaaaleaacaWGPbaabeaakiaadseaaiaawIcacaGLPaaaaSqaaiaadMg[email protected][email protected]

Hyperparameters for two-trait model in Steps 7 to 10 are the same as Steps 6 to 9 in single-trait model.

Step 11. Repeat steps 2-10 until a certain criterion of convergence is satisfied.

We first choose candidate QTL with either | b ^ j |/ | σ ^ 2 R ^ | > 10 6  or | d ^ j |/ | σ ^ 2 R ^ | > 10 6 [email protected]@[email protected]@+=feaaguart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaqWaaeaaceWGIbGbaKaadaWgaaWcbaGaamOAaaqabaaakiaawEa7caGLiWoacaGGVaWaaOaaaeaadaabdaqaaiqbeo8aZzaajaWaaWbaaSqabeaacaaIYaaaaOGabmOuayaajaaacaGLhWUaayjcSdaaleqaaOGaeyOpa4JaaGymaiaaicdadaahaaWcbeqaaiabgkHiTiaaiAdaaaGccaqGGaGaae4BaiaabkhacaqGGaWaaqWaaeaaceWGKbGbaKaadaWgaaWcbaGaamOAaaqabaaakiaawEa7caGLiWoacaGGVaWaaOaaaeaadaabdaqaaiqbeo8aZzaajaWaaWbaaSqabeaacaaIYaaaaOGabmOuayaajaaacaGLhWUaayjcSda[email protected][email protected] in the first stage of analysis. In the second stage, we perform the likelihood ratio test with markers that have passed the first stage of selection. The LOD score test in formula (3) is then applied to check if there is significant additive or dominant effect at a given marker.

Controlling false discovery rate

Multiplicity issue is an important problem in testing many hypotheses simultaneously. In this study, we first took the commonly used threshold, LOD = 3, proposed by Morton [18], and found that 4 out of 24 scenarios that we explored in the single trait analysis and 13 out of 36 scenarios that we explored in the two-trait analysis had the Monte Carlo estimated FDRs greater than 0.05. We then tried the threshold of LOD = 3.3 as suggested by Lander and Kruglyak [19] and found that 3 out of 24 scenarios in the single trait analysis and 2 out 36 scenarios in the two-trait analysis with the estimated FDRs exceeded 0.05. Finally, when we used LOD = 3.5 as the threshold, the estimated FDRs in all scenarios in both single- and two-trait analysis are controlled at 0.05.

Simulation studies

We conducted computer simulations using Matlab software [20] to investigate the performance of the proposed methods. For singletrait model, 201 markers were simulated by SimPed program [21] with two levels of sample sizes, n = {150,300}. The Minor Allele Frequency (MAF) across all markers is assumed to be uniformly distributed, MAF ˜ Unif (0.1, 0.5). Markers were evenly spaced with 1cM between two adjacent markers and each marker was assumed to be associated with two parameters, an additive and a dominant effect. The number of parameters in the model (402 in total) was larger than the number of individuals (n). We explored and compared the performance of our proposed method at three levels of heritability (h2 = 0.4, 0.6, and 0.8) and four pedigree structures (I (a), II (b), III (a+b+c), and IV (d)), where pedigree structure IV could be considered as a family with inbred individuals that is commonly seen in animals or plants. These pedigree structures are illustrated in Figure 1.