VSNi Team
10 July 2024In genetic analyses, the incorporation of pedigree information in models plays a crucial role in accurately estimating, for the trait(s) under study, the additive genetic effects of a group of individuals. Indeed, in a model without pedigree information, only direct progeny contributes to the estimation of these effects for the parents. However, when the pedigree information is incorporated, each relative contributes to the estimation of an individual’s effect, this is more or less dependent on how closely they are related. This methodology enables more precise estimates, which in turn allows us to make better-informed breeding decisions. Commonly used models that benefit from this type of information are the Animal Model and the Parental Model.
Both the Animal and Parental models are linear mixed models that include fixed effects (e.g., year, herd, management practice, and gender) and random effects (e.g., genotype, blocking structure, pen, and residual). They also incorporate a numerator relationship matrix (A), derived from the pedigree information, and it is used when solving the famous Henderson’s mixed model system of equations that will estimate the fixed (BLUEs) and random (BLUPs) effects. In this way, the additive genetic effects are separated from unwanted systematic effects and background noise.
The main difference between the two types of models is that the Animal Model estimates breeding values, for all the individuals in the pedigree (e.g., grand-parents, parents, and offspring), while the Parental Model estimates the genetic combining abilities (GCAs, half of the breeding values) of the parents (and grand-parents if included in the pedigree).
For further reading on these models, take a look at the following blog by Dr. Valérie Poupon: “Parental versus Animal Model: What is the difference and how do we choose?”.
In a model where a set of genotypes are studied (e.g., lines or varieties in a plant breeding program), without parental nor pedigree information, the genetic estimates for these genotypes are referred to as "genotypic values" or “total genetic value” (TGV). These values estimate, for a given individual, the summed effect of both the additive and non-additive genetic effects.
An Animal Model, as mentioned above, necessarily includes the pedigree information and will provide, in contrast, breeding value estimates. They correspond to what parents pass on to their progeny, assuming that they are crossed with parents of equal genetic worth. In other words, a breeding value quantifies the average effect of an individual’s alleles on its progeny.
A Parental Model, including or not pedigree information, will always provide a GCA for each parent, which, as mentioned earlier, is equal to half of the breeding value. A GCA represents the deviation of the mean value of the progeny of a parent from the population mean. Note that a parental model without a numerator relationship matrix still requires parental information, which computationally is the equivalent of a two-generation pedigree (parents and progeny, which are included in the dataset to fit the model).
Both the breeding values and GCAs are estimates of the additive genetic effects, and thus provide information on the individuals’ genetic worth for selection in breeding programs. TGVs also provide relevant information; however, their calculation does not incorporate relatives’ information, and in addition, the non-additive effects, such as dominance and epistasis, cannot be passed on to the progenies. Hence, the information provided is less useful to select future parents as it is biased.
Heritability, with values ranging from 0 to 1, is an extremely important statistic in breeding as it quantifies the relative importance of the genetic effects on the phenotype. A value close to 0 indicates that the phenotype is not controlled by the genetic makeup. In this case, selection has almost no effect, and thus, it will not lead to better offspring. However, a higher value of heritability indicates that relevant potential genetic gains can be achieved in a breeding program.
We need to differentiate between "broad-sense heritability” and “narrow-sense heritability”. The former represents the proportion of the total genetic variance (additive and non-additive) to the phenotypic variance. In contrast, the latter measures only the proportion of the additive genetic variance to the phenotypic variance.
In a simplified situation, where a parent has only one offspring and there is no information regarding its relatives, then its breeding value corresponds to the narrow-sense heritability multiplied by the deviation of the phenotypic value of the offspring from the population mean.
ASReml-R is a powerful software package designed for fitting linear mixed models, including pedigree-based models. Its flexibility and robust statistical framework make it an excellent choice for leveraging pedigree information in your analyses. Given the general sparsity of pedigree data (less computationally demanding), ASReml-R can handle and accurately fit very large datasets with complex hierarchical structures.
Pedigree data is typically formatted into three columns, which correspond to the individuals, sire, and dam identifiers. ASReml-R is able to obtain the numerator relationship matrix (A) required for the analyses using these three columns, and furthermore, it can process and incorporate many types of additional information such as gender, level of inbreeding, selfing, maternal grand-sire, and genetic groups. ASReml-R can also handle messy and some level of missing pedigree.
ASReml-R generates comprehensive outputs, including estimates of variance components, additive effects, and their associated standard errors. These results are essential for making informed decisions in breeding programs and genetic studies. The estimated additive effects can be used for the selection of superior individuals, while the variance components provide insights into the genetic architecture of the trait(s) under investigation as well as the genetic diversity of the population.
Incorporating pedigree information into statistical analyses is crucial for accurate estimation of variance components and genetic effects. Pedigree-based models, such as Animal and Parental Models, can leverage the genetic relationships among individuals (not just direct progenies) to partition variance components, into genetic and environmental effects. The distinction between genotypic values and breeding value/GCAs, as well as broad-sense and narrow-sense heritability, highlights the importance of pedigree information in genetic improvement programs.
The asr013 recipe in the Cookbook describes the ainverse() function that is used to transform a pedigree into the inverse of a numerator relationship matrix (used in the model), including options that are used to incorporate additional information (gender, level of inbreeding, selfing, maternal grand-sire, and genetic groups…).
Recipes asr011, 12, 17, and 18 are examples using pedigree data.
While the benefits of incorporating pedigree data are clear, navigating the world of complex pedigree data can be daunting. Here's where our services can be invaluable. Partnering with VSNi professionals can help you:
Related Reads