Associations for whole-exome sequencing profiling with carcass traits in crossbred pigs

Jae Young Yoo1Sang-Mo Kim2,3Dong Hyun Lee2Gye-Woong Kim2Jong-Young Lee3


Industrial pig breeding has used the Duroc breed and terminal sires in a three-way crossbred system in Korea. This study identified the gene variation patterns related to carcass quality in crossbred pigs ([Landrace × Yorkshire] × Duroc) using whole-exome sequencing (WES). This study used crossbred pigs and divided them into two groups (first plus grade, n = 5; second grade, n = 5). Genomic DNA samples extracted from the loin muscles of both groups were submitted for WES. A set of validated single-nucleotide polymorphisms (SNPs: n = 102) were also subjected to the Kompetitive allele-specific polymerase chain reaction (KASP) to confirm the WES results in the loin muscles. Based on the WES, SNPs associated with meat quality were found on chromosomes 5, 10, and 15. We identified variations in three of the candidate genes, including kinesin family member 5B (KIF5B), GLI family zinc finger 2 (GLI2), and KIF26B, that were associated with meat color, marbling score, and backfat thickness. These genes were associated with meat quality and the mitogen-activated protein kinase (MAPK) and Hedgehog (Hh) signaling pathways in the crossbred pigs. These results may help clarify the mechanisms underlying high-quality meat in pigs.



In Korea, breeding has used the Duroc breed and terminal sires in a three-way crossbred system. Korean pork consumption has shown a unique pattern with a strong preference for high-fat cuts such as belly, Boston butt, and rib (Choe et al., 2015).

Since the last century, traditional selective pig breeding methods have aimed to significantly improve economically important genetic traits such as growth rate, backfat thickness, and feed conversion ratio (Choi et al., 2015). In addition, animal improvement programs have dramatically accelerated the search for complex traits using genome-wide association studies (GWAS), such as quantitative trait loci (QTL) (Liu et al., 2020). The August 23, 2021 release 45 of the Animal QTL Database listed 33,540 QTLs in pigs (Pig QTLdb, 2021).

In pigs, GWAS has been conducted based on cDNA microarrays, whole genome sequencing (WGS), RNA sequencing (RNAseq) and the single-nucleotide polymorphism (SNP) chips for meat quality-related traits (Lee et al., 2012; Jeong et al., 2015; Wu et al., 2020; Zappaterra et al., 2021). In the Yorkshire breeds, genome-wide research has identified growth traits related to the candidate genes mitogen-activated protein kinase 6 (MAP2K6), phospholipase C beta 1 (PLCB1), rho GTPASE activating protein 24 (ARHGAP24), GLI family zinc finger 2 (GLI2), neuronal tyrosine-phosphorylated phosphoinositide-3-kinase adaptor 2 (NYAP2), and zinc finger protein multitype 2 (ZFPM2) (Meng et al., 2017). Researchers also identified a regulatory gene network using GWAS with the AT-rich interaction domain 3B (ARID3B), glial cell missing homolog 1 (GCM1), and GLI2 genes for economic pig characteristics such as body fat thickness in purebred pigs (Lee et al., 2019).

Generally, WGS can detect genetic variants, but did not accuracy is lower that larger indels are, indeed, only detected poorly in DNA (Daetwyler et al., 2014; Falardeau et al., 2016). Despite recent decreases in cost, WGS of association panels comprising large numbers of samples remains cost-prohibitive. However, GWAS results showed that the leading SNPs for complex traits were located in non-coding regions and the exonic parts of the whole genome, representing 1.5 - 2% of the whole genome in pigs and humans (Choi et al., 2015; Liu et al., 2020). Whole-exome sequencing (WES) usually identifies around 20,000 different variants in the DNA (Salmaninejad et al., 2019).

However, no WES analysis has been conducted to date to detect variants of genes associated with meat qualities in purebred and crossbred pigs (Landrace × Yorkshire × Duroc [LYD]). This study used WES methods to associate genetic variants with loin muscle carcass quality in crossbred pigs.

Materials and Methods

Animals and sample collection

Pork carcasses are graded in terms of loin muscle quality and conformation based on the Korea Institute for Animal Products Quality Evaluation (KAPE) and scored as first plus (1+), first (1), or second (2), based on marbling, lean color, backfat thickness, and body weight (KAPE, 2018). Ten crossbred pigs (LYD) were determined to be grades 1+ (Class I, n = 5) and 2 (Class II, n = 5) after being slaughtered according to the KAPE standards. Quality details of the carcass tenderloin muscle characteristics were reported previously (Kim et al., 2020). For WES analysis, ten samples of loin muscles were immediately frozen and stored at -80℃ until DNA extraction.

DNA sample preparation and sequencing

Genomic DNA was extracted from the tissue samples (50 µg·sample-1) using the Roche DNA Extraction Kit (Roche Diagnostics, Mannheim, Germany) following the manufacturer’s standard protocol. The procedure included a proteinase K digestion followed by 56℃ column purification to extract high-quality DNA. DNA degradation and contamination were monitored on 1% agarose gels, and the DNA concentration was measured using the Qubit® DNA Assay Kit in a Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA).

Library preparation for sequencing

A total amount of 1.0 μg of genomic DNA per sample was used as input material for the DNA library preparation. Sequencing libraries were generated using 130104_Sscr_10_2_MW_EZ_HX1 (Roche NimbleGen Inc., Pleasanton, CA, USA) following the manufacturer’s recommendations, and index codes were added to each sample. Briefly, fragmentation was carried out using a hydrodynamic shearing system (Covaris Inc., Woburn, MA, USA) to generate 180 - 280 base pair (bp) fragments. The remaining overhangs were converted into blunt ends via exonuclease/polymerase activities, and the enzymes were removed. Adapter oligonucleotides were ligated after adenylation of the 3' ends of the DNA fragments. DNA fragments with ligated adapter molecules on both ends were selectively enriched by polymerase chain reaction (PCR). Following PCR, liquid-phase hybridization was performed on the library using a biotin-labeled probe, after which streptomycin-coated magnetic beads were used to capture the gene exons. Captured libraries were enriched via PCR, and index tags were added to prepare for hybridization. The products were purified using the AMPure XP system (Beckman Coulter, Brea, CA, USA) and quantified using the Agilent High Sensitivity DNA Kit on the Agilent Bioanalyzer 2100 system (Agilent, Santa Clara, CA, USA).

Whole-exome sequencing (WES) bioinformatics analysis

Sequence reads were filtered for base quality and adapter trimming using Trimmomatic ver. 0.33 (Bolger et al., 2014). Reads were trimmed if four consecutive bases had an average Phred quality score of < 20. After trimming, only pairs of DNA sequences for which each read exceeded 35 bp were retained for analysis. Sequence reads were aligned to the pig reference genome (Sus scrofa 10.2) using the Burrows-Wheeler Aligner (BWA-MEM) ver. 0.7, which is used to map low-divergent sequences against a large reference genome (Li and Durbin, 2009). The alignments were sorted according to their coordinates and converted into the BAM format using SAMtools ver. 1.2 (Ramirez-Gonzalez et al., 2012). Next, the data were formatted for variant calling using Picard ver. 1.135, and duplicate reads were marked (Li et al., 2009).

We used the Genome Analysis Toolkit (GATK) ver. 2.6-4 (McKenna et al., 2010; DePristo et al., 2011; Van der Auwera et al., 2013) for downstream SNP and indel calling. We used Realigner Target Creator to identify poorly mapped regions (nearby indels) from the alignments and realigned these regions using Indel Realigner. Next, Unified Genotyper was used to call the SNPs and indels with a Phred scale quality > 30. After SNP calling, we used VariantFiltration to discard sequencing and alignment artifacts from the SNPs with the parameters “MQ0 ≥ 4 && ((MQ0/(1.0 * DP)) > 0.1)”, “SB ≥ -1.0, QUAL<10,” and “QUAL <30.0 || QD < 5.0 || HRun> 5 || SB >-0.10” and from the indels with the parameters “QD < 2.0,” “FS > 200.0,” and “Read Pos Rank Sum< -20.0.” All the variants that passed the above filtering criteria were used in the downstream analysis. ANNOVAR (Wang et al., 2010) was used to annotate the functions of the variants (exonic, intronic, 5' and 3' untranslated regions, splicing, and intergenic). SNPs identified in the exonic regions were classified as synonymous or non-synonymous SNPs.


A set of validated SNPs was used that the Kompetitive allele specific PCR (KASP) were used to confirm the WES results in the loin muscles (102 SNPs). We have used the flanking sequences with evaluated SNPs indicated by brackets to design KASP assays (Supplementary Table 1). All custom SNP assays were designed by the corresponding company (LGC Genomics, Hoddesdon, Herts, UK). The KASP assays were performed in a volume of 5 μL using 384-well plates, and low ROX was used as a passive reference dye. Two probes ensured the allelic specificity of the TaqMan assay; one was labeled with FAM and the other with VIC. In the KASP assays, bi-allelic discrimination was achieved through the competitive binding of two allele-specific forward primers, one labeled with FAM and the other with HEX. These different reporter dyes were detected independently on real-time quantitative PCR instruments with excitation sources and emission filters in the respective wavelengths. Genotyping was carried out on the QuantStudio 5 Real-Time PCR System (Applied Biosystems, Waltham, MA, USA).

Table 1. Single-nucleotide polymorphisms (SNP) annotation of whole exome sequencing for Landrace × Yorkshire × Duroc (LYD) breeds.

Chr., number of chromosomes, Ref, reference genotype; Alt, alteration genotype; NUP214, nucleoporin 214; GLI2, GLI family zinc finger 2; DTX4, deltex E3 ubiquitin ligase 4; CRB1, crumbs cell polarity complex component 1; KIF5B, kinesin family member 5B; RPTOR, regulatory associated protein of MTOR complex 1; NME9, NME/NM23 family member 9; NECH1, neutral cholesterol ester hydrolase 1; AMPH, amphiphysin; KIF26B, kinesin family member 26B; CALML5, calmodulin like 5; C2orf54, chromosome 2 open reading frame 54; SEPT3, SEPTIN3.

z We compared with prevalence rates of first and second grades in rcossbred pigs by the whole-exome sequencing (WES).

Statistical analysis

We performed analysis of variance and χ2 tests for continuous and categorical variables on our data. All statistical analyses were conducted using the SAS software (version 9.3, SAS Institute, Inc., Cary, NC, USA).


On average, approximately 1.0 × 108 unique good-quality reads were generated per exome for the first plus-grade and second-grade groups (Supplementary Table 2). In the first-grade group, the total number of reads per sample ranged from 20,061,291 to 25,268,416, (median: 22,865,861). In the second-grade group, the number of reads ranged from 18,447,182 to 24,652,411 (median: 21,475,165). The sequence reads of each breed were aligned to the pig reference genome (Sus scrofa 10.2) from the Ensembl database. The mean coverage depth was 57.07 - 84.14 with 61 - 63% uniformity for amplicon coverage in the targeted regions with at least 4 times the target bases, and coverage of the median proportion with > 20 times coverage was 77.76% (Supplementary Table 3). Whole-exome variant analysis identified 489,279 variants in each of the first plus-grade and second-grade groups. The hetero rate (%) and Ts/Tv ratios for each whole exome ranged from 0.089 to 0.108 and from 2.366 to 2.400, respectively (Supplementary Table 4).

Table 2. Basic characteristics of the validation subjects in Landrace × Yorkshire × Duroc (LYD) pigs.

Data are means ± SD.

CI, confidence interval; CIE, International Commission on Illumniation; L*, indicates lightness; a*, the red/green coordinate;b *, the yellow/ blue coordinate.

zThe animal products grading system of the Korea Institute for Animal Products Quality Evaluation (KAPE).

For additional quality control, we selected SNPs with a minor allele frequency of none or lower samples using the χ2 test. We identified 439 variants in each of calling rate with ≥ 100% and depth > 340. Using log-linear models, we selected an association with meat quality at an unadjusted significance level of p< 0.05. We assessed 13 candidate genes for changes in carcass characteristics in loin muscle to determine whether the candidate genes played a role in meat quality (Fig. 1). The SNPs associated with meat quality were located on chromosomes 1, 2, 5, 9, 10, 12, 13 and 15 (Table 1). Three suggestive SNPs were detected in the first plus grade. These SNPs located on the SSC1: 305,240,304, SSC15: 355,155,021, SSC2: 11,839,450, and SSC10: 24,762,182. Based on the pig genome assembly version 10.2, these were mapped to nucleoporin 214 (NUP214), GLI2, and deltex E3 ubiquitin ligase 4 (DTX4). Additionally, the six first plus grade genes had dominant SNPs in the regions of SSC10: 24,762,182, SSC10: 47,337,624, SSC12: 1,689,010, and SSC13: 86,695,398, 120,292,202, where we detected the crumbs cell polarity complex component 1 (CRB1), kinesin family member 5B (KIF5B), regulatory associated protein of MTOR complex 1 (RPTOR), NME/NM23 family member 9 (NME9), neutral cholesterol ester hydrolase 1 (NECH1), and amphiphysin (AMPH). In addition, in the second-grade group, we detected dominant genes for KIF26B, calmodulin-like 5 (CALML5), and septin-type G domain-containing protein (SEPT3).

Fig. 1. The analytical framework for the identification of carcass traits genes in crossbred pigs. The diagram shows the processes used to filter the variants identified by whole exome sequencing. KASP, Kompetitive allele-specific polymerase chain reaction; KIF5B, kinesin family member 5B; GLI2, GLI family zinc finger 2; KIF26B, kinesin family member 26B; CIE, International Commission on Illumination; L*, indicates lightness.

The variants identified in the LYD breed were genotyped using KASP (n = 102). The LYD pig characteristics are listed in Table 2. We identified variations in three candidate genes, including KIF5B, GLI2, and KIF26B (Fig. 2). For KIF5B, the CIE L* (lightness) value of the C/C genotype (46.85, n = 29) and the C/T genotype (48.54, n = 59) was significantly higher than that in the T/T genotype (45.86) (p < 0.05) (Fig. 2a). For GLI2, the heterozygous G/A genotype was significantly higher than that of the homologous genotypes (G/G and A/A) for marbling score and meat color (Fig. 2b and c). In addition, the G/G genotype (21.09 mm, n = 64) and G/A genotype (21.37 mm, n = 30) were significantly higher than the A/A genotype (17.20 mm, n = 5) in KIF26B (p = 0.083) for backfat thickness (Fig. 2d).

Fig. 2. Validation of KIF5B, GLI2, and KIF26B genes by KASP PCR. (a) KIF5B gene (b), (c) GLI2 gene (d) KIF26B. CIE, International Commission on Illumination; L*, indicates lightness; KIF5B, kinesin family member 5B; GLI2, GLI family zinc finger 2; KIF26B, kinesin family member 26B; KASP PCR, Kompetitive allele-specific polymerase chain reaction.


This study determined SNPs associated with meat quality through WES and genotyping in crossbred pigs. Using genotyping, we identified the associations of KIF5B, GLI2, and KIF26B with meat qualities such as meat color, marbling score, and backfat thickness.

Kinesins are ATP-dependent microtubule plus-end-directed motor proteins and are divided into 14 subfamilies that share similarities in the motor and structural domains (Park et al., 2017). The Kinesin-1 subfamily is a tetramer consisting of two kinesin heavy chains such as KIF5A, KIF5B, or KIF5C (Kanai et al., 2000). Generally, protein kinases can regulate the intracellular transport of organelles, proteins, and lipids through direct phosphorylation with KIF5s (Gibbs et al., 2015). The KIF5B and KIF26B genes interact with extracellular signal-regulated kinase (ERK) and mitogen-activated protein kinase (MAPK) signals to regulate cell proliferation, differentiation, migration, senescence, and apoptosis (Sun et al., 2015; Liang and Yang, 2019). In pigs, MAPK signaling is associated with intermuscular fat contents to regulate lipid deposition, such as polyunsaturated fatty acids (Zappaterra et al., 2021). Our previous study showed that significantly enriched KEGG pathways were related to the signaling pathways, including the cyclic adenosine monophosphate (cAMP) signaling pathway, the MAPK signaling pathway, and the MAPK cascade pathway (Kim et al., 2020). Thus, KIF family (e.g., KIF5B and KIF26B) variations may be associated with meat qualities such as color and backfat thickness.

The GLI2 was associated with marbling score and meat color in the heterozygous type. The GWAS showed that the GLI2 functions as a transcription regulator alongside the Hedgehog (Hh) pathway in growth in purebred pigs (Meng et al., 2017; Lee et al., 2019). The Hh signaling pathway is directly linked to the control of body fat mass and plays a crucial role in lipid metabolism, adipogenesis, and obesity (Wei et al., 2019). These data suggest that GLI2 and the Hh pathway regulate meat quality in pigs.


This study had some limitations. First, the statistical power is low because of the small number of samples; some samples could not be genotyped using the KASP assay with genomic DNA. Second, we did not assess any joint effects of SNPs and expression status. Future studies are needed with more samples. Despite these limitations, this study has demonstrated the benefits of combining WES and KASP data, which are low cost but still effective. In conclusion, genetic variations in the KIF family (KIF5B and KIF26B) and GLI2 genes, related to MAPK and Hh signaling pathways, were associated with meat quality in crossbred pigs. Our findings may help clarify the mechanisms underlying high-quality meat production in pigs.

Conflict of Interests

No potential conflict of interest relevant to this article was reported.

Authors Information

Jae Young Yoo,

Sang-Mo Kim,

Dong Hyun Lee,

Gye-Woong, Kim,

Jong-Young Lee,


1 Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30:2114-2120.  

2 Choe JH, Yang HS, Lee SH, Go GW. 2015. Characteristics of pork belly consumption in South Korea and their health implication. Journal of Animal Science and Technology 57:22.  

3 Choi JW, Chung WH, Lee KT, Cho ES, Lee SW, Choi BH, Lee SH, Lim W, Lim D, Lee YG, et al. 2015. Whole-genome resequencing analyses of five pig breeds, including Korean wild and native, and three European origin breeds. DNA research 22:259-267.  

4 Daetwyler HD, Capitan A, Pausch H, Stothard P, van Binsbergen R, Brondum RF, Liao X, Djari A, Rodriguez SC, Grohs C, et al. 2014. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nature Genetics 46:858-865.  

5 DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al. 2011. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics 43:491-498.  

6 Falardeau F, Camurri M, Campeau P. 2016. Genomic approaches to diagnose rare bone disorders. Bone 102:5-14.  

7 Gibbs KL, Greensmith L, Schiavo G. 2015. Regulation of axonal transport by protein kinases. Trends in Biochemical Sciences 40:597-610.  

8 Jeong H, Song KD, Seo M, Caetano-Anolles K, Kim J, Kwak W, Oh JD, Kim E, Jeong DK, Cho S, et al. 2015. Exploring evidence of positive selection reveals genetic basis of meat quality traits in Berkshire pigs through whole genome sequencing. BMC genetics 16:104.  

9 Kanai Y, Okada Y, Tanaka Y, Harada A, Terada S, Hirokawa N. 2000. KIF5C, a novel neuronal kinesin enriched in motor neurons. The Journal of Neuroscience 20:6374-6384.  

10 KAPE (Korea Institute for Animal Products Quality Evaluation). 2018. Livestock products grading Accessed in on 1 December 2018.  

11 Kim SM, Markkandan K, Lee JY, Kim GW, Yoo JY. 2020. Transcriptome profiling associated with carcass quality of loin muscles in crossbred pigs. Animals (Basel) 10:1279.  

12 Lee J, Kang JH, Kim JM. 2019. Bayes factor-based regulatory gene network analysis of genome-wide association study of economic traits in a purebred swine population. Genes (Basel) 10:293.  

13 Lee KT, Lee YM, Alam M, Choi BH, Park MR, Kim KS, Kim TH, Kim JJ. 2012. A whole genome association study on meat quality traits using high density SNP chips in a cross between Korean native pig and landrace. Asian-Australasian Journal of Animal Sciences 25:1529-1539.  

14 Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754-1760.  

15 Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078-2079.  

16 Liang YJ, Yang WX. 2019. Kinesins in MAPK cascade: How kinesin motors are involved in the MAPK pathway? Gene 684:1-9.  

17 Liu Y, Liu X, Zheng Z, Ma T, Liu Y, Long H, Cheng H, Fang M, Gong J, Li X, et al. 2020. Genome-wide analysis of expression QTL (eQTL) and allele-specific expression (ASE) in pig muscle identifies candidate genes for meat quality traits. Genetics, Selection, Evolution 52:59.  

18 McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20:1297-1303.  

19 Meng Q, Wang K, Liu X, Zhou H, Xu L, Wang Z, Fang M. 2017. Identification of growth trait related genes in a Yorkshire purebred pig population by genome-wide association studies. Asian-Australasian Journal of Animal Sciences 30:462-469.  

20 Park SJ, Seog JS, Moon IS, Seog DH. 2017. The regulation mechanisms of kinesin motor proteins. Journal of Life Science 27:840-848.  

21 Pig QTLdb. 2021. The animal QTL database. Accessed in on 23 August 2021.  

22 Ramirez-Gonzalez RH, Bonnal R, Caccamo M, Maclean D. 2012. Bio-samtools: Ruby bindings for SAMtools, a library for accessing BAM files containing high-throughput sequence alignments. Source Code for Biology and Medicine 7:6.  

23 Salmaninejad A, Motaee J, Farjami M, Alimardani M, Esmaeilie A, Pasdar A. 2019. Next-generation sequencing and its application in diagnosis of retinitis pigmentosa. Ophthalmic Genetics 40:393-402.  

24 Sun Y, Liu WZ, Liu T, Feng X, Yang N, Zhou HF. 2015. Signaling pathway of MAPK/ERK in cell proliferation, differentiation, migration, senescence and apoptosis. Journal of Receptor and Signal Transduction Fesearch 35:600-604.  

25 Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, Del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J, et al. 2013. From FastQ data to high confidence variant calls: The genome analysis toolkit best practices pipeline. Current Protocols in Bioinformatics 43:11.10.1-11.10.33.  

26 Wang K, Li M, Hakonarson H. 2010. ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Research 38:e164.  

27 Wei H, Li J, Shi S, Zhang L, Xiang A, Shi X, Yang G, Chu G. 2019. Hhip inhibits proliferation and promotes differentiation of adipocytes through suppressing hedgehog signaling pathway. Biochemical and Biophysical Research Communications 514:148-156.  

28 Wu P, Wang K, Zhou J, Chen D, Yang X, Jiang A, Shen L, Zhang S, Xiao W, Jiang Y, et al. 2020. Whole-genome sequencing association analysis reveals the genetic architecture of meat quality traits in Chinese Qingyu pigs. Genome 63:503-515.  

29 Zappaterra M, Gioiosa S, Chillemi G, Zambonelli P, Davoli R. 2021. Dissecting the gene expression networks associated with variations in the major components of the fatty acid semimembranosus muscle profile in large white heavy pigs. Animals (Basel) 11:628.