The use of the random forest approach as a first step, to filter candidate SNPs without taking into consideration a statistical model specification, is advantageous in genome-wide association studies, as long as little is known about candidate areas and the genetic architecture of the specific trait. Furthermore, the fact that results were obtained using two different strategies (Common SNPs and Highest 1% SNPs) and are very similar, provides reliability to the random forest methodology as can be seen in the previous study
With the exception of four selected SNPs in the Highest 1% SNPs strategy (chr 12: rs136348926; chr 11: rs110833507; chr 2: rs42923911; chr 9: rs110025080), all other SNPs presented a fat-related QTL described in their chromosome region. Also, only one SNP on chr 3 (rs42021729) is not close to any described gene in the surrounding area (± 250kb) (Table
In a previous genome-wide association study in Canchim, 100 SNPs on several chromosomes were considered the optimal set of SNPs to differentiate the 30 individuals with extreme phenotypes for backfat thickness. Among these SNPs, two haplotypes on chr 14 were genotyped and their association to the phenotype was validated in the whole population
. In the current study, even though SNPs from chr 14 were associated with backfat thickness by the random forest approach (in the Common SNP and Highest 1% SNP strategies, data not show), these SNPs were not selected in the stepwise regression model. Conflicting results and/or studies that cannot be replicated in the post-genomic area are not so uncommon
[56–59], and these differences can be attributed to partially insufficient power, false-positive results, bias, sample size, and to differences in populations, controls, and methodologies
[56–58], or true heterogeneity associations
. In these two GWA studies with Canchim, the base population is very similar, but the sample size and methodologies are not, which could explain the difference in the findings. A future option to help clarify the inconsistency in these findings would be to perform a meta-analysis, which combines data together to increase sample size and power, while reducing error risks
Another outcome from this study and the previous one
 is the possibility of including these SNPs in the development of a low density SNP (LD-SNP) panel for implementation of genomic selection in Canchim beef cattle. The most widespread strategy for developing small panels is by applying methods of variable selection to identify a diminutive set of SNPs that have good predictive power for the trait or breeding value
. The increase in accuracy of genomic breeding values obtained by using LD-SNP panels can be highly similar (around 90%) compared to the accuracies obtained by high density panels
[62, 63], at a more cost-effective price. Therefore, it is more likely to be adopted by farmers and the beef industry
. Furthermore, LD-SNP panels developed with SNPs selected on the basis of their effects perform better than LD-SNP panels with SNPs evenly spaced
[62, 63]. Importantly, SNPs identified in these studies need to undergo a prior validation in a population of animals which are not included in the population used for the SNP discovery (training population), enabling confidence in genomic predictions for future populations.
From the SNPs identified in this study, there were two on chr 10 (rs133046994, rs135638125) associated with backfat thickness, which together accounted for almost 12% of the dEBV variation (Table
2). These two SNPs are in the same chromosomal region as fat-related QTLs identified in previous studies
[65, 66], and they map to the same genes (THSD4 - thrombospondin, type I, domain containing 4, and LRRC49 - leucine-rich repeat-containing protein 49) thereby indicating THSD4, LRRC49 and the surrounding areas as strong candidates for further investigations and validation. The LRRC49 gene has been linked to breast cancer in humans, but very little is known about the biological function of the protein encoded by this gene
The THSD4 gene in Bos taurus and in Homo sapiens has a provisional status from RefSeq
, which, by definition, supports that this gene is both transcribed and expressed. Further evidence for the annotation of this gene is given by its sequence identity in the UniGene database
 when compared to orthologous sequences from M. musculus (95.1%), which has a validated status in RefSeq, and to H. sapiens (93.1%), suggesting a well-conserved homology of the THSD4 gene in these species.
The THSD4 gene encodes a protein with conserved disintegrin and metalloprotease domains, which it shares with the ADAM-TS1 protein family, and plays an import role in adipogenesis
. Previous studies have shown that this protein family interferes with the availability of differentiation-inducing or differentiation-inhibiting growth factors, either by modifying the extracellular matrix, affecting cell migration and adhesion, or by activating other pathways, which are key for regulating the differentiation of adipocytes, allowing their growth and expansion during adipogenesis
The subcutaneous fat percentage QTL reported on chr 10 (Table
2) is from a Charolais × Holstein crossbred cattle population, and is described as highly significant with additive effects estimated to be 0.5 phenotypic standard deviation units
. The study also reveals that the Charolais allele was associated with higher fat levels.
The SNP on chr 1 (rs137294146) associated with backfat thickness is responsible for approximately 9.4% of the dEBV variation (Table
2). There is also a reported QTL for fat thickness over the 12th rib
 and another for intramuscular fat percentage
, indicating that there should be one or more genes in this area affecting fat metabolism. In the 500Kb window surrounding this SNP, three genes are annotated, SOX14 (sex determining region Y – box 14), CLDN18 (claudin 18), and DZIP1L (DAZ interacting protein 1-like). The SOX14 gene seems to be involved in the regulation of embryonic development, whereas CLDN18 belongs to a multigene family that encodes a tetraspanning membrane protein acting on components at tight junctions, but its regulatory mechanisms, and roles in physiology and pathology are still under investigation
. The DZIP1L gene encodes a zinc finger protein, but how it affects either adipogenesis or lipid metabolism has not been depicted from the current literature. Nonetheless, the functions of these gene products are still being elucidated.
The 500Kb window around the SNP on chr 3 (rs109349988) reveals many annotated genes, of which some have been reported as participating in lipid metabolism. For example, PMVK (phosphomevalonate kinase) catalyzes the conversion of mevalonate 5-phosphate with ATP to form mevalonate 5-diphosphate and ADP, which is one of the initial reactions involved in the cholesterol biosynthetic pathway
. Other proteins in this region include ADAR (adenosine deaminase, RNA-specific), which encodes an RNA-editing enzyme by site-specific deamination of adenosines, resulting in changes in protein function or gene expression. A study in humans was conducted that found ADAR enzymes were associated with serum triglyceride and adiponectin levels, abdominal circumference, and body mass index
. Interestingly, this region also contains SHC1 (Src homology 2 domain containing – transforming protein 1) which has been reported as having a role in human obesity
, and as being one of the mediators for regulating the insulin-like growth factor 1 (IGF-1) pathway, which plays a key role in regulating cell proliferation, differentiation and apoptosis
. Lastly, this region contains ADAM15 (ADAM metallopeptidase domain 15), which belongs to the ADAM protein family previously discussed. These studies corroborate our findings and require further investigation to elucidate how these genes are affecting the deposition of subcutaneous fat in bovines.
The SNP associated with backfat thickness on chr 19 (rs136717249) is responsible for approximately 4.88% of the dEBV variance. This region contains the PHOSPHO1 (phosphatase, orphan 1) gene, which encodes a phosphatase enzyme that has been implicated in the mineralization of the extracellular matrix, a key process for skeletal development
. The PHOSPHO1 gene product has high activities toward phosphoethanolamine (PEA) and phosphocholine (PCho)
, which are the main metabolites involved in the pathway for the formation of phosphatidylcholine and phosphatidylethanolamine
. These compounds are implicated in the metabolism of complex glycerolipids, prostaglandins, leukotrienes, glycosylphosphatidylinositol-anchors, and some amino acids, such as glycine, serine and threonine. Also included in this region is the PHB gene (prohibitin), which is thought to be involved in regulating cell proliferation, gene transcription, and apoptosis. In recent studies, deficient PHB activity in the liver has been associated with non-alcoholic steatohepatitis and obesity, although the mechanism remains unknown
[80, 81]. Other examples include the IGF2BP1 (insulin-like growth factor 2 mRNA binding protein 1) gene, which encodes a protein that binds to the mRNAs of certain genes and regulates their translation. Lastly, the GIP (gastric inhibitory polypeptide, also known as the glucose-dependent insulinotropic polypeptide) gene has a known effect on stimulating the release of insulin from pancreatic β cells, but also has an insulin-like effect on adipocytes, suggesting that the GIP gene product enhances adipocyte glucose uptake, and that, at least in humans, it has an important role in the development of nutrition-induced obesity
. A recent study suggests that the GIP gene product has an effect on reducing free fatty acid release from adipose tissues, either by increasing reesterification or by inhibition of lipolysis
. Indeed, QTL studies reveal oleic acid content (OAC) and palmitoleic acid content (PAC) QTLs
[84, 85] in close proximity to the GIP gene in the bovine genome, which further suggests an association between this gene and free fatty acid processing.
The SNP rs134790147 on chr 13 also was associated with backfat thickness, and it is carrying 3.51% of the dEBV. Within this SNP region, a QTL for fat thickness over the 12th rib was found and described in an Angus population
. Also, a set of four genes are localized in the ±250kb window from the SNP position. The CCDC7 gene (coiled-coil domain containing 7) seems to be associated with human cancer
[86, 87], and there is no information available for bovines. The ARL5B gene product (ADP-ribosylation factor-like 5B), also known as ARL8, belongs to a family of proteins that show similar structure to ADP-ribosylation factors (ARFs family). ARLs and ARFs belong to the RAS superfamily of small GTPases, which function as modulators of complex and diverse cellular processes
[88, 89], of which the most canonical are cell proliferation and differentiation. However, they are also involved in protein trafficking through the trans-Golgi network (TGN). The TGN has a central role in protein sorting and directs the transport of newly synthesized proteins to different transport vesicles
[90–92], and also receives recycled molecules and extracellular materials by retrograde transport. Recently, it was observed that ARL5B enhances retrograde transport from endosomes to the TGN
. The MGC152301 (uncharacterized LOC783682) and the LOC524240 (Alk-like) genes do not have any available information in terms of function of their gene products, but both show the same two conserved domains: cd00112 (LDLa) and cd06263 (MAM)
. The LDLa is a low density lipoprotein receptor class A domain, that plays an important role in mammalian cholesterol metabolism, the protein receptor binds LDL and transports it into the cell by endocytosis
. The MAM is an extracellular domain that mediates protein-protein interactions, and is found in a variety of proteins, of which many are known to function in cell adhesion
. The remaining 16 SNPs, which were not described in detail here, accounted for 19.14% of dEBV variation for backfat thickness and, as seen in Table 2, most of them present some fat-related QTL described within their regions
[29, 65, 66, 85, 97–99], and are of further interest for future investigations on how these SNPs can be influencing backfat thickness deposition in Canchim beef cattle.