Strabismus refers to misalignment between two eyes that point in different directions, and is classified into comitant (concomitant) strabismus and incomitant (noncomitant or paralytic) strabismus. Structural anomalies of extraocular muscles, such as anomalous insertion, hypoplasia and aplasia, have long been recognized as congenital causes for hereditary incomitant strabismus (Matsuo et al., 1988; Uchiyama et al., 2010; Okano et al., 1990; Matsuo et al., 2009a). More recently, genomic mutations or polymorphisms have been identified in families with hereditary incomitant strabismus, including congenital fibrosis of extraocular muscles (CFEOM), Duane syndrome (Engle, 2007; Graeber, Hunter & Engle, 2013), or congenital (idiopathic) superior oblique muscle palsy (Jiang et al., 2004; Jiang et al., 2005; Imai et al., 2008; Ohkubo et al., 2012). Congenital progressive external ophthalmoplegia is also well recognized as a mitochondrial disease with hereditary background. Acquired incomitant strabismus is caused by vascular, traumatic or compression paralysis of ocular motility cranial nerves. It may also be developed as a consequence of muscle diseases, or resulted from hyper- or hypothyroidism, myasthenia gravis and other rare conditions.
Primary and non-syndromic comitant strabismus is a multifactorial disorder which has both genetic and environmental background with their undefined contribution (Michaelides & Moore, 2004; Maconachie, Gottlob & McLean, 2013; Ye et al., 2014). Genetic influence is evidenced by family history (Abrahamsson, Magnusson & Sjostrand, 1999; Matsuo, Yamane & Ohtsuki, 2001; Taira et al., 2003) and phenotypic concordance between monozygotic twins (Podgor, Remaley & Chew, 1996; Matsuo et al., 2002; Sanfilippo et al., 2012). Environmental influence is supported by the association with premature birth and perinatal hypoxia as it occurs with a higher incidence in cerebral palsy (Cotter et al., 2011; Jacobson & Dutton, 2000). At present, no gene has been identified to be responsible for the development of comitant strabismus. American and British researchers reported 7p22.1 as a chromosomal susceptibility locus for esotropia in Caucasian families (Parikh et al., 2003; Rice et al., 2009). Our previous research has identified the susceptibility loci in 4q28.3 and 7q31.2 regions for comitant strabismus that comprised both esotropia and exotropia in Japanese families (Fujiwara et al., 2003; Shaaban et al., 2009a; Shaaban et al., 2009b). Other chromosomal loci have also been reported to be associated with comitant strabismus in other ethnicity (Khan et al., 2011; Bosten et al., 2014).
Given this background, we conducted single nucleotide polymorphism (SNP) analyses to narrow the chromosomal loci down to a single gene in Japanese families. As an analytical method, we previously tried to use association study that examines the relationship between several polymorphic markers and the strabismus phenotype in the chromosomal regions (Matsuo, 2015). In this study, we used three different methods for linkage analysis: transmission disequilibrium test (TDT) (Spielman & Ewens, 1996), TDT allowing for errors (TDTae) (Gordon et al., 2004), and linkage analysis under dominant and recessive inheritance (Lathrop et al., 1984). We hypothesized that the results by different analytical methods of linkage analysis might localize a specific gene that would be responsible for comitant strabismus.
Materials and Methods
This study involved 108 affected subjects and 96 unaffected subjects in 58 Japanese families with primary and non-syndromic comitant strabismus including both esotropia and exotropia, which mostly overlapped with subjects in the previous study for chromosomal loci identification (Shaaban et al., 2009a). The previous study used 55 families with at least four members in each family. Part of the genomic DNA samples that were used in the previous study were no longer available due to the DNA shortage. Thus, the present study involved new affected subjects and unaffected subjects in new families as well as available subjects of the previous study. The features of 58 families are summarized in Table 1. The study followed the tenets of the Declaration of Helsinki, and was approved by the Ethics Committee of Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences (Approval No. Genome 215).
|Affected (%)||Unaffected (%)|
|The number of individuals||108 (52.9%)||96 (47.1%)|
|Male||45 (53.6%)||39 (46.4%)|
|Female||63 (52.5%)||57 (47.5%)|
|The number of families||55 (94.8%)||3 (5.2%)|
|Mixed phenotypes (exotropia and esotropia)||8|
|The number of individuals||108 (52.9%)||96 (47.1%)|
|Accommodative or partially accommodative esotropia||14|
SNP selection and typing
Tag SNPs in the two chromosomal regions were first picked from the JSNP database for Japanese (Hirakawa et al., 2002). Finally, 24 rsSNPs were chosen at the 4q28.3 region and 233 rsSNPs were chosen at the 7q31.2 region from the dbSNP database of the US National Center for Biotechnology Information (NCBI). Genomic DNA that was isolated from peripheral blood leukocytes was amplified by multiplex polymerase chain reaction (PCR).
The MassARRAY system is a high-throughput matrix-assisted laser desorption ionization and time-of-flight mass spectrometry (MALDI-TOF MS) for detection of nucleic acids. After PCR-based multiplex reaction and clean-up to remove unincorporated dNTPs, the third primers were introduced into the reactions which correspond to DNA template immediately at front of polymorphic sites. Single nucleotide base extension reaction was performed with mass-modified nucleotides. Then SNPs were identified on the platform of MassARRAY Analyzer 4 (96 well) iPlex SNP Genotyping (Sequenom, San Diego, CA, USA). Overall call rates were 87%. We then proceeded to quality controls of SNPs and samples.
Hardy-Weinberg equilibrium and principal component analysis
We first performed Hardy-Weinberg equilibrium (HWE) testing for data quality control. Principal component analysis (PCA) was performed by the genome-wide complex trait analysis (GCTA) program (Yang et al., 2011) to calculate eigenvectors which were then put in the model as covariates to identify if there be a population substructure among families.
Family-based association study: TDT and TDTae
Transmission disequilibrium test (TDT) is a test for association in the presence of linkage for a case-parent trio. Thus, two parents had to be present in a pedigree with one affected subject. As it is customary to only show paternal (or maternal) in a pedigree drawing, we assigned all information unknown except for their sex in genetic analysis, but we did not expect parent-specific effects so that we treated the maternal and paternal genotypes symmetrically. Both parents with homozygous condition were not informative, as there was no genetic variation at the locus in the progeny. Only subjects with at least one heterozygous parent were informative. This situation led to reduction of effective sample size.
Plink program version 1.9 (Purcell et al., 2007; Purcell & Chang, 2015) was run to detect genotypes which violated the Mendelian rules in TDT analysis. The Plink program can handle errors by rendering the offending genotypes “unknown”. Therefore, it was run on the original data without deletion of any SNPs or families which went through the initial cleaning. TDT is a form of linkage analysis which is only powerful in the presence of genetic association (Ott, 1989). The errors were handled by Plink to eliminate the offending data in TDT analysis as mentioned above.
In contrast, transmission disequilibrium test allowing for errors (TDTae) is an implementation of TDT which allows errors to be present in estimating their rates in the course of analysis by TDTae program (Gordon et al., 2004). In the process of running the TDTae program, errors were estimated in the background of any one of a number of error models. We considered the two most reasonable and economical error models, which require few parameters: DSB (Douglas-Skol-Boehnke) allows for genotype errors (Douglas, Skol & Boehnke, 2002) and GHLO (Gordon-Heath-Liu-Ott) allows for allele errors (Gordon et al., 2001; Yang et al., 2008). DSB is analogous to error models proposed many years ago and has only been implemented in the last decade. Under the two error models, we run TDTae program for dominant (d), recessive (r) and multiplicative (m) inheritance.
Furthermore, we defined linkage disequilibrium (LD) blocks, using Haploview 4.2 (Gabriel et al., 2002), on chromosome 4q28.3 and 7q31.2. Haplotype analysis was performed based on haplotype blocks to figure out if there were haplotypes that would be associated with the strabismus phenotype.
Linkage analysis in large pedigrees
Linkage analysis estimates recombination fractions between a putative disease locus and marker loci (Lathrop et al., 1984), and the results were output as LOD (logarithm of odds) scores (Ott, 1999; Terwilliger & Ott, 1994; Terwilliger & Ott, 2016). The Pseudomarker program (Gertz et al., 2014; Goring & Terwilliger, 2000; Hiekkalinna et al., 2011) estimates allele frequencies by maximum likelihood, separately under linkage and no linkage, which makes the results virtually independent of allele frequencies. Since the Pseudomarker program requires error-free data, we had to remove SNPs as necessary as possible to obtain a pure and error-free dataset (Lathrop et al., 1984). In addition, this program can also take linkage disequilibrium (LD) between a SNP and the disease into account, thus resulting in gain of additional power. Linkage analysis generally requires absence of errors. Mendelian inconsistencies would be allowed in linkage analysis with a suitable choice of penetrance for SNPs, but the procedure is cumbersome and rarely done.
Hardy-Weinberg equilibrium and principal component analysis
After quality control of SNPs and individuals, 19 SNPs with monomorphic or undetected types or at a low call rate were excluded, and one SNP was merged to another in the database of NCBI. Finally, all individuals and 237 SNPs remained. The flow chart of analytical process is shown in Fig. 1.
Twenty one out of 23 SNPs on chromosome 4q28.3, and 208 out of 214 SNPs on chromosome 7q31.2 were in Hardy-Weinberg equilibrium (P > 0.05, chi-square test). The results of PCA analysis showed no population substructure among families or between affected and unaffected individuals, and also detected no outliers (Fig. 2).
TDT and TDTae
A total of 261 Mendelian errors were detected. Errors could be due to faulty genotyping, clerical mistakes, pedigree mismatch (e.g., adopted child), or a new mutation which the program treated as an inherited variant. The dataset with Mendelian errors was applied to TDT directly. Mendelian errors were reduced when some families with large error rates were deleted. After five SNPs (rs4148690, rs3757807, rs3807975, rs3779551, and rs213987, all on chromosome 7) and four families (37, 8, 10, and 43) with the highest error rates were deleted, 186 individuals in 54 families and 232 SNPs with 71 errors remained, which were used as the analysis set for TDTae.
The best results of TDT (P < 0.08) are shown in Table 2. And the TDTae results with significant P values (P < 0.05) are shown in Table 3. Figures 3 and 4 show the results of TDT and TDTae, with haplotype analysis and LD blocks adjusted for families on chromosome 4q28.3 and on chromosome 7q31.2, respectively. In the 4q28.3 locus, all significant SNPs were located in the intron of MGST2. In the 7q31.2 locus, several SNPs with significant P values (P < 0.05) were distributed dispersedly in TES, ST7, WNT2, CAV1, and CFTR. Haplotype analysis in the 4q28.3 locus did not get a positive result. In the 7q31.2 locus, haplotypes with significant P values are located in TES, CAV1, or WNT2.
|Chromosome||dbSNP||Chromosomal location||Gene||Minor allele||Major allele||Transmitted alleles (no.)||Untransmitted alleles (no.)||Odds ratio||χ2||P value|
P values are nominal and not corrected for testing multiple SNPs.
|Chromosome||dbSNP||Chromosomal location||Gene||aR2||cMinimum of corrected P value|
|Dominant model||Recessive model||Multiplicative model|
The corrected P value is given by 1 –(1 –p)k−1 (Gordon et al., 2004).
R1 = Pr(aff | + d)/Pr(aff | + +) and R2 = Pr(aff |dd)/Pr(aff | + +) are genotypic relative risks for a di-allelic trait locus with low-risk (wild-type) allele + and high-risk (disease) allele d. If both R1 and R2 are less than 1, the genotypic relative risk value of the other allele would be calculated by R1′ = R1/R2 and R2′ = 1/R2. A few strange results are omitted from this table (e.g., R > 10,000, or R = 0).
Four families and 46 SNPs with Mendelian errors were deleted. Furthermore, linkage analysis requires valid data of complete family trios, so the program extracted only 90 individuals in 21 families which remained in the final analysis set. The dominant mode was less plausible because few parents in the present families were affected in themselves, or rather, the trait under consideration was frequently consistent with recessive inheritance. Table 4 shows all significant SNPs under dominant inheritance and recessive inheritance. The linkage analysis employed the error-free dataset which were only available by reducing the number of families, individuals, and SNPs. Based on this methodological limitation of linkage analysis, the LOD scores in the present analysis with the small number of families were as small as just over 1.
|Chromosome||dbSNP||Chromosomal location||Gene||LOD score||Linkage P value||Model|
logarithm of odds
P values are nominal and not corrected for testing multiple SNPs.
Expression quantitative trait locus (eQTL)
The significant SNPs in the 4q28.3 locus were related to MGST2 transcription in the search for expression quantitative trait locus (eQTL) in the Human Genetic Variation Database (HGVD) which displays the Japanese genetic variations and the association between the variations and transcription levels of genes (Higasa et al., 2016).
In our preceding study, we tried to use a method that did not depend on kinship, such as association study adjusted by family, in order to examine the relationship between several polymorphic markers and the strabismus phenotype in the chromosomal regions (Matsuo, 2015). This strategy was based on the fact that some of the family trios were not complete and that there were many Mendelian errors which might be attributed to adoption. The preceding results showed that significant SNPs were in MGST2 and WNT2 on chromosomal 4q28.3 locus and 7q31.2 locus, respectively. However, the false discovery rate (FDR) was too high to reduce the power in conducting multiple comparisons among SNPs (Benjamini & Hochberg, 1995), and therefore, we turned in the present study to focus on methods for linkage analysis.
When a few families or SNPs seem to contribute to the majority of errors, it is best to delete these families or SNPs firstly and then to carry out TDTae. In contrast, TDT by Plink would handle errors by ignoring the offending genotypes. The TDTae program generally furnishes much smaller P values than the TDT since the error model is in a parametric manner. There is some agreement between the outputs from Plink and TDTae, although not very strong. In the present analyses, the two error models in the TDTae furnished similar results, suggesting that the results would not be unduly dependent on the assumptions regarding errors.
The P values shown in the linkage studies were nominal and not corrected for the testing of multiple SNPs. A correction for multiple comparisons was somewhat difficult to make since these SNPs are presumably highly correlated with each other. The Pseudomarker program requires strict error-free data with no Mendelian errors. Therefore, some potential candidates of SNPs might be deleted in the process of preparing error-free data which were based merely on a single faulty genotyping. Furthermore, it is worth noting that genotyping errors would cause inflation in the recombination fraction between the disease and marker loci, leading to the consequence that recombination fractions may appear larger than they truly are Lincoln & Lander (1992).
In the present study, we clearly demonstrated that MGST2 is a candidate for the chromosomal 4q28.3 locus. As for the 7q31.2 locus, in contrast, the results of different kinds of statistical analyses could not narrow the locus to a single gene. Under the circumstances, the distribution of significant SNPs in the locus showed that only the ST7 to WNT2 region contained significant SNPs for all three methods of linkage analysis (Fig. 5). In the 7q31.2 locus, ST7 is indeed in the same big haplotype block with WNT2.
Primary and non-syndromic comitant strabismus contains several different clinical entities: esotropia includes infantile esotropia, accommodative esotropia, partially accommodative esotropia, late-onset (acute-onset) esotropia, and microtropia (microesotropia) while exotropia includes intermittent exotropia, constant exotropia and congenital (infantile) exotropia (Matsuo et al., 2003; Matsuo et al., 2005; Matsuo & Matsuo, 2005; Matsuo & Matsuo, 2007; Matsuo et al., 2007). Furthermore, patients with the same clinical entity or clinical diagnosis show varying degrees of manifestations, not only in horizontal and vertical deviations but also in the state of binocular vision. Under the circumstances, one way to define comitant strabismus is as a disease with abnormal binocular vision, namely abnormalities in simultaneous perception, fusion and stereopsis.
In our ongoing research, different clinical entities of primary and non-syndromic comitant strabismus were analyzed altogether in chromosomal mapping and SNP typing (Fujiwara et al., 2003; Shaaban et al., 2009a; Shaaban et al., 2009b; Matsuo, 2015). In other words, the presence or the absence of a phenotype “strabismus” was used as a single phenotypic descriptor in the genetic statistical analysis. This approach was justified by the fact that in our previous study the same chromosomal susceptibility loci were replicated in stratified groups of the families either with esotropia or with exotropia (Shaaban et al., 2009a).
In the Japanese population, exotropia is more prevalent than esotropia (Matsuo & Matsuo, 2005; Matsuo & Matsuo, 2007; Matsuo et al., 2009b; Matsuo et al., 2010), in contrast with the Caucasian population which shows higher prevalence of esotropia. A common genetic mechanism is assumed to give rise to exotropia and esotropia since both entities of comitant strabismus share abnormal binocular vision as a phenotype. In addition, there are indeed families which show mixed phenotypes of exotropia and esotropia, as observed in this study: one member shows exotropia and another member shows esotropia in a family. Abnormal activities of unknown genes in the central nervous system might be responsible for the abnormal binocular vision in patients with comitant strabismus.
The large number of Mendelian errors seems to be the limitation of this study. The original data sets with Mendelian errors were used in TDT analyses (Spielman & Ewens, 1996; Gordon et al., 2004). In contrast, the error-free data sets were used in TDTae and linkage analyses under dominant and recessive inheritance (Lathrop et al., 1984). The common use of the original data sets should have underlain the more consistent results whereas sharing of the same data sets would not necessarily mean that the data applied actually in analysis are the same. Or rather, the difference is merely the methods to handle the errors prior to software application or in the process by ignoring a single cell or deleting the whole series. In the present study, permutation tests were done to check the robustness of the results. Therefore, the results should be affected mainly by the methods, and would not be decided by the number of Mendelian errors.
Both MGST2 and WNT2 are known to be expressed in the brain (Jakobsson, Mancini & Ford-Hutchinson, 1996; Cadigan & Nusse, 1997) and likely to be involved in the development of comitant strabismus. Different analytical methods shed light on the data from different angles, therefore it is useful to apply more than one type of analysis. Strict Mendelian application as in this study, might not be appropriate in multifactorial disorders such as comitant strabismus, but would certainly provide a step to get guidance for detecting genetic risks of the disease. Since the proof of a responsible gene in a multifactorial disorder is difficult to be obtained in animal experiments, a different approach in patients, such as whole exome sequencing, would provide support for the present results of SNP typing. Further functional studies are necessary to clarify the mechanisms of the two genes on the susceptibility of comitant strabismus. Finally, it should be noted that there is a limitation in applying the eQTL to the present study since the eQTL data have been obtained in analyses of blood cells (Higasa et al., 2016).
This study with different analytical methods for genetic statistics provides evidence that MGST2 and WNT2 are potential candidate genes for comitant strabismus in Japanese population.