Visitors   Views   Downloads
Note that a Preprint of this article also exists, first published January 11, 2015.


Selenium (Se) is an essential trace element required for selenocysteine (Sec) residues inserted during mRNA translation into Se dependent proteins, termed selenoproteins (Brigelius-Flohé, 1999). Selenocysteine is a relatively rare Se containing analogue of the essential amino acid cysteine (Cys) (Papp et al., 2007; Penglase, 2014). The number of genes coding for selenoproteins varies among species, with mammals having 24 to 25, birds 25, and bony fish 35 to 38 (Mariotti et al., 2012). Most selenoproteins are redox enzymes that contain a single Se atom present within a catalytically active Sec residue (Papp et al., 2007). An exception is the Se rich glycoprotein, selenoprotein P (SEPP1; aka SeP, SEPP, SEPP1a), which in vertebrates contains 7 to 18 Sec residues, depending on the species (Lobanov, Hatfield & Gladyshev, 2008). The high Sec content of SEPP1 is thought to facilitate Se distribution throughout the body. In mammals, the liver is a major site of SEPP1 expression, where it is synthesised utilising Se obtained from food. Hepatic SEPP1 is then secreted into the blood plasma (Kato et al., 1992). Of the Se that is present in the bioavailable pool, plasma SEPP1 accounts for around 80% of the total Se in plasma (Hill et al., 1996; Hill et al., 2007), and 8% of the total body Se (Read et al., 1990). Tissues utilise a combination of receptor mediated endocytosis and pinocytosis to obtain SEPP1 from the plasma, where it is then catabolised to release Se for de nova selenoprotein synthesis (Burk & Hill, 2009; Burk et al., 2013).

Several features of SEPP1 are conserved among vertebrates including, (i) a single N-terminal domain Sec residue present within a thioredoxin like motif (UXXC, where U is Sec), (ii) a histidine rich region in the mid region of the protein, and (iii) an apolipoprotein E receptor-2 (APOER2; aka LRP8) binding site followed by five Sec residues in proximity to the C-terminal (Fig. 1) (Lobanov, Hatfield & Gladyshev, 2008). APOER2 is widely expressed in human tissues (; Kim et al., 2014). APOER2 facilitated uptake of plasma SEPP1 is an essential (testes) or important (brain and foetus) pathway in some, but not all (muscle, kidney, liver or whole body) tissues for maintaining Se homeostasis in vivo (Burk et al., 2007; Olson et al., 2007; Hill et al., 2012; Burk et al., 2013). In contrast, the histidine rich regions of SEPP1 presumably interact with multiple receptors, including megalin (LRP2). A megalin facilitated uptake pathway minimises excretion of Se by binding SEPP1 fragments in the kidney (Olson et al., 2008; Kurokawa et al., 2014) and plays a role in maintaining tissue Se homeostasis (Steinbrenner et al., 2006; Chiu-Ugalde et al., 2010). Additionally, the histidine rich regions are associated with the heparin binding properties of SEPP1. It is postulated that the heparin binding properties of SEPP1 allow the N-terminal Sec of SEPP1 to provide antioxidant protection for endothelial cells at sites of inflammation (Hondal et al., 2001; Saito et al., 2004).

The receptor binding sites and selenocysteine (Sec) residues of vertebrate selenoprotein P (SEPP1).

Figure 1: The receptor binding sites and selenocysteine (Sec) residues of vertebrate selenoprotein P (SEPP1).

From the N-terminal side, SEPP1 is comprised of a conserved N-terminal domain Sec residue, followed by several proposed heparin binding sites which include a histidine rich region. Following this, there is the shorter Sec residue rich C-terminal domain which contains an APOER2 binding site. The C-terminal domain can be further divided into two subdomains. The first subdomain exists on the N-terminal side of the APOER2 binding site and contains a region with a low conservation of Sec residues among vertebrates (mainly due to Sec to cysteine (Cys) conversions (Lobanov, Hatfield & Gladyshev, 2008)). The second subdomain is located downstream of the APOER2 binding site and contains five Sec residues that are conserved across vertebrate species. Several species of amphibians also have an additional Sec residue in the C-terminal end of this region (Lobanov, Hatfield & Gladyshev, 2008). The proposed heparin binding sites/histidine rich regions are based on rat SEPP1 found by Hondal et al. (2001). Similar histidine rich regions are found in the SEPP1’s of other species ( Cys residues outside the C-terminal domain are not shown. Red lines, conserved Sec residues; Black lines, Cys or Sec residues; Green lines, Cys/Sec residues within the APOER2 binding site; Green box grids, proposed heparin binding sites.

In contrast, other domains in SEPP1 have low conservation among species. For example, single-nucleotide mutations causing Sec to cysteine (Cys) substitutions in the SEPP1 C-terminal domain upstream and including the APOER2 binding site have occurred frequently throughout the vertebrate linage (Fig. 1) (Lobanov, Hatfield & Gladyshev, 2008). The reason why Sec content plasticity is observed only within this region of SEPP1 is unclear, but it appears to be responsible for most of the variation between the SEPP1 Sec content among vertebrates (Lobanov, Hatfield & Gladyshev, 2008). Furthermore, why SEPP1 Sec content differs among species also remains unknown. Several lines of evidence suggest vertebrate SEPP1 Sec number may be a direct function of Se utilisation. For instance, vertebrate SEPP1 Sec content correlates positively with selenoproteome size, tissue Se levels, and Se bioavailability in the environment (Lobanov, Hatfield & Gladyshev, 2008).

If a direct relationship between SEPP1 Sec content and Se requirements exists, the SEPP1 Sec content of a species could predict its Se requirements, or vice versa. In doing so, this would provide a new insight into how the genome affects nutrient utilisation. Additionally, such a relationship would allow considerable scope for implementing the 3R’s (replace, reduce, refine). For example, this relationship would indicate the dietary Se levels to focus on when investigating the Se requirements for novel species. Such knowledge would reduce both the number of animals required and the risk of exposure to Se levels that may compromise animal welfare in such experiments.

In the following work, we compared the Sec content of mammalian, avian and bony fish SEPP1s predicted in silico with their Se requirements determined in vivo. We found a strong positive non-linear correlation (R2 = 0.78) between the two, suggesting Se requirements can be predicted from the Sepp1 gene sequence. The correlation was dictated by the Sec content within the C-terminal domain upstream and including the APOER2 binding site of SEPP1s. The model was limited, as it could not predict Se requirements in species whose SEPP1 Sec content was >15 residues, as found in the majority of bony fish species. The predicted Se requirements for vertebrate species based on their SEPP1 Sec content are provided.

Materials and Methods

The in silico predicted species specific Sec content of SEPP1 (SEPP1a in fish) were obtained from Lobanov, Hatfield & Gladyshev (2008), the open access selenoprotein database (; Romagné et al., 2014) or by analysing genomic Sepp1 sequences (NCBI) for Sec content (, an open access online software for this purpose (Mariotti et al., 2013). The SEPP1 Sec content of five bony fish species; loach (Paramisgurnus dabryanus), cobia (Rachycentron canadum), grouper (Epinephelus malabaricus), gibel carp (Carassius auratus gibelio) and yellowtail kingfish (Seriola lalandi); were assumed to be within the 15 to 17 residue range found for fish in general (Lobanov, Hatfield & Gladyshev, 2008) (see Table S2). Protein alignments and a phylogenetic tree for vertebrate SEPP1 are provided in Figs. S2 and S3, respectively. The species specific Se requirement data were obtained from published studies and from the National Research Council of the USA (NRC) nutrient requirement reports (NRC, 1963; Hilton, Hodson & Slinger, 1980; Gatlin & Wilson, 1984; NRC, 1985; NRC, 1994; NRC, 1995; Weiss et al., 1996; NRC, 1997; Weiss et al., 1997; Lei et al., 1998; Wedekind, Yu & Combs, 2004; Lin & Shiau, 2005; Fischer et al., 2008; Jensen & Pallauf, 2008; Sunde et al., 2009; Liu et al., 2010; Sunde & Hadley, 2010; Han et al., 2011; NRC, 2011; Le & Fotedar, 2013; Hao, Ling & Hong, 2014; Penglase et al., 2014). See Table S1 for further information regarding these animal Se requirement studies. Where multiple Se requirement studies for a species are available, the dietary Se requirements to fulfil the requirements of the actively growing juvenile stage were selected. Data were analysed in GraphPad Prism (GraphPad Software, San Diego, California, USA, V. 5.04). Data were fitted with a horizontal line (null hypothesis) and then tested against more complex models in the following sequence; first order polynomial, second order polynomial and five parameter logistic equation (5PL) asymmetric sigmoidal; until the simplest model that explained the data was found (p < 0.05). Other vertebrate classes (reptiles and amphibians) were excluded from the analyses because of the absence of Se requirement studies.

Results and Discussion

The selenocysteine content of selenoprotein P correlates strongly with selenium requirements

The Sec content of SEPP1s were identified for a total of 14 species; three bony fish, three birds and eight mammals; for which the Se requirements are also published (Table S1). Using this data, a positive non-linear correlation (R2 = 0.78) was found between Se requirements and SEPP1 Sec number (Fig. 2). This reflects the positive correlation between SEPP1 Sec content and selenoprotein number in vertebrates found previously (Kryukov & Gladyshev, 2000; Lobanov, Hatfield & Gladyshev, 2008). A linear relationship between Se requirements and SEPP1 Sec content was moderately strong (R2 = 0.68) but was statistically rejected (p = 0.048) in favour of the non-linear model mentioned above.

The relationship between the selenocysteine content of selenoprotein P and selenium requirements in vertebrates.

Figure 2: The relationship between the selenocysteine content of selenoprotein P and selenium requirements in vertebrates.

The solid line with the solid circles (●) is the best fit model for the SEPP1 Sec content versus Se requirements (mg Se/kg dry matter (DM)) from 14 species with representatives from the mammalian bird and bony fish classes where the genome sequences were available (second order polynomial, R2 = 0.78, y = 3.3 + 93x − 175x2). The broken line represents the same data modeled with an additional five bony fish species with known Se requirement levels (  ), but unannotated genomes. SEPP1 Sec content in these fish were assumed to be within the likely range of 15–17 Sec residues found for fish in general (5PL Asymmetric sigmoidal, R2 = 0.86, y = − 9.98 + (26.9/((1 + 10((−2.23397 − X) × 4.661))1.910)). Shaded boxes group animals within classes. The X axis is log transformed.

All fish annotated to date have SEPP1 (aka SEPP1a in fish) with 15 to 17 Sec residues (see Table S2). Based on this, an additional five bony fish species with known Se requirements were assumed to have SEPP1s with 17 Sec residues and added to the data set, which was then re-analysed. This resulted in an asymmetric sigmoidal trend with a plateau at 17.0 (Fig. 2), suggesting that a species SEPP1 is only useful for predicting Se requirements prior to this plateau (≤16 Sec residues). When a species SEPP1 has >16 Sec residues, as is found in many fish species, this curve predicts a minimum requirement (0.24 mg/Se kg dry matter (DM)) but not a maximum (there is no correlation between SEPP1 Sec content and Se requirements above this level). Modelling the data with alternative SEPP1 Sec content (15 or 16 Sec) for these five fish species shifts the plateau height towards those values, but retains the general features of the model. The asymmetric sigmoidal model (Fig. 2, segmented line) differs from the second order polynomial model (Fig. 2, solid line), which only predicts Se requirements for species with SEPP1s containing up to 15 Sec residues (0.20 mg/Se kg, Table 1).

Table 1:
The Se requirements (mg Se/kg DM) predicted by the model (Fig. 2, solid line) with changes in the selenocysteine (Sec) content of selenoprotein P (SEPP1).
Class Sec no. Predicted Se requirementa
?b 6 0.03 ± 0.03
Mammals 7 0.04 ± 0.03
8 0.06 ± 0.02
9 0.07 ± 0.02
10 0.09 ± 0.01
11 0.10 ± 0.02
12 0.12 ± 0.03
13 0.14 ± 0.04
14 0.17 ± 0.05
15 0.20 ± 0.04
Bony fish 16+ >0.20
DOI: 10.7717/peerj.1244/table-1


mg Se/kg feed DM, mean (±95% confidence interval, when shown).
There are currently no known species with full length SEPP1 containing 6 Sec residues.

The model (Fig. 2) demonstrates the broad range of Se requirements found for bony fish (0.25 to 5.56 mg Se/kg dry feed) that occurs over a small range of SEPP1 Sec contents (15 to 17 Sec residues). The reason/s for this are unknown. Limitations to increasing SEPP1 Sec content above 17 residues may have led fish to utilise regulatory mechanisms to increase Se supply to peripheral tissues. For example Sepp1 mRNA expression is elevated in fish, particularly in the kidneys, in comparison to mammals (Lobanov, Hatfield & Gladyshev, 2008). This suggests plasma SEPP1 in fish may be replenished by SEPP1 synthesised from Se scavenged in the kidneys. On the other hand, the single nucleotide mutation required to change a Sec to a Cys codon (Lobanov, Hatfield & Gladyshev, 2008) may have allowed mammals to decrease SEPP1 Sec content in line with Se requirements, resulting in the large range of SEPP1 Sec contents (7 to 15 Sec residues) found in mammals. The Se requirements versus SEPP1 Sec content in vertebrates predicted by the second order polynomial model (Fig. 2, solid line) are provided in Table 1.

It is essential to note that the correlation between SEPP1 Sec content and Se requirements does not prove causation. Another factor/s may be involved in the simultaneous increase in SEPP1 Sec content and Se requirements observed in this study, such as the environmental availability of Se. For example, within vertebrate classes, species with Sec poor SEPP1s are often found in habitats with lower background levels of Se. Both guinea pigs and naked mole rats (Heterocephalus glaber) have Sec poor SEPP1s (7 residues), low Se requirements (Jensen & Pallauf, 2008; Kasaikina et al., 2011) and inhabit the Andes or East Africa respectively, both regions of low Se status (FAO, 1992; Rachel et al., 2013). Freshwater habitats often have lower background levels of Se than marine habitats (Combs & Combs, 1986; Santos et al., 2015) and freshwater fish have on average less Sec in SEPP1 than marine fish (Table S2). Furthermore, SEPP1 appears to have originated in invertebrates, but thus far SEPP1 (along with greater number of selenoproteins), has only been found in invertebrates inhabiting marine environments (Lobanov, Hatfield & Gladyshev, 2009; Liang, Jiazuan & Qiong, 2012). Added to this, if a direct relationship does exist between SEPP1 Sec content and Se requirements, it is unclear which factor is causing the other.

Overall, we hypothesise that environmental Se availability was an evolutionary pressure to decrease Se utilisation as animals progressed from Se rich marine environments into fresh water and terrestrial habitats where environmental Se levels are generally lower. Selection then occurred for decreased Se utilisation (Se requirements), which resulted in decreased selection pressure on maintaining, and then decreases in, SEPP1 Sec number. The results were new species-specific equilibriums between environmental Se availabilities, Se requirements and SEPP1 Sec contents.

A hypothesis for the Sec number plasticity or conservation in different domains of vertebrate SEPP1

As discussed, most of the difference in the SEPP1 Sec content between species is a result of differences in the Sec content found upstream and including the APOER2 binding site within the C-domain of SEPP1 (Fig. 1 and Table S2). When we analysed the Sec content in this region in relation to a species Se requirement (Fig. S1), we found a similar positive correlation as found for full-length SEPP1 and Se requirements (Fig. 2), supporting this statement. Recently it was found that SEPP1 Sec residues closer to the C-terminal are translated with greater efficiency than those towards the N-terminal (Shetty, Shah & Copeland, 2014). Premature termination of SEPP1 translation at Sec codons appears to be a common event. For instance, four rat SEPP1 isoforms have been identified in plasma, whereby in addition to the full length protein, shorter variants are synthesised when translation is terminated at the second, third or seventh Sec codon (Ma et al., 2002). Thus, on average each plasma SEPP1 in mice contains 5 Sec residues, not the 10 Sec residues expected if only the full length protein is present (Hill et al., 2007). As a consequence of this, a proportion of translated SEPP1 proteins will not contain the APOER2 binding site (Fig. 1).

Thus, as discussed we hypothesise that decreases in Se requirements are an evolutionary adaption to Se availability. Secondly, we hypothesise that the Se requirements of the brain among species is similar on a weight basis, despite differences in the Se requirements of the whole body. For instance, compared to mice, naked mole rats have lower levels (−30 to −75%) of Se in most tissues except the brain (Kasaikina et al., 2011). And lastly, low Se availability can stall translation of selenoproteins at Sec codons (Weiss Sachdev & Sunde, 2001), and may be a reason for the truncated forms of SEPP1 translated in vivo. Thus, Sec to Cys substitutions in SEPP1 may have occurred specifically in the region downstream and including the APOER2 binding site as it aids the translation of full-length protein under Se limiting conditions, such as those faced by naked mole rats and guinea pigs. The subsequent retention of the APOER2 binding site would allow the continuation of a controlled Se supply to critical organs, such as the brain, that utilise APOER2 mediated uptake of SEPP1.


The Sec content of SEPP1 correlates with Se requirements in vertebrates with ≤15 Sec residue SEPP1s. No correlation occurred between SEPP1 Sec content and Se requirements for species with >15 Sec residue SEPP1s; however, a minimum Se requirement of 0.20 mg Se/kg DM for these species was predicted. This study suggests that genome evolution is affected directly by nutrient availability in the environment, and provides novel evidence that the genomic sequence can be used to predict a nutrient requirement.

Supplemental Information

(Raw data equivalent). The SEPP1 Sec content, selenium requirements, and the biomarkers, statistical methods and the selenium species used to assess the selenium requirements of species included in this study

Abbreviations; Sec, selenocysteine, SEPP1, Selenoprotein P; TXNRD, thioredoxin reductase; GPX, glutathione peroxidase; BLR, Broken line regression; Na2SeO4, sodium selenate; Na2SeO3, sodium selenite; Se-yeast, selenoyeast; NaHSeO3, sodium hydride selenite; SeMet, selenomethionine. * Methods utilised to analyse tissue GPX activity are unable to distinguish between isoforms, so are listed as total GPX activity. However, in mammals GPX1 is responsible for the majority of total GPX activity (Brigelius-Flohe et al., 2002) 1 The authors of the guinea pig study state a Se requirement of 0.08 mg Se/kg DM, which includes a safety margin above the 0.06 mg Se/kg DM predicted with BLR. 2 Data from actively growing juvenile animals was utilised in preference to adults 3 Sec content of these species were based on closely related species (Gibel carp and loach are both cyprinids, as are common carp (Cyprinus carpo) and zebrafish which both have SEPP1 (SEPP1a) with 17 Sec residues) or on salt water fish (Both green spotted pufferfish (Tetraodon nigroviridis) and fugu (Takifugu rubripes) have SEPP1 with 17 Sec residues). Irrespective of this, the range of Sec residues found in fish SEPP is small, being 15 to 17 (Lobanov, Hatfield & Gladyshev, 2008) .

DOI: 10.7717/peerj.1244/supp-1

The amino acid sequences of SEPP1 (aka SEPP1a) in vertebrate species included in this study and closely related species (fish)

The total Sec (U) and the Sec content upstream and including the APOER2 binding site (E-CQC—-A; shaded in yellow) within the C-terminal domain (SEPP1←APOER2), and the region downstream of the APOER2 binding site (SEPP1APOER2) are also shown. * Sequence obtained at then searched for Sec and SECIS elements using**Sequences obtained from Lobanov, Hatfield & Gladyshev (2008). Abbreviations; SEPP1APOER2, Sec residues in the C-terminal domain of full length SEPP1in and upstream of the APOER2 binding site (E-CQC—-A; in fish may range between E-CQC–A to E-CQC—–A); SEPP1APOER2, Sec residues in full length SEPP1upstream of the APOER2 binding site.

DOI: 10.7717/peerj.1244/supp-2

The relationship between the selenocysteine content within specific domains of selenoprotein P and selenium requirements

The solid lines with the solid circles (●) is the best fit model for the number of Sec residues found upstream and including the APOER2 binding site in the C-terminal of SEPP1 versus the selenium requirements (mg Se/kg DM) in mammals and bony fish. The broken lines represents the same data modelled with an additional five bony fish species with known Se requirement levels (○), but unannotated genomes as described in Fig. 2. The solid line is linear, R2 = 0.82, y = 1 + 35x, while the dashed line is 5PL asymmetric sigmoidal, R2 = 0.92, y = − 6.54 + (17.5/((1 + 10((−1.75538−X)×5.851))2.99910)). X axis is log transformed.

DOI: 10.7717/peerj.1244/supp-3

Multiple sequence alignments for vertebrate SEPP1

Multiple sequence alignment of vertebrate SEPP1. Alignment performed with Jalview (Andrew et al.2009), using MUSCLE (Robert 2004) with standard settings. Sequences not listed in Table S2 were obtained from the selenoprotein database (, (Romagné et al., 2014)). MUSCLE does not recognize the amino acid symbol for selenocystiene (u), replacing u with x during alignment analysis. All x’s within the sequences represent selenocysteine residues. Animals included in the manuscript are provided in the top list of sequences, followed by the invertebrate Lottia gigantean sequence (Liang, Jiazuan & Qiong, 2012) used as an outgroup for building a phylogenetic tree, additional mammal sequences available on selenoDB (version 2.0) or the naked mole rat (Heterocephalus glaber, XP_004848622.1) genome database (

DOI: 10.7717/peerj.1244/supp-4

Phylogenetic tree of vertebrate SEPP1 Phylogenetic tree of vertebrate selenoprotein P

A UPGMA tree is displayed. Support values shown are the UPGMA bootstrap values. The tree was configured using standard settings in MEGA Ver. 6.06 (Koichiro et al.2013). Sequences were obtained as described in Fig. S3. Lottia gigantean was utilised as a non-vertebrate outgroup.

DOI: 10.7717/peerj.1244/supp-5