The relationship between cells and tissue microenvironments is a topic of vital importance for cancer biology. Because of rapid cellular proliferation and irregular vascularization, tumors often develop regions of hypoxia (Höckel & Vaupel, 2001). Tumor microenvironments also exhibit abnormal ranges of other physical-chemical variables, including hydration state (McIntyre, 2006; Abramczyk et al., 2014).
Some aspects of the complex metazoan response to hypoxia are mediated by hypoxia-inducible factor 1 (HIF-1). HIF-1 is a transcription factor that is tagged for degradation in normoxic conditions. Under hypoxia, the degradation of HIF-1 is suppressed; HIF-1 can then enter the nucleus and activate the transcription of downstream targets (Semenza, 2003). Indeed, transcriptional targets of HIF-1 are found to be differentially expressed in proteomic datasets for laboratory hypoxia (Cifani et al., 2011; McMahon et al., 2012). However, proteomic studies of cells in hypoxic conditions provide many examples of proteins that are not directly regulated by HIF-1 (McMahon et al., 2012; Fuhrmann et al., 2013), and cancer proteomic datasets also include many proteins that are not known to be regulated by HIF-1.
The complexity of the underlying regulatory mechanisms (McMahon et al., 2012) and the large differences between levels of gene expression and protein abundance (van den Beucken et al., 2011; Cifani et al., 2011; Ho et al., 2016) present many difficulties for a bottom-up understanding of global proteomic trends. As a counterpart to molecular explanations, a systems perspective can incorporate higher-level constraints (Drack & Wolkenhauer, 2011). A commonly used metaphor in systems biology is attractor landscapes. The basins of attraction are defined by dynamical systems behavior, but in many cases are analogous to minimum-energy states in thermodynamics (Emmeche, Koppe & Stjernfelt, 2000; Enver et al., 2009). Nevertheless, little attention has been given to the thermodynamic potential that is inherent to the compositional difference between the up-expressed and down-expressed proteins in proteomic experiments. Such a high-level perspective may require concepts and language that differ from those applicable to molecular interactions (Ellis, 2015).
To better understand the microenvironmental context for compositional changes, this study uses proteomic data as input into a descriptive thermodynamic model. First, a compositional analysis of differentially (up- and down-) expressed proteins identifies consistent trends in the oxidation and hydration states of proteomes of colorectal cancer (CRC), pancreatic cancer, and cells exposed to hypoxia or hyperosmotic stress. These results lay the groundwork for using a thermodynamic model to quantify environmental constraints on the potential for proteomic transformation. Finally, the Discussion section explores some implications of the hypothesis that elevated synthesis of lipids provides an electron sink for the oxidation of proteomes. In this situation, some cancer systems may develop an abnormally large redox disproportionation between pools of cellular biomacromolecules.
Tables 1–4 present the sources of data. Protein IDs and expression (up/down or abundance ratios) were found in the literature, often being reported in the supporting information (SI) or supplementary (suppl.) tables. In some cases, source tables were further processed, using fold-change and significance cutoffs that, where possible, are based on statements made in the primary publication. The data are stored as *.csv files in the R package canprot, which was developed during this study (see http://github.com/jedick/canprot) and is provided as Dataset S1.
|ΩcAⒶ||87||81||CIN C/Aa||ΩuAⒶ||55||68||CM T/Nb|
|ΩdAⒶ||157||76||MIN C/Aa||ΩvAⒶ||33||37||stromal T/Na|
|ΩeAⒶ||43||56||biomarkers up/down||ΩwAⒶ||51||55||chromatin-binding C/A|
|ΩfAⒶ||48||166||stage I/normalb||ΩxAⒶ||58||65||epithelial A/N|
|ΩgAⒶ||77||321||stage II/normalb||ΩyAⒶ||44||210||tissue secretome T/Na|
|ΩhAⒶ||61||57||microdissected T/Nb||ΩzAⒶ||113||66||membrane enriched T/N|
|ΩlAⒶ||63||131||stage III/normala||ΩDAⒶ||123||75||stromal AD/NCa|
|ΩmAⒶ||42||26||stage IV/normala||ΩEAⒶ||125||60||stromal CIS/NCa|
carcinoma or adenocarcinoma
adenomatous colon polyps
carcinoma in situ
invasive colonic carcinoma
non-neoplastic colonic mucosa
|ΩeAⒶ||28||29||T/N||ΩpAⒶ||208||219||T/N (no DM)a|
|ΩgAⒶ||207||152||FFPE T/Na||ΩrAⒶ||227||148||LCM PDAC/ANTc|
|ΩiAⒶ||38||47||FFPE T/Nc||ΩtAⒶ||35||51||mouse 2.5 w T/Na|
|ΩjAⒶ||78||57||T/Na||ΩuAⒶ||40||73||mouse 3.5 w T/Na|
|ΩkAⒶ||257||456||T/Na||ΩvAⒶ||49||84||mouse 5 w T/Na|
|ΩwAⒶ||37||108||mouse 10 w T/Na|
pancreatic ductal adenocarcinoma
adjacent normal tissue
|ΩbAⒶ||41||22||placental secretome||ΩlAⒶ||178||77||A431 Hx48||ΩwAⒶ||127||292||HepG2/C3A SPH|
|ΩdAⒶ||87||28||DU145a||ΩnAⒶ||48||36||A431 ReOx||ΩyAⒶ||137||64||U87MG and 786-O|
|ΩeAⒶ||29||21||SK-N-BE(2)c; IMR-32||ΩoAⒶ||141||64||SH-SY5Y||ΩzAⒶ||129||141||HCT116 transcriptiona|
|ΩfAⒶ||53||65||H9C2b||ΩpAⒶ||65||34||A431 Hx48-S||ΩAAⒶ||469||1024||HCT116 translationa|
|ΩgAⒶ||409||337||MCF-7 SPH P5||ΩqAⒶ||137||61||A431 Hx72-S||ΩBAⒶ||66||50||adipose-derived SCa|
|ΩhAⒶ||248||214||MCF-7 SPH P2||ΩrAⒶ||56||49||A431 ReOx-S||ΩCAⒶ||65||27||cardiomyocytes CoCl2a|
|ΩiAⒶ||48||52||SPH perinecrotica||ΩsAⒶ||74||44||A431 Hx48-P||ΩDAⒶ||35||69||cardiomyocytes SALa|
|ΩjAⒶ||101||186||SPH necrotica||ΩtAⒶ||67||53||A431 Hx72-P||ΩEAⒶ||116||225||HT29 SPH|
acute promonocytic leukemic cells
rat neuroblastoma cells
prostate carcinoma cells
- SK-N-BE(2)c; IMR-32; SH-SY5Y
rat heart myoblast
breast cancer cells
epithelial carcinoma cells
hypoxia 48 h
hypoxia 72 h
hypoxia 48 h followed by reoxygenation for 24 h
hepatocellular carcinoma cells
renal clear cell carcinoma cells
- HCT116; HT29
colon cancer cells
|ΩaAⒶ||38||44||S. cerevisiae VHG 2 ha||ΩnAⒶ||49||28||eel gilla|
|ΩbAⒶ||33||62||S. cerevisiae VHG 10 ha||ΩoAⒶ||78||77||S. cerevisiae t30ab|
|ΩcAⒶ||18||65||S. cerevisiae VHG 12 ha||ΩpAⒶ||67||67||S. cerevisiae t30bb|
|ΩdAⒶ||63||94||mouse pancreatic islets||ΩqAⒶ||87||87||S. cerevisiae t30cb|
|ΩeAⒶ||148||44||adipose-derived stem cells||ΩrAⒶ||25||38||IOBA-NHC|
|ΩfAⒶ||17||11||ARPE-19 25 mM||ΩsAⒶ||105||96||CAUCR succinate tr.a|
|ΩgAⒶ||21||24||ARPE-19 100 mM||ΩtAⒶ||209||142||CAUCR NaCl tr.a|
|ΩhAⒶ||114||61||ECO57 25 °C, aw 0.985a||ΩuAⒶ||33||33||CAUCR succinate pr.a|
|ΩiAⒶ||238||61||ECO57 14 °C, aw 0.985a||ΩvAⒶ||33||27||CAUCR NaCl pr.a|
|ΩjAⒶ||263||56||ECO57 25 °C, aw 0.967a||ΩwAⒶ||294||205||CHO alla|
|ΩkAⒶ||372||73||ECO57 14 °C, aw 0.967a||ΩxAⒶ||66||75||CHO higha|
|ΩlAⒶ||32||39||Chang liver cells 25 mM||ΩyAⒶ||14||28||Yarrowia lipolyticab|
|ΩmAⒶ||19||50||Chang liver cells 100 mM||ΩzAⒶ||160||141||Paracoccidioides lutziia|
very high glucose
human retinal pigmented epithelium cells
Escherichia coli O157:H7 Sakai
human conjunctival epithelial cells
Chinese hamster ovary cells
Sequence IDs were converted to UniProt IDs using the UniProt mapping tool (http://www.uniprot.org/mapping/) or the gene ID conversion tool of DAVID 6.7 (https://david.ncifcrf.gov/conversion.jsp). For proteins where the automatic conversions produced no matches, manual searches in UniProt were performed using the gene names or protein descriptions. If specified (i.e., as UniProt IDs with suffixes), particular isoforms of the proteins were used. Obsolete or secondary IDs reported for some proteins were updated to reflect current, primary IDs (uniprot_updates.csv in Dataset S1). Any duplicated IDs listed as having opposite expression ratios were excluded from the comparisons here.
Amino acid sequences of human proteins were taken from the UniProt human reference proteome. Sequences of proteins in other organisms and of human proteins not contained in the reference proteome were downloaded from UniProt or the NCBI website (for one study reporting GI numbers; see Table 4). Amino acid compositions were computed using functions in the CHNOSZ package (Dick, 2008) or the ProtParam tool on the UniProt website. The amino acid compositions are stored in *.Rdata files in Dataset S1.
R (R Core Team, 2016) and R packages canprot (this study) and CHNOSZ (Dick, 2008) were used to process the data and generate the figures with code specifically written for this study, which is provided in Dataset S2.
Measures of compositional oxidation and hydration state
Two compositional metrics that afford a quantitative description of proteomic data, the average oxidation state of carbon (ZC) and the water demand per residue (), are briefly described here.
The oxidation state of atoms in molecules quantifies the degree of electron redistribution due to bonding; a higher oxidation state signifies a lower degree of reduction. Although calculations of oxidation state from molecular formulas necessarily make simplifying assumptions regarding the internal electronic structure of molecules, such calculations may be used to quantify the flow of electrons in chemical reactions, and the oxidation state concept is useful for studying the transformations of complex mixtures of organic molecules. For example, calculations of the average oxidation state of carbon provide insight on the processes affecting the decomposition of carbohydrate, protein and lipid fractions of natural organic matter (Baldock et al., 2004). Moreover, oxidation state can be regarded as an ensemble property of organic systems (Kroll et al., 2015). See Dick (2016) for additional references where organic and biochemical reactions have been characterized using the average oxidation state of carbon.
Despite the large size of proteins, their relatively simple primary structure means that ZC can be computed using the elemental abundances in any particular amino acid sequence (Dick, 2014): (1) In this equation, c, h, n, o, and s are the elemental abundances in the chemical formula for a specific protein with total charge z. Note, however, that ionization by gain or loss of protons alters charge and the number of H equally, so has no effect on the value of ZC; for ease of computation, ZC is calculated here for proteins in their completely non-ionized forms.
In contrast to the elemental stoichiometry in Eq. (1), a calculation of the hydration state must account for the gain or loss of H2O. In the biochemical literature, “protein hydration” or water of hydration refers to the effective (time-averaged) number of water molecules that interact with a protein (Timasheff, 2002). These dynamically interacting molecules form a hydration shell that has important implications for crystallography and enzymatic function, but hydration numbers have been measured for few proteins and are difficult to compute, especially for the many proteins with unknown tertiary structure. Thus, the structural hydration of proteins identified in proteomic datasets generally remains unquantified.
A different concept of hydration state arises by considering the chemical components that make up proteins. A componential analysis is a method of projecting the composition of a molecule using specified chemical formula units as the components, or basis species. The notion of components is central to chemical thermodynamics (Gibbs, 1875); the choice of components determines the thermodynamic variables (chemical potentials), and a careful choice leads to more convenient representations of the compositional and energetic constraints on reactions (e.g. Zhu & Anderson, 2002).
The components, or basis species, consist of a minimum number of species whose compositions can be linearly combined to represent the composition of any protein. The 20 proteinogenic amino acids are together composed of five elements (C, H, N, O, S), so five basis species are needed to represent the primary sequences of proteins. As noted previously (see references in Dick, 2016), all possible combinations of basis species lead to thermodynamically consistent models, but are differently suited to making interpretations. Dick (2016) proposed using C5H10N2O3, C5H9NO4, C3H7NO2S, O2, and H2O as a basis for assessing compositional differences in proteomes. The first three formulas correspond to glutamine (Q), glutamic acid (E), and cysteine (C).
To account for protein ionization, a proton can be included in the basis, which is now referred to as “QEC+”. Using the QEC+ basis, the stoichiometric projection of a protein with formula , where z is the charge of the protein and h is the number of H in the fully nonionized protein, is represented by (R1) To compare the compositions of different-sized proteins, the stoichiometric coefficients in Reaction (R1) can be divided by the sequence length (number of amino acids) of the protein. The length-normalized coefficients, written with an overbar, include the per-residue water demand for formation of a protein (). This componential “hydration state” is used in this study, and should not be confused with the structural biochemical “protein hydration” mentioned above.
The primary reason for choosing the QEC+ basis instead of others lies in the relation of the compositional variables representing oxidation and hydration state ( and ) with each other and with ZC. It is important to note that ZC is a measure of oxidation state that is independent of the choice of basis species. Smoothed scatter plots of vs ZC and vs ZC are shown in Fig. S1 for the 21,006 human proteins in the UniProt reference proteome. The plots in the top row of this figure are made using the QEC basis (which is equivalent to the QEC+ basis for the plotted variables) while those in the bottom row are made using the basis species CO2, NH3, H2S, H2O, and O2; these inorganic species are often used to balance reactions in geochemical models. It is apparent from Fig. S1 that, using the QEC basis, is highly positively correlated with ZC, and shows a slight negative correlation with ZC. Accordingly, in the QEC basis, is a strong indicator of oxidation state, while represents a distinct compositional variable. In contrast, the plots in the bottom row of Fig. S1 show a moderate positive correlation between and ZC and a stronger negative correlation between and ZC. Using that basis would therefore weaken the interpretation of as an indicator of oxidation state and of as a distinct compositional variable. The relations among , , and ZC also vary between basis species consisting of different combinations of amino acids; those differences together with biological considerations support the choice of QEC instead of other amino acids (Dick, 2016).
In summary, Reaction (R1) is not a mechanism for protein synthesis, but is a projection of any protein’s elemental composition into chemical components, i.e., the basis. Compared to a basis composed of simpler inorganic species, the QEC+ basis reduces the projected codependence of oxidation and hydration state in proteins, unfolding a compositional dimension that can enrich a thermodynamic model.
The progression of colorectal cancer (CRC) begins with the formation of numerous non-cancerous lesions (adenoma), which may remain undetectable. Over time, a small fraction of adenomas develop into malignant tumors (carcinoma) (Jimenez et al., 2010; Wiśniewski et al., 2015). Publicly available datasets reporting a minimum of ca. 30 up- and 30 down-expressed proteins for tissue samples of CRC, and one meta-analysis of serum biomarkers, were compiled recently (Dick, 2016). These same datasets are listed in Table 1, with one newer addition (dataset ΩGAⒶ; Liu et al., 2016).
Many aspects of the experimental methods, statistical tests, and bioinformatics analyses used to identify significantly up-expressed and down-expressed proteins vary considerably among studies. The comparisons here are made without any control of this variability. Although particular comparisons may reflect study-specific conditions and methods, visualization of the chemical compositions of proteins for many datasets can reveal general features of the cancer phenotype.
For each dataset, Table 1 lists the numbers of down-expressed (n1) and up-expressed (n2) proteins in cancer relative to normal tissue. For datasets comparing different stages of cancer progression, groups n1 and n2 correspond to the down- and up-expressed proteins in the more advanced stage (e.g., carcinoma) compared to the less advanced stage (e.g., adenoma). Mean values of average oxidation state of carbon (ZC; Eq. (1)) and water demand per residue (; Reaction (R1)) were calculated for the up- and down-expressed groups of proteins, together with the corresponding mean differences (ΔZC and for the means of up- minus down-expressed groups), p-values, and effect sizes. These values are listed in Table S1. Figure S2 shows the mean values of ZC and for the up- and down-expressed proteins together in a single plot (lettered point symbols for down-expressed and arrowheads for up-expressed proteins). Because of the high variability of mean values among datasets, compositional trends between up- and down-expressed proteins are difficult to interpret using Fig. S2. Therefore, the differences in mean values between up- and down-expressed proteins (ΔZC and ) are plotted in this paper.
Figure 1A shows vs ΔZC for the CRC datasets. The gray boxes cover the range from −0.01 to 0.01 for each of the variables. To draw attention to the largest and most significant changes, filled points and dashed lines indicate mean differences with a p-value (Wilcoxon test) less than 0.05; solid lines indicate mean differences with a common language effect size (CLES) ≥60% or ≤40%. The common language statistic “is the probability that a score sampled at random from one distribution will be greater than a score sampled from some other distribution” (McGraw & Wong, 1992). Here, CLES is calculated as the percentage of pairings of individual proteins with a positive difference in ZC or between the up- and down-expressed groups from all possible pairings between the groups. Point symbols are squares if the p-values for both ZC and are less than 0.05, or circles otherwise.
The plot illustrates that proteins up-expressed in carcinoma relative to normal tissue most often have significantly higher ZC [ΩgAⒶ ΩkAⒶ ΩlAⒶ ΩnAⒶ ΩpAⒶ ΩrAⒶ ΩsAⒶ ΩuAⒶ ΩvAⒶ ΩlAⒶ], [ΩeAⒶ ΩoAⒶ ΩtAⒶ ΩxAⒶ ΩyAⒶ ΩDAⒶ ΩGAⒶ ΩHAⒶ], or both [ΩqAⒶ ΩAAⒶ ΩCAⒶ] (see also Dick, 2016). The red points in the plot highlight the datasets for adenoma/normal comparisons [ΩiAⒶ ΩoAⒶ ΩxAⒶ ΩAAⒶ ΩDAⒶ ΩHAⒶ]. Most of these exhibit a significant positive but not the large increase in ZC found for many of the carcinoma/normal comparisons.
Many proteomic studies have been performed to investigate the differences between normal pancreas (NP) and pancreatic adenocarcinoma (PDAC). Proteomic studies also address the inflammatory conditions of autoimmune pancreatitis, which is sometimes misidentified as carcinoma (Paulo et al., 2013), and chronic pancreatitis, which is associated with increased cancer risk (Chen et al., 2007). Searches for proteomic data were aided by the reviews of Pan et al. (2013) and Ansari et al. (2014). Table 2 lists selected datasets reporting at least ca. 25 up-expressed and 25 down-expressed proteins.
The compositional comparisons in Fig. 1B show that up-expressed proteins in pancreatic cancer often have significantly higher ZC [ΩbAⒶ ΩeAⒶ ΩgAⒶ ΩiAⒶ ΩoAⒶ ΩpAⒶ ΩqAⒶ ΩrAⒶ]. A dataset obtained for pancreatic cancer associated with diabetes mellitus (Wang et al., 2013a) [ΩqAⒶ] has both significantly higher ZC and . Only one dataset, from a study that targeted accessible proteins (Turtoi et al., 2011) [ΩhAⒶ], is characterized by a large negative mean difference of ΔZC. Some other datasets that do not have significantly different ZC exhibit higher in cancer compared to non-cancerous (normal or pancreatitis) tissue [ΩaAⒶ ΩjAⒶ ΩkAⒶ ΩmAⒶ ΩuAⒶ]. Two of the four datasets with negative [ΩdAⒶ ΩhAⒶ ΩnAⒶ ΩsAⒶ] were obtained from studies of chronic pancreatitis (Chen et al., 2007) or low-grade tumors (Wang et al., 2013b) (red points in Fig. 1B); another used a procedure to isolate accessible proteins (Turtoi et al., 2011) [ΩhAⒶ], while the remaining low- dataset [ΩsAⒶ] may be an outlier in terms of mean chemical composition (Fig. S2). Therefore, the datasets with positive and/or ΔZC likely reflect a general characteristic of pancreatic cancer.
Hypoxia and 3D culture
Hypoxia refers to oxygen concentrations that are lower than normal physiological levels. Hypoxia is a factor in many pathological conditions, including altitude sickness, stroke, and cardiac ischemia (e.g., Datta et al., 2010; Li et al., 2012; Fuhrmann et al., 2013). In tumors, irregular vascularization and abnormal perfusion contribute to the formation of hypoxic regions (Höckel & Vaupel, 2001). A related situation is the growth in the laboratory of 3D cell cultures (e.g., tumor spheroids), instead of two-dimensional growth on a surface. In 2D monolayers, all cells are exposed to the gas phase, but interior regions of 3D cultures are often diffusion-limited, leading to oxygen deprivation and necrosis (McMahon et al., 2012). There are some overlaps, but also many differences, between gene expression in 3D culture and hypoxic conditions (DelNero et al., 2015). These studies emphasize that growth in 3D culture is associated with heterogeneous oxygen concentrations and have found an interdependence between the effects of hypoxia and 3D growth on gene expression. The proteomic changes likely reflect not only oxygen limitation but also other processes connected with 3D growth (e.g., nutrient deprivation, extracellular architecture, and even light penetration). Although the comparisons made here do not address these individual factors, they do provide information on whether hypoxia and 3D culture lead to similar changes in the overall chemical composition of proteomes.
Table 3 lists selected proteomic datasets with a minimum of ca. 20 up- and 20 down-expressed proteins in hypoxia or 3D growth. The differences in chemical composition of the differentially expressed proteins are plotted in Fig. 2A. In many experiments, hypoxia or 3D growth induces a proteomic transformation with a significant and/or large decrease of ZC [ΩaAⒶ ΩbAⒶ ΩcAⒶ ΩgAⒶ ΩhAⒶ ΩjAⒶ ΩmAⒶ ΩoAⒶ ΩwAⒶ ΩAAⒶ ΩEAⒶ]. These datasets cluster around a narrow range of ΔZC (−0.032 to −0.021), except for dataset ΩEAⒶ (3D growth of colon cancer cells) with much lower ΔZC. As extracellular proteins have relatively high ZC (Dick, 2014), the observation in some experiments that hypoxia decreases the abundance of proteins associated with the extracellular matrix (ECM) (Blankley et al., 2010) is compatible with the overall expression of more reduced (low- ZC) proteins. Conversely, reoxygenation leads to the formation of more oxidized proteins in the supernatant (-S) and pellet (-P) fractions of isolated chromatin [ΩrAⒶ ΩuAⒶ].
While most studies controlled gas composition to generate hypoxia, two datasets [ΩCAⒶ ΩDAⒶ] are from a study that used cobalt chloride (CoCl2) to induce hypoxia in rat cardiomyocytes; treatment with salidroside (SAL) had anti-hypoxic effects (Xu et al., 2016). The CoCl2 and SAL treatments result in the expression of somewhat more reduced and more oxidized proteins, respectively, in agreement with the general trends for hypoxia and reoxygenation experiments.
Two datasets oppose the general trends, showing large and significantly higher ZC under hypoxia. These datasets were obtained using particular analytical methods or cell types. One of the nonconforming datasets is for the supernatant in a chromatin isolation procedure [ΩpAⒶ], and the other is for adipose-derived stem cells [ΩBAⒶ] (see below).
By hyperosmotic stress is meant a condition that increases the extracellular hypertonicity, or osmolality. The addition of osmolytes (or “cosolvents”) lowers the water activity in the medium (Timasheff, 2002). Equilibration with hypertonic solutions drives water out of cells, causing cell shrinkage. The selected datasets listed in Table 4 include at least ca. 20 up-expressed and 20 down-expressed proteins in response to high concentrations of NaCl (five studies), glucose (six studies), succinate (one study), KCl (one study), or adaptation to seawater (one study). The proteomic analyses used bacterial, yeast, or mammalian cells, or fish (eel) gills (Tse et al., 2013). One study varied temperature along with NaCl concentration (Kocharunchitt et al., 2012), and one study reported both transcriptomic and proteomic ratios (Kohler et al., 2015).
In the study of Giardina, Stanley & Chiang (2014) [ΩoAⒶ ΩpAⒶ ΩqAⒶ], the reported expression ratios for extracellular proteins after transfer from low glucose to high glucose media are nearly all less than 1. Therefore, the “up-expressed” proteins in the comparisons here are taken to be those that have a higher expression ratio than the median in a given experiment. To achieve a sufficient sample size using data from Chen et al. (2015) [ΩrAⒶ], the comparisons here use a combined set of proteins, i.e., those identified to have the same direction of change in the two treatment conditions (380 and 480 mOsm NaCl) and a significant change in at least one of the conditions.
Figure 2B shows that hyperosmotic stress strongly (CLES ≤40%) and/or significantly (p-value < 0.05) induces the formation of proteins with relatively low water demand per residue in 11 datasets [ΩaAⒶΩbAⒶ ΩdAⒶΩfAⒶΩiAⒶ ΩmAⒶΩsAⒶΩtAⒶΩuAⒶΩvAⒶΩzAⒶ]. Five of these datasets, including four for bacteria [ΩsAⒶΩtAⒶΩuAⒶΩvAⒶ] and one for human cells [ΩmAⒶ], also show an increase in ZC. These trends are found in both the transcriptomic [ΩsAⒶΩtAⒶ] and proteomic [ΩuAⒶ ΩvAⒶ] data from the study of Kocharunchitt et al. (2012).
Four datasets obtained for mammalian cells have low ΔZC with no significant [ΩrAⒶΩwAⒶΩxAⒶ] or a significantly negative mean difference of [ΩfAⒶ]. Six datasets [ΩhAⒶΩkAⒶΩnAⒶΩoAⒶΩpAⒶΩqAⒶ] from one study each of yeast and E. coli, and of Japanese eels adapted to seawater, have very small mean differences in ZC and a negative that follows the trends of most of the other datasets, but with lower significance (p-value > 0.05).
The comparisons here show that hyperosmotic stress consistently induces the formation of proteins with lower water demand per residue. In some, but not all, cases, this coincides with an increase in average oxidation state of carbon. Less often, and perhaps specific to mammalian cells, the proteomic composition is shifted toward lower oxidation state of carbon. There are only a couple of datasets, using NaCl treatment [ΩeAⒶΩjAⒶ], that show an increase in water demand per residue.
Notably, two datasets for adipose-derived stem cells oppose the general trends for hypoxic and hyperosmotic conditions (see Fig. 2A [ΩBAⒶ] and Fig. 2B [ΩeAⒶ]). This intriguing result shows that these stem cells respond to external stresses with proteomic transformations that are chemically similar to those in cancer (Fig. 1).
The correlations of compositional differences (negative ΔZC and ) with hypoxia and hyperosmotic stress can be proposed as resulting from attraction of the proteomes to a context-specific low-energy state. Thermodynamic models can help to illuminate the possible microenvironmental constraints on the observed proteomic transformations. Here, the chemical affinities of stoichiometric formation reactions of proteins were calculated, grouped, and compared in order to estimate the thermodynamic potential for the overall process of proteomic transformation.
The chemical affinity quantifies the potential, or propensity, for a reaction to proceed. It is the infinitesimal change with respect to reaction progress of the negative of the Gibbs energy of the system. The chemical affinity is numerically equal to the “non-standard” or actual (Warn & Peters, 1996), “real” (Zhu & Anderson, 2002), or “overall” (Shock, 2009) negative Gibbs energy of reaction. These energies are not constant, but vary with the chemical potentials, or chemical activities, of species in the reaction. Chemical activity (a) and potential (μ) are related through μ = μ∘ + RTlna, where the standard chemical potentials of particular species (μ∘ = G∘, i.e., standard Gibbs energies) depend only on temperature and pressure.
The equilibrium constant (K) for a reaction is given by ΔG∘ = − 2.303RTlogK, where ΔG∘ is the standard Gibbs energy of the reaction, 2.303 stands for the natural logarithm of 10, R is the gas constant, T is temperature in Kelvin, and log denotes the decadic logarithm. The equation used for affinity (A) is A = 2.303RTlog(K∕Q), where Q is the activity quotient of the reaction (e.g., Helgeson, 1979, Eq. 11.27; Warn & Peters, 1996, Eq. 7.14; Shock, 2009). Accordingly, the per-residue affinity of Reaction (R1) can be written as (2) where the abbreviations of the amino acids have been substituted for their formulas. Here, a and f stand for chemical activity and fugacity (e.g., aH2O is water activity, and fO2 is oxygen fugacity). The fugacity, rather than activity, of O2 is used because gaseous oxygen is the reference state most commonly used in previous thermodynamic models. If aO2 were used instead, its values would differ from fO2 according to the solubility of oxygen in water at the given temperature but otherwise the two models would be thermodynamically equivalent. The overbar notation ( and ) signifies that the coefficients in Reaction (R1) are each divided by the length (number of amino acids) of the protein sequence. Likewise, the elemental composition and standard Gibbs energy per residue are those of the ionized protein (with formula ) divided by the length of the protein.
The standard Gibbs energies of species at 37 °C and 1 bar were calculated with CHNOSZ (Dick, 2008) using equations and data taken from Wagman et al. (1982) and Kelley (1960) (), Johnson, Oelkers & Helgeson (1992) and references therein (H2O), and using the Helgeson–Kirkham–Flowers equations of state (Helgeson, Kirkham & Flowers, 1981) with data taken from Amend & Helgeson (1997) and Dick, LaRowe & Helgeson (2006) (amino acids), and from Dick, LaRowe & Helgeson (2006) and LaRowe & Dick (2012) (amino acid group additivity for proteins).
In previous calculations, activities of the amino acid basis species and protein residues were set to 10−4 and 100, respectively (Dick, 2016). As long as constant total activity of residues is assumed, the specific value does not greatly affect the outcome of the calculations; here it is kept at 100. Revised activities of the amino acid basis species, corresponding to mean concentrations in human plasma (Tcherkas & Denisenko, 2001), are used here: 10−3.6 (cysteine), 10−4.5 (glutamic acid) and 10−3.2 (glutamine). Adopting these activities of basis species, instead of 10−4, lowers the calculated equipotential lines for proteomic transformations by about 0.5 to 1 logaH2O (see below). Accounting for protein ionization, with pH set to 7, also lowers the equipotential lines, by about 1 logaH2O compared to calculations for nonionized proteins.
It follows from Eq. (2) that varying the fugacity of O2 and activity of H2O alters the chemical affinity for formation of proteins by a specific amount depending on their chemical composition. For example, Figure 5A of Dick (2016) shows that decreasing logfO2 is relatively more favorable for the formation of up-expressed than down-expressed proteins in a particular cancer dataset (Knol et al., 2014; ΩwAⒶ in Table 1). This tendency is consistent with the lower ZC of these up-expressed proteins, which is unlike most other datasets for CRC (Fig. 1A).
How can the affinities of groups, rather than individual proteins, be compared? One method is based on differences in the ranks of chemical affinities of proteins between groups (Dick, 2016). Using this method, the affinities of all of the proteins in a dataset are ranked; the ranks are then summed for proteins in the up- and down-expressed groups (rup and rdown). Before taking the difference, the ranks are multiplied by a weighting factor to account for the different numbers of proteins in the groups (n = nup + ndown). This weighted rank difference (WRD) of affinity summarizes the estimates of the differential potential for formation: (3)
On a contour diagram of the WRD of affinity (referred to here as a “potential diagram”), the line of zero WRD represents a rank-wise equal affinity (or “equipotential line”) for formation of proteins in the two groups.
To characterize the general trends, diagrams were made for groups of proteomic datasets with similar compositional features. For pancreatic cancer, there are 11 datasets with ΔZC > 0.01 (i.e., to the right of the gray box in Fig. 1B) and for which the mean difference of is neither significant (low p-value) nor large (high CLES). Conversely, there are 8 datasets for pancreatic cancer with and for which the mean difference of ZC is neither large nor significant. Similarly, weighted rank-difference diagrams were constructed for 13 (ΔZC > 0.01) and 10 () datasets for CRC, 8 datasets for hypoxia (ΔZC < − 0.01), and 12 datasets for hyperosmotic stress (). The individual diagrams for each of these groups are presented in Fig. S3.
In order to observe the central tendencies among the various datasets, the potential diagrams for each group in Fig. S3 were combined by taking the arithmetic mean of the WRD at all grid points in logfO2–logaH2O space. The resulting diagrams (Fig. 3) have equipotential lines, shown in white, and zones of positive and negative WRD of affinity, i.e., greater relative potential for formation of up- and down-expressed groups of proteins, colored red and blue, respectively.
The solid black lines in Fig. 3 show the median position along the x- or y-axis for the equipotential lines in each group (Fig. S3), and the dashed black lines are positioned at the 1st and 3rd quartiles. The interquartile ranges for the cancer groups are smaller than those for hypoxia, but less so for hyperosmotic stress. The smaller range would be expected if the cancer datasets reflected a somewhat narrower set of conditions than the datasets for experiments with hypoxia; the latter represent a wide variety of organisms, cell types, and laboratory conditions (Table 3).
Calculations of the average oxidation state of carbon and water demand per residue, derived from elemental stoichiometry, provide information on the microenvironmental factors affecting differential protein expression in cancer and laboratory experiments. Hypoxia or hyperosmotic stress generally induces the expression of proteins with lower overall oxidation state of carbon or lower water demand per residue, respectively, compared to down-expressed proteins. In contrast, proteomes of CRC and pancreatic cancer are often characterized by greater water demand per residue or oxidation state of carbon. The formation of more highly oxidized proteins despite the hypoxic conditions of many tumors hints at a complex set of microenvironmental–cellular interactions in cancer.
Plots of data from experiments with hypoxia and hyperosmotic stress illuminate two dimensions of possible compositional attraction to a low-energy state (Fig. 2). A thermodynamic model quantifies the altered potential for proteomic transformation in response to changing oxygen fugacity and water activity. The equipotential lines for cancer proteomes with high differential water demand lie between logaH2O = − 1 to −3, while the potential threshold for transformation of proteomes in hyperosmotic stress is closer to unit activity of water (logaH2O = − 0 to −2) (Figs. 3D–3F). Although there is considerable variability among the individual datasets (Fig. S3), the merged diagrams demonstrate a physiologically realistic range for the activity of water. Water activity in cells is close to one, but restricted diffusion of H2O in “osmotically inactive” regions of cells (Model, 2014) could result in locally lower water activities. The present findings provide evidence that the molecular processes regulating proteomic transformations operate within the chemical constraints of subcellular regions of depleted water activity.
The finding of a frequently positive water demand for the transformation between normal and cancer proteomes offers a new perspective on the biochemistry of hydration in cancer. The thermodynamic calculations predict that, in contrast to hyperosmotic stress, proteomes of cancer tissues are stabilized by increasing water activity. A higher than normal water activity would be consistent with the greater hydration of tissue that is apparent in spectroscopic analysis of breast cancer tissue (e.g., Abramczyk et al., 2014). Speculatively, the relatively high water content needed for embryonic development (Moulton, 1923) could be recreated in cancer cells if they revert to an embryonic mode of growth (McIntyre, 2006).
The equipotentials for transformation of proteomes in cancer cluster near an oxygen fugacity of ca. 10−68 to 10−66. The oxygen fugacity should be interpreted not as actual oxygen concentration, rather as a internal scale of oxidation potential. Oxygen fugacity and water activity can be converted to the Eh scale for redox potential, giving values that are comparable to other biochemical measurements (Dick, 2016).
Although cancer proteomes are obtained from tissues that are likely derived from hypoxic tumor environments, their differential expression is most often in favor of oxidized proteins (Figs. 1A and 1B). What are some explanations for this finding? Perhaps the relatively high logfO2 threshold for chemical transformation of hypoxia-responsive proteins could support a buffering action that potentiates the formation of relatively oxidized proteins in cancer (compare the median and quartiles in Fig. 3C with those in Figs. 3A and 3B). This speculative hypothesis requires a division of the cellular proteome into localized, chemically interacting subsystems. Alternatively, the development of a high oxidation potential in cancer cells may be associated with a higher concentration of mitochondrially produced reactive oxygen species (ROS). Neither of these possibilities addresses the magnitude of the chemical differences in the proteomes, and the question remains: where do the electrons go?
A plausible hypothesis comes from considering the different oxidation states of biomolecules. Fatty acids are reduced compared to amino acids, nucleotides, and saccharides (Amend et al., 2013). In parallel with the formation of more reduced proteins, hypoxia induces the accumulation of lipids in cell culture (Gordon, Barcza & Bush, 1977). Cancer cells are also known for increased lipid synthesis. Lipid droplets, which are derived from the endoplasmic reticulum (ER), form in great quantities in cancer cells (Koizume & Miyagi, 2016). Assuming that lipids are synthesized from relatively oxidized metabolic precursors, their formation requires a source of electrons. These considerations lead to the hypothesis that increased lipid synthesis is coupled to the oxidation of the proteome.
Calculations that combine proteomic and cellular data can be used to quantify a hypothetical redox balance between cellular lipids and proteins. The major assumptions in the calculations here are that the overall cellular oxidation state of carbon is the same in cancer and hypoxia, and that changes in this cellular oxidation state are brought about by altering only the numbers of lipid and protein molecules. The overall chemical composition of the lipids is assumed to be constant, but the proteins are assigned different values of ZC. These simplifying assumptions are meant to pose quantifiable “what if” questions, to serve as points of reference about the range of molecular composition of cells (Milo & Phillips, 2015).
The worked-out calculation is shown in Fig. 4. The lipid:protein ratio in hypoxia is taken from Gordon, Barcza & Bush (1977), and ballpark values for the differences in ZC of proteins in hypoxia and cancer are from the present study. Notably, the lipid:protein weight ratio in hypoxia (0.19) is higher than in normal cells (i.e., 0.15 using data from Gordon, Barcza & Bush, 1977 or 0.16 using data compiled by Milo & Phillips, 2015 for E. coli). The calculation indicates that an increase of the lipid:protein weight ratio in cancer cells by ca. 20% over that in hypoxic normal cells could provide an electron sink that is large enough to take up the electrons released by oxidation of the proteome in hypoxic normal cells to generate that in hypoxic cancer cells. That proteomic transformation is quantified here by an increase of ΔZC from ca. −0.03 to 0.03, both relative to non-hypoxic normal cells (Fig. 4).
As found by Raman spectroscopy, levels of both lipids and proteins are elevated in colorectal cancer (Stone et al., 2004). Lipid droplets are formed extensively in CRC stem cells (Tirinato et al., 2015), suggestive of a higher lipid:protein ratio than either cancer or normal epithelial cells. In contrast to CRC, lipids are decreased in breast cancer compared to normal breast tissue (Frank, McCreery & Redd, 1995; Stone et al., 2004). Given a lower lipid content, and therefore smaller electron sink, one might expect that proteomes in breast cancer are oxidized to a lesser extent than those in CRC and pancreatic cancer. Other factors that affect the systemic redox balance, such as a more reduced gut microbiome in CRC (Dick, 2016) and metabolic coupling between epithelial and stromal cells, may be important for an accurate account of the compositional relationships among biomacromolecules.
These compositional and thermodynamic analyses support the notion that changes in bulk chemical composition of cells and the microenvironment have a significant role in shaping the differential expression of proteins. The analysis done here is primarily concerned with top-down causal factors (physical constraints on protein synthesis and degradation), but does not preclude a major role for bottom-up factors (e.g., regulation of gene expression). Speculatively, further applications of these methods could be used to predict the ability of chemotherapy or other treatments to reduce or reverse the potential for formation of the proteins required by cancer cells. Based on the current findings, a decreased proteomic oxidation and/or hydration state may emerge as one aspect of beneficial treatments.
This approach to the data differs from conventional interpretations of proteomic data that are based on the functions of proteins. Nevertheless, the scope of explanations dealing with functions and molecular interactions offers limited insight on the high-level organization of proteomes in a cellular and microenvironmental context. Although a variety of bioinformatics tools are available for functional interpretations (Laukens, Naulaerts & Berghe, 2015), none so far addresses the overall chemical requirements of proteomic transformations. The compositional and thermodynamic descriptions presented here encourage a fresh look at the question, “What is cancer made of?”
Although many hypoxia experiments induce the formation of proteins with lower oxidation state of carbon (ZC), the up-expressed proteins in colorectal and pancreatic cancer are often relatively oxidized compared to the down-expressed ones. Hyperosmotic stress in the laboratory leads to the formation of proteins with relatively low water demand per residue (), but cancer proteomes often show the opposite trend, with up-expressed proteins having higher average than down-expressed ones.
The global proteomic differences can be described as compositional changes in terms of chemical basis species and quantified in a thermodynamic framework. A positive thermodynamic potential for each proteomic transformation is predicted in a specific range of oxidation and hydration potential. However, the distribution of biomolecules other than proteins should also be considered to account for changes in cellular redox balance. An electron sink associated with a ca. 20% greater lipid to protein ratio in cancer compared to normal hypoxic cells would be sufficient to balance the electrons released by the formation of more oxidized proteins in CRC and pancreatic cancer. It thus appears possible that a redox disproportionation develops in some cancers, leading to pools of both more reduced and more oxidized macromolecules compared to normal conditions.