Systematic review and meta-analysis of 50 years of coral disease research visualized through the scope of network theory

Coral disease research encompasses five decades of undeniable progress. Since the first descriptions of anomalous signs, we have come to understand multiple processes and environmental drivers that interact with coral pathologies. In order to gain a better insight into the knowledge we already have, we explored how key topics in coral disease research have been related to each other using network analysis. We reviewed 719 papers and conference proceedings published from 1965 to 2017. From each study, four elements determined our network nodes: (1) studied disease(s); (2) host genus; (3) marine ecoregion(s) associated with the study site; and (4) research objectives. Basic properties of this network confirmed that there is a set of specific topics comprising the majority of research. The top five diseases, genera, and ecoregions studied accounted for over 48% of the research effort in all cases. The community structure analysis identified 15 clusters of topics with different degrees of overlap among them. These clusters represent the typical sets of elements that appear together for a given study. Our results show that while some coral diseases have been studied considering multiple aspects, the overall trend is for most diseases to be understood under a limited range of approaches, e.g., bacterial assemblages have been considerably studied in Yellow and Black band diseases while immune response has been better examined for the aspergillosis-Gorgonia system. Thus, our challenge in the near future is to identify and resolve potential gaps in order to achieve a more comprehensive progress on coral disease research.


Introduction 1
Coral diseases have been an important factor responsible for the decline of coral reefs in some 2 areas in the last decades (Rogers and J. Miller 2013). Although pathogens and diseases are part 3 of the natural dynamics of ecosystems, including coral reefs, the interaction with other stressful 4 environmental factors aggravates their negative effects (Ban et al. 2014), enhancing important losses 5 of live coral cover (Lewis et al. 2017;Precht et al. 2016;Randall and Woesik 2015).This loss of coral 6 cover has been particularly important in the Caribbean, which has been frequently called a "coral Lafferty 2004 analyzed the frequency of coral disease papers as a potential proxy of a coral diseases 18 incidence. Finally, Ban et al. 2014 reviewed the experimental research about at least two stressors 19 simultaneously using network theory. These types of syntheses, although scarce, provide important 20 benefits over narrative reviews such as offering a wide and more objective perspective of the research 21 landscape, integrating the trends in subfields of a discipline, and potentially identifying research 22 gaps and opportunities for new questions to be explored (Lortie 2014). 23 Here we present a systematic review aimed at identifying groups of coral disease research topics 24 that have been explored more frequently than others. To do this, we performed a network analysis 25 approach. Network analysis is frequently used in systematic reviews (Borrett et al. 2014;Ohniwa 26 et al. 2010) and allows researchers to address multiple issues in coral disease research, such as the 27 evaluation of phage-bacteria interactions (Soffer et al. 2014) synergistic effects of environmental 28 stressors (Ban et al. 2014), the analysis of microbial positive and negative interactions in healthy 29 and diseased conditions (Sweet and Bulling 2017;Meyer et al. 2016), and the analysis of gene 30 expression and regulation (Wright et al. 2015). We hypothesized that if the research topics typically 31 addressed in coral research lack uniformity among several key aspects of epizootiology, then a 32 network representing the co-apparition of these topics in coral disease research papers would exhibit 33 a community structure, where the communities (also called clusters) of nodes would represent the 34 different themes that have characterized most of coral disease research in the last 50 years.

36
Data acquisition 37 We reviewed a total of 719 publications spanning a period from 1969 to 2017. We looked for these in 38 the search engines and databases Google scholar, Meta, and Peerus, using combinations of keywords 39 with search modifiers, including "coral disease", "syndrome", "yellow", "black", "purple", "spot", 40 "band", "pox", "dark", "plague", "growth", "trematode", "anomalies", "ciliate", "soft coral", and 41 "aspergillosis". This included peer-reviewed papers and conference proceedings, since the latter would 42 also provide information about research questions explored in a given time and a given location. 43 We excluded thesis, preprints, and book chapters. Additionally, we compared our database with questions, each one corresponding to one node category:

49
• What was the studied disease? Each disease was listed as an individual node and the 50 cases where there was no specific disease of interest (e.g. general surveys), we assigned the node 51 "multiple diseases". Additionally, we classified "White Syndrome" as all the descriptions of 52 pathologies involving tissue loss from the Pacific sensu (Bourne et al. 2015). We applied similar 53 criteria and classified "Pink syndrome" as the references to Pink spots, Pink line syndrome,

54
Pink-Blue syndrome, and Pink-Blue spot syndrome, as these diseases have not been clearly 55 distinguished from one another. We excluded papers specifically concerning thermally-induced 56 bleaching but included the node "bleaching effects" for papers investigating the effects of 57 bleaching over some disease topic or vice versa.

58
• Where did the samples come from? Or where was the study conducted? We 59 reported the corresponding marine ecoregion sensu (Spalding et al. 2007). When no explicit 60 sampling site was stated, we contacted the respective authors to verify the location.

61
• What was the genus of the affected host? If the study implicated multiple specific 62 genera, each one was listed as an individual node. In the cases of baselines and similar studies 63 which lacked specific taxa of interest, we assigned the node "multiple genera". Using the extracted topics as vertices and their co-occurrence in a given paper as the edges, we 74 built an undirected weighted network. The weight w ij was the frequency of co-occurrence of the 75 keywords i and j in the same article (Fig. 2). The network was constructed and analyzed using the 76 R package igraph (Csardi and Nepusz 2006;R Core team 2016). Considering the size and complexity 77 of the resulting graph, we used the network reduction algorithm proposed by Serrano et al. 2009, 78 implemented in the package disparityfilter for R (Bessi 2015) to extract the backbone of our network.

79
This resulted in a smaller graph that retains the multiscale properties of the original network.

80
Next, we used the link communities approach (Ahn et al. 2010) to obtain groups of nodes forming 81 closely connected groups (hereafter, referred as communities), using the linkcomm package (Kalinka 82 and Tomancak 2011). With this method overlapping communities may appear, which allows for 83 several nodes to be part of multiple communities (the scripts for the used functions with their 84 modifications are available at https://github.com/luismmontilla/CoDiRNet. We explored the 85 similarity among the obtained communities, representing them as a new network of communities 86 where each community was a node, and the edges had the value of the Jaccard coefficient for 87 the number of shared nodes (hence, communities without shared topics would be disconnected).

88
Additionally, we used the community centrality (Kalinka and Tomancak 2011) as a measure of the 89 importance of each node within their respective communities.

90
To test the statistical robustness of the obtained communities, we measured the communities 91 assortativity (r com ) using a modification of the method proposed by Shizuka and Farine (2016). In A represents a hypothetical paper focusing on the prevalence of Yellow Band Disease in a country of the Southern Caribbean. Each paper produces a fully connected graph. In B, a second hypothetical paper generates its own graph, which has a link between Orbicella and 'Temperature' in common with A. C represents the resulting network, with the link between Orbicella and 'Temperature' representing the co-apparition in two papers and the remaining links representing one paper.  From the reduced network, we obtained a total of 15 node communities (hereafter, referred 121 as C1 to C15). We obtained a r oc value of 0.23, indicating that the network effectively possess a 122 community structure that departs from randomness. These groups ranged from small communities 123 with low overlap, to rich and highly interconnected communities, e.g. C5 and C14 shared almost 124 50% of their members (figure 4).

125
The smaller communities were composed of six or less nodes. C3 represented four topics related 126 to questions about temporal patterns; C6 represented four genera typically affected by Dark Spot    Figure 6. Membership matrix of communities C1, C7, C11, C13, and C15. Each column represents a different community and the rows indicate the membership of a given topic to one or multiple communities. Circle size represents the importance of the topic within a community, measured as overall node centrality.

10/17
PeerJ reviewing PDF | (2018:11:33001:1:1:NEW 9 Apr 2019)  Figure 7. Membership matrix of communities C2, C4, C5, C12, and C14. Each column represents a different community and the rows indicate the membership of a given topic to one or multiple communities. Circle size represents the importance of the topic within a community, measured as overall node centrality. The circles represent the value of the overall node centrality.

11/17
PeerJ reviewing PDF | (2018:11:33001:1:1:NEW 9 Apr 2019) Manuscript to be reviewed with the highest centrality. There are several reasons contributing to this. Black Band Disease 167 infection has been studied since the emergence of the field (Antonius 1976;Garrett and Ducklow 168 1975) and it has consistently accumulated a body of knowledge that makes it the most studied coral 169 disease so far. Its effects comprise a substantial number of coral hosts across different ecoregions and 170 produce extensive mortality during disease epizootics (Diraviya Raj et al. 2016;Aeby et al. 2015;171 Yang et al. 2014;Sato et al. 2009;Hobbs et al. 2015), especially when combined with seasonal or 172 anomalous temperature increases and determined light conditions (Chen et al. 2017;Lewis et al. 173 2017; Bhedi et al. 2017;A. W. Miller and Richardson 2015;Kuehl et al. 2011;Sato et al. 2011;174 Boyett et al. 2007 concept of pathogenic consortia as etiological agents of coral diseases (Carlton and Richardson 1995), 177 and the accumulated findings are allowing the proposal of etiological models (Sato et al. 2016).

178
In contrast, other diseases have been important enough to appear in their own communities, but

221
Other relevant aspect of this field that can be improved in the future is the incorporation of 222 open science practices, making available annotated coral disease specific datasets-represented in 223 our network as the 'database' topic-e.g. Caldwell et al. 2016 Coral Disease Database (http://gcdd.tinypla.net/), especially considering that we found the 225 topic 'baseline' as an important node in the network but most of this data is not freely available, 226 and it would represent an invaluable resource for further analysis.

227
In summary, we obtained a generalized representation of the most explored topics in coral disease 228 research, however, these predominant themes and questions are yet to be generalized to the range of 229 potential coral hosts and their diseases. We expect that future revisions using this approach will 230 find better connected communities as consequence of studying every disease from a broader range of 231 research objectives. Future analysis of this data set will include additional data on coral immunity 232 and a more detailed analysis of the temporal trends of the field.