Trends in hepatocellular carcinoma research from 2008 to 2017: a bibliometric analysis

Objectives To comprehensively analyse the global scientific outputs of hepatocellular carcinoma (HCC) research. Methods Data of publications were downloaded from the Web of Science Core Collection. We used CiteSpace IV and Excel 2016 to analyse literature information, including journals, countries/regions, institutes, authors, citation reports and research frontiers. Results Until March 31, 2018, a total of 24,331 papers in HCC research were identified as published between 2008 and 2017. Oncotarget published the most papers. China contributed the most publications and the United States occupied leading positions in H-index value and the number of ESI top papers. Llovet JM owned the highest co-citations. The keyword “transarterial chemoembolization” ranked first in the research front-line. Conclusions The amount of papers published in HCC research has kept increasing since 2008. China showed vast progress in HCC research, but the United States was still the dominant country. Transarterial chemoembolization, epithelial-mesenchymal transition, and cancer stem cell were the latest research frontiers and should be paid more attention.


INTRODUCTION
Hepatocellular carcinoma (HCC) is one of the most common type of primary liver malignancy and ranks third in the world's leading causes of cancer death (Balogh et al., 2016). The highest incidence rates of HCC worldwide are in East Asia and sub-Saharan Africa, with over 20 per 100,000 individuals (Ghouri, Mian & Rowe, 2017;Zhu et al., 2016). According to data from the Surveillance, Epidemiology, and End Results (SEER) program in the United States, HCC incidence remains relatively low compared to other primary cancers (Mittal & El-Serag, 2013). However, the incidence rate has risen nearly fivefold from 1.4 to 6.7 per 100,000 individuals over the past decades (El-Serag & Kanwal, 2014;White et al., 2017). Regarding gender, HCC prevalence worldwide in males is higher than in females. The sex ratio in HCC varies from 4:1 to 2:1, depending on geographical area (Hefaiedh et al., 2013;Massarweh & El-Serag, 2017). Due to its extensive prevalence, HCC puts a substantial economic burden on the public health and medical system, whether in developed or developing countries.
Academic journals have published a large number of papers in HCC research since the past decade. However, no attempts have been made to analyse the data on publications systematically. Bibliometric analysis is defined as a quantitative analysis combining mathematical and statistical methods (Pritchard, 1969), and is a good choice for assessing trends in research activites (Dalpé, 2002). Moreover, bibliometric analysis focuses on the metrological characteristics of research literatures within a certain field (Ellegaard & Wallin, 2015), which helps investigators to grasp the development characteristics in this field over time and guide their follow-up work. In recent years, an increasing number of bibliometric studies have been published in high-impact medical journals (Aggarwal et al., 2016;Almeida-Guerrero et al., 2018;Azer, 2015;Baek et al., 2018;Bruggmann et al., 2017;Khan et al., 2018). Journals have also gradually shifted from publishing only conventional research to including bibliometric research (Wakeling et al., 2017).
The present study systematically evaluated HCC research from 2008 to 2017. We aimed to identify the mode of publications, construct research collaboration networks, and assess research trends and frontiers by time.

Data collection
Raw data from WoSCC were initially downloaded and verified by two authors (YZ and YM) independently. The data were then imported into Excel 2016 (Redmond, WA, USA) and CiteSpace IV (Drexel University, Philadelphia, PA, USA), and systematically analysed. Any differences were unified by discussion.

Statistical methods
The WoSCC literature analysis report was used to analyse publication characteristics, including countries/regions, institutes, authors, journal sources, citation counts, number of annual publications, impact factor and H-index. The impact factor is an indicator that reflects the average number of yearly citations for recent papers published in the journal (Garfield, 2006). It is used to assess the quantity of research output in most bibliometric studies. H-index is a measure calculating both the productivity and citation impact per publication of a country, institute, scholar, and so forth (Costas & Bordons, 2007). It usually serves as an indicator for assessing the quality of scientific output. Excel 2016 was used to analyse the publication trend. The polynomial model f (x) = ax 3 + bx 2 + cx + d was applied to forecast the growth of publications in the following year. Variable x stands for the publication year and f (x) stands for the number of publications.
CiteSpace IV was used to analyse the association between journals, explore collaboration networks between authors/institutes/countries, identify co-cited authors/references, capture keywords with strong citation bursts, and construct visualization maps of all items mentioned above. In the present study, the individual network was derived from the 50 most highly cited papers in a one-year slice (Chen, 2004). Moreover, we used the TF-IDF weighting to analyse the contents of each cluster. TF-IDF, an abbreviation of term frequency-inverse document frequency, is a statistical algorithm reflecting how significant a word to a corpus of documents (Ramos, 2003).

Annual publications and growth forecast
In total, 24,331 papers ( Fig. 1, Fig. S1) matched the retrieval criteria, including 93 systematic reviews, 541 meta-analysis, and 132 systematic review and meta-analysis. The number of publications by year was presented in Fig. 2A, where the overall trend consistently kept rising from 1,348 articles in 2008 to 3,572 articles in 2017.
The polynomial curve fitting of publication growth in HCC research showed a significant correlation (the coefficient of determination (R 2 ) = 0.9985) between publication year and

Distribution of journals
In total, 1,681 academic journals (Dataset S1) have published papers in HCC research.  Figure 3 displayed the dual-map overlay of journals. The left and right sides corresponded to the citing and cited journals maps, respectively. The labels represented the disciplines covered by the journal. The lines on the map started from the left and ended on the right, representing the citation links. There were three main citation paths shown on the map.

Distribution of countries and institutes
The 24,331 papers in HCC research were contributed by 116 countries/regions (Dataset S2). Extensive collaborations were observed between countries/regions (Fig. 4A). According to the list of top 10 countries/regions (Table 2) engaged in HCC research, China contributed the most publications (10,755), followed by the United States (3,993), Japan (3,296), and South Korea (1,937).  Nearly 11,000 institutes (Dataset S3) made contributions to HCC research. The collaborations between institutes were not evident (Fig. 4B), compared to countries. The top 10 institutes (Table 2) contributed more than 20% of total publications. In the list, Fudan University ranked first, followed by Sun Yat-Sen University, Second Military Medical University and Shanghai Jiao Tong University.

Analysis of ESI top papers, H-index, and citations
Among the top four productive countries (Fig. 5), the United States contributed the most number of ESI top papers (160) and achieved the highest H-index value (136). Due to a vast amount of literature, China owned the most citation counts (145,060). The other two countries, Japan and South Korea, did not have advantages in the ranking of the three items mentioned above.

Analysis of references
We used CiteSpace IV to construct a network of co-cited references (Fig. 7A) that revealed the relevance between papers. The values of Modularity Q and Mean Silhouette both were more than 0.5 (Fig. S3), indicating that the distributivity and homogeneity of clusters were reasonable and acceptable. All clusters were named after terms extracted from the references of publications (Fig. S4). In this network, the first massive cluster was named ''# 0 meta-analysis'', followed by the second, named ''#1 advanced hepatocellular carcinoma'', and the third, named ''#2 second-line treatment.'' Furthermore, the timeline view of these clusters was shown in Fig. 7B.

Analysis of burst keywords
We identified keywords with strong citation bursts through CiteSpace IV (Fig. 8, Fig. S5). Among them, the keywords that had citation bursts after 2014 were listed as follows:

General information
At the beginning of the study, we searched for HCC-related papers published from 2000 to 2017 in the Web of Science. The overall trend of publications between 2000 and 2007 was stable, with only slowly increasing (Fig. S6). The total number of publications during this period was relatively small, compared with the later period (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017). For this study, we were committed to identify a trend that has obvious changes in the number of publications. Besides to including a sufficient number of articles, the time span cannot be too long, either, we finally determined to limit the search from 2008 onwards.  Among the top 10 contributive countries/regions in HCC research, China was the only one from the developing world, showing its vast progress in life science over the past decade. China had an absolute advantage in the number of papers published, which also received a large number of citations. However, the United States occupied the first positions in both ESI top papers and H-index. In terms of research quality, the United States was the dominant country in HCC research. The most active collaborations were observed between Saudi Arabia and Egypt, Australia and Scotland and England and Brazil. Moreover, the collaborations among European countries were much stronger than those among Asian countries.
The top 10 institutes contributed to 5,362 papers, which accounted for 22.03% of total publications. In this list, the top five institutes were all from China. Additionally, there were 3,994 Chinese institutes involved in HCC research (Dataset S3), accounting for 36.34% of the total number of research institutes worldwide. According to recent reports, China accounted for more than half of world's HCC patients, including over 466,000 new cases in 2015 (Zhu et al., 2016). The estimated incidence rate of HCC was 30.62 per 100,000 standard population, resulting it as the second common malignancy in China (Tanaka et al., 2011). Except for that, there are limited treatment options for unresectable HCC patients and the overall prognosis of HCC is very poor (Xie et al., 2017), making it a serious health issue in current China. That is the reason why a considerable number of Chinese institutes engaged in HCC research and China leads in the number of papers published.

Citation information
Among the top 10 active authors, each person has published at least 225 papers; they were regarded as prolific authors. Despite that, none of them are included as top co-cited authors, suggesting that prolific authors should focus not only on number of publications but also quality of research. Regarding co-cited authors, those with more than 5,000 co-citations, including Llovet JM, who discovered the critical role of mTOR signaling in HCC pathogenesis (Villanueva et al., 2008), Bruix J, who provided guidelines for HCC management (Bruix & Sherman, 2011) and EL-Serag HB, who elaborated the epidemiology of HCC and viral hepatitis (El-Serag, 2012), have made significant contributions in this field.
The co-citation clusters in the timeline view demonstrated that top co-cited references were mainly gathered between 2008 and 2012. Meanwhile, among the top 100 co-cited references identified by CiteSpace IV, there were 72 items existed in the period from 2008 to 2012 (Fig. S4). Given this result, the period (2008-2012) could be considered as a ''golden phase'' of HCC research within the past decade. Table 3 presented the top 10 co-cited references in HCC research. Bruix J (2011), who published a paper in Hepatology, had the highest co-citations (1730), followed by Jemal A (2011, 1,578 co-citations), Llovet JM (2012, 1,206 co-citations) andForner A (2012, 1,196 co-citations), who published papers in CA: A Cancer Journal for Clinicians, Journal of Hepatology and The Lancet respectively. Additionally, other journals with highest impact factor have also contributed some papers on HCC research during the past decade (Dataset S1), such as New England Journal of Medicine (5 papers), The Lancet (12 papers), Nature (10 papers), and Cell (nine papers). They were the fundamentals of this field.

Research frontiers
We used CiteSpace IV to capture the burst keywords, which could be considered a prediction of research frontiers. As shown in Fig. 8, the blue line represented the time intervals and the red line represented the period of citation bursts. Here, we listed three frontiers of HCC research as follows: i. Transarterial chemoembolization: Transarterial chemoembolization (TACE) is considered as an effective treatment for intermediate or advanced HCC patients (Han & Kim, 2015). In the United States, TACE is not only the most common therapy for HCC patients but also the most common bridging therapy for patients waitlisted for liver transplantation (Shah et al., 2011;Thuluvath et al., 2010). According to a retrospective study of the SEER database, TACE utilization significantly improved survival for HCC patients, especially those at an intermediate stage (Gray et al., 2017). Additionally, a recent study has proved that TACE can be a useful treatment option for HCC patients with segmental portal vein tumour thrombus (Choi et al., 2017). ii. Epithelial-mesenchymal transition: Epithelial-mesenchymal transition (EMT) is a biological process in which epithelial cells gradually change into a mesenchymallike type (Van Zijl et al., 2009). This process has proved to be involved in various pathological conditions, including inflammation, fibrosis and cancer (Barriere et al., 2015;Skrypek et al., 2017). Increasing evidence demonstrated that EMT plays a vital role in transferring malignant hepatocytes during the progression of HCC (Giannelli et al., 2016;Huaman et al., 2018;Nitta et al., 2008). The association between EMT and HCC raises a demand to exploit novel diagnostic and therapeutic strategies against HCC progression. iii. Cancer stem cell: Solid tumours contain a small fraction of tumorigenic cells, known as cancer stem cells (CSCs) (Valent et al., 2012). CSCs play a crucial role in tumour metastasis/recurrence and have been identified in many malignant tumours, including HCC (Valent et al., 2012). Accumulating studies have illustrated that HCC CSCs could be enriched by several different markers, including CD133, CD90, CD24, CD13 and EpCAM (Feng et al., 2014;Kim & Park, 2014;Sainz Jr & Heeschen, 0000). HCC CSCs could partially explain the heterogeneity of HCC, metastasis after hepatic tumour resection and chemotherapeutic resistance in advanced HCC cells (Ji & Wang, 2012;Zheng et al., 2018), which provide the potential to develop novel therapeutic strategies based on stem cell biology.

Strengths and limitations
To the best of our understanding, this paper is the first bibliometric analysis on HCC research trend over the past decade. The data analysis process was relatively objective. However, most publications retrieved from the database were written in English, causing incomplete analysis to some extent. Furthermore, this study consisted exclusively of original and review articles published between 2008 and 2017 and indexed by the Web of Science. It may not be enough to represent all HCC literature, such as other document types published in journals, books, and conferences were not included. The analysis in this study was based on articles recorded in the Science Citation Index-Expanded (SCI-E) of the Web of Science Core Collection (WoSCC). Each journal to which the SCI-E articles belong had its corresponding citation report provided by the Web of Science. Although other databases such as PubMed, Scopus, and Embase could provide a broader range of coverage, much of the ''extra coverage'' could be attributed to journals with potentially limited readers. Given that our objective was to conduct a high-quality bibliometric analysis to identify research trends in the core of HCC field, the SCI-E articles from WoSCC may be the only appropriate choice. Therefore, the results from other databases were not included. As for China ranked first in publications, except for the active participation of Chinese institutes, the strong support of science funding in China could be another important reason. A recent study has shown that the proportion of funded papers in China is the highest, compared to other countries. Nearly 80% of SCI-E papers are supported by science funding (Sun et al., 2013). Therefore, the advantage of the number of publications from China is particularly prominent, which may create an illusion that the gap between western countries and China is widening. Finally, although the Web of Science database is still updating, this study covers the vast majority of papers in HCC research since 2008; new data may not influence the final results.

CONCLUSIONS
The number of publications in HCC research has been increasing over the past decade. The United States, Japan, and China were the top three countries contributing to HCC studies. There were active collaborations between developed countries. Although many Chinese institutes were engaged in HCC research, the United States was still the dominant country. Llovet JM, Bruix J and EL-Serag HB may be ideal candidates for academic cooperation. Transarterial chemoembolization, epithelial-mesenchymal transition and cancer stem cell may be frontiers in this field, and researchers should pay close attention to relevant studies in the coming years.