Molecular detection and genotyping of intestinal protozoa from different biogeographical regions of Colombia

Background Intestinal parasitic protozoa represent a serious problem of public health particularly in developing countries. Protozoa such as Blastocystis, Giardia intestinalis, Entamoeba histolytica and Cryptosporidium spp. are associated with diarrheal symptoms. In Colombia, there is little region-specific data on the frequency and circulating genotypes/species of these microorganisms. Therefore, the main objective of our study was to employ molecular detection and genotyping of G. intestinalis and Blastocystis, Cryptosporidium and Entamoeba spp. in samples from different biogeographical regions of Colombia. Methods We collected 649 human fecal samples from five biogeographical regions of Colombia: the Amazon, Andean, Caribbean, Orinoco and Pacific regions. Blastocystis, G. intestinalis, Cryptosporidium spp. and Entamoeba complex were detected by microscopy and conventional PCR. Molecular genotyping was conducted to identify Blastocystis subtypes (STs) (18s), G. intestinalis assemblages (triose phosphate isomerase and glutamate dehydrogenase) and Cryptosporidium species (18s). Genetic diversity indices were determined using dnasp.5. Results We detected G. intestinalis in 45.4% (n = 280) of samples, Blastocystis in 54.5% (n = 336) of samples, Cryptosporidium spp. in 7.3% (n = 45) of samples, Entamoeba dispar in 1.5% (n = 9) of samples, and Entamoeba moshkovskii in 0.32% (n = 2) of samples. Blastocystis STs 1–4, 8 and 9 and G. intestinalis assemblages AII, BIII, BIV, D and G were identified. The following Cryptosporidium species were identified: C. hominis, C. parvum, C. bovis, C. andersoni, C. muris, C. ubiquitum and C. felis. The Caribbean region had the highest frequency for each of the microorganisms evaluated (91.9% for G. duodenalis, 97.3% for Blastocystis, 10.8% for Cryptosporidium spp., 13.5% for E. dispar and 2.7% for E. moshkovskii). The Orinoco region had a high frequency of Blastocystis (97.2%) and the Andean region had a high frequency of G. intestinalis (69.4%). High and active transmission was apparent in several regions of the country, implying that mechanisms for prevention and control of intestinal parasitosis in different parts of the country must be improved.


INTRODUCTION
Infectious diseases are major public health challenges worldwide. Despite efforts to reduce human morbidity and mortality, shortcomings in prevention and control measures continue to impact the continued transmission of pathogens in the human population, preventing the management of some diseases in endemic areas (Morens, Folkers & Fauci, 2004). Infectious diseases caused by intestinal parasites have a wide distribution worldwide. In 2001, approximately 3,500 million people were infected by protozoa and intestinal helminths where the children were the most affected (MinSalud, 2015) by protozoal infections. Members of the genus Blastocystis are the most common eukaryotic microorganisms in the human and animal intestine (Stensvold, Alfellani & Clark, 2012), followed by Giardia intestinalis (synonyms: G. duodenalis and G. lamblia) and various Cryptosporidium and Entamoeba species (Haque, 2007). Together, these are the main protozoan causative agents of diarrheal disease in humans worldwide (Caccio & Ryan, 2008;Haque, 2007;Jacobsen et al., 2007).
Worldwide, approximately 200 million individuals are infected by Giardia species, while the frequency of Cryptosporidium infection ranges from 0.1% to 10% in developed and developing countries, respectively (WHO, 2010). The frequency of amebiasis, caused mainly by Entamoeba histolytica, is often reported as near 20%, but can vary greatly depending on the region and the techniques used to differentiate the E. histolytica/ dispar/moshkovskii complex (Silva et al., 2014;Tasawar, Kausar & Lashari, 2010). The frequency of Blastocystis ranges between 0.5% and 24% in industrialized countries and between 30% and 76% in developing countries (Wawrzyniak et al., 2013). However, other studies have identified populations of children where the frequency of Blastocystis approaches 100% (El Safadi et al., 2014). In Colombia, the latest national survey by the Ministry of Health revealed that Blastocystis was the most commonly identified protozoa in human feces, with a nationwide frequency of 52%. Blastocystis were followed by Entamoeba (17%), Giardia (15%) and Cryptosporidium (0.5%) spp. (MinSalud, 2015).
Molecular tools have been developed to assess the genetic diversity of protozoan parasites at the intra-species level. In the case of G. intestinalis, eight genotypes or assemblages (A-H) have been identified and are distributed worldwide. Within these assemblages, sub-assemblages (AI-AIII and BIII-BIV) have been established (Faria et al., 2017;Lasek-Nesselquist, Mark Welch & Sogin, 2010;Ryan & Cacciò, 2013). In Latin America, similar frequencies of assemblages A and B were observed in Brazil (Coronato Nunes et al., 2016) and Cuba (Pelayo et al., 2008), while the frequencies of the AI, AII, AIII, BIII and BIV sub-assemblages varied in Brazil, Argentina, Peru, Colombia and Mexico (Coronato Nunes et al., 2016;Minvielle et al., 2008;Molina et al., 2011;Perez Cordon et al., 2008;Sánchez et al., 2017;Torres-Romero et al., 2014). On the other hand, members of the genus Blastocystis can be classified into 17 subtypes (STs) (Stensvold et al., 2007) based on polymorphisms of 18S rDNA (Scicluna, Tawari & Clark, 2006). In humans, STs 1-3 are common in both Europe and South America (Del Coco et al., 2017;Malheiros et al., 2011;Ramírez et al., 2016;Santin et al., 2011), while ST4 is commonly found in Europe (Stensvold et al., 2011;Wawrzyniak et al., 2013) and was possibly associated with an enzootic cycle in nonhuman primates in Latin America (Ramírez et al., 2014;Santin et al., 2011). Approximately 20 different species have been identified within the genus Cryptosporidium, where Cryptosporidium hominis and Cryptosporidium parvum are the most common pathogens infecting humans (Feng & Xiao, 2017). Two markers (the small subunit of the ssuRNA and gp60) have been used to discriminate species and STs (Khan, Shaik & Grigg, 2018). Ten STs of C. hominis (Ia-Ik) and 16 STs of C. parvum (IIA-IIp) have been described (Garcia-R et al., 2017;Xiao, 2010). In Colombia, infections by Cryptosporidium viatorum (Sánchez et al., 2017) Cryptosporidium galli and Cryptosporidium molnari have been reported (Sánchez et al., 2018). Lastly, within the genus Entamoeba, the only pathogenic species is E. histolytica. However, the morphological similarities between the three species of the E. histolytica/moshkovskii/ dispar complex (Pritt & Clark, 2008) make molecular tools required for species identification (Ximénez et al., 2009). A study in Colombia, in children under 16 years old, found a frequency of infections of 49.1%, being E. dispar the most frequently detected and E. moshkovskii also reported (López et al., 2015).
Colombia has a wide variety of climates and biogeographical regions classified according to epidemiological features. For instance, the biogeographical regions of Colombia are characterized by different climatic and ecosystem conditions, ranging from temperate zones to permanent snow in the mountain peaks. Moreover, the number of inhabitants and economic activities are increasing and the availability of resources are decreasing in some of these regions (IDEAM et al., 2007;Barón, 2002) affecting the ecological niches where some pathogens could be circulating. Also, variation in socioeconomic conditions may be associated with behavioral factors that drive the transmission of microorganisms through contact with animals from both urban and rural areas, as well as the consumption of food and water under inadequate sanitary conditions. All these features make Colombia a country where the transmission of intestinal microorganisms is very likely (MinSalud, 2015). For this reason, it is mandatory to establish intervention programs to know what protozoa are being transmitted, including their biological and molecular characteristics to improve control and prevention plans. Therefore, the main objective of this study was to conduct molecular detection and genotyping of Giardia, Blastocystis, Cryptosporidium and Entamoeba species from samples collected in different biogeographical regions of Colombia. We also compared the concordance results by PCR and microscopy in the analyzed samples.

Ethics approval and consent to participate
This study was a minimum risk investigation for participants. Both the ethical standards of the Colombian Ministry of Health (Youth Code) and the Helsinki Declaration of 2013 were followed. The parents or guardians of minors participating in the study signed informed consent forms and gave their permission to obtain samples. This study was approved by the research ethics committee of the Universidad del Rosario (registered in Act No. 394 of the CEI-UR), the ethics committee of the Department of Internal Medicine of the Universidad del Cauca (number VRI024/2016), and the INCCA University of Colombia (number 237894).

Study area
Colombia is a country with significant geographical, ethnic, cultural and socioeconomic diversity. Based on climatic, territorial and ecosystem diversity, the country is subdivided into six natural regions: Insular, Caribbean, Amazon, Andean, Orinoco and Pacific. These regions are not precisely geographical but coincide with clusters that are the recipients of government budget funds. Except for the Insular region, all regions were included in this study.
The Caribbean region is located in the north of the country and includes seven departments. In the department of Córdoba, eight samples were obtained from Montería city and in the department of Bolívar, 30 samples were collected in the city of Mompós. This coastal region has some mountainous areas, contains both tropical and dry forest ecosystems, and is strongly influenced by the presence of bodies of water. The Amazon region has the smallest human population but the greatest diversity of flora and fauna. Its biome mainly comprises tropical forest and is characterized by warm weather and abundant precipitation. The human population is primarily indigenous. The Amazon region is located in southern Colombia and consists of six departments (Rangel-Ch & Aguilar, 1995). Fifty samples each were obtained from the departments of Guainía and Amazonas. The participating municipalities were Caño Conejo, Coco Nuevo, Coco Viejo and Puerto Inírida in Guainía and the cities of Leticia and Puerto Nariño in Amazonas.
The Andean region is the most populous in the country, housing half of the Colombian population. This region comprises the northern zone of the Andes and contains three mountain ranges that contribute to high climatic variability because of the different altitudes found within the region. The Andean region contains the departments of Antioquia, Boyacá and Cundinamarca; 50 samples were collected from each of these departments. The municipalities or cities contributing samples were Medellín, Río Negro, Amalfi, Bello, Caldas and El Santuario (Antioquia); Paipa, Villa de Leyva, Arcabuco, Tuta and Tunja (Boyacá) and the municipalities of Chaguaní, San Francisco, Fómeque, Soacha and the city of Bogotá (Cundinamarca). Other departments, including Quindío, Risaralda, Caldas and part of Tolima, make up the Coffee Axis. Another 50 stool samples were obtained from inhabitants of the municipalities of Calarcá, Armenia, Pereira, Córdoba and the Corregimiento Barcelona, all located within the Coffee Axis.
The Orinoco contains a large number of rivers, warm ecosystems and tropical and subtropical forests. This region is sparsely populated and comprises four departments. One of these is Casanare, where we collected 53 samples from the municipalities of Poré, Yopal and Tamara.
Finally, the Pacific region, one of the wettest in the world, is characterized by tropical forest and high species diversity. Although it is naturally resource rich, the region has poor urban development and infrastructure. The Pacific region contains four departments, one of these is the Cauca department, where 258 samples were collected from inhabitants of commune 8 in the city of Popayán. These samples were collected within the study of Villamizar and colleagues (Villamizar et al., 2019), and are, in turn, part of the current study.

Study population
In total, 649 stool samples were collected from adults and children in different biogeographical regions of Colombia (Fig. 1). Convenience sampling was conducted to obtain samples from five different regions. The average age was 5 years (standard deviation: 6 years; range: 1-70 years). Microscopy is the gold standard used in Colombia to detect intestinal parasites, then most of the samples were evaluated by this diagnostic scheme following the protocol by Villamizar and collaborators (Villamizar et al., 2019), except for samples from Casanare, Bolívar and Córdoba, which could be only tested by PCR, since the entire sample portion was preserved in ethanol 100%. The individuals included in the study lived in both rural and urban areas of different municipalities/cities. The percentages of samples obtained in each biogeographical region were: Amazon (15.4%, n = 100), Andean (30.8%, n = 200), Caribbean (5.9%, n = 38), Orinoquía (8.2%, n = 53) and Pacific (39.8%, n = 258). The majority (86%, n = 558) of samples were assessed for the presence of intestinal protozoa by microscopy. All samples were subjected to molecular detection of intestinal protozoa and in those that were positive, further molecular characterization was conducted.

DNA extraction
Prior to DNA extraction, approximately 300 mL of each sample was washed with sterile phosphate-buffered saline. Genomic DNA was extracted from stool samples using the Norgen Stool DNA Isolation Kit, Norgen Biotek Corp., following the recommendations of the manufacturer. During the lysis step, 10 mL of a recombinant plasmid, pZErO-2, was added (final concentration: 100 pg/mL). This plasmid contained the Arabidopsis thaliana aquaporin gene as an internal control for heterologous extrinsic amplification (Duffy et al., 2013).

Conventional PCR
Initially, an internal amplification control (IAC) PCR was performed to verify that there was no inhibition of this technique using stool samples as template. All samples were subjected to this amplification control, except those from Cauca (The samples were collected and extracted directly in Popayan and therefore IAC was not added to the sample). Overall, 358 (91.6%) samples were validated with a positive amplification for the IAC, and were subjected to PCR to detect G. intestinalis, Blastocystis, Cryptosporidium and Entamoeba complex DNA. On the other hand, 33 (8.4%) samples that were negative for the IAC were discarded. IAC PCRs and molecular detection for Giardia, Blastocystis and Cryptosporidium spp., were performed in a final volume of 9 µL containing 3.5 µL of GoTaq Green Master Mix (Promega), 2 µL of template DNA, and primers. For the Entamoeba complex, a conventional multiplex PCR was performed using previously reported conditions and primers EntF (5′-ATGCACGAGAGCGAAAG CAT-3′), EhR (5′-GATCTAGAAACAATGCTTCTCT-3′), EdR (5′-CACCACTTACTA TCCCTACC-3′) and EmR (5′-TGACCGGAGCCAGAGACAT-3′) (Mahmoudi, Nazemalhosseini-Mojarad & Karanis, 2015). Differentiation between E. histolytica, E. dispar and E. moshkovskii was based on the size of the amplicon using these primers. DNA extracted from axenic cultures of each protozoan provided by The University of Texas Medical Branch were used as positive controls.
Once all PCRs were performed, the size of each amplicon was assessed using 2% agarose gel electrophoresis followed by staining with SYBR Safe. Subsequently, each product was purified with ExoSAP-IT Ò following the manufacturer's recommendations. Both strands of each amplicon were sequenced using the Sanger method by Macrogen (Seoul, South Korea). Sequences were edited in MEGA 7.0 (Kumar, Stecher & Tamura, 2016) and compared with publicly available sequences using BLAST to verify that they corresponded to the expected taxonomic unit.

Indices of genetic diversity
To assess the degree of DNA polymorphism, we constructed a multiple alignment of concatenated sequences for each of the loci evaluated for both G. intestinalis and Blastocystis using MAFFT v7. For the gdh and tpi loci of G. intestinalis, we analyzed 30 (295 sites including gaps) and 25 (465 sites including gaps) sequences, respectively. In case of Blastocystis with the 18s gene, we analyzed 114 (1,635 sites including gaps) sequences. All these sequences were used to calculate the indices of diversity (π and Θ), number of polymorphic (segregating) sites (S), number of haplotypes (h), and the haplotype diversity by department. DnaSP v5 software was used for these analyses.

Statistical analysis
Data were summarized using univariate statistics in Stata 14 (StataCorp, 2015, Stata Statistical Software: Release 14). Subsequently, Cohen's kappa indices were calculated to assess agreement between the results of microscopy and molecular techniques, both globally and for each of the parasites individually.

Sample description and detection of protozoa
The ages of individuals from which samples were collected ranged between 1 and 70 years (average, 4.8 years; standard deviation, 5.5 years). The largest number of samples (39.8%) were collected in the Pacific region (Department of Cauca), while 30.8% were collected in the Andean region (Departments of Antioquia, Boyacá, Cundinamarca and the Coffee Axis). The majority (74.9%) of samples came from rural areas.

Comparison of protozoan detection by microscopy and PCR
The majority of samples were positive by microscopy (68.3%) and molecular methods (71.2%). The frequency of positive samples by PCR (n = 616) vs microscopy (n = 649) was calculated for each protozoan: G. intestinalis (PCR 41.1% vs microscopy 24.5%), Blastocystis (PCR 49.0% vs microscopy 33.6%), Cryptosporidium (PCR 5.6% vs microscopy 27.3%), and the Entamoeba complex (PCR 22.9% vs microscopy 0.2%). The concordance between direct microscopy and by conventional PCR was analyzed both globally and for each protozoan. In all cases, a low concordance between detection techniques was observed, with kappa indices of 0.3807 for detection of all protozoa and 0.2699, 0.1478, 0.0149 and −0.0036 for G. intestinalis, Blastocystis, Cryptosporidium spp. and the Entamoeba complex, respectively.

Giardia intestinalis
Using molecular detection by PCR, 43.1% of samples tested positive for G. intestinalis (Fig. 2A). The Caribbean region showed the highest frequency at 89.5% (95% CI [83.1-100.7]), followed by the Andean region (mainly the Department of Antioquia and the Coffee Axis) and the Amazon region (municipalities of Coco Viejo and Caño Conejo, Department of Guainía) in which the majority of sampled areas had a frequency greater than 60% (Table 1; Fig. 2B). Certain areas such as Yopal of Casanare and Paipa of Boyacá were distinguished by their extremely high G. intestinalis frequency (89.5% and 100%, respectively). From positive samples, assemblages were identified using the gdh marker for 33 samples as follows: AII (3.0%), BIII (36.3%), BIV (48.8%), D (3.0%) and G (9.1%). Sub-assemblage BIV was the most frequent, mainly in the municipality of Poré (Casanare), followed by sub-assemblage BIII with significant frequency in the cities of Yopal (Casanare) and Popayán (Cauca); the lattermost city had the greatest variety of assemblages. In the city of Monteria, a high frequency of assemblage G (75%) was observed. Twenty-five samples were genotyped using the tpi marker, and the AII, BIII and BIV assemblages were detected at frequencies of 8%, 56% and 36%, respectively. The highest frequency (82%) was observed for the BIII sub-assemblage in Mompós, followed by the BIV sub-assemblage (75%) in the city of Yopal (Figs. 2C-2K). From one sample collected in the city of Yopal, we found an inconsistency with the assigned assemblages using different markers. In the case of gdh marker, this sample clustered between AI and AII sub-assemblages, and could not be determined its assemblage with gdh, but with tpi marker this sample clustered with the BIV sub-assemblage.

Diversity indices
Genetic diversity indices by department were calculated based on these alignments. For G. intestinalis, the number of segregating (polymorphic) sites (S) was 202 for gdh and 105 for tpi, with haplotypic diversities of 0.977 and 0.903, respectively. The nucleotide diversity indices π and Θ, as well as haplotypic diversity (Hd), were high for the population in Córdoba for both loci and in Casanare for tpi. The lowest diversity in gdh was found among sequences from Cauca. Unfortunately, tpi sequences from Cauca showed electropherograms of poor quality and were not analyzed. For Blastocystis sequences, the departments of Bolívar and Córdoba showed a greater number of polymorphic (segregating) sites (S). In particular, the Bolívar sequences showed the highest number of haplotypes (24), with a haplotypic diversity of 0.989 and higher nucleotide diversity indices compared with Casanare and Cauca. The latter had the lowest sequence diversity (Table 2).
Cryptosporidium and Entamoeba spp.

DISCUSSION
Colombia is a privileged country with natural wealth, geographical variety and ecosystem diversity. However, the climatic conditions and location of the country, in addition to the unequal distribution of resources in different regions, give rise to some primarily rural areas with unfavorable socioeconomic conditions and inadequate sanitary conditions. These factors directly influence the transmission of parasitic diseases among the residents of a given region (Ortiz, López & Rivas, 2012). Another factor that plays a major role in transmission of infectious protozoa is age: children tend to be the most common hosts and adults are likely to be an important source of transmission to children (Carvajal-Restrepo et al., 2019). Age may be associated with susceptibility to infection due to age-dependent immunological conditions that favor colonization by protozoa as well as age-dependent malnutrition and behavioral factors that affect transmission (Harhay, Horton & Olliaro, 2010). Likewise, these factors can influence the transmission of helminths, which explains the finding of some of them in the samples evaluated by microscopy. It is important to clarify that although the objective of our study was not the detection of these geohelminths, we wanted to report them due to the great importance they have mainly in the child population, associated not only with immunological and malnutrition problems but also with growth and development (Papier et al., 2014).
Our findings also support a high transmission rate of helminths in the country, which has severe implications in control programs across the country. Our results showed a high frequency of intestinal protozoa present in different regions of the country. Using microscopic detection (with the exception of the Caribbean and Orinoco), we observed that the Andean region had the highest frequency of G. intestinalis, Blastocystis, members of the Entamoeba complex and Cryptosporidium spp. Using molecular tests, the region with the highest frequency of all protozoa evaluated was the Caribbean. There was a low concordance (kappa index = 0.38) between the two techniques evaluated (microscopy and PCR). However, it is not possible to assert that the Andean or Caribbean regions truly had higher frequency of these protozoa, as our study had an important selection bias: sampling was carried out at convenience, with a higher number of samples obtained in regions such as the Andean and Pacific regions. However, it is important to note that the Andean region concentrates the largest population in Colombia and the Pacific region is one of the rainiest areas in the world, with low economic conditions, inadequate health conditions and poor access to education which might to some extant explain our findings (Barón, 2002). Thus, future studies would be necessary to collect a larger number of samples of comparable quantity in each region. Despite this, our results are in agreement with those of a survey conducted using microscopy by the Ministry of Health in 2015 (MinSalud, 2015). For example, using molecular tests, we observed that the areas with higher frequencies of these protozoa coincided with the Caribbean region and the Andean region in most cases, with the exception of G. intestinalis, because in the survey by the Ministry of Health, was observed at higher frequency in the Colombian Amazon.
In Colombia, most reports on protozoan pathogens have focused solely on microscopic detection (Agudelo-Lopez et al., 2008;Carvajal-Restrepo et al., 2019). Several studies have shown differences in detection rates using molecular tests, which allow identification of cryptic species and their genotypes in addition to detection (Morgan et al., 1998;Stensvold et al., 2018). Thus, there is clear value in using complementary techniques (Beyhan & Taş Cengiz, 2017;Sri-Hidajati et al., 2018;Mateo et al., 2014) for molecular epidemiological studies, which may help to better elucidate the transmission dynamics of microorganisms and to establish better prevention and control plans. Another advantage of using molecular techniques is their sensitivity in cases of polyparasitism (Meurs et al., 2017). Polyparasitism is an important factor in the transmission of parasitic diseases, and the presence of different infectious agents, including helminths and protozoa, may serve as an indicator of inadequate sanitary conditions, immune suppression, nutritional deficiencies and continual reinfection (Supali et al., 2010). In our study, 29.3% of samples evaluated were positive for both Blastocystis and G. intestinalis, 1.7% were positive for Blastocystis, G. intestinalis and Cryptosporidium spp., 3.4% were positive for G. intestinalis and Cryptosporidium spp., and 3.4% for Blastocystis and Cryptosporidium spp. The remaining co-infection combinations occurred at less than 1.4%. None of these combinations showed any geographical associations.
Few studies of these protozoan pathogens have been conducted in Colombia. Studies of samples from indigenous communities in the Amazon (Sánchez et al., 2017), from a rural region in La Vírgen (Ramírez et al., 2015), and from children in rural schools in the municipality of Apulo (Hernández et al., 2019), Cundinamarca, found G. intestinalis in human fecal samples. Sub-assemblages AI, AII, BIII and BIV and sub-assemblages AII, BIII and BIV were detected in the feces of children in nurseries of the Colombian Institute of Family Welfare. Assemblages C and D were detected in samples from dogs in Tolima (Rodríguez et al., 2014). These results are consistent with the detection of assemblages AII, BIII and BIV in the Orinoco, Pacific and Caribbean regions in our study, with the exception of the presence of assemblage D in the Pacific and assemblage G in the Caribbean (Figs. 2C-2K). As in other studies outside Colombia, we observed no restriction of assemblages to specific geographic regions (Broglia et al., 2013;Feng & Xiao, 2011). Assemblage D is typically associated with dogs, while assemblage G infects rodent including rats and mice (Caccio & Ryan, 2008). Thus, there is the potential for human infection by these assemblages in humans and they could potentially maintain an active cycle of transmission or generate transient infections, in humans (Heyworth, 2016). The association between these assemblages and the development of disease is not clear (Sprong, Cacciò & Van Der Giessen, 2009), but they may be acquired through the consumption of untreated water in rural regions where potable drinking water systems are absent, allowing closer contact with animal feces and increasing the risk of zoonotic transmission (Fantinatti et al., 2016). In agreement with this, assemblage H was detected in a study of water supplied by treatment plants in Nariño (southwest Colombia) (Sánchez et al., 2018), suggesting that water or the feces of wild animals that have not been studied as possible reservoirs could explain the presence of these assemblages. It is also important to consider that in the Caribbean region consumption of exotic animals and animal products, including iguana eggs, small crocodiles, freshwater turtles and armadillos, is very common. The potential role of these foods in the transmission of infections is unknown.
Another protozoan detected with high frequency was Blastocystis, mainly in the Caribbean and Orinoco regions. Frequency rates were often above 80%, especially in some regions of the Amazon such as Caño Conejo, Puerto Inírida (Guainía), of the Andean region such as Amalfi (Antioquia), the city of Bogotá, Soacha (Cundinamarca) and Calarcá, Corregimiento Barcelona, Pereira (Coffee Axis) (Fig. 3B). These findings are in agreement with results obtained using microscopy in Colombia that showed significant frequency in Caribbean regions such as Santa Marta (62.6%) and in Andean areas such as Santander (25%), Bogotá (22.4%), Quindío (36.4%) and Cundinamarca (34.8%) (Londono-Franco et al., 2014). When performing subtyping of Blastocystis, we identified STs 1-4, 8 and 9, with STs 1-3 having the highest frequency as reported by Del Coco and collaborators in a review made in 2017 and another study in Brazil (Del Coco et al., 2017;Malheiros et al., 2011). The municipality of Poré, Casanare showed the greatest diversity of STs. Similarly, the lower proportion of ST4 observed in the Caribbean and in the Pacific coincides with previous reports suggesting that this subtype is of recent origin in humans from the Americas  and of ethnic origin in Colombia associated with the enzootic cycle (Jiménez, Jaimes & Ramírez, 2019;Ramírez et al., 2014;Santin et al., 2011). Surprisingly, one sample was positive for ST8 in the Caribbean and another for ST9 in Casanare in the Orinoco region; these STs are rarely detected in humans (Stensvold & Clark, 2016). This is the first report of ST9 in Colombia, previously, one study in Italy reported the presence of ST9 in samples from symptomatic humans (Meloni et al., 2011). However, more studies are required to evaluate the potential zoonotic origin of this ST and its relationship with the presence of symptoms (Stensvold et al., 2009). A previous study reported the presence of ST8 in Colombia in marsupial stool samples (Ramírez et al., 2014), while another study detected this ST in arboreal nonhuman primates in Asia and South America (Alfellani et al., 2013). Few studies have reported the presence of this ST in humans, but it could apparently be involved in zoonotic transmission to humans (Meloni et al., 2011;Stensvold et al., 2007), where a great variety of animals could be involved in the transmission. This is because there is a great diversity of fauna and ecosystems in the country. For instance, in the Caribbean and Orinoco regions exist diverse ecosystems including savannah, mountainous forest, bodies of water, jungles and moorland (Barón, 2002;IDEAM et al., 2007;Vergara, 2018), these are exploited by each department to generate economic resources, and the presence there of nonhuman primates, rodents, birds and pigs infected with intestinal protozoa could increase the risk of zoonotic transmission in rural areas as has been reported in the country and in Brazil (Rondón et al., 2017;Valença-Barbosa et al., 2019).
We also characterized the alleles of each of the STs. No geographical associations were observed for STs or alleles. Allele 4 of ST1 was detected in the regions of Cauca (44.7%), Casanare (25.5%) and Bolívar (14.9%) (Fig. 3I), and was the most frequently observed as previously reported (Ramírez et al., 2014;Sánchez et al., 2017). In addition, alleles 8, 80, 88 and 141 were found in ST1, alleles 9, 11, 12, 15 and 64 within ST2, and alleles 31, 34, 36 were detected in ST3. The presence of alleles 38, 47, 52, 57, 136 and 151 provided evidence of the great intra-subtype diversity present and mostly agreed with studies of STs circulating in Ecuador, Peru, Bolivia, Colombia, Brazil and Argentina in samples from humans, domestic animals and the enzootic cycle (Ramirez et al., , 2014(Ramirez et al., , 2016. For ST4, alleles 42, 91 and 133 were identified, which had previously been reported in Colombia; in particular, allele 91 that is possibly of European origin (Ramírez et al., 2014;Stensvold et al., 2011). For ST8 isolates reported in Colombia and Brazil (Ramírez et al., 2016), the 21 allele probably had a zoonotic origin. For ST9 we detected allele 129, of which there is no previous report in Colombia. The origin of this ST has not been established, and then it is not possible to make inferences about its transmission. As mentioned above, great diversity was present among the STs characterized for Blastocystis, and establishing the transmission dynamics for several of the STs detected at low frequency would be a useful task.
In addition to detecting protozoa and determining the frequencies of STs and assemblages, the genetic diversity among sequences of G. intestinalis and Blastocystis was evaluated by department and by marker. For G. intestinalis, diversity indexes were higher for assemblages from the Casanare and Córdoba departments for gdh (Table 2). However, the low number of sequences obtained from the Córdoba region means that the degree of diversity would need to be verified using more samples in a future study. For Blastocystis, greater diversity was found in the Caribbean region in the departments of Bolívar and Córdoba. This was expected since Blastocystis usually has high inter-subtype variability , as reported in a study of SSU DNAr genes conducted in Mexico, where the results showed similar diversity indices within each subtype, despite their different geographical regions and different inter-subtype indices (Villegas-Gómez et al., 2016), and a greater diversity between the STs of a control group compared with one associated with irritable bowel syndrome (Vargas-Sanchez et al., 2015). Like G. intestinalis, the number of sequences for Córdoba was very small, avoiding any strong conclusions regarding this population.
Finally, in the case of other less frequently detected protozoa such as Cryptosporidium and Entamoeba spp., we identified in some cases the species present in positive samples. For Cryptosporidium spp., C. andersoni was detected in the Amazon; C. muris, C. ubiquitum and C. andersoni were detected in the Andean region; C. hominis, C. muris and C. felis were detected in the Caribbean; and C. hominis and C. parvum were detected in the Pacific. These results agreed with previous studies (Galván-Díaz, 2018;Sánchez et al., 2017Sánchez et al., , 2018, except for C. ubiquitum and C. andersoni which had not previously been reported in human feces in Colombia. These two species are associated with a wide variety of animal hosts including domestic and wild ruminants, rodents, omnivores and primates (Fayer, Santín & Macarisin, 2010). The low host specificity of C. ubiquitum along with the shared habitats of different animals can contribute to its wide distribution and therefore to possible infections in humans, especially immunocompromised patients (Fayer, Santín & Macarisin, 2010). C. parvum and C. hominis are the species most frequently detected in humans. However, cattle and other domestic and wild animals infected with different species can have great importance in public health and the transmission of this parasite (Ryan, Fayer & Xiao, 2014). The detection of species associated with bovine hosts and cats infecting humans allows us to infer that the transmission of Cryptosporidium spp. in the regions evaluated is zoonotic and possibly also from human to human. Likewise, these findings suggest the great need to evaluate prevention and control measures for parasitic infections and the need to improve water sanitation infrastructure for human consumption.
For the Entamoeba complex, we only obtained samples positive for E. dispar in the departments of Bolívar and Casanare and samples positive for E. moshkovskii in Bolívar. This indicated probable orofecal transmission in the areas evaluated. Our findings are consistent with a study conducted by Lopez et al. showing a high frequency of E. dispar and E. moshkovskii in La Virgen, Cundinamarca, Colombia and low frequency of E. histolytica (López et al., 2015). Due to the low number of samples detected for Cryptosporidium and Entamoeba spp., it was not possible to establish any type of geographical associations for these parasites. For these microorganisms, the number of samples collected in each region should be expanded to establish with greater certainty the true frequencies of the circulating variants in the country.

CONCLUSIONS
In conclusion, ours is the first study to assess the frequency and genotypes of intestinal protozoa using sampling areas located in five biogeographical regions of Colombia. Our results showed frequent transmission of intestinal protozoa and high genetic diversity of G. intestinalis and Blastocystis, mainly in the Caribbean and the Andean regions. The sampled areas need to be expanded to establish the transmission rates and genetic characteristics of these microorganisms more accurately. Likewise, it is necessary to note that the results of the present work are an important contribution to explore the frequencies of these parasites in the country but new studies are required to obtain more representative information of the biogeographical regions, increasing the number of samples and including more cities/municipalities to be evaluated in order to establish the real frequency of these microorganisms in the different bioregions. Future studies could consider evaluating samples from different countries in South America as well, which would permit assessment of the frequency of intestinal parasites as well as their STs and assemblages at the continental level. The high frequency of Blastocystis and G. intestinalis in the samples analyzed likely represents active orofecal transmission involving different hosts in addition to humans, within the life cycles of these protozoa. This might put populations living in vulnerable socioeconomic conditions at risk and it is therefore necessary to implement new strategies for control and prevention of these microorganisms.

Blast
Basic Local Alignment Search Tool gdh glutamate dehydrogenase Hd haplotypic diversity Mega Molecular evolutionary genetics analysis S polymorphic (segregating) sites SSUrRNA Small subunit ribosomal ribonucleic acid ST subtype tpi triose phosphate isomerase Ximena Villamizar performed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Giovanny Herrera performed the experiments, analyzed the data, prepared figures and/ or tables, authored or reviewed drafts of the paper, and approved the final draft. Julio Cesar Giraldo performed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Luis Reinel Vasquez-A performed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Plutarco Urbano analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Oswaldo Villalobos analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Catalina Tovar analyzed the data, authored or reviewed drafts of the paper, and approved the final draft. Juan David Ramírez conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.

Human Ethics
The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers): This study was a minimum risk investigation for participants. Both the ethical standards of the Colombian Ministry of Health (Youth Code) and the Helsinki Declaration of 2013 were followed. The parents or guardians of minors participating in the study signed informed consent forms and gave their permission to obtain samples. This study was approved by the research ethics committee of the Universidad del Rosario (registered in Act No. 394 of the CEI-UR), the ethics committee of the Department of Internal Medicine of the Universidad del Cauca (number VRI024/2016), and the INCCA University of Colombia (number 237894).

DNA Deposition
The following information was supplied regarding the deposition of DNA sequences: The raw data is available as a Supplemental File. All sequences are available at GenBank: MN877659-MN877714.

Data Availability
The following information was supplied regarding data availability: The raw data is available as a Supplemental File.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.8554#supplemental-information.