Apparent source levels and active communication space of whistles of free-ranging Indo-Pacific humpback dolphins (Sousa chinensis) in the Pearl River Estuary and Beibu Gulf, China

Zhi-Tao Wang; Whitlow W.L. Au; Luke Rendell; Ke-Xiong Wang; Hai-Ping Wu; Yu-Ping Wu; Jian-Chang Liu; Guo-Qin Duan; Han-Jiang Cao; Ding Wang

doi:10.7717/peerj.1695

Apparent source levels and active communication space of whistles of free-ranging Indo-Pacific humpback dolphins (Sousa chinensis) in the Pearl River Estuary and Beibu Gulf, China

Zhi-Tao Wang^1,2,3,9, Whitlow W.L. Au³, Luke Rendell⁴, Ke-Xiong Wang ¹, Hai-Ping Wu⁵, Yu-Ping Wu⁶, Jian-Chang Liu⁷, Guo-Qin Duan⁸, Han-Jiang Cao⁸, Ding Wang ¹

1The Key Laboratory of Aquatic Biodiversity and Conservation of the Chinese Academy of Sciences, Institute of Hydrobiology of the Chinese Academy of Sciences, Wuhan, Hubei, China

2University of Chinese Academy of Sciences, Beijing, China

3Marine Mammal Research Program, Hawaii Institute of Marine Biology, University of Hawaii, Hawaii, HI, United States of America

4Sea Mammal Research Unit, School of Biology, University of St. Andrews, Fife, United Kingdom

5School of Marine Sciences, Qinzhou University, Guangxi, China

6School of Marine Sciences, Sun Yat-Sen University, Guangzhou, China

7Transport Planning and Research Institute, Ministry of Transport, Guangzhou, China

8Hongkong-Zhuhai-Macao Bridge Authority, Guangzhou, China

9Division of Marine Science and Conservation, Nicholas School of the Environment, Duke University, Beaufort, NC, United States of America

DOI: 10.7717/peerj.1695

Published: 2016-02-15
Accepted: 2016-01-26
Received: 2015-11-25

Academic Editor: Joseph Pawlik

Subject Areas: Animal Behavior, Conservation Biology, Ecology, Marine Biology, Zoology
Keywords: Active communication space, Pearl River Estuary, Sound propagation model, Whistles, Indo-Pacific Humpback dolphins, Hydrophone arrays, Beibu Gulf, Apparent source level, Sousa chinensis

Copyright: © 2016 Wang et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.

Cite this article: Wang Z, W.L. Au W, Rendell L, Wang K, Wu H, Wu Y, Liu J, Duan G, Cao H, Wang D. 2016. Apparent source levels and active communication space of whistles of free-ranging Indo-Pacific humpback dolphins (Sousa chinensis) in the Pearl River Estuary and Beibu Gulf, China. PeerJ 4:e1695 https://doi.org/10.7717/peerj.1695

The authors have chosen to make the review history of this article public.

Abstract

Background. Knowledge of species-specific vocalization characteristics and their associated active communication space, the effective range over which a communication signal can be detected by a conspecific, is critical for understanding the impacts of underwater acoustic pollution, as well as other threats.

Methods. We used a two-dimensional cross-shaped hydrophone array system to record the whistles of free-ranging Indo-Pacific humpback dolphins (Sousa chinensis) in shallow-water environments of the Pearl River Estuary (PRE) and Beibu Gulf (BG), China. Using hyperbolic position fixing, which exploits time differences of arrival of a signal between pairs of hydrophone receivers, we obtained source location estimates for whistles with good signal-to-noise ratio (SNR ≥10 dB) and not polluted by other sounds and back-calculated their apparent source levels (ASL). Combining with the masking levels (including simultaneous noise levels, masking tonal threshold, and the Sousa auditory threshold) and the custom made site-specific sound propagation models, we further estimated their active communication space (ACS).

Results. Humpback dolphins produced whistles with average root-mean-square ASL of 138.5 ± 6.8 (mean ± standard deviation) and 137.2 ± 7.0 dB re 1 µPa in PRE (N = 33) and BG (N = 209), respectively. We found statistically significant differences in ASLs among different whistle contour types. The mean and maximum ACS of whistles were estimated to be 14.7 ± 2.6 (median ± quartile deviation) and 17.1± 3.5 m in PRE, and 34.2 ± 9.5 and 43.5 ± 12.2 m in BG. Using just the auditory threshold as the masking level produced the mean and maximum ACS_at of 24.3 ± 4.8 and 35.7 ± 4.6 m for PRE, and 60.7 ± 18.1 and 74.3 ± 25.3 m for BG. The small ACSs were due to the high ambient noise level. Significant differences in ACSs were also observed among different whistle contour types.

Discussion. Besides shedding some light for evaluating appropriate noise exposure levels and information for the regulation of underwater acoustic pollution, these baseline data can also be used for aiding the passive acoustic monitoring of dolphin populations, defining the boundaries of separate groups in a more biologically meaningful way during field surveys, and guiding the appropriate approach distance for local dolphin-watching boats and research boat during focal group following.

Introduction

Human activities have profoundly changed the world’s aquatic environment. The International Union for the Conservation of Nature (IUCN) suggests that nearly half of the extant marine mammal species are threatened by two or more human impacts, and that a quarter of marine mammals have been classified as threatened with extinction (Davidson et al., 2012). The Indo-Pacific humpback dolphins (Sousa chinensis, locally called the Chinese white dolphin) is widely distributed throughout shallow, coastal waters from eastern India in the west to the Southern China Sea in the east and throughout Southeast Asia (Jefferson & Rosenbaum, 2014; Reeves et al., 2008). However, marine mammal species occurring in coastal areas are most susceptible to risk, and the coastal distribution of the humpback dolphins make it highly vulnerable to the impact of human activity (Davidson et al., 2012). Its conservation status was categorized as Near Threatened by the IUCN Red List of Threatened Species (Reeves et al., 2008) and as a Grade One National Key Protected Animal in China. Five resident populations of Indo-Pacific humpback dolphins have been identified in Chinese coastal waters: the Pearl River Estuary (PRE) (Chen et al., 2010), Leizhou Bay (Xu et al., 2015) of Guangdong, the Beibu Gulf (BG) of Guangxi (Chen et al., 2009; Pan et al., 2006), Xiamen harbor of Fujian (Chen et al., 2009), and the West coast of Taiwan (Wang et al., 2012).

The PRE region (Fig. 1) is among the most economically developed regions in China (Yeung & Shen, 2008) and also home the world’s largest known population of humpback dolphins (Chen et al., 2010; Preen, 2004), with the population size estimated to be over 2,500 (CVs: 19–89%) (Chen et al., 2010). The BG region (Fig. 1) is, in comparison, relatively undeveloped, with a smaller human population, and the humpback dolphin population there was estimated to be 251 (95% CI [136–794]) (Chen et al., 2009; Pan et al., 2006). The concern about the effects of anthropogenic noise on aquatic life is growing world widely (Popper & Hawkins, 2012), and economic growth in China has been accelerating human damage to coastal ecosystems (He et al., 2014). The recent construction of the Hongkong-Zhuhai-Macao bridge (Wang et al., 2014b), the Zhuhai wind-farm project in Pearl River Estuary, and the flourishing year round dolphin-watching industry in Beibu Gulf (Wang et al., 2013) all have potentially adverse effects on aquatic life. Pile-driving is likely to cause acoustic disturbance (Wang et al., 2014b), and the intense dolphin-watching industry make the dolphin susceptible to close approaches by high-speed dolphin-watching vessels. High-speed vessels can seriously affect the dolphins’ natural behavior (Ng & Leung, 2003), introduce masking noise (Sims, Hung & Wursig, 2012a), and cause injury or even death (Jefferson, 2000) to resident cetaceans. Hence, concerns regarding the conservation of these Chinese white dolphin populations are increasing.

Figure 1: Map of the study area.
Acoustic recordings of underwater sounds produced by humpback dolphins were made in Pearl River Estuary and Beibu Gulf. Dashed line area shows the sound recording region.

Download full-size image

DOI: 10.7717/peerj.1695/fig-1

Marine mammals, especially cetaceans, have evolved sophisticated sound production and reception mechanisms to aid in meeting their requirements for a series of vital processes, including communication, navigation, and foraging (Au, 1993; Au & Hastings, 2008; Surlykke et al., 2014). Dolphins use frequency modulated narrowband sounds, also called whistles, for communication with conspecifics (Janik, 2000b; Janik & Slater, 1998). Both whistle source level (SL), defined as the amplitude at 1 m from the animal on the acoustic axis (Janik, 2000a) and its associated active communication space, the effective range over which a communication signal can be detected by a conspecific (Marten & Marler, 1977; Tervo et al., 2012) are fundamental parameters in animal communication systems. The source level is important because it can provide information on the biological ambient noise caused by conspecifics to which an animal is exposed (Janik, 2000a), which can shed some light on evaluating the appropriate exposure level of dolphins to anthropogenic noise. Knowledge of the statistical distribution of whistle source levels can help in planning passive acoustic monitoring studies of habitat use, as well as abundance estimates (Frankel et al., 2014; Širović, Hildebrand & Wiggins, 2007). However, the distance commonly used to identify dolphins as members of a group was either the ‘10-m chain rule’ (any individuals considered part of the same group were within 10 m of at least one other member of the group, regardless of behavior) (Acevedo-Gutiérrez, 2002; Acevedo-Gutiérrez & Stienessen, 2004; Connor, Smolker & Bejder, 2006; Quick & Janik, 2008; Quick & Janik, 2012; Smolker et al., 1992) or a radius of 100 m (a collection of individuals within which no dolphins were separated by greater than 100 m) (Barco et al., 1999; Lewis, Wartzok & Heithaus, 2011), which may not be biologically meaningful. In conjunction with passive acoustic localization, many recorded whistles from a dolphin focal group (defined by 10-m chain rule) were confirmed to be produced by non-focal groups nearby, rather than the defined focal group (Quick & Janik, 2008). Also, the estimated whistle active space in previous studies of odontocetes were mismatched with, and always greater than, the separation distances commonly used to define the boundary of separate groups (Janik, 2000a; Miller, 2006; Quintana-Rizzo, Mann & Wells, 2006). Additionally, with the increasing threaten of the acoustic masking in marine ecosystems by anthropogenic noise (Clark et al., 2009), the active communication space can help to define the boundary of separate dolphin groups in a more biologically meaningful way.

Humpback dolphin can emit pulsed sound with a peak frequency of 114 ± 12 kHz and an apparent source level of 199 ± 3 dB re 1 µPa @ 1 m (peak-to-peak) (Freitas et al., 2015). Also, they can produce whistles with fundamental frequencies averaged 6.4 kHz, and minimum and maximum fundamental frequencies averaged 5.1 kHz and 7.7 kHz, respectively (Wang et al., 2013). Although S. chinensis is a common species in many waters, information about its vocal behavior remains sparse (Hoffman et al., 2015; Kimura et al., 2014; Li et al., 2012; Wang et al., 2013; Wang et al., 2015). The regulation of underwater acoustic pollution is currently constrained by sparse data, especially the scarcity of quantitative data on animal vocalization characteristics and effects of anthropogenic noise on the biological functions, such as acoustically mediated social interactions (NRC, 2005). In order to avoid or to mitigate the possible detrimental impact and to better protect these Sousa populations, basic acoustic information is needed.

While the apparent source level of whistles, defined as the back-calculated sound pressure level at 1 m distance from the sound source at an unknown angle from the acoustic axis (Jensen et al., 2009b), and its active communication space were estimated in many cetaceans, such as in bottlenose dolphin (Tursiops truncatus) (Jensen et al., 2012) and in white-beaked dolphins (Lagenorhynchus albirostris) (Rasmussen et al., 2006), relevant information is barely known in humpback dolphin. In this study, by using passive acoustic localization, the apparent source level of whistles produced by free-ranging S. chinensis in Pearl River Estuary and Beibu Gulf were measured. The active communication space of whistles were further estimated by integrating whistle source parameters, real-time measurements of environmental background noise spectrum levels and by modeling of the sound propagation loss for the habitat in question with animal physiological hearing capabilities and critical ratios.

Methods

Data collection

Acoustic recordings were made during June–July, 2014, in PRE (22°06′–22°11′S; 113°40′–113°45′E) and August 2014 in BG (21°30′–21°37′S; 108°40′–108°58′E), China (Fig. 1). Surveys were conducted from a 7.5 m recreational powerboat with a 140 hp outboard engine in PRE or a 6.8 m dolphin-watching vessel powered by 40 hp outboard engine in BG under Beaufort sea states ≤3 (on a scale of 12) with a randomly selected route rather than structured transects.

When a group of dolphins was sighted and the majority of whose members were engaged in slow or moderate movements (resting, milling, socializing or feeding) (Hawkins & Gartside, 2009), the vessel moved position to the side of the dolphin group. Groups were defined by the ‘10 m chain rule’ (Quick & Janik, 2012). If the dolphin group was traveling fast (Hawkins & Gartside, 2009), the boat would move swiftly ahead of their moving direction to await them passing by. During sound recording, the vessel’s engine was turned off. For each animal group, the GPS time, location (latitude and longitude), dolphin species, and behavior (traveling, socializing, milling, resting, and feeding) (Hawkins & Gartside, 2009) were recorded. The water depth and water quality, including temperature, salinity, and pH, were measured with a Horiba Multi-parameter Water Quality Monitoring System (model W-22XD; Horiba, Ltd., Kyoto, Japan) for sound propagation modeling. Recording was stopped when none of the dolphins of a group were within 50 m to the hydrophone arrays.

The two-dimensional cross-shaped array consisted of five Reson piezoelectric hydrophones, one in the middle and four on each end of the arms (model TC-4013, frequency range 1 Hz–170 kHz, sensitivity: −211 dB ± 3 dB re 1 V/µPa; Reson Inc., Slangerup, Denmark) (Fig. 2). Each hydrophone was equipped with a 1 MHz bandwidth Reson EC6081 voltage pre-amplifier with a band-pass filter (model VP2000, pass-band 0.1 to either 100 kHz or 250 kHz depending on sampling rate). The EC6081 employ the first order filters (one pole), which was a filter slope of 6 dB/octave in frequency. The hydrophones were connected via a 16-channel synchronized analogue-to-digital (A/D) converter to a laptop computer running LabVIEW 2011 SP1 software (National Instruments (NI), Austin, TX, USA). The A/D converter consisted of four high-speed, 16 bit resolution, data acquisition (DAQ) modules (NI 9223), incorporated in a compact DAQ four-slot USB chassis (NI cDAQ-9174). Each NI 9223 was a four-channel simultaneous A/D converter with a sample rate up to 1 MHz for each channel. Both VP2000 amplifier and NI cDAQ-9174 were powered by external battery packs.

Figure 2: Schematic of experimental apparatus and the array design.
Acoustic signals was picked by the hydrophones and conditioned by the amplifier and filtered before storage into the PC via the DAQ systems. Distance between H1, H2, H3, H4, and H5 was 1.47 m and 1.54 m for Pearl River Estuary and Beibu Gulf, respectively. Distance between H1, H2, H3, and H4 was 2.08 m and 2.18 m for Pearl River Estuary and Beibu Gulf, respectively. The inset shows a detailed view of the hydrophone array.

Download full-size image

DOI: 10.7717/peerj.1695/fig-2

A steel bracket was used to fix the distance between hydrophones. The bracket was made from a stainless cylinder-shaped bar with a cross structure as its backbone (bar diameter: 2.5 cm) and a reinforced stainless bar (bar diameter: 2 cm) at each quadrant (Fig. 2). A 5 cm extending bar (bar diameter: 0.3 cm) was affixed perpendicularly to the bracket plane at the center and end of each arms to mount the hydrophones, to minimize the interference of the bar to the sound (including reflection and/or shadowing) (Fig. 2). Inter-hydrophone distance along the backbone structure of the bracket was 1.47 m in PRE and 1.54 m in BG.

During the sound recording, the hydrophone arrays were deployed from the side of the boat so that the plate was in the horizontal plane at a depth of 1 m. Floats and attached weights limited array movement to reduce noise due to water flow (Fig. 2). The acoustic data were stored directly on the hard drive of a computer in technical data management streaming (TDMS) format and sampled at a rate of either 200 kHz or 512.828 kHz, giving a Nyquist frequency of 100 kHz and 256.414 kHz, respectively. The presence of signals was monitored in real-time by using both the PC screen for waveforms monitoring and a headphone connected to the center hydrophone. To minimize the chance of missing good signal, a three second pre-recording buffer was employed. Upon detecting a signal, a manual trigger was used to initiated a recording with the buffer included.

The Reson hydrophones were calibrated prior to shipment from the factory (Fig. S1). The remaining components of the recording system, including the amplifier, filter, A/D converter and laptops, were calibrated in the lab prior to the field survey by inputting a calibration signal generated by an OKI underwater sound level meter (model SW1020; OKI Electric Industry Co., LTD., Tokyo, Japan). Signal flow was also simultaneously monitored with an oscilloscope (model TDS1002C; Tektronix Inc., Beaverton, OR, USA). The noise floor of the recording system was about 65 and 55 dB re 1 µPa²/Hz at 100 Hz and 1 kHz, respectively, and flat at about 50 dB re 1 µPa²/Hz between 10 kHz and 100 kHz, which were lower than the ambient noise level at sea state 0 in our study (Fig. S1), and suitable for noise monitoring.

Sound propagation modeling

Multi-path propagation is inevitable in shallow waters, as bottom and surface reflections interfere with the signal propagation in a direct path. Following standard sound propagation theory (Au & Hastings, 2008; Aubauer, Lammers & Au, 2000; Urick, 1983), a custom-compiled sound propagation model (File S1) targeted on the impact of multi-path propagation on the original signals and took into account the hydrophone-animal geometry (such as animal depth, hydrophone depth, distance between hydrophone and animal) and site specific environment and bathymetry characteristics (such as water depth and bottom sediment contents) was adopted for this study (Fig. 3).

Figure 3: Schematic of multipath propagation.
The dw, da and dh were the depth of the water, the animal, and the receiving hydrophone, respectively. “A” denotes the animal location, and “H” denotes the hydrophone, Aa was the horizontal separation distance between the animal and the hydrophone, r₀ was the direct signal propagation path, r_s(m) and r_b(m) were the signal propagation lengths for multipath propagation signal with a total number of m reflection points and the initial reflection point at the air–water and water–bottom interface, respectively, θ_s(m) and Φ_s(m) were the incident (same as reflected) and transmitted angle, respectively, for multipath propagation signal with a total number of m reflection points and the initial reflection point at the air–water interface, θ_b(m) and Φ_b(m) were the incident (same as reflected) and transmitted angle, respectively, for multipath propagation signal with a total number of m reflection points and the initial reflection point at the water–bottom interface, h_s(m) and h_b(m) were the vertical propagation length of the multipath propagation signal with a total number of m reflection points and the initial reflection point at the air–water and water–bottom interface, respectively, by referencing the animal location. The insets show the sound transmission at the air–water interface and at the water–bottom interface, respectively.

Download full-size image

DOI: 10.7717/peerj.1695/fig-3

Since the energy flux density (EFD, dB re 1 µPa² s) is more meaningful in situations where considerable signal distortion occurs during propagation (Urick, 1983), the estimated transmission loss (TL) for each location was subsequently derived from the difference from the energy flux density of the received signal (EFD_r) and the energy flux density at the signal source (EFD_s) by the equation: (1) $TL = {EFD}_{r} - {EFD}_{s}$ (2) $EFD = 10 \times {log}_{10} \{\int_{0}^{T} (p^{2} (t) ∕ p_{ref1}^{2}) d t\}$ where p(t) is the sound pressure in µPa, then p_ref1 was 1 µPa² s. For each hydrophone and animal depth combination at a given water depth, the above obtained transmission losses, at varied separation distances between the hydrophone and animal were fitted to a geometric spreading loss model to estimate the environment–dependent transmission loss coefficient by the equation: (3) $TL = k \times {log}_{10} (r ∕ r_{0})$ where k was the transmission loss coefficient, r was the distance between the animal and hydrophone, r_o was the reference range set as 1 m (Fig. 4). Frequency-dependent absorption was ignored here, since the sound absorption losses for a standard Sousa whistle, with a mean fundamental frequency of 6.35 kHz (Wang et al., 2013) as a function of site specific temperature and pressure at the PRE and BG were 0.31 and 0.30 dB/km, respectively, according to the Fisher and Simmons equation (Fisher & Simmons, 1977), and would be negligible over the ranges at which we actually recorded signals.

Figure 4: Sound transmission loss coefficient as a function of animal depth and distance between hydrophone and animals at given hydrophone and water depth.
The blue curve was the modeled transmission loss of the whistle with a peak frequency of 6.6 kHz (see spectrogram in Fig. 5) at water depth of 4.5 m with hydrophone at 1 m depth and animal located at (A) surface, (B) middle section and (C) bottom of the water in Beibu Gulf. The red curve in each graph represents logarithmic curve fit of the blue curve.

Download full-size image

DOI: 10.7717/peerj.1695/fig-4

Figure 5: Schematic of acoustic localization of humpback dolphins whistle.
(A) oscillograms of same signal received at four different hydrophones (H1, H2, H3, and H4). Cross-correlation was shown in (B), and legends on the top left corner of each panel indicate which two hydrophones have been cross-correlated. The peak of each correlation function corresponds to time differences in time of arrival of whistles in the front hydrophone minus that of the later one for the compared hydrophones. Hyperbola fixing (in C) and legends next to each hyperbola indicate which hydrophone pair it corresponds to. Points of intersection of hyperbolae indicate position of sound source. Closed blue circle (in C) indicates position of hydrophone arrays. Point (0, 0) was located at the center of the acoustic array. The slide on the top right corner of (C) optimize estimated depth of the animal.

Download full-size image

DOI: 10.7717/peerj.1695/fig-5

Acoustic data analysis

The peripheral four hydrophone channels were used for the acoustic localization of phonation animals, and the center hydrophone channel was used for detailed whistle characteristic measurement. Raven Pro Bioacoustics Software (version 1.4; Cornell Laboratory of Ornithology, NY, USA) was used to analyze the acoustic data in spectrogram (window type: Hann windows; FFT size: 8,192 and 16,384 samples for sampling frequencies of 200 and 512 kHz, respectively; frame overlapping: 80%). Only whistles with good signal-to-noise ratios (SNR ≥ 10 dB) on all five hydrophones and satisfying the criteria of no overlapping echolocation signal or whistles from different individuals were analyzed. In order to make the data more independent and to reduce the possibility of using multiple whistles from the same animal, for each dolphin encounter, we extracted only one signal for each whistle tonal type (Wang et al., 2013) for further analyzing.

Acoustic localization

Passive acoustic localization of vocalizing animals based on differences in the time of arrival of the same sound between all pairs of hydrophone receivers is a well-established technique (Au & Benoit-Bird, 2003; Janik, 2000b; Jensen et al., 2009a; Spiesberger, 1997; Spiesberger & Fristrup, 1990; Wahlberg, Møhl & Madsen, 2001; Watkins & Schevill , 1972). In this study, a custom-written package based on Matlab software (version R2010b; The Mathworks, Inc., Natick, MA, USA), named TOADY (King, Harley & Janik, 2014; Quick, Rendell & Janik, 2008; Quick & Janik, 2012; Quick & Janik, 2008; Schulz, Whitehead & Rendell, 2006; Schulz et al., 2008) was adopted for localizing phonating animals. The time delays were preserved on the simultaneous multi-track recording of signal input from all hydrophones. Signal waveforms from the different recording channels were cross-correlated to determine the difference in arrival time of a sound at each hydrophone pair. Before cross-correlation processing, a digital high-pass filter set to start rolling off just below the minimum frequency of the fundamental frequency contour of each whistle was used to eliminate any low-frequency background noise interference. The position of the largest peak in the resulting cross-correlation vector represents the amount by which the two signals are offset in time (Hayes et al., 2000). Signals with more than one equivalent peak and/or low cross-correlation maxima were discarded (Lammers & Au, 2003). The time delays were used to generate hyperboloid surfaces of possible source locations.

The standard hyperboloid can be estimated by rotating a standard hyperbola along its transverse axis. In detail, the standard hyperbola can be constructed by equations: (4) $x^{2} ∕ a_{i j}^{2} - y^{2} ∕ b_{i j}^{2} =1$ (5) $a_{i j} = c \times t_{i - j} ∕ 2$ (6) $d_{i j} = \sqrt{{(x_{i} - x_{j})}^{2} + {(y_{i} - y_{j})}^{2} + {(z_{i} - z_{j})}^{2}}$ (7) $b_{i j} = \sqrt{{(d_{i j} ∕ 2)}^{2} - a_{i j}^{2}}$ where (x, y) represent the locus coordinates in two dimensions at the hyperbola located alone the hydrophone array plane, a_ij and b_ij represent the distance from the center to either vertex and the length of a segment perpendicular to the transverse axis drawn from each vertex to the asymptotes of the hyperbola between the hydrophone i and j, respectively. The symbols (x_i, y_i, z_i) and (x_j, y_j, z_j) represent the three-dimension coordinates of the hydrophone i and j, respectively, c represents the speed of sound in water in m/s, and t_i−j represents the time delay between the hydrophones i and j in seconds. The maximum allowable time delay between a pair of hydrophones in the array is limited to the direct-path propagation time between them (Helble et al., 2015) as: (8) $max (t_{i - j}) = d_{i j} ∕ c$ where d_ij represent the separation of the hydrophone i and j in m. The standard hyperboloid was then rotated and further recast to the center of the spatial geometry of the corresponding array-pair positions.

Once all the hyperboloids were established, contours of the hyperboloids (hyperbolae) at varied assumed animal depths, ranging from the water surface to the bottom set at 0.5 m increments, were displayed in the graphical interface of the TOADY software for visual inspection the hyperbolic fixing (Fig. 5C). Four hydrophones resulted in six hyperbolae and yield four points of intersection (for each independent combination of a hydrophone triad, only two of the three time differences were linearly independent, and all three hyperbolae intersected at a single point) (Laurinolli et al., 2003). The localization accuracy was increased by inclusion of the depth function (Quick, Rendell & Janik, 2008), and animal depth was estimated as that where the surface area of the polygon formed by the hyperbola intersections was minimum (Quick, Rendell & Janik, 2008). The average of the hyperboloid intersections was taken as the best estimate of the sound source’s location (Clark & Ellison, 2000; Laurinolli et al., 2003; Schulz, Whitehead & Rendell, 2006; Schulz et al., 2008).

Ideally, all the four intersections occurred at one point (Fig. 5C). The location error was assessed by a linear error propagation model (Taylor, 1997), and the root-mean-square (rms) location error was estimated using the equation: (9) $ε_{rms} = \sqrt{ε_{x}^{2} + ε_{y}^{2} + ε_{z}^{2}}$ where ε_x, ε_y, and ε_z are the standard deviation (SD) of the hyperbolae intersections in the zonal, meridional, and vertical directions, respectively (Laurinolli et al., 2003; Schulz et al., 2008; Wahlberg, Møhl & Madsen, 2001).

Signal extraction

Whistles with successful source location estimates were extracted for sound parameter analysis using the center hydrophone channel. The extracted whistle was assigned to one of the following six tonal types based on its fundamental time-frequency contour as: flat, down, rise, U-shape, concave and sine. All tonal types were mutually exclusive (Fig. 6, for detailed definition, see Wang et al., 2013). A three-step procedure was applied to extract the candidate whistles (Fig. 7). A 2-s signal was extracted for each candidate whistle (the whole signal in Figs. 7A and 7B). The actual whistle was subsequently measured from the start and end points of the fundamental contour (Fig. 7C) and further extracted it as the portion containing 98% of the total cumulative energy, which started at the time when 1% of the cumulative signal energy was reached (t_1%ce, in Fig. 7E) and ended when 99% of the cumulative signal energy was reached (t_99%ce, in Fig. 7E). Whistle duration was derived from the time difference between the 1st and 99th cumulative energy percentiles (in Fig. 7E). A 500 ms ambient noise selection was extracted either before of after (in Fig. 7B) each whistle from the 2-s signal, with a gap of over 0.2 ms from either sides of the whistle fundamental contour (in Fig. 7B). All spectrograms were computed with 25 ms Hann windows (5,000 and 12,820 samples, zero-padded to 8,192 and 16,384 samples for sampling frequencies of 200 and 512 kHz, respectively) for FFT computation with 80% overlap for a temporal resolution of 5 ms and an interpolated spectral frequency resolution of 24.4 and 31.1 Hz, respectively.

Figure 6: Spectrogram of the six whistle tonal types.
Spectrogram configuration (window type: Hanning; temporal grid resolution 5 ms; overlap samples per frame 80%; frequency grid spacing 24.4 Hz; window size 5,000; FFT size 8,192; Nyquist frequency 100 kHz). Note that spectrogram maximum frequency was scaled to 25 kHz for a detailed view of the whistle fundamental frequency.

Download full-size image

DOI: 10.7717/peerj.1695/fig-6

Figure 7: Three-step whistle extraction.
(A) waveform and (B) spectrogram of the 2 s signal extracted for each whistle. Candidate whistle was extracted from the starting and ending point of the trace of the whistle fundamental frequency contour (in C) and further extracted as the portion containing 98% of the total cumulative energy (between ce_1% and ce_99% in E), whistle duration was defined as the time between the 1st and 99th cumulative energy percentiles (between t_99%ce and t_1%ce in E). A 500 ms ambient noise selection was extracted ahead of or following (in A and B) each whistle as the matched noise. Spectrogram configuration (window type: Hanning; temporal grid resolution 5 ms; overlap samples per frame 80%; frequency grid spacing 31.3 Hz; window size 12,821; FFT size 16,384; Nyquist frequency 256.414 kHz). Note that spectrogram maximum frequency was scaled to 20 kHz for a detailed view of the fundamental frequency.

Download full-size image

DOI: 10.7717/peerj.1695/fig-7

Apparent source levels and source energy flux density

For each whistle, the root-mean-square sound pressure levels (SPL_rms, dB re 1 µPa) and energy flux density (EFD) were calculated using the following equations (Au & Hastings, 2008): (10) $S P L_{rms} = 10 \times {log}_{10} \{1 ∕ T \times \int_{0}^{T} (p^{2} (t) ∕ p_{ref2}^{2}) d t\}$ where p(t) was the sound pressure in µPa, and p_ref2 was 1 µPa. SPL_rms critically relies upon the signal window size (T) in Eq. (10) (Madsen, 2005). Bottlenose dolphins integrate pure-tone acoustic energy in the same way as humans (Johnson, 1968b), with the integrating time constant for the pure-tone range from 1 kHz to 8 kHz approximately 200 ms (Johnson, 1968b; Plomp & Bouman, 1959). The representative range of the fundamental frequencies of the Sousa whistle averaged at 6.4 kHz with the minimum and maximum fundamental frequency average at 5.1 kHz and 7.7 kHz, respectively (Wang et al., 2013). Here, we assumed that the integration time constant from bottlenose dolphins also applied to Sousa. Both whistles and matched noise samples were consecutively cut into segments of 200 ms with two adjacent slices overlapping by 95%. A measure termed SPL_rms200 was taken as maximum SPL_rms value from the 200 ms slices of each whistle, and SPL_noi was derived from the average SPL_rms value of the 200 ms, slices of each matched noise sample. Absolute pressure levels were derived by incorporating the sensitivity of the hydrophone and the amplifier gain (Au & Hastings, 2008). Apparent source levels (ASLs) and source energy flux density (SEFD) were estimated from the received apparent sound pressure levels and energy flux density by compensating for the transmission loss using the site-specific transmission loss model.

Power spectral density and one-third octave band levels

Power spectral density (dB re 1 µPa²Hz⁻¹), the averaged sound power in each 1 Hz band (Sims et al., 2012b) were calculated using Welch approach for each whistle over its 98% energy windows and their corresponding noise to assess the detailed acoustic energy distribution. Their one-third octave band levels (dB re 1 µPa²) were further calculated to assess how cetaceans auditory systems perceive sound and were impacted by ambient noise (Madsen et al., 2006). All power spectral density and one-third octave band levels were computed with 0.2 s slice window, with 95% overlap between two slices for FFT computation, resulting in an interpolated spectral frequency resolution of 3.05 and 3.91 Hz for sampling frequencies of 200 and 512 kHz, respectively.

Active communication space

Detection of a tonal signal against a continuous broad-band noise background will be effectively masked by only a relatively narrow band of frequencies centered on the tonal stimulus, namely the critical bandwidth (Fletcher, 1940). The critical ratio is another measure of auditory filter width and an indirect method for estimating critical bandwidth (Au & Moore, 1990). At the detection threshold, the signal power equals the noise power, so that the auditory filter width is the ratio of the threshold intensity of a tone over the ambient noise power spectral density at the frequency in question (Fletcher, 1940).

The active communication space is a combined function of the signal source level, the dolphin auditory threshold, the habitat-specific transmission loss, and the masking level at the one third octave band center frequency in question (Janik, 2000a; Jensen et al., 2012; Quintana-Rizzo, Mann & Wells, 2006). The masking level is determined by the noise one-third octave band level or the masked tone threshold, whichever dominated. The masked tone threshold is the sum of the noise power spectral density and the critical ratio at the frequency in question (Janik, 2000a; Jensen et al., 2012; Quintana-Rizzo, Mann & Wells, 2006). The active communication space of each whistle is estimated as the maximum range at which the signal can still be detected in at least one of the one-third octave bands analyzed after accounting for the transmission loss (Miller, 2006). For whistle signals, the one-third octave band that determines the maximum range is always at the peak frequency of the signal one-third octave band levels (Fig. 8).

Figure 8: Schematic for active communication space calculation.
The mean(Sig_TOBL) and mean(Noi_TOBL), surrounded by gray shading of a 95% CI were calculated from a running average of the one-third octave band levels for each whistle and the matched noise, respectively, with step window size of 200 ms and 95% steps overlap, fp was the peak frequency determined by the mean(Sig_TOBL), the max(Sig_TOBL) and mean(Noi_PSD) were calculated from a running maximum one-third octave band levels of whistle and a running average power spectral density of the matched noise, respectively, both with step window size of 200 ms and 95% steps overlap. *Sousa* audiogram with a frequency span of 500 Hz–38 kHz was obtained by fitting a third-order polynomial curve to the *Sousa* auditory thresholds between 5 kHz and 38 kHz. Dolphin critical ratio was adopted from Johnson, McManus & Skaar (1989). The inset shows a detailed portion of the max(Sig_TOBL) and mean(Sig_TOBL) at the peak frequency determined by the averaged one-third octave band levels for all the 200 ms slices for each whistle.

Download full-size image

DOI: 10.7717/peerj.1695/fig-8

The active communication space for each whistle can be modeled by the equations: (11) $k \times {log}_{10} [mean (ACS)] = TL = [mean (Sig_TOBL)] (fp) - max \{ML(fp), AT(fp)\}$ (12) $k \times {log}_{10} [max (ACS)] = T L = [max (Sig_TOBL)] (fp) - max \{ML (fp), AT (fp)\}$ (13) $ML (fp) = max \{[mean (Noi_TOBL)] (fp), MTT (fp)\}$ (14) $MTT(fp) = [mean (Noi_PSD)] (fp) + CR(fp)$ where ACS was the active communication space of the whistle in the near simultaneous ambient noise conditions obtained from the matched noise sample, mean(Sig_TOBL) and max (Sig_TOBL) were the averaged and maximum one-third octave band level for all the 200 ms slices for each whistle, f_p was determined by the peak frequency of the averaged one-third octave band levels for all the 200 ms slices for each whistle, mean(Noi_PSD) and mean(Noi_TOBL) were the averaged power spectral density and one-third octave band levels of all the 200 ms slices from the matched noise sample for each whistle, ML was the masking level, MTT was the masked tone threshold, and AT was Sousa auditory threshold. The Sousa audiogram with a frequency span of 500 Hz–38 kHz (which cover the fundamental contour range of Sousa whistles of 520 Hz–33 kHz (Wang et al., 2013)) was estimated by fitting a third-order polynomial curve to the auditory thresholds between 5 kHz and 38 kHz (Li et al., 2012). CR was the dolphin critical ratio (Johnson, McManus & Skaar, 1989), and was obtained by following the equation: (15) $CR = 19.8 + 0.075 \times f^{1 ∕ 2}$ where CR was in dB and f was the frequency in Hz. The equation was obtained by applying a least-square fit to the bottlenose dolphin critical ratio data (Johnson, 1968a; Moore & Au, 1982).

In cases where the masking level was always higher than the relevant Sousa auditory threshold, i.e., the active communication space was noise-limited, the theoretical active communication space determined by the Sousa auditory threshold alone was also calculated. The active communication space determined by auditory threshold alone (ACS_at) can be modeled as: (16) $k \times {log}_{10} [mean ({ACS}_{at})] = TL = [mean (Sig_TOBL)] (fp) - AT (fp)$ (17) $k \times {log}_{10} [max ({ACS}_{at})] = T L = [max (Sig_TOBL)] (fp) - AT (fp) .$

Statistical analysis

Descriptive statistics of all measured acoustic parameters were obtained and presented in the form of mean, SD, and ranges if they were normal distributed. For those parameters with a grossly skewed distribution, descriptive parameters of median, quartile deviation (QD), 5 percentile (P5), and 95 percentile (P95) were adopted. The Levene’s test for equality of error variances and Kolmogorov–Smirnov goodness-of-fit test were used to analyze homogeneity of variance and the distributions of the data, respectively. Nonparametric statistical analyses (Zar, 1999) were adopted if data were not normally distributed. The Kruskal–Wallis test (Zar, 1999) was used to examine the difference in the mean of the transmission loss coefficient of different test signals running in the sound propagation model. The Mann–Whitney U-test (Zar, 1999) was used to analyze differences between transmission loss coefficients, as well as acoustic parameters between sites. Differences in apparent source levels and energy flux density across different whistle tonal types was analyzed by the Kruskal–Wallis test (Zar, 1999), and Duncan’s multiple comparison test (Zar, 1999) was used for post hoc comparisons of acoustic parameter among tonal types. Statistical analyses were performed using SPSS 16.0 for Windows (SPSS Inc., Chicago, USA). Differences were considered significant at p < 0.05.

Ethical statement

Permission to conduct the study was granted by the Ministry of Science and Technology of the People’s Republic of China. The research permit was issued to the Institute of Hydrobiology of the Chinese Academy of Sciences (Permit number: 2011BAG07B05).

Results

Six hundred and thirty four whistles were recorded during 14 observation days, from which 33 whistles were successfully selected from two days in the Pearl River Estuary and 209 whistles from eight days in the Beibu Gulf (Table 1) for further analysis.

Table 1:

Summary of 14 survey days in Pearl River Estuary and Beibu Gulf.

Each successfully localized whistle was grouped according to tonal types.

Site	Date	Sample rate	Recorded whistles			Localized whistles
				Flat	Down	Rise	U-shape	Convex	Sine	Sum
PRE	20140605	200,000	78	21	0	2	0	1	5	29
	20140708	512,821	5	0	0	0	0	0	0	0
	20140710	512,821	19	1	0	0	0	0	3	4
	20140711	512,821	6	0	0	0	0	0	0	0
BG	20140804	512,821	35	4	3	2	3	3	4	19
	20140805	512,821	49	2	2	1	15	1	2	23
	20140806	512,821	28	5	5	1	1	0	0	12
	20140813	200,000	107	6	2	2	1	3	1	15
	20140814	200,000	55	8	1	4	9	1	0	23
	20140815	200,000	8	1	0	0	0	0	1	2
	20140816	200,000	4	0	0	0	0	0	0	0
	20140820	200,000	66	5	3	2	2	2	9	23
	20140821	200,000	18	0	0	0	0	0	0	0
	20140822	200,000	156	13	12	14	33	4	16	92
	Sum		634	66	28	28	64	15	41	242

DOI: 10.7717/peerj.1695/table-1