PeerJ Preprints: Computational Sciencehttps://peerj.com/preprints/index.atom?journal=peerj&subject=3670Computational Science articles published in PeerJ PreprintsExtreme degeneracy of inputs in firing a neuron leads to loss of information when neuronal firing is examinedhttps://peerj.com/preprints/272282018-09-202018-09-20Kunjumon I Vadakkan
Possible combinations of inputs of the order of 10100 can fire (axonal spike or action potential) a neuron that has nearly 104 inputs (dendritic spines). This extreme degeneracy of inputs that can fire a neuron involves significant loss of information when examination is limited to neuronal firing. Excitatory postsynaptic potentials (EPSPs) propagating from remote locations on the dendritic tree attenuate as they arrive at the axon hillock depending on the distance they propagate. Moreover, some EPSPs from remote locations will not even reach the axonal hillock. In this context, an operational mechanism at the location of origin of these EPSPs is necessary to preserve information for efficient storage. A similar mechanism is also expected at the location of origin of EPSPs generating dendritic spikes.
Possible combinations of inputs of the order of 10100 can fire (axonal spike or action potential) a neuron that has nearly 104 inputs (dendritic spines). This extreme degeneracy of inputs that can fire a neuron involves significant loss of information when examination is limited to neuronal firing. Excitatory postsynaptic potentials (EPSPs) propagating from remote locations on the dendritic tree attenuate as they arrive at the axon hillock depending on the distance they propagate. Moreover, some EPSPs from remote locations will not even reach the axonal hillock. In this context, an operational mechanism at the location of origin of these EPSPs is necessary to preserve information for efficient storage. A similar mechanism is also expected at the location of origin of EPSPs generating dendritic spikes.Latent factors and dynamics in motor cortex and their application to brain-machine interfaceshttps://peerj.com/preprints/272172018-09-162018-09-16Chethan PandarinathK. Cora AmesAbigail A RussoAli FarshchianLee E MillerEva L DyerJonathan C Kao
In the fifty years since Evarts first recorded single neurons in motor cortex of behaving monkeys, great effort has been devoted to understanding their relation to movement. Yet these single neurons exist within a vast network, the nature of which has been largely inaccessible. With advances in recording technologies, algorithms, and computational power, the ability to study network-level phenomena is increasing exponentially. Recent experimental results suggest that the dynamical properties of these networks are critical to movement planning and execution. Here we discuss this dynamical systems perspective, and how it is reshaping our understanding of the motor cortices. Following an overview of key studies in motor cortex, we discuss techniques to uncover the “latent factors” underlying observed neural population activity. Finally, we discuss efforts to leverage these factors to improve the performance of brain-machine interfaces, promising to make these findings broadly relevant to neuroengineering as well as systems neuroscience.
In the fifty years since Evarts first recorded single neurons in motor cortex of behaving monkeys, great effort has been devoted to understanding their relation to movement. Yet these single neurons exist within a vast network, the nature of which has been largely inaccessible. With advances in recording technologies, algorithms, and computational power, the ability to study network-level phenomena is increasing exponentially. Recent experimental results suggest that the dynamical properties of these networks are critical to movement planning and execution. Here we discuss this dynamical systems perspective, and how it is reshaping our understanding of the motor cortices. Following an overview of key studies in motor cortex, we discuss techniques to uncover the “latent factors” underlying observed neural population activity. Finally, we discuss efforts to leverage these factors to improve the performance of brain-machine interfaces, promising to make these findings broadly relevant to neuroengineering as well as systems neuroscience.Serverless OpenHealth at data commons scale - traversing the 20 million patient records of New York's SPARCS dataset in real-timehttps://peerj.com/preprints/272092018-09-152018-09-15Jonas S AlmeidaJanos HajagosJoel SaltzMary Saltz
In a previous report, we explored the serverless OpenHealth approach to the Web as a Global Compute space. That approach relies on the modern browser full stack, and, in particular, its configuration for application assembly by code injection. The opportunity, and need, to expand this approach has since increased markedly, reflecting a wider adoption of Open Data policies by Public Health Agencies. Here, we describe how the serverless scaling challenge can be achieved by the isomorphic mapping between the remote data layer API and a local (client-side, in-browser) operator. This solution is validated with an accompanying interactive web application (bit.ly/loadsparcs) capable of real-time traversal of New York’s 20 million patient records of the Statewide Planning and Research Cooperative System (SPARCS), and is compared with alternative approaches. The results obtained strengthen the argument that the FAIR reproducibility needed for Population Science applications in the age of P4 Medicine is particularly well served by the Web platform.
In a previous report, we explored the serverless OpenHealth approach to the Web as a Global Compute space. That approach relies on the modern browser full stack, and, in particular, its configuration for application assembly by code injection. The opportunity, and need, to expand this approach has since increased markedly, reflecting a wider adoption of Open Data policies by Public Health Agencies. Here, we describe how the serverless scaling challenge can be achieved by the isomorphic mapping between the remote data layer API and a local (client-side, in-browser) operator. This solution is validated with an accompanying interactive web application (bit.ly/loadsparcs) capable of real-time traversal of New York’s 20 million patient records of the Statewide Planning and Research Cooperative System (SPARCS), and is compared with alternative approaches. The results obtained strengthen the argument that the FAIR reproducibility needed for Population Science applications in the age of P4 Medicine is particularly well served by the Web platform.Data organization in spreadsheetshttps://peerj.com/preprints/31832018-09-112018-09-11Karl W BromanKara H. Woo
Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this paper offers practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, don't leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, don't include calculations in the raw data files, don't use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.
Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this paper offers practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, don't leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, don't include calculations in the raw data files, don't use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.Geomorphometry-10 years after the book-challenges ahead ?https://peerj.com/preprints/271572018-08-292018-08-29Hannes Isaak Reuter
In 2008 the Geomorphometry book was published after several years of work on it. 10 years have passed since the book has been published, many more years since the early work of the grandfathers of this domain. One of the key definition in the book was the following: Geomorphometry is the science of digital terrain modelling, analysis and quantitative land surface analysis.The author argues that this definition still holds true. The paper discusses past developments and future questions and argues that we need to move to a predicted space-time geomorphomerty parameters based approach.
In 2008 the Geomorphometry book was published after several years of work on it. 10 years have passed since the book has been published, many more years since the early work of the grandfathers of this domain. One of the key definition in the book was the following: Geomorphometry is the science of digital terrain modelling, analysis and quantitative land surface analysis.The author argues that this definition still holds true. The paper discusses past developments and future questions and argues that we need to move to a predicted space-time geomorphomerty parameters based approach.A data-driven method for the determination of water-flow velocity in watershed modellinghttps://peerj.com/preprints/271552018-08-292018-08-29Qiming ZhouFangli ZhangLiang Cheng
Physically-based distributed hydrological models have always played an important role in watershed hydrology. Existing hydrological modeling applications focused more on the estimation of water balance and less on the simulation of water transportation in a catchment. Different from the prediction of flow production, the dynamic simulation of flow concentration depends largely on the field distribution of water-flow velocity. However, it is still difficult to determine the water-flow velocity with terrain analysis techniques, which had always hampered the application of hydrological models in surface water transportation simulation. This study, therefore, proposes a data-driven method for creating a field map of overland flow velocity based on the Manning’s equation. Case study on a gauged watershed is undertaken to validate the spatial distribution of flow velocity. The preliminary results indicate that the proposed empirical method can reasonably determine the spatial distribution of water-flow velocity. Further efforts are still required to support the space-time change of flow velocity under the control of microtopography and instantaneous water depth.
Physically-based distributed hydrological models have always played an important role in watershed hydrology. Existing hydrological modeling applications focused more on the estimation of water balance and less on the simulation of water transportation in a catchment. Different from the prediction of flow production, the dynamic simulation of flow concentration depends largely on the field distribution of water-flow velocity. However, it is still difficult to determine the water-flow velocity with terrain analysis techniques, which had always hampered the application of hydrological models in surface water transportation simulation. This study, therefore, proposes a data-driven method for creating a field map of overland flow velocity based on the Manning’s equation. Case study on a gauged watershed is undertaken to validate the spatial distribution of flow velocity. The preliminary results indicate that the proposed empirical method can reasonably determine the spatial distribution of water-flow velocity. Further efforts are still required to support the space-time change of flow velocity under the control of microtopography and instantaneous water depth.Optimal exponent-pairs for the Bertalanffy-Pütter growth modelhttps://peerj.com/preprints/271522018-08-282018-08-28Katharina Renner-MartinNorbert BrunnerManfred KühleitnerWerner-Georg NowakKlaus Scheicher
The Bertalanffy-Pütter growth model describes mass m at age t by means of the differential equation dm/dt = p⋅ma−q⋅mb. The special case using the Bertalanffy exponent-pair a=2/3 and b=1 is most common (it corresponds to the von Bertalanffy growth function VBGF for length in fishery literature). For data fitting using general exponents, five model parameters need to be optimized, the pair a<b of non-negative exponents, the non-negative constants p and q, and a positive initial value m0 for the differential equation. For the case b=1 it is known that for most fish data any exponent a<1 could be used to model growth without affecting the fit to the data significantly (when the other parameters p, q, m0 were optimized). Thereby, data fitting used the method of least squares, minimizing the sum of squared errors (SSE). It was conjectured that the optimization of both exponents would result in a significantly better fit of the optimal growth function to the data and thereby reduce SSE. This conjecture was tested for a data set for the mass-growth of Walleye (Sander vitreus), a fish from Lake Erie, USA. Compared to the Bertalanffy exponent-pair the optimal exponent-pair achieved a reduction of SSE by 10%. However, when the optimization of additional parameters was penalized, using the Akaike information criterion (AIC), then the optimal exponent-pair model had a higher (worse) AIC, when compared to the Bertalanffy exponent-pair. Thereby SSE and AIC are different ways to compare models. SSE is used, when predictive power is needed alone, and AIC is used, when simplicity of the model and explanatory power are needed.
The Bertalanffy-Pütter growth model describes mass m at age t by means of the differential equation dm/dt = p⋅ma−q⋅mb. The special case using the Bertalanffy exponent-pair a=2/3 and b=1 is most common (it corresponds to the von Bertalanffy growth function VBGF for length in fishery literature). For data fitting using general exponents, five model parameters need to be optimized, the pair a<b of non-negative exponents, the non-negative constants p and q, and a positive initial value m0 for the differential equation. For the case b=1 it is known that for most fish data any exponent a<1 could be used to model growth without affecting the fit to the data significantly (when the other parameters p, q, m0 were optimized). Thereby, data fitting used the method of least squares, minimizing the sum of squared errors (SSE). It was conjectured that the optimization of both exponents would result in a significantly better fit of the optimal growth function to the data and thereby reduce SSE. This conjecture was tested for a data set for the mass-growth of Walleye (Sander vitreus), a fish from Lake Erie, USA. Compared to the Bertalanffy exponent-pair the optimal exponent-pair achieved a reduction of SSE by 10%. However, when the optimization of additional parameters was penalized, using the Akaike information criterion (AIC), then the optimal exponent-pair model had a higher (worse) AIC, when compared to the Bertalanffy exponent-pair. Thereby SSE and AIC are different ways to compare models. SSE is used, when predictive power is needed alone, and AIC is used, when simplicity of the model and explanatory power are needed.Machine learning for cross-scale geomorphometric classification of landforms: a day at the beachhttps://peerj.com/preprints/271122018-08-132018-08-13Ashton ShortridgeClayton QueenAlan Arbogast
This paper investigates the use of random forests and spatial random forests (RFsp) for the classification of coastal dune areas along 41km of Lake Michigan’s shoreline using a lidar- derived DEM. Terrain variables across a range of spatial neighborhood scales are utilized, and for two different cell resolutions. Distance is explicitly incorporated into the RFsp models through the calculation of buffer distances around small numbers (6-13) of gridded points in the study area. While classification accuracy is high generally, RFsp produced much more accurate results. At the fine scale, topographic variables and their neighborhood ranges were not predictive of dune areas, perhaps because large (> 0.1 hectare) neighborhoods were not tested at that scale. At the coarse scale these variables were much more important. The use of small numbers of gridded (non-sample) points to improve spatial prediction warrants further investigation.
This paper investigates the use of random forests and spatial random forests (RFsp) for the classification of coastal dune areas along 41km of Lake Michigan’s shoreline using a lidar- derived DEM. Terrain variables across a range of spatial neighborhood scales are utilized, and for two different cell resolutions. Distance is explicitly incorporated into the RFsp models through the calculation of buffer distances around small numbers (6-13) of gridded points in the study area. While classification accuracy is high generally, RFsp produced much more accurate results. At the fine scale, topographic variables and their neighborhood ranges were not predictive of dune areas, perhaps because large (> 0.1 hectare) neighborhoods were not tested at that scale. At the coarse scale these variables were much more important. The use of small numbers of gridded (non-sample) points to improve spatial prediction warrants further investigation.Hyper-scale analysis of surface roughnesshttps://peerj.com/preprints/271102018-08-132018-08-13John B LindsayDaniel R Newman
Surface roughness is frequently measured using DEMs to characterize the ruggedness and topographic complexity of landscapes. Roughness maps have been applied in geological mapping, ecological modeling, and other environmental applications. These maps are typically derived using a roving-window approach, where kernel size dictates the scale at which roughness is assessed. The pattern of roughness is strongly scale dependent and this roughness-scaling relation can reveal useful information about the geomorphologic character of landscapes. This study applied hyper-scale analysis of a normal-vector based roughness metric for a LiDAR DEM of Rondeau Bay, Canada. The use of integral images, a data structure for computationally efficient filtering operations, allowed for the fine scale resolution of the analysis. The unique roughness scale signature of each grid cell in the DEM was derived for all spatial scales ranging from 3 to 5000 cells (7.5 m to 12,502.5 m). Maps of maximum roughness and the scale of maximum roughness were created for the study site. This cell-specific scaling approach to the characterization of surface roughness is in contrast to the use of single, often arbitrarily selected, kernel sizes to map topographic attributes. The additional information provided by the scale map was found to provide valuable ancillary data for landscape interpretation.
Surface roughness is frequently measured using DEMs to characterize the ruggedness and topographic complexity of landscapes. Roughness maps have been applied in geological mapping, ecological modeling, and other environmental applications. These maps are typically derived using a roving-window approach, where kernel size dictates the scale at which roughness is assessed. The pattern of roughness is strongly scale dependent and this roughness-scaling relation can reveal useful information about the geomorphologic character of landscapes. This study applied hyper-scale analysis of a normal-vector based roughness metric for a LiDAR DEM of Rondeau Bay, Canada. The use of integral images, a data structure for computationally efficient filtering operations, allowed for the fine scale resolution of the analysis. The unique roughness scale signature of each grid cell in the DEM was derived for all spatial scales ranging from 3 to 5000 cells (7.5 m to 12,502.5 m). Maps of maximum roughness and the scale of maximum roughness were created for the study site. This cell-specific scaling approach to the characterization of surface roughness is in contrast to the use of single, often arbitrarily selected, kernel sizes to map topographic attributes. The additional information provided by the scale map was found to provide valuable ancillary data for landscape interpretation.Hydraulic modeling of megaflooding using terrestrial and Martian DEMshttps://peerj.com/preprints/271072018-08-102018-08-10Tao LiuVictor R Baker
Megaflooding generated from Glacial Lake Missoula (GLM) during the late Pleistocene, swept across the Columbia Plateau and Columbia Basin regions of the northwestern U.S., producing the Channeled Scabland, an assemblage of landforms comprising a regional anastomosing complex of overfit stream channels scoured into basalt bedrock. This region provides the best-studied example of a landscape created by catastrophic flooding. Using DEM data and the HEC-RAS 2-D hydraulic model, we analyzed the GLM flood propagation from the Clark Fork in northern Idaho to the eastern Pacific Ocean, and the GLM flood simulation generally covers the tracts of the Channeled Scabland and captures the paleohydraulic conditions that have been inferred in the field and documented by previous hydraulic studies. A test simulation on the Columbia Gorge suggest the other sources of water besides Lake Missoula may have been involved in producing the megaflooding. Initial hydraulic analyses for the megafloods and their relations to the field evidence provide important insights into cataclysmic flood processes and associated landforms.
Megaflooding generated from Glacial Lake Missoula (GLM) during the late Pleistocene, swept across the Columbia Plateau and Columbia Basin regions of the northwestern U.S., producing the Channeled Scabland, an assemblage of landforms comprising a regional anastomosing complex of overfit stream channels scoured into basalt bedrock. This region provides the best-studied example of a landscape created by catastrophic flooding. Using DEM data and the HEC-RAS 2-D hydraulic model, we analyzed the GLM flood propagation from the Clark Fork in northern Idaho to the eastern Pacific Ocean, and the GLM flood simulation generally covers the tracts of the Channeled Scabland and captures the paleohydraulic conditions that have been inferred in the field and documented by previous hydraulic studies. A test simulation on the Columbia Gorge suggest the other sources of water besides Lake Missoula may have been involved in producing the megaflooding. Initial hydraulic analyses for the megafloods and their relations to the field evidence provide important insights into cataclysmic flood processes and associated landforms.