Estimating nitrogen and phosphorus concentrations in streams and rivers across the contiguous United States: a machine learning framework
- Published
- Accepted
- Subject Areas
- Biochemistry, Freshwater Biology, Aquatic and Marine Chemistry, Environmental Contamination and Remediation, Spatial and Geographic Information Science
- Keywords
- Nitrogen, Phosphorus, freshwater quality, freshwater biochemistry, nutrients, machine learning
- Copyright
- © 2019 Shen et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.
- Cite this article
- 2019. Estimating nitrogen and phosphorus concentrations in streams and rivers across the contiguous United States: a machine learning framework. PeerJ Preprints 7:e27585v1 https://doi.org/10.7287/peerj.preprints.27585v1
Abstract
Nitrogen (N) and Phosphorus (P) are essential nutrients for life processes in water bodies but in excessive quantities, they are a significant source of aquatic pollution. Eutrophication has now become widespread due to such an imbalance, and is largely attributed to anthropogenic activity. In view of this phenomenon, we present a new dataset and statistical method for estimating and mapping elemental and compound con- centrations of N and P at a resolution of 30 arc-seconds (∼1 km) for the conterminous US. The model is based on a Random Forest (RF) machine learning algorithm that was fitted with environmental variables and seasonal N and P concentration observations from 230,000 stations spanning across US stream networks. Accounting for spatial and temporal variability offers improved accuracy in the analysis of N and P cycles. The algorithm has been validated with an internal and external validation procedure that is able to explain 70-83% of the variance in the model. The dataset is ready for use as input in a variety of environmental models and analyses, and the methodological framework can be applied to large-scale studies on N and P pollution, which include water quality, species distribution and water ecology research worldwide.
Author Comment
Preprint of a manuscript submitted to Nature Scientific Data
Estimation of Nitrogen (N) and Phosphorus (P) concentration at a resolution of 30 arc-seconds (∼1 km) for the conterminous US. Assess predictors importance and spatial correlation among N & P vs agriculture and urban land cover.