Integrating active learning and crowdsourcing into large-scale supervised landcover mapping algorithms

Stephanie R Debats; Lyndon D Estes; David R Thompson; Kelly K Caylor

doi:10.7287/peerj.preprints.3004v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

Integrating active learning and crowdsourcing into large-scale supervised landcover mapping algorithms

Stephanie R Debats ¹, Lyndon D Estes¹, David R Thompson², Kelly K Caylor^1,3,4

1 Department of Civil & Environmental Engineering, Princeton University, Princeton, NJ, United States

2 NASA Jet Propulsion Laboratory, Pasadena, CA, USA

3 Department of Geography, University of California, Santa Barbara, Santa Barbara, CA, United States

4 Bren School of Environmental Science and Management, University of California, Santa Barbara, Santa Barbara, CA, United States

DOI: 10.7287/peerj.preprints.3004v1

Published: 2017-06-06
Accepted: 2017-06-06

Subject Areas: Human-Computer Interaction, Algorithms and Analysis of Algorithms, Computer Vision, Spatial and Geographic Information Systems
Keywords: computer vision, machine learning, active learning, crowdsourcing, landcover, agriculture, human-computer interaction, remote sensing

Copyright: © 2017 Debats et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Cite this article: Debats SR, Estes LD, Thompson DR, Caylor KK. 2017. Integrating active learning and crowdsourcing into large-scale supervised landcover mapping algorithms. PeerJ Preprints 5:e3004v1 https://doi.org/10.7287/peerj.preprints.3004v1

Abstract

Sub-Saharan Africa and other developing regions of the world are dominated by smallholder farms, which are characterized by small, heterogeneous, and often indistinct field patterns. In previous work, we developed an algorithm for mapping both smallholder and commercial agricultural fields that includes efficient extraction of a vast set of simple, highly correlated, and interdependent features, followed by a random forest classifier. In this paper, we demonstrated how active learning can be incorporated in the algorithm to create smaller, more efficient training data sets, which reduced computational resources, minimized the need for humans to hand-label data, and boosted performance. We designed a patch-based uncertainty metric to drive the active learning framework, based on the regular grid of a crowdsourcing platform, and demonstrated how subject matter experts can be replaced with fleets of crowdsourcing workers. Our active learning algorithm achieved similar performance as an algorithm trained with randomly selected data, but with 62% less data samples.

Author Comment

This is a preprint submission to PeerJ Preprints.

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article