Article Spotlight: Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects

by | Apr 9, 2025 | Article Spotlight

Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects

Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research.

 

Read the research 

“With smartphones in every pocket and with increasing collection digitisation efforts, more and more natural history data becomes available and the CS-ML approach provides an excellent opportunity to collect, validate and analyse those datasets.”

 

Isaak Eijkelboom

Utrecht University

For All Readers - AI Explainer

What is this research about?
This study explores how machine learning (ML) and citizen science (CS) can work together to improve the discovery and analysis of fossils and artefacts. While citizen scientists have long played a key role in palaeontology and archaeology, new technology—particularly ML-powered tools—offers a way to process large datasets more efficiently. The study outlines a workflow to help researchers design effective CS-ML projects, ensuring both scientific accuracy and public engagement.
Why combine machine learning with citizen science?
Citizen science projects generate vast amounts of data, but sorting, verifying, and analysing this information can be time-consuming. Machine learning algorithms can help by:
  • Identifying patterns in large datasets of fossils and artefacts.
  • Automating classification tasks, reducing the burden on researchers.
  • Improving data accuracy through AI-assisted validation.
  • Enhancing public engagement, making discoveries more accessible to non-experts.
However, not all CS-ML projects are successful, and without a structured approach, they may fail to reach their full potential.
What challenges do CS-ML projects face?
While biodiversity monitoring has successfully integrated ML tools, object-based research in palaeontology and archaeology presents unique challenges:
  • Specific legislation on archaeological and palaeontological finds may differ between different countries and regions and need to be taken into account when setting up a project
  • Fossils and artefacts require different classification methods than living organisms.
  • The quality and consistency of citizen-collected data can vary.
  • Many ML tools are designed for image-based datasets, but fossils often require 3D analysis.
  • Public participation is crucial, so user-friendly tools and engagement strategies are needed.
How can researchers design successful CS-ML projects?
The study proposes a four-phase workflow to guide researchers:
  1. Preparation – Define objectives, choose the right ML models, and engage stakeholders.
  2. Execution – Train volunteers, collect data, and integrate ML tools for processing.
  3. Implementation – Validate and analyse the data, ensuring accuracy.
  4. Reiteration – Refine and improve the system based on feedback and new discoveries.
Each phase ensures that scientific goals, technology development, and public engagement are aligned.
What does a well-designed CS-ML project look like in practice?
The LegaSea project, which studies fossils and artefacts from sand nourishments in the Netherlands, demonstrates how this approach can work. By combining citizen-collected data with ML-driven analysis, researchers can streamline discoveries and improve classification accuracy.
Why is this research important?
By developing a structured approach to CS-ML projects, researchers can:
  • Maximise scientific discoveries from citizen-collected data.
  • Reduce errors and improve data reliability using AI.
  • Boost public participation, making research more inclusive and engaging.
  • Expand machine learning applications in palaeontology and archaeology.
What is the key takeaway?
Machine learning and citizen science have huge potential to transform how we study fossils and artefacts, but to succeed, projects need careful planning and integration. This study provides a workflow and best practices to help researchers build more effective CS-ML collaborations.

What are Article Spotlights?

PeerJ Article Spotlights feature research published in PeerJ journals that is of interest  to non-specialists and the general public.

Spotlighted articles are press released, and feature author interviews, AI explainers and more.

If you have published in Peer J and would like to be featured in an Article Spotlight please contact PeerJ.

 

 

 

Get PeerJ Article Alerts