Towards the portability of knowledge in reinforcement learning-based systems for automatic drone navigation

View article
PeerJ Computer Science

Main article text

 

Introduction

  • It proposes a mechanism that lets us isolate the knowledge obtained by an agent during its learning and to separate it from the rest of the tasks (perception, action, etc.), which was not possible in the earlier version of the system and reduced its capacity for knowledge portability.

  • It presents a complete and exhaustive study of knowledge portability between agents in different scenarios, which provides an idea of how portable is the knowledge obtained by means of RL techniques.

System Description

Antecedents

Adaptation to drones

Knowledge exportation and importation

Experimental Design

Description of the learning scenarios

Metrics used

  • Average number of cycles per attempt (Avg_CI): This considers the number of cycles that the agent has invested in each attempt and calculates the average of said values. The lower the value, the quicker the agent is able to perform its task and therefore it implies better learning.

  • Success rate (%Success): represents the number of successful attempts in relation to the total number of attempts for each simulation.

  • Simulation time (T_Sim): represents the amount of time required by the agent (or agents) to complete the simulation. The lesser the time, the quicker the agent has learnt.

Results

Scenario 1 (Sc1)

Scenario 2 (Sc2)

Scenario 3 (Sc3)

Scenario 4 (Sc4)

Scenario 5 (Sc5)

Conclusion

  • To expand the study to include a greater number of scenarios, especially scenarios with multiple drones interacting with each other and with large obstacles present in the trajectory of these drones.

  • To take a first step towards converting the implemented system(simulation) into reality. For this, real drones are required with processing systems that include the logic of the presented system. Additionally, these drones must be equipped with a perception system(sensors) that allows them to detect obstacles and other drones, and to know their position relative to the destination point (possibly GPS). It must be possible to map the set of movements of the simulated agent on to the movement orders of a drone, probably by means of a movement interface facilitated by its operating system.

  • For a more efficient and sustainable learning process of the agents, an option would be to attempt to reduce the number of perception patterns, which would lead to less storage and fewer decisions to analyse. This may be achieved by grouping patterns that are very similar using data mining (clustering) techniques, for example. It must be studied if the reduction in patterns, along with improved sustainability, allows us to maintain a certain level of quality with regard to learning (success rate).

Additional Information and Declarations

Competing Interests

Juan A. Lara is an Academic Editor for PeerJ.

Author Contributions

José M. Barreiro conceived and designed the experiments, performed the experiments, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Juan A. Lara conceived and designed the experiments, analyzed the data, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Daniel Manrique analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Peter Smith performed the experiments, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The code is available at GitHub and Zenodo:

https://github.com/in2latoj/MARSADA/tree/Marsada

in2latoj. (2023). in2latoj/MARSADA: Marsada2.0 (Marsada). Zenodo. https://doi.org/10.5281/zenodo.7614681

The data is available at Zenodo: Juan Alfonso Lara. (2023). Data used in Knowledge Transfer Research for Drone Navigation [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7600381

Funding

The authors received no funding for this work.

4 Citations 965 Views 37 Downloads

Your institution may have Open Access funds available for qualifying authors. See if you qualify

Publish for free

Comment on Articles or Preprints and we'll waive your author fee
Learn more

Five new journals in Chemistry

Free to publish • Peer-reviewed • From PeerJ
Find out more