Smart learning: A search-based approach to rank change and defect prone classes

Carol V Alexandru; Annibale Panichella; Sebastiano Panichella; Alberto Bacchelli; Harald C Gall

doi:10.7287/peerj.preprints.1160v1

Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

NOT PEER-REVIEWED

"PeerJ Preprints" is a venue for early communication or feedback before peer review. Data may be preliminary.

Smart learning: A search-based approach to rank change and defect prone classes

Carol V Alexandru¹, Annibale Panichella ², Sebastiano Panichella¹, Alberto Bacchelli ², Harald C Gall¹

1 Department of Informatics, University of Zurich, Zurich, Switzerland

2 SERG, Delft University of Technology, Delft, Netherlands

DOI: 10.7287/peerj.preprints.1160v1

Published: 2015-06-05
Accepted: 2015-06-05

Subject Areas: Data Mining and Machine Learning, Software Engineering
Keywords: defect prediction, code change, genetic algorithm

Copyright: © 2015 Alexandru et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ PrePrints) and either DOI or URL of the article must be cited.

Cite this article: Alexandru CV, Panichella A, Panichella S, Bacchelli A, Gall HC. 2015. Smart learning: A search-based approach to rank change and defect prone classes. PeerJ PrePrints 3:e1160v1 https://doi.org/10.7287/peerj.preprints.1160v1

Abstract

Research has yielded approaches for predicting future changes and defects in software artifacts, based on historical information, helping developers in effectively allocating their (limited) resources. Developers are unlikely able to focus on all predicted software artifacts, hence the ordering of predictions is important for choosing the right artifacts to concentrate on. We propose using a Genetic Algorithm (GA) for tailoring prediction models to prioritize classes with more changes/defects. We evaluate the approach on two models, regression tree and linear regression, predicting changes/defects between multiple releases of eight open source projects. Our results show that regression models calibrated by GA significantly outperform their traditional counterparts, improving the ranking of classes with more changes/defects by up to 48%. In many cases the top 10% of predicted classes can contain up to twice as many changes or defects.

Author Comment

This is currently submitted to a Software Engineering conference for peer review.

Supplemental Information

Replication package

A replication package for our study is publicly available for download. In the replication package, we provide: (i) the scripts for the extraction process on a specific dataset, (ii) the datasets used in our experimentation, and (iii) the raw data for the experimented predictors.

DOI: 10.7287/peerj.preprints.1160v1/supp-1

Download

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Some Markdown syntax is allowed: _italic_ **bold** ^superscript^ ~subscript~ %%blockquote%% [link text](link URL)

By posting this you agree to PeerJ's commenting policies

Questions

Ask a question

Learn more about Q&A

Links

Add a link

Content

Alert

Just enter your email

Supplemental Information

Replication package

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article