EuroPython 2017

Linespots: Predicting Bugs in your Code

Speaker(s) Maximilian Scholz
Sub Community: PyData

In times of increased awareness of technical debts, reviewing and auditing code becomes more important. The main problem with code review is the amount of time that is being spent searching the needle in the haystack. You just don’t know what you are looking for and where to find it. One possible solution to the problem to the idea of bug prediction. If we could somehow know where bugs are in our code, focusing reviewing efforts on that area should, in theory, increase the effectiveness of our review. More bugs should be uncovered while less time is spent reviewing. This is what Linespots tries to offer. It is an algorithm developed during my thesis that analyses a project’s history and calculates a probability value for each line of code in the project, representing the likeliness of a bug existing in that line. Using the probabilities, reviewers can focus on the areas that are at a higher risk of containing bugs and spend less time on robust code. The research done so far showed, that by analyzing 0.5% lines of code with the highest risk values in a project, an average of 50% of the bugs fixed in the next 150 commits were correctly predicted by Linespots. This is an improvement by factor 10 compared to Bugspots, an algorithm developed at Google, which Linespots is based upon.

Outline:

  • Basics and functionality of Linespots
  • Research results
  • Pros and cons of Linespots
  • Results of a case study

in on Friday 14 July at 14:00 See schedule

Do you have some questions on this talk?

New comment