Loading...
 

Structured Code Reviews using NLG

This idea is part of the A Dollar Worth of Ideas series, with potential open source, research or data science projects or contributions for people to pursue. I would be interested in mentoring some of them. Just contact me for details.


The value of code reviews cannot be understated. However, they take time and finesse, particularly to avoid toxic comments.

Machine learning has been applied to the problem for a while, particularly to the problem of finding the person most appropriate to do the review.

Here, the idea is to go in the direction of GitHub Copilot and help the reviewer write the code review texts. But instead of just training on large amounts of data, the idea described here follows Reiter and Sripada (2002) in aspiring to generate always the best possible text rather than just training text. (For more issues regarding NLG and ML, see Prof. Reiter's blog post.)

Indeed, if we were to train on all code reviews we will end up with text with a toxicity level similar to its training data, made even worse by using open source reviews, a community usually admonished by its high levels of toxicity.

Instead, a corpus of code reviews can be clustered and the different categories used to make a structured editor, where the reviewer can choose what type of comment they want to introduce (the categories can be further ranked based on trained classifiers). Once the category has been selected, a piece of text (that can be customized for different communities) can be used as a starting por for the review. This way, the communication starts on good footing, with texts that have been selected to be culturally neutral, non-toxic and overall effective.

Such functionality can be integrated within existing code review tools and help improve the peace as with automatic code formatters.