7 Code Quality and Refactoring

7.1 Code quality

The quality of your software is a critical factor in software development, affecting maintainability, scalability, and reliability. The practice of writing clean code refers to writing code that is simple to understand and easy to read for others, not just for the original author. It should be easy to debug and collaborate on, and it should be accessible to future modifications and extensions. Clean code will:

Clearly communicate its purpose and functionality, avoiding obscure names or complex constructions.
Be as simple as it possibly can be, with no unnecessary elements.
Have a clear structure and flow that is understandable to others.
Have a consistent naming convention.
Be easily testable.

Both refactoring and clean code practices aim to make software easier to manage and enhance, making it more reliable and robust over time.

“Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?”

Brian W. Kernighan

Additional reading for quality criteria

A set of Common Software Quality Assurance Baseline Criteria for Research Projects
https://www.eosc-synergy.eu/for-developers/

7.1.1 Python Enhancement Proposals (e.g. PEP8)

Just as we have ISO standards for a wide range of industries, there are guidelines and best practices on how to write Python code. The main goal for PEP8 is to enhance the readability and consistency of Python code. The reasoning for PEP8 was to create and maintain consistency in code layout, naming conventions, and design patterns. Making it easier to understand and share different developers code across diverse projects worldwide.

Consider looking into these PEP8 resources

https://realpython.com/python-pep8/
https://peps.python.org/pep-0008/
The pycodestyle GitHub repository library. It checks your Python code against style conventions found in PEP8.

7.1.2 Error handling

You can enhance the reliability of your code through proactive error handling. While it’s beneficial that errors alert users that something is wrong, the downside is that the error messages they receive are often not helpful or informative.

Tip

Utrecht University offers a good introduction to handling errors and exceptions in their workshop on Writing Reproducible Code

7.1.3 Checking code quality

Linters for Python

Linters are tools that analyze source code to flag programming errors, bugs, syntax errors, and suspicious constructs. For Python, linters play a crucial role in ensuring code quality and adherence to coding standards. Moreover, linters can be integrated into most IDEs and they can also be part of a Continuous Integration workflow.

Two common linters are pylint and flake8.

When to use pylint and when to use flake8?

pylint is one of the most popular and comprehensive linters for Python. It checks for errors in the code, tries to enforce a coding standard, and looks for code smells. It can also be customized to adjust to any coding style and supports plugins to add additional checks.
flake8 is a wrapper around PyFlakes (checks Python code for syntax errors) and pycodestyle (checks whether Python code is compliant with PEP8 conventions). It’s highly configurable, with options to ignore certain checks and errors, and is often used in continuous integration systems.

While pylint provides thorough analysis, flake8 offers speed and simplicity for basic style enforcement. Many combine both tools to maximize code quality and compliance, using flake8 for rapid checks and pylint for more detailed examinations.

Formatters for Python

Formatters are tools that automatically adjust the formatting of your code to make it consistent and readable according to predefined style guidelines. They do not identify errors in the logic of the code but instead restructure the whitespace, line breaks, and indentation so that the code is more uniform across a project. For Python, Black is commonly used as a formatter.

👉 Black Documentation

Tools for MATLAB

MISS_HIT is a compiler framework designed for MATLAB, accompanied by a suite of tools aimed at enhancing code quality and accuracy. It provides a range of tools suitable for various levels of static analysis.

SonarCloud

SonarCloud is a cloud-based service that provides inspection of code quality to perform automatic reviews with static code analysis to detect bugs, code smells and security vulnerabilities in a project. It supports many programming languages and integrates with GitHub (and GitLab and Bitbucket) as part of the Continuous Integration workflows. SonarCloud is particularly useful for projects that require compliance with coding standards or need regular feedback on the quality of the code.

Consideration

While SonarCloud offers valuable features for code quality analysis, be aware that for non open-source projects it is a paid service, and pricing model depends on how many lines of code you want to check.

Code coverage

Code coverage quantifies the proportion of source code that is run by a software program’s (unit) test suite (also see the Testing guide). It helps to identify which parts of the codebase have been tested, and achieving a high code coverage generally indicates a lower likelihood of hidden bugs. However, it is important to note that high code coverage does not necessarily translate to high code quality - it simply tells us how much of the codebase is being tested.

Recommended services:

OpenSSF

The Open Source Security Foundation (OpenSSF) Best Practices badge provides a way for Free/Libre and Open Source Software (FLOSS) projects to demonstrate their adherence to best practices. Projects can choose to self-certify for free. Inspired by the numerous badges available on GitHub, the OpenSSF Best Practices Badge allows to quickly identify which FLOSS projects are committed to best practices and are therefore more likely to deliver high-quality and secure software.

The criteria for earning the passing badge and additional details about the OpenSSF Best Practices Badging program can be found on GitHub.

7.2 Refactoring

Refactoring is the process of restructuring existing code without changing its external behavior. Refactoring helps make the code more maintainable and understandable, which in turn makes it easier to build on and less likely to develop bugs. This can include:

Improving readability - making code easier to understand, which helps future maintainers and external partners.
Reducing complexity - simplifying complex code structures, which can involve breaking down large functions into smaller, more manageable pieces or removing unnecessary dependencies.
Optimizing software design - making it more robust and adaptable for future needs.
Identifying and eliminating redundancies - removing duplicated or unnecessary code.
Ensuring consistency - adhere to a consistent coding style accross the codebase to ensure the code is uniform.

7.2.1 Refactoring workflow

When to refactor code?

Rule of three: Begin refactoring when the same or very similar code is being written for the third time.
When adding a feature: Refactoring existing code before adding new features can help simplify the integration of new functionality.
When fixing a bug: Cleaning up code in the areas around a bug can make it easier to identify and fix the issue.
During a code review: Refactoring during code reviews can prevent issues from becoming part of the public codebase and streamline the development process.
Finding a code smell (see below)

👉 Refactoring.Guru - When to refactor?

How to refactor code?

Refactoring should be done through minor changes without breaking the underlying code. Each iteration should make your code slightly better, and could be done according to this checklist:

Maintain clean code: Refactor with the aim to make the code cleaner and more efficient.
Avoid adding new features: Refactor without introducing new functionalities.
Keep tests passing: All existing tests should still be passing after refactoring, ensuring no new bugs are introduced.

👉 Refactoring.Guru - How to refactor?