8  Code Smells

CC-BY-4.0 © 2021 Balaban et al.

Code smells are software characteristics that suggest there might be an issue with the code’s design or implementation. While code smells themselves might not always indicate a bug or malfunction, they can make the code harder to understand, maintain, and extend, which can lead to bugs and other issues down the line. Code smells are usually noticed and addressed during code reviews, when writing tests, adding new features, fixing bugs, and during automated code analysis.

The “long method”

A “long method” is a common code smell that refers to a method/function that is excessively long and contains a large number of lines of code. Long methods can make code difficult to understand, maintain, and debug.

def load_data(filepath: str):
    # Check data file exists
    # If file extension is .json: load json data
    # If file extionsion is .pickle: load pickled data
    # If file extionsion is .csv: load cvs data
    # Verify content of data set
Solution

Identify logical blocks of code within the long method/function and extract them into separate methods with descriptive names. We should aim to make each method responsible for a singular task and compose more complex functionalities from modular components.

“Functions should do one thing. They should do it well. They should do it only.”

Robert C. Martin

def load_data(filepath: str) -> Data:
    verify_filepath(filepath: str)  
    data = read_data(filepath: str)
    verify_data(data)
    return data

  
def verify_filepath(filepath: str): pass

def read_data(filepath: str) -> Data:
    _, extension = os.path.splitext(filepath)
    data_types = {
        ".json": read_from_json,
        ".pickle": read_from_pickle,
        ".csv": read_from_csv,
    }
    return data_types[extension](filepath)

def read_from_json(filepath: str): pass
def read_from_pickle(filepath: str): pass
def read_from_csv(filepath: str): pass
def verify_data(data: Data) -> bool: pass

Monolithic design and large classes

A monolithic design is where the entire application or system is built as a single, tightly coupled unit, without clear separation of responsibilities or modularization. Often, this smell is encountered as large classes and indicates that a single class in the codebase is responsible for handling too many responsibilities or has grown too complex.

In this example, the KiteController class is responsible for both adjusting the kite angle based on wind speed and generating power from the kite.

class WindSensor:
    def measure_wind_speed(self):
        # Placeholder for wind speed measurement
        return 10

class KiteController:
    def __init__(self):
        self.wind_sensor = WindSensor()

    def adjust_kite_angle(self):
        wind_speed = self.wind_sensor.measure_wind_speed()
        # Logic to adjust kite angle based on wind speed
        if wind_speed > 15:
            print("Strong wind detected. Adjusting kite angle for stability.")
        else:
            print("Moderate wind detected. Maintaining kite angle.")

    def generate_power(self):
        # Logic to generate power based on kite angle and wind speed
        wind_speed = self.wind_sensor.measure_wind_speed()
        if wind_speed > 15:
            print("Generating high power from kite.")
        else:
            print("Generating moderate power from kite.")

def main():
    kite_controller = KiteController()
    kite_controller.adjust_kite_angle()
    kite_controller.generate_power()

if __name__ == "__main__":
    main()
Solution
  • Follow the Single Responsibility Principle (SRP) - Ensure that each class has only one job. If a class is doing too much, split its responsibilities into separate classes.
  • Use dependency injection: Reduce class coupling by calling dependencies as arguments (injecting dependencies) rather than hard-coding them. This promotes modularity and testability, as well as making it easier to swap out components.

The main function serves as the entry point and demonstrates dependency injection by creating instances of WindSensor, KiteController, and PowerGenerationSystem externally and passing them to each other’s constructors.

class WindSensor:
    def measure_wind_speed(self):
        # Simulate wind speed measurement
        return 10  # Placeholder value for demonstration purposes

class KiteController:
    def __init__(self, wind_sensor):
        self.wind_sensor = wind_sensor

    def adjust_kite_angle(self):
        wind_speed = self.wind_sensor.measure_wind_speed()
        # Logic to adjust kite angle based on wind speed
        if wind_speed > 15:
            print("Strong wind detected. Adjusting kite angle for stability.")
        else:
            print("Moderate wind detected. Maintaining kite angle.")

class PowerGenerationSystem:
    def __init__(self, kite_controller):
        self.kite_controller = kite_controller

    def generate_power(self):
        self.kite_controller.adjust_kite_angle()
        # Logic to generate power based on kite angle and wind speed
        print("Generating power from kite.")

# Dependency Injection
def main():
    wind_sensor = WindSensor() 
    kite_controller = KiteController(wind_sensor) # Through dependency injection, the WindSensor can be easily replaced with another sensor
    power_generation_system = PowerGenerationSystem(kite_controller)

    power_generation_system.generate_power()

if __name__ == "__main__":
    main()

Code duplication

Duplicated code refers to instances where similar or identical blocks of code appear in multiple places within a codebase. This code smell can increase maintenance efforts, as changes in one place might require corresponding changes in other places.

Solution
  • Refactor the code to accept parameters as arguments, instead of hard-coding them.
  • Extract common functionality into functions or methods.
  • Refactor duplicated code into higher-level abstractions.
  • Make use of utility functions to centralize common code and avoid duplication.

Hard coding and magic numbers

This happens when constants and specific values are directly embedded into the code instead of being defined as variables or passed as arguments. Hard-coding makes the code less flexible and harder to maintain because changing the behavior requires modifying the source code rather than adjusting parameters. This smell is often noticed when you need to make changes to the source code in order to change the behaviour of the software at runtime.

def calculate_area(radius):
    # Hard-coded value of pi
    return 3.14 * radius * radius

def check_temperature(temperature):
    # Hard-coded magic numbers for temperature thresholds
    if temperature > 30:
        print("It's too hot!")
    elif temperature < 10:
        print("It's too cold!")
Solution

Using configurable parameters or constants can make the code more adaptable and easier to maintain.

def calculate_area(radius, pi):
    return np.pi * radius * radius

def check_temperature(temperature, hot_threshold=30, cold_threshold=10):
    # If needed, you can use default values
    if temperature > hot_threshold:
        print("It's too hot!")
    elif temperature < cold_threshold:
        print("It's too cold!")

Deep nesting

Deep nesting occurs when there are too many levels of indentation in the code, making it harder to understand, maintain, and debug. It can lead to spaghetti code and decreased readability.

def validate_model_convergence(model: Model) -> bool:
    if model.convergence > 1:
        if model.convergence < 0.1:
            if model.secondary_condition == True
                return True
            else:
                return False
        else:
            return False
    else:
        return False
Solution

Refactoring to reduce nesting levels and employing techniques like early returns or breaking down complex logic into smaller, more modular functions can help alleviate this code smell.

def validate_model_convergence(model: Model) -> bool:
    if model.convergence <= 1:
        return False
    if model.convergence >= 0.1:
        return False
    if model.secondary_condition == False
        return False
    return True
        

Or alternatively using the any/all built-in functions

def validate_model_convergence(model: Model) -> bool:
    return all([
        model.convergence > 1,
        model.convergence < 0.1,
        model.secondary_condition == True,
    ])

with the equivalent in MATLAB

function result = validate_model_convergence(model)
    result = all([model.convergence > 1, model.convergence < 0.1, model.secondary_condition == true]);
end

Long parameter list

A function or method accepts parameters that are not necessary for its operation, leading to increased coupling and decreased readability. It can make the code harder to understand and maintain, as well as increase the risk of errors due to the misuse of parameters. Refactoring to reduce the number of parameters and passing only the necessary data can help improve code clarity and maintainability.

def process_kite_flight(kite_name: str, wind_speed: float, kite_angle: float, power_generated: float):
    # Process kite flight data
    print("Kite Name:", kite_name)
    print("Wind Speed:", wind_speed)
    print("Kite Angle:", kite_angle)
    print("Power Generated:", power_generated)
    # Additional processing logic...

# Usage
process_kite_flight(kite_name = "Kite_1", wind_speed=20.5, kite_angle=45.0, power_generated=150.0)
Solution
  • If a function requires a large number of parameters, it may be a sign that it’s doing too much. Break down the function into smaller, more focused functions or classes with clearer responsibilities.
  • Instead of passing a long list of parameters, encapsulate related data into meaningful objects or data structures. By passing objects or data structures, you can reduce the number of parameters while still providing necessary information to the function.
from dataclasses import dataclass

# Create a dataclass to encapsulate kite flight data which can be 
# passed as a single parameter 
@dataclass
class KiteFlightData:
    name: str
    wind_speed: float
    kite_angle: float
    power_generated: float

def process_kite_flight(kite_data: KiteFlightData):
    # Process kite flight data
    print("Kite Name:", kite_data.name)
    print("Wind Speed:", kite_data.wind_speed)
    print("Kite Angle:", kite_data.kite_angle)
    print("Power Generated:", kite_data.power_generated)
    # Additional processing logic...

# Usage 
flight_data = KiteFlightData(name="Kite_1", wind_speed=20.5, kite_angle=45.0, power_generated=150.0)
process_kite_flight(flight_data)
Tip

You can combine dataclasses with data validation through Pydantic.

Divide and conquer

Be careful not to create too large datastructures as this increases complexity and may introduce tight coupling. Instead, break down large datastructures into smaller, more manageable pieces based on logical grouping or functionality. Use composition to combine smaller data classes into larger ones where necessary.

Inappropriate intimacy

This smell occurs when one class knows too much about the internal structure of another class, leading to tight coupling. Tight coupling makes the code harder to understand, maintain, and refactor because changes to one class can have ripple effects on other classes.

class GroundStation:
    def __init__(self, station_name):
        self.station_name = station_name
        self.kite = Kite()

    def get_kite_name(self):
        # Violation of the Law of Demeter:
        # Accessing a property of an object returned by another object
        return self.kite.name        

class Kite:
    def __init__(self):
        self.name = "Kite_1"

# Usage
ground_station = GroundStation("TUD")
kite_name = ground_station.get_kite_name()
Solution

Follow the principles of least knowledge (Law of Demeter). Each unit should have only limited knowledge about other units, i.e. don’t talk to strangers.

class GroundStation:
    def __init__(self, station_name, kite):
        self.station_name = station_name
        self.kite = kite

    def get_kite_name(self):
        return self.kite.get_name()

class Kite:
    def __init__(self, name):
        self.name = name
        
    def get_name(self):
        return self.name    

# Usage
kite = Kite("Kite_1")
ground_station = GroundStation("TUD", kite)
kite_name = ground_station.get_kite_name()
class GroundStation:
    def __init__(self, station_name, kite_name):
        self.station_name = station_name
        self.kite_name = kite_name

    def get_kite_name(self):
        return self.kite_name

# Usage
ground_station = GroundStation("TUD", "Kite_1")
kite_name = ground_station.get_kite_name()

Side effects and external state

Side effects refer to observable changes or interactions that a function or expression has on the external world beyond its return value. Non-pure functions are functions that have side effects or rely on external state, making their behavior dependent on factors other than their inputs.

Tip

Pure functions are deterministic and have no side-effects.

Instead, non pure functions may

  • modify state outside their scope, such as changing the value of a global variable, printing to the console, or modifying files
  • produce different results for the same input depending on the state of external (global) variables or resources
  • use random number generation and are thus non-deterministic
  • read input from the user or write output to a display
  • interact with databases, APIs, or other external services
Solution

Ensure that each function or module has a single responsibility. Break down complex functions into smaller, focused functions that perform specific tasks. This helps in isolating non-pure functions with side effects from pure functions.

CC-BY-4.0 CodeRefinery

Dead and commented code

Dead and commented code refers to parts of the code that are no longer in use or have been superseded but are not deleted, only commented out. Such code can clutter the codebase, making it hard to read and maintain.

Solution

Remove it. Commit the removal of the commented-out code with a meaningful commit message explaining why it was removed. This allows you to track the change and easily revert it if necessary.