4 Commenting Code

Effective commenting is an essential aspect of writing clean and maintainable R code. Comments help explain what your code does, why certain decisions were made, and how different parts of the code relate to each other. Good comments make your code easier to understand, debug, and maintain, especially when working in teams or revisiting code after a long time.

In this chapter, we will discuss best practices for commenting your R code, including when to comment, how to write clear and concise comments, and common pitfalls to avoid.

4.1 The Purpose of Comments

4.1.1 Why Comment Your Code?

Comments serve several key purposes:

  • Clarification: Comments clarify complex or non-obvious parts of your code. This helps others (and your future self) understand the logic and purpose behind certain lines of code.
  • Documentation: Comments can document the purpose and usage of functions, variables, and scripts. This is particularly useful for APIs or shared codebases.
  • Debugging Aid: Comments can serve as reminders or notes during debugging, helping you track issues and solutions.
  • Collaboration: When working in teams, comments help ensure that everyone understands the code’s intent and functionality, reducing the chances of misinterpretation.

4.1.2 When to Comment

Not every line of code needs a comment. The key is to comment where necessary and where it adds value:

  • Complex Logic: If a section of code involves complex logic, algorithms, or calculations, add comments to explain the steps.
  • Unusual Workarounds: If you implement a workaround or a non-standard approach due to a limitation or bug, comment on why this was necessary.
  • Function Definitions: Always comment on what a function does, its inputs, and its outputs. Use Roxygen2 for formal documentation if needed.
  • Configuration and Parameters: Comment on configuration settings, parameters, or magic numbers that control the script’s behavior.

4.2 Best Practices for Commenting

4.2.1 Writing Clear and Concise Comments

  • Be Brief but Descriptive: Comments should be short and to the point, but still provide enough detail to be informative. Avoid unnecessary verbosity.

    # Calculate the mean of the 'value' column, excluding NA values
    mean_value <- mean(data$value, na.rm = TRUE)
  • Use Proper Grammar and Spelling: Good grammar and spelling improve the readability of your comments, making them easier to understand.

    # Create a scatter plot of height versus weight
    plot(height ~ weight, data = df)
  • Focus on the Why, Not the What: Instead of stating what the code does (which is usually evident), explain why a particular approach was taken.

    # Using a logarithmic transformation to normalise skewed data
    log_data <- log(raw_data)

4.2.2 Comment Placement

  • Inline Comments: Place inline comments on the same line as the code they refer to. Use inline comments sparingly and only for clarifying specific parts of the line.

    df <- df[df$value > 0, ]  # Filter out rows with non-positive values
  • Block Comments: For sections of code that require more explanation, use block comments placed above the relevant code block.

    # This block of code cleans the data by:
    # 1. Removing rows with missing values
    # 2. Filtering outliers based on the IQR method
    clean_data <- df %>%
        filter(!is.na(value)) %>%
        filter(value >= lower_bound & value <= upper_bound)
  • Function Comments: Place comments above function definitions to describe their purpose, inputs, and outputs. This is especially important for functions that will be reused or shared.

    #' Calculate the mean of a numeric vector, excluding NA values
    #'
    #' @param x A numeric vector
    #' @return The mean of the vector, excluding NA values
    calculate_mean <- function(x) {
        mean(x, na.rm = TRUE)
    }

4.2.3 Consistency in Commenting

  • Use a Consistent Style: Adopt a consistent style for your comments across all scripts. This includes how you structure your comments, where you place them, and how detailed they are.
  • Update Comments as Code Changes: Ensure that comments are updated whenever the corresponding code is modified. Outdated comments can be misleading and harmful.

4.3 Common Pitfalls to Avoid

4.3.1 Over-Commenting

While comments are valuable, too many comments can clutter your code and make it harder to read. Avoid commenting on obvious code or restating what the code does.

  • Example of Over-Commenting:

    # Assign the value 10 to variable x
    x <- 10

4.3.2 Commenting Out Code

Commenting out code is a common practice during development and debugging, but avoid leaving large blocks of commented-out code in your final scripts. This can confuse others and clutter your script.

Instead: Remove unused code or move it to a different file if you think you might need it later.

4.3.3 Vague or Uninformative Comments

Avoid vague comments that don’t add value or clarify the code. Comments like “This is important” or “Fix this later” are not helpful.

  • Example of Vague Comment:

    # Do something here
    result <- some_function(data)
  • Improved Comment:

    # Apply the custom function to transform the data based on business rules
    result <- some_function(data)

4.4 Advanced Commenting Techniques

4.4.1 Using Roxygen2 for Documentation

For more formal and detailed documentation, especially for packages or functions intended for reuse, consider using the Roxygen2 package. Roxygen2 allows you to write documentation in a structured format directly above your functions, which can then be compiled into formal documentation files.

  • Example of Roxygen2 Comment Block:

    #' Filter data frame by a specific threshold
    #'
    #' This function filters the input data frame by a threshold applied
    #' to a specified column.
    #'
    #' @param df A data frame to be filtered
    #' @param threshold Numeric value used as the filtering threshold
    #' @param column_name Name of the column to apply the threshold to
    #' @return A filtered data frame
    #' @export
    filter_by_threshold <- function(df, threshold, column_name) {
        df[df[[column_name]] > threshold, ]
    }

4.4.2 Writing “To-Do” Comments

It’s often useful to leave “to-do” comments in your code to remind yourself or others of tasks that need to be completed later. These comments should be concise and clear.

  • Example:

    # TODO: Optimise this function for large datasets
    slow_function <- function(x) {
        # current implementation
    }
  • Note: Consider using a consistent format, like TODO, FIXME, or NOTE, so that these comments can be easily searched for and addressed.

4.5 Summary

In this chapter, we’ve explored best practices for commenting your R code effectively. Comments are a vital tool for making your code more understandable and maintainable. By commenting thoughtfully and consistently, you can greatly improve the clarity of your code, making it easier to work with and share.

In the next chapter, we will look at code syntax and spacing practices to further enhance the readability and structure of your R code.