Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Distinguish commented code vs valid comments [closed]

I have to work with a project that has tons of commented code everywhere. Before I introduce any changes I would like to do a basic clean-up and remove old unused code.

So I could just use solution from this accepted answer to remove all comments, but...

There are legitimate comments (not a commented code) that explain stuff. I don't want to remove it. For example:

// Those parameters control foo and bar... <- valid comment
int t = 5;
// int t = 10;  <- commented code
int k = 2*t;

Only line 3 should be removed.

What are the possible ways of analyzing the code and distinguish between comments in natural language and commented lines of code?

like image 699
hans Avatar asked Apr 12 '26 15:04

hans


1 Answers

This is a basic approach, but it proposes a proof of concept of what might be done. I do it using Bash along with the usage of the GCC -fsyntax-only option.

Here is the bash script:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
    LINE=`echo $line | grep -oP "(?<=//).*"`
    if [[ -n "$LINE" ]]; then
            echo $LINE | gcc -fsyntax-only -xc -
            if [[ $? -eq 0 ]]; then
                   sed -i "/$LINE/d" ./$1
            fi
    fi
done < "$1"

The approach I followed here was reading each line from the code file. Then, greping the text after the // delimiter (if exists) with the regex (?<=//).* and passing that to the gcc -fsyntax-only command to check whether it's a correct C/C++ statement or not. Notice that I've used the argument -xc - to pass the input to GCC from stdin (see my answer here to understand more). An important note, the c in -xc - specifies the language, which is C in this case, if you want it to be C++ you shall change it to -xc++.

Then, if GCC was able to successfully parse the statement (i.e., it's a legitimate C/C++ statement), I directly remove it using sed -i from the file passed.


Running it on your example (but after removing <- commented code from the third line to make it a legitimate statement):

// Those parameters control foo and bar... <- valid comment
int t = 5;
// int t = 10;
int k = 2*t;

Output (in the same file):

// Those parameters control foo and bar... <- valid comment
int t = 5;
int k = 2*t;

(if you want to add your modifications in a different file, just remove the -i from sed -i)

The script can be called just like: ./script.sh file.cpp, it may show several GCC errors since these are the valid comments.


Update.

A more simplified version of the same logic is:

#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
    if [[ "$line" =~  [/]+.* ]]; then
        $LINE=${line##*\/}
        echo ${$LINE} | gcc -fsyntax-only -xc - && sed -i "/$LINE/d" ./$1
    fi
done < "$1"
like image 87
ndrwnaguib Avatar answered Apr 15 '26 06:04

ndrwnaguib



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!