Suppose I want to count the lines of code in a project. If all of the files are in the same directory I can execute: <pre class="prettyprint"><code>cat * | wc -l </code></pre> However, if there are sub-directories, this doesn't work. For this to work cat would have to have a recursive mode. I suspect this might be a job for xargs, but I wonder if there is a more elegant solution?

First you do not need to use <code>cat</code> to count lines. This is an antipattern called Useless Use of Cat (UUoC). To count lines in files in the current directory, use <code>wc</code>: <pre class="prettyprint"><code>wc -l * </code></pre> Then the <code>find</code> command recurses the sub-directories: <pre class="prettyprint"><code>find . -name "*.c" -exec wc -l {} \; </code></pre> <ul> <li><code>.</code> is the name of the top directory to start searching from</li> <li><code>-name "*.c"</code> is the pattern of the file you're interested in </li> <li><code>-exec</code> gives a command to be executed</li> <li><code>{}</code> is the result of the find command to be passed to the command (here <code>wc-l</code>)</li> <li><code>\;</code> indicates the end of the command </li> </ul> This command produces a list of all files found with their line count, if you want to have the sum for all the files found, you can use find to list the files (with the <code>-print</code> option) and than use xargs to pass this list as argument to wc-l. <pre class="prettyprint"><code>find . -name "*.c" -print | xargs wc -l </code></pre> EDIT to address Robert Gamble comment (thanks): if you have spaces or newlines (!) in file names, then you have to use <code>-print0</code> option instead of <code>-print</code> and <code>xargs -null</code> so that the list of file names are exchanged with null-terminated strings. <pre class="prettyprint"><code>find . -name "*.c" -print0 | xargs -0 wc -l </code></pre> The Unix philosophy is to have tools that do one thing only, and do it well.

If you want a code-golfing answer: <pre class="prettyprint lang-bash prettyprint-override"><code>grep '' -R . | wc -l </code></pre> The problem with just using wc -l on its own is it cant descend well, and the oneliners using <pre class="prettyprint lang-bash prettyprint-override"><code>find . -exec wc -l {} \; </code></pre> Won't give you a total line count because it runs wc once for every file, ( loL! ) and <pre class="prettyprint lang-bash prettyprint-override"><code>find . -exec wc -l {} + </code></pre> Will get confused as soon as find hits the ~200k1,2 character argument limit for parameters and instead calls wc multiple times, each time only giving you a partial summary. Additionally, the above grep trick will not add more than 1 line to the output when it encounters a binary file, which could be circumstantially beneficial. For the cost of 1 extra command character, you can ignore binary files completely: <pre class="prettyprint lang-bash prettyprint-override"><code> grep '' -IR . | wc -l </code></pre> If you want to run line counts on binary files too <pre class="prettyprint lang-bash prettyprint-override"><code> grep '' -aR . | wc -l </code></pre> Footnote on limits: The docs are a bit vague as to whether its a string size limit or a number of tokens limit. <pre class="prettyprint lang-bash prettyprint-override"><code>cd /usr/include; find -type f -exec perl -e 'printf qq[%s => %s\n], scalar @ARGV, length join q[ ], @ARGV' {} + # 4066 => 130974 # 3399 => 130955 # 3155 => 130978 # 2762 => 130991 # 3923 => 130959 # 3642 => 130989 # 4145 => 130993 # 4382 => 130989 # 4406 => 130973 # 4190 => 131000 # 4603 => 130988 # 3060 => 95435 </code></pre> This implies its going to chunk very very easily.

How to count lines of code including sub-directories [duplicate]

Tags:

linux

bash

unix

wc

Suppose I want to count the lines of code in a project. If all of the files are in the same directory I can execute:

cat * | wc -l

However, if there are sub-directories, this doesn't work. For this to work cat would have to have a recursive mode. I suspect this might be a job for xargs, but I wonder if there is a more elegant solution?

863

asked Nov 25 '08 07:11

speciousfool

2 Answers

First you do not need to use cat to count lines. This is an antipattern called Useless Use of Cat (UUoC). To count lines in files in the current directory, use wc:

wc -l *

Then the find command recurses the sub-directories:

find . -name "*.c" -exec wc -l {} \;

. is the name of the top directory to start searching from
-name "*.c" is the pattern of the file you're interested in
-exec gives a command to be executed
{} is the result of the find command to be passed to the command (here wc-l)
\; indicates the end of the command

This command produces a list of all files found with their line count, if you want to have the sum for all the files found, you can use find to list the files (with the -print option) and than use xargs to pass this list as argument to wc-l.

find . -name "*.c" -print | xargs wc -l

EDIT to address Robert Gamble comment (thanks): if you have spaces or newlines (!) in file names, then you have to use -print0 option instead of -print and xargs -null so that the list of file names are exchanged with null-terminated strings.

find . -name "*.c" -print0 | xargs -0 wc -l

The Unix philosophy is to have tools that do one thing only, and do it well.

181

answered Sep 24 '22 01:09

philant

If you want a code-golfing answer:

grep '' -R . | wc -l

The problem with just using wc -l on its own is it cant descend well, and the oneliners using

find . -exec wc -l {} \;

Won't give you a total line count because it runs wc once for every file, ( loL! ) and

find . -exec wc -l {} +

Will get confused as soon as find hits the ~200k¹^,² character argument limit for parameters and instead calls wc multiple times, each time only giving you a partial summary.

Additionally, the above grep trick will not add more than 1 line to the output when it encounters a binary file, which could be circumstantially beneficial.

For the cost of 1 extra command character, you can ignore binary files completely:

 grep '' -IR . | wc -l

If you want to run line counts on binary files too

 grep '' -aR . | wc -l

Footnote on limits:

The docs are a bit vague as to whether its a string size limit or a number of tokens limit.

cd /usr/include; find -type f -exec perl -e 'printf qq[%s => %s\n], scalar @ARGV, length join q[ ], @ARGV' {} +  # 4066 => 130974 # 3399 => 130955 # 3155 => 130978 # 2762 => 130991 # 3923 => 130959 # 3642 => 130989 # 4145 => 130993 # 4382 => 130989 # 4406 => 130973 # 4190 => 131000 # 4603 => 130988 # 3060 => 95435

This implies its going to chunk very very easily.

answered Sep 22 '22 01:09

Kent Fredric

Related questions
                            
                                What do the numbers in /proc/loadavg mean on Linux?
                            
                                Where is the <conio.h> header file on Linux? Why can't I find <conio.h>? [duplicate]
                            
                                How do I change my pwd to the real path of a symlinked directory?
                            
                                Uncompress tar.gz file [closed]
                            
                                How to delete all files older than 3 days when "Argument list too long"?
                            
                                Checking for environment variables
                            
                                couldn't connect to server 127.0.0.1 shell/mongo.js
                            
                                Running ASP.Net on a Linux based server
                            
                                How to search for a file in the CentOS command line [closed]
                            
                                How can the Linux kernel compile itself?
                            
                                How can I show the wget progress bar only? [closed]
                            
                                Adding timestamp to a filename with mv in BASH
                            
                                How to stop java process gracefully?
                            
                                How to display only files from aws s3 ls command?
                            
                                What does "make oldconfig" do exactly in the Linux kernel makefile?
                            
                                Finding out what the GCC include path is [duplicate]
                            
                                Remove empty lines in a text file via grep
                            
                                How to remove all white spaces from a given text file
                            
                                Installing OpenSSH on the Alpine Docker Container
                            
                                Take a screenshot via a Python script on Linux

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With