I have a file which contains "title" written in it many times. How can I find the number of times "title" is written in that file using the sed command provided that "title" is the first string in a line? e.g.
# title title title
should output the count = 2 because in first line title is not the first string.
Update
I used awk to find the total number of occurrences as:
awk '$1 ~ /title/ {++c} END {print c}' FS=: myFile.txt
But how can I tell awk to count only those lines having title the first string as explained in example above?
Using grep -c alone will count the number of lines that contain the matching word instead of the number of total matches. The -o option is what tells grep to output each match in a unique line and then wc -l tells wc to count the number of lines. This is how the total number of matching words is deduced.
You can tell sed to carry out multiple operations by just repeating -e (or -f if your script is in a file). sed -i -e 's/a/b/g' -e 's/b/d/g' file makes both changes in the single file named file , in-place.
On Linux and Unix-like operating systems, the wc command allows you to count the number of lines, words, characters, and bytes of each given file or standard input and print the result.
Never say never. Pure sed
(although it may require the GNU version).
#!/bin/sed -nf # based on a script from the sed info file (info sed) # section 4.8 Numbering Non-blank Lines (cat -b) # modified to count lines that begin with "title" /^title/! be x /^$/ s/^.*$/0/ /^9*$/ s/^/0/ s/.9*$/x&/ h s/^.*x// y/0123456789/1234567890/ x s/x.*$// G s/\n// h :e $ {x;p}
Explanation:
#!/bin/sed -nf # run sed without printing output by default (-n) # using the following file as the sed script (-f) /^title/! be # if the current line doesn't begin with "title" branch to label e x # swap the counter from hold space into pattern space /^$/ s/^.*$/0/ # if pattern space is empty start the counter at zero /^9*$/ s/^/0/ # if pattern space starts with a nine, prepend a zero s/.9*$/x&/ # mark the position of the last digit before a sequence of nines (if any) h # copy the marked counter to hold space s/^.*x// # delete everything before the marker y/0123456789/1234567890/ # increment the digits that were after the mark x # swap pattern space and hold space s/x.*$// # delete everything after the marker leaving the leading digits G # append hold space to pattern space s/\n// # remove the newline, leaving all the digits concatenated h # save the counter into hold space :e # label e $ {x;p} # if this is the last line of input, swap in the counter and print it
Here are excerpts from a trace of the script using sedsed:
$ echo -e 'title\ntitle\nfoo\ntitle\nbar\ntitle\ntitle\ntitle\ntitle\ntitle\ntitle\ntitle\ntitle' | sedsed-1.0 -d -f ./counter PATT:title$ HOLD:$ COMM:/^title/ !b e COMM:x PATT:$ HOLD:title$ COMM:/^$/ s/^.*$/0/ PATT:0$ HOLD:title$ COMM:/^9*$/ s/^/0/ PATT:0$ HOLD:title$ COMM:s/.9*$/x&/ PATT:x0$ HOLD:title$ COMM:h PATT:x0$ HOLD:x0$ COMM:s/^.*x// PATT:0$ HOLD:x0$ COMM:y/0123456789/1234567890/ PATT:1$ HOLD:x0$ COMM:x PATT:x0$ HOLD:1$ COMM:s/x.*$// PATT:$ HOLD:1$ COMM:G PATT:\n1$ HOLD:1$ COMM:s/\n// PATT:1$ HOLD:1$ COMM:h PATT:1$ HOLD:1$ COMM::e COMM:$ { PATT:1$ HOLD:1$ PATT:title$ HOLD:1$ COMM:/^title/ !b e COMM:x PATT:1$ HOLD:title$ COMM:/^$/ s/^.*$/0/ PATT:1$ HOLD:title$ COMM:/^9*$/ s/^/0/ PATT:1$ HOLD:title$ COMM:s/.9*$/x&/ PATT:x1$ HOLD:title$ COMM:h PATT:x1$ HOLD:x1$ COMM:s/^.*x// PATT:1$ HOLD:x1$ COMM:y/0123456789/1234567890/ PATT:2$ HOLD:x1$ COMM:x PATT:x1$ HOLD:2$ COMM:s/x.*$// PATT:$ HOLD:2$ COMM:G PATT:\n2$ HOLD:2$ COMM:s/\n// PATT:2$ HOLD:2$ COMM:h PATT:2$ HOLD:2$ COMM::e COMM:$ { PATT:2$ HOLD:2$ PATT:foo$ HOLD:2$ COMM:/^title/ !b e COMM:$ { PATT:foo$ HOLD:2$ . . . PATT:10$ HOLD:10$ PATT:title$ HOLD:10$ COMM:/^title/ !b e COMM:x PATT:10$ HOLD:title$ COMM:/^$/ s/^.*$/0/ PATT:10$ HOLD:title$ COMM:/^9*$/ s/^/0/ PATT:10$ HOLD:title$ COMM:s/.9*$/x&/ PATT:1x0$ HOLD:title$ COMM:h PATT:1x0$ HOLD:1x0$ COMM:s/^.*x// PATT:0$ HOLD:1x0$ COMM:y/0123456789/1234567890/ PATT:1$ HOLD:1x0$ COMM:x PATT:1x0$ HOLD:1$ COMM:s/x.*$// PATT:1$ HOLD:1$ COMM:G PATT:1\n1$ HOLD:1$ COMM:s/\n// PATT:11$ HOLD:1$ COMM:h PATT:11$ HOLD:11$ COMM::e COMM:$ { COMM:x PATT:11$ HOLD:11$ COMM:p 11 PATT:11$ HOLD:11$ COMM:} PATT:11$ HOLD:11$
The ellipsis represents lines of output I omitted here. The line with "11" on it by itself is where the final count is output. That's the only output you'd get when the sedsed
debugger isn't being used.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With