Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formatting git log output with sed/awk/grep

Tags:

git

regex

bash

sed

awk

Summary / 'gist of' version,

if I have a set of messages with subject [SUB] and body [BODY] like below, How can I add a newline after the subject only if [BODY] exists (And replace the place holders with *)

[SUB] some subject. [BODY] some body lines 
with newline chars and !@@# bunch of other *#@ chars
 without [(BODY)] or [(SUB)]... and more stuff
[SUB] Another subject. with no body [BODY] 
[SUB] another [BODY] some body.

I want this to be formatted like

* some subject.

some body lines 
with newline chars and !@@# bunch of other *#@ chars
 without [(BODY)] or [(SUB)]... and more stuff
* Another subject. with no body 
* another 

some body.

What I really wanna do,

So I am trying to auto-generate my CHANGELOG.md file from the git log output. The problem is, I need to put newline char only if the body of the commit message is non empty.

The current code looks like this, (broken into two lines)

git log v0.1.0..v0.1.2 --no-merges --pretty=format:'* %s -- %cn | \
[%h](http://github.com/../../commit/%H) %n%b' | grep -v Minor | grep . >> CHANGELOG.md

and a sample output,

* Added run information display (0.1.2) -- ... | [f9b1f6c](http://github.com/../../commit/...) 
+ Added runs page to show a list of all the runs and run inforation, include sorting and global filtering.
+ Updated run information display panel on the run-info page
+ Changed the links and their names around.

* Update README.md -- abc | [2a90998](http://github.com/../../commit/...) 

* Update README.md -- xt | [00369bd](http://github.com/../../commit/...) 

You see here, the lines starting with the * are the commits, and the lines starting on + are just a part of the body for the first commit. Right now it adds a %n (newline) in front of all the body sections regardless of whether it's empty or not. I want to add this ONLY if its non empty (probably even after removing the whitespaces)

How would I achieve this? my knowledge of sed and awk is almost non-existing, and trying to learn didn't help much.

(I will can make sure all the code in the body is indented, so it wont confuse list of commits with lists in the body)


My Answer

i'm sure jthills answer is correct (and maye even a better way to do it), but while I was looking to figure out what his meant, i came up wit this. Hope it will help myself or someone in he future,

I am pasting the full shell script that I used,

mv CHANGELOG.md CHANGELOG.md.temp
printf '### Version '$1' \n\n' > CHANGELOG.md
git log $2..$1 --no-merges --pretty=format:'[SUB]%s -- %cn | \
    [%h](http://github.com/<user>/<gitrepo>/commit/%H) [BODY]%b' | grep -v Minor | \
    sed '{:q;N;s/\s*\[BODY\][\n\s]*\[SUB\]/\n\[SUB\]/;b q}' | \
    sed 's/\[SUB\]/* /g' | 
    sed 's/\[BODY\]/\n\n/'>> CHANGELOG.md
cat CHANGELOG.md.temp >> CHANGELOG.md
rm CHANGELOG.md.temp

I am basically prepending the new commit log to the CHANGELOG.md using the temp file. Please feel free to suggest shorter versions for this 3 sed commands

like image 875
xcorat Avatar asked Oct 23 '25 02:10

xcorat


2 Answers

Tag your syntax in the git log output. This will handle inserting the newlines properly, the rest you know:

git log --pretty=tformat:'%s%xFF%x01%b%xFF%x02' \
| sed '1h;1!H;$!d;g              # buffer it all (see comments for details)
       s/\xFF\x01\xff\x02//g     # strip null bodies
       s/\xFF\x01/\n/g           # insert extra newline before the rest
       s/\xFF.//g                # cleanup
'

(edit: quote/escape typos)

like image 97
jthill Avatar answered Oct 25 '25 16:10

jthill


For your first file in your question, you could try the following:

awk -f r.awk input.txt 

where input.txt is the input file, and r.awk is :

{
    line=line $0 ORS
}

END {
    while (getSub()) {
        getBody()
        print "* " subj
        if (body) {
            print ""
            print body
        }
    }
}

function getBody(ind) {
    ind=index(line,"[SUB]")
    if (ind) {
        body=substr(line,1,ind-1)
        line=substr(line,ind)
    }
    else
        body=line
    sub(/^[[:space:]]*/,"",body)
    sub(/[[:space:]]*$/,"",body)
}

function getSub(ind,ind2) {
    ind=index(line,"[SUB]")
    if (ind) {
        ind=ind+5
        ind2=index(line,"[BODY]")
        subj=substr(line, ind, ind2-ind)
        line=substr(line,ind2+6)
        return 1
    }
    else
        return 0
}

gives output:

*  some subject. 

some body lines 
with newline chars and !@@# bunch of other *#@ chars
 without [(BODY)] or [(SUB)]... and more stuff
*  Another subject. with no body 
*  another 

some body.
like image 23
Håkon Hægland Avatar answered Oct 25 '25 17:10

Håkon Hægland