Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove comments without affecting values in the config file

Tags:

awk

perl

I have a config file and I need to remove the comments starting with # to the end of the line. But it should not affect the values that are in double/single quotes.

My input file:

# comment1
# comment2
#hbase_table_name=mytable # hbase table.
hbase_table_name=newtable # hbase table.
hbase_txn_family=txn
app_name= "cust#100"  # Name of the application
app_user= 'all#50,all2#100'  # users
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181

The perl command that I'm trying

perl -0777 -pe ' s/^\s*$//gms ; s/#.*?$//gm; s/^\s*$//gms;s/^$//gm' config.txt

The output I'm getting is

hbase_table_name=newtable
hbase_txn_family=txn
app_name= "cust
app_user= 'all
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181

But the required output is

hbase_table_name=newtable
hbase_txn_family=txn
app_name= "cust#100"
app_user= 'all#50,all2#100'
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181

I'm looking for a bash solution using any tools - awk or perl that can solve this.

A rare scenario may be with config entry like

app_user= 'all#50,all2#100'  # users - "all" of them

and the result should be app_user= 'all#50,all2#100'

like image 407
stack0114106 Avatar asked Jan 24 '26 01:01

stack0114106


2 Answers

Here is a perl script:

#!/usr/bin/perl

use strict;

while (<DATA>){
    if (m/^\h*#/) {next;};
    if (m/((['"])[^\2]*\2)/) {print substr $_, 0, @+[0]; print "\n"; next; }
    s/#.*$//; print ;
}

__DATA__
# comment1
# comment2
#hbase_table_name=mytable # hbase table.
hbase_table_name=newtable # hbase table.
hbase_txn_family=txn
app_name= "cust#100"  # Name of the application
#app_name= "cust#100"  # Name of the application
app_user= 'all#50,all2#100'  # users
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
# from comments, other lines
hbase_table_name=newtable ## hbase table.
app_user= 'all#50,all2#100'  # users - "all" of them

Output:

hbase_table_name=newtable 
hbase_txn_family=txn
app_name= "cust#100"
app_user= 'all#50,all2#100'
hbase.zookeeper.quorum=localhost
zookeeper.znode.parent=/hbase-secure
hbase.zookeeper.property.clientPort=2181
hbase_table_name=newtable 
app_user= 'all#50,all2#100'

Change <DATA> to <> and to use on a file...

like image 76
dawg Avatar answered Jan 25 '26 16:01

dawg


Could you please try following(written and tested with shown samples).

awk '
/^#/{
  next
}
/".*"|\047.*\047/{
  match($0,/.*#/)
  print substr($0,RSTART,RLENGTH-1)
  next
}
{
  sub(/#.*/,"")
}
1
'  Input_file

Explanation: Adding detailed explanation for above code.

awk '                                   ##Starting awk program from here.
/^#/{                                   ##Checking condition if a line starts from #  then do following.
  next                                  ##next will skip all further statements from here.
}
/".*"|\047.*\047/{                      ##Checking condition if a line matching regex from " to * OR single quote to single quote in current line.
  match($0,/.*#/)                       ##If above TRUE then come inside block; using match to match everything till # here.
  print substr($0,RSTART,RLENGTH-1)     ##Printing substring which prints from starting to length of matched regex with -1 to remove # in it.
  next                                  ##next willskip all further statements from here.
}
{
  sub(/#.*/,"")                         ##This statement will executewhen either a line is NOT starting from # OR  does not have single/double quote in it.
}
1                                       ##1 will print edited/non-edited lines here.
like image 35
RavinderSingh13 Avatar answered Jan 25 '26 17:01

RavinderSingh13