I've got a shell script outputting data like this: <pre class="prettyprint"><code>1234567890 * 1234567891 * </code></pre> I need to remove JUST the last three characters " *". I know I can do it via <pre class="prettyprint"><code>(whatever) | sed 's/$.*$.../\1/' </code></pre> But I DON'T want to use sed for speed purposes. It will always be the same last 3 characters. Any quick way of cleaning up the output?

Here's an old-fashioned unix trick for removing the last 3 characters from a line that makes no use of sed OR awk... <pre class="prettyprint"><code>> echo 987654321 | rev | cut -c 4- | rev 987654 </code></pre> Unlike the earlier example using 'cut', this does not require knowledge of the line length.

I can guarantee you that <code>bash</code> alone won't be any faster than <code>sed</code> for this task. Starting up external processes in <code>bash</code> is a generally bad idea but only if you do it a lot. So, if you're starting a <code>sed</code> process for each line of your input, I'd be concerned. But you're not. You only need to start one <code>sed</code> which will do all the work for you. You may however find that the following <code>sed</code> will be a bit faster than your version: <pre class="prettyprint"><code>(whatever) | sed 's/...$//' </code></pre> All this does is remove the last three characters on each line, rather than substituting the whole line with a shorter version of itself. Now maybe more modern RE engines can optimise your command but why take the risk. To be honest, about the only way I can think of that would be faster would be to hand-craft your own C-based filter program. And the only reason that may be faster than <code>sed</code> is because you can take advantage of the extra knowledge you have on your processing needs (<code>sed</code> has to allow for generalised procession so may be slower because of that). Don't forget the optimisation mantra: "Measure, don't guess!" <hr> If you really want to do this one line at a time in <code>bash</code> (and I still maintain that it's a bad idea), you can use: <pre class="prettyprint"><code>pax> line=123456789abc pax> line2=${line%%???} pax> echo ${line2} 123456789 pax> _ </code></pre> <hr> You may also want to investigate whether you actually need a speed improvement. If you process the lines as one big chunk, you'll see that <code>sed</code> is plenty fast. Type in the following: <pre class="prettyprint"><code>#!/usr/bin/bash echo This is a pretty chunky line with three bad characters at the end.XXX >qq1 for i in 4 16 64 256 1024 4096 16384 65536 ; do cat qq1 qq1 >qq2 cat qq2 qq2 >qq1 done head -20000l qq1 >qq2 wc -l qq2 date time sed 's/...$//' qq2 >qq1 date head -3l qq1 </code></pre> and run it. Here's the output on my (not very fast at all) R40 laptop: <pre class="prettyprint"><code>pax> ./chk.sh 20000 qq2 Sat Jul 24 13:09:15 WAST 2010 real 0m0.851s user 0m0.781s sys 0m0.050s Sat Jul 24 13:09:16 WAST 2010 This is a pretty chunky line with three bad characters at the end. This is a pretty chunky line with three bad characters at the end. This is a pretty chunky line with three bad characters at the end. </code></pre> That's 20,000 lines in under a second, pretty good for something that's only done every hour.

Trim last 3 characters of a line WITHOUT using sed, or perl, etc

Tags:

shell

unix

sed

I've got a shell script outputting data like this:

1234567890  * 1234567891  *

I need to remove JUST the last three characters " *". I know I can do it via

(whatever) | sed 's/\(.*\).../\1/'

But I DON'T want to use sed for speed purposes. It will always be the same last 3 characters.

Any quick way of cleaning up the output?

772

asked Jul 24 '10 04:07

RubiCon10

2 Answers

Here's an old-fashioned unix trick for removing the last 3 characters from a line that makes no use of sed OR awk...

> echo 987654321 | rev | cut -c 4- | rev  987654

Unlike the earlier example using 'cut', this does not require knowledge of the line length.

answered Sep 17 '22 16:09

sitzen2k

I can guarantee you that bash alone won't be any faster than sed for this task. Starting up external processes in bash is a generally bad idea but only if you do it a lot.

So, if you're starting a sed process for each line of your input, I'd be concerned. But you're not. You only need to start one sed which will do all the work for you.

You may however find that the following sed will be a bit faster than your version:

(whatever) | sed 's/...$//'

All this does is remove the last three characters on each line, rather than substituting the whole line with a shorter version of itself. Now maybe more modern RE engines can optimise your command but why take the risk.

To be honest, about the only way I can think of that would be faster would be to hand-craft your own C-based filter program. And the only reason that may be faster than sed is because you can take advantage of the extra knowledge you have on your processing needs (sed has to allow for generalised procession so may be slower because of that).

Don't forget the optimisation mantra: "Measure, don't guess!"

If you really want to do this one line at a time in bash (and I still maintain that it's a bad idea), you can use:

pax> line=123456789abc pax> line2=${line%%???} pax> echo ${line2} 123456789 pax> _

You may also want to investigate whether you actually need a speed improvement. If you process the lines as one big chunk, you'll see that sed is plenty fast. Type in the following:

#!/usr/bin/bash  echo This is a pretty chunky line with three bad characters at the end.XXX >qq1 for i in 4 16 64 256 1024 4096 16384 65536 ; do     cat qq1 qq1 >qq2     cat qq2 qq2 >qq1 done  head -20000l qq1 >qq2 wc -l qq2  date time sed 's/...$//' qq2 >qq1 date head -3l qq1

and run it. Here's the output on my (not very fast at all) R40 laptop:

pax> ./chk.sh 20000 qq2 Sat Jul 24 13:09:15 WAST 2010  real    0m0.851s user    0m0.781s sys     0m0.050s Sat Jul 24 13:09:16 WAST 2010 This is a pretty chunky line with three bad characters at the end. This is a pretty chunky line with three bad characters at the end. This is a pretty chunky line with three bad characters at the end.

That's 20,000 lines in under a second, pretty good for something that's only done every hour.

answered Sep 16 '22 16:09

paxdiablo

Related questions
                            
                                Unix command to check the filesize
                            
                                Find all files in a directory that are not directories themselves
                            
                                UNIX timestamp always in GMT?
                            
                                tree command on osx bash
                            
                                How to duplicate a folder exactly
                            
                                How to convert errno in UNIX to corresponding string?
                            
                                How do I mount a filesystem using Python?
                            
                                Can ImageMagick return the image size?
                            
                                Is it safe to pipe the output of several parallel processes to one file using >>?
                            
                                Command line zip everything within a directory, but do not include any directory as the root
                            
                                Setting environment variable for one program call in bash using env
                            
                                Is there an equivalent source command in Windows CMD as in bash or tcsh?
                            
                                Unix shell file copy flattening folder structure
                            
                                Getting $USER inside shell script when running with sudo?
                            
                                Time waste of execv() and fork()
                            
                                Test run cron entry
                            
                                Check if user is root in C?
                            
                                automatic docker login within a bash script
                            
                                "head" command for aws s3 to view file contents
                            
                                How to cancel shutdown on Linux? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With