Use zcat and sed or awk to edit compressed .gz text file

Tags:

I am trying to edit compressed fastq.gz text files, by removing the first six characters of lines 2,6,10,14... I have two different ways of doing this right now, either using awk or sed, but these only seem to work if the files are unzipped. I would like to edit the files without unzipping them and tried the following code without getting it to work. Thanks.

Using sed:

zcat /dir/* | sed -i~ '2~4s/^.\{6\}//'

Using awk:

zcat /dir/* | awk 'NR%4==2 {gsub(/^....../,"")} 1'

376

asked Feb 17 '15 17:02

The Nightman

2 Answers

You can't bypass compression, but you can chain the decompress/edit/recompress together in an automated fashion:

for f in /dir/*; do
  cp "$f" "$f~" &&   
  gzip -cd "$f~" | sed '2~4s/^.\{6\}//' | gzip > "$f"
done

If you're quite confident in the operation, you can remove the backup files by adding rm "$f~" to the end of the loop body.

answered Oct 14 '22 04:10

Mark Reed

I wrote a script called zawk which can do this natively. It's similar to glenn jackman's answer to a duplicate of this question, but it handles awk options and several different compression mechanisms and input methods while retaining FILENAME and FNR.

You'd use it like:

zawk 'awk logic goes here' log*.gz

This does not address sed's "in-place" flag (-i).

answered Oct 14 '22 04:10

Adam Katz

Related questions
                            
                                How can I do foreach *.mp3 file recursively in a bash script?
                            
                                sed and grep get the line number for a match
                            
                                How to make a bash function return 1 on any error
                            
                                Sed error: bad flag in substitute command: 'U'
                            
                                How to extract text portion of a binary file in linux/bash?
                            
                                Find previous searches in less command
                            
                                OS X bash: dirname
                            
                                Recommendation - Zsh vs FishShell. Scripting, productivity and poweruser perse [closed]
                            
                                Save part of matching pattern to variable
                            
                                How Do I Find the Last Positional Parameter in Linux
                            
                                Writing shells script to display time in am or pm notation
                            
                                How to gently kill Firefox process on Linux/OS X
                            
                                bash removing part of a file name
                            
                                What is a reason for using shift $((OPTIND-1)) after getopts?
                            
                                Specify which shell Yarn uses for running scripts
                            
                                Bash conditional based on exit code of command
                            
                                sh return: can only `return' from a function or sourced script
                            
                                tar: Failed to open '/dev/sa0' error in FreeBSD [closed]
                            
                                Replacing "#", "$", "%", "&", and "_" with "\#", "\$", "\%", "\&", and "\_"
                            
                                Bash: Getting PID of daemonized screen session

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With