put all separate paragraphs of a file into a separate line

Question

I have a file that contains sequence data, where each new paragraph (separated by two blank lines) contain a new sequence:

#example

ASDHJDJJDMFFMF
AKAKJSJSJSL---
SMSM-....SKSKK
....SK


SKJHDDSNLDJSCC
AK..SJSJSL--HG
AHSM---..SKSKK
-.-GHH

and I want to end up with a file looking like:

ASDHJDJJDMFFMFAKAKJSJSJSL---SMSM-....SKSKK....SK
SKJHDDSNLDJSCCAK..SJSJSL--HGAHSM---..SKSKK-.-GHH

each sequence is the same length (if that helps).

I would also be looking to do this over multiple files stored in different directiories.

I have just tried

sed -e '/./{H;$!d;}' -e 'x;/regex/!d' ./text.txt

however this just deleted the entire file :S

any help would bre appreciated - doesn't have to be in sed, if you know how to do it in perl or something else then that's also great.

Thanks.

Ed Morton · Accepted Answer

All you're asking to do is convert a file of blank-lines-separated records (RS) where each field is separated by newlines into a file of newline-separated records where each field is separated by nothing (OFS). Just set the appropriate awk variables and recompile the record:

$ awk '{$1=$1}1' RS= OFS= file
ASDHJDJJDMFFMFAKAKJSJSJSL---SMSM-....SKSKK....SK
SKJHDDSNLDJSCCAK..SJSJSL--HGAHSM---..SKSKK-.-GHH

glenn jackman · Answer

awk '
    /^[[:space:]]*$/ {if (line) print line; line=""; next}
    {line=line $0}
    END {if (line) print line}
'

perl -00 -pe 's/
//g; $_.="
"'

For multiple files:

# adjust your glob pattern to suit, 
# don't be shy to ask for assistance
for file in */*.txt; do
    newfile="/some/directory/$(basename "$file")"
    perl -00 -pe 's/
//g; $_.="
"' "$file" > "$newfile"
done

put all separate paragraphs of a file into a separate line

Tags:

sed

perl

text-manipulation

brucezepplin

2 Answers

Ed Morton

glenn jackman

Recent Activity

Donate For Us

put all separate paragraphs of a file into a separate line

Tags:

sed

perl

text-manipulation

brucezepplin

2 Answers

Ed Morton

glenn jackman

Related questions

Recent Activity

Donate For Us