When I run the following code
from subprocess import call, check_output, Popen, PIPE
gr = Popen(["grep", "'^>'", myfile], stdout=PIPE)
sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout)
gr.stdout.close()
out = sd.communicate()[0]
print out
Where myfile looks like this:
>name len=345
sometexthere
>name2 len=4523
someothertexthere
...
...
I get
None
When the expected output is a list of numbers:
345
4523
...
...
The corresponding command I run in the terminal is
grep "^>" myfile | sed "s/.*len=//" > outfile
So far, I have tried playing around with escaping and quoting in different ways, such as escaping slashes in the sed or adding extra quotation marks for grep, but the combinatorial possibilities there are large.
I have also considered just reading in the file and writing Python equivalents of grep and sed, but the file is very large (I could always read line by line though), it will always run on UNIX-based systems and I am still curious on where I made errors.
Could it be that
sd.communicate()[0]
returns some kind of object (instead of the list of integers) for which None is the type?
I know I can grab the output with check_output in simple cases:
sam = check_output(["samn", "stats", myfile])
but not sure how to make it work with more complicated situations were stuff is getting piped.
What are some productive approaches to get the expected results with subprocess?
As suggested you need to stdout=PIPE
in the second process and remove the single quotes from "'^>'"
:
gr = Popen(["grep", "^>", myfile], stdout=PIPE)
Popen(["sed", "s/.*len=//"], stdin=gr.stdout, stdout=PIPE)
......
But this can be done simply using pure python and re
:
import re
r = re.compile("^\>.*len=(.*)$")
with open("test.txt") as f:
for line in f:
m = r.search(line)
if m:
print(m.group(1))
Which would output:
345
4523
If the lines that start with >
always have the number and the number is always at the end after len=
then you don't actually need a regex either:
with open("test.txt") as f:
for line in f:
if line.startswith(">"):
print(line.rsplit("len=", 1)[1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With