Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to save output from python like tsv

I am using biopython package and I would like to save result like tsv file. This output from print to tsv.

for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
    print ("%s %s %s" % (record.id,record.seq, record.format("qual")))

Thank you.

like image 622
Vonton Avatar asked Apr 27 '15 12:04

Vonton


People also ask

How do you save a Dataframe as a TSV in Python?

How To Write Pandas DataFrame as TSV File? We can use Pandas' to_csv() function to write dataframe as Tab Separated Value or TSV file by specifying the argument for separator with sep=”\t”.

What is the difference between CSV and TSV?

CSV uses an escape syntax to represent commas and newlines in the data. TSV takes a different approach, disallowing TABs and newlines in the data. The escape syntax enables CSV to fully represent common written text. This is a good fit for human edited documents, notably spreadsheets.


2 Answers

My preferred solution is to use the CSV module. It's a standard module, so:

  • Somebody else has already done all the heavy lifting.
  • It allows you to leverage all the functionality of the CSV module.
  • You can be fairly confident it will function as expected (not always the case when I write it myself).
  • You're not going to have to reinvent the wheel, either when you write the file or when you read it back in on the other end (I don't know your record format, but if one of your records contains a TAB, CSV will escape it correctly for you).
  • It will be easier to support when the next person has to go in to update the code 5 years after you've left the company.

The following code snippet should do the trick for you:

#! /bin/env python3
import csv
with open('records.tsv', 'w') as tsvfile:
    writer = csv.writer(tsvfile, delimiter='\t', newline='\n')
    for record in SeqIO.parse("/home/fil/Desktop/420_2_03_074.fastq", "fastq"):
        writer.writerow([record.id, record.seq, record.format("qual")])

Note that this is for Python 3.x. If you're using 2.x, the open and writer = ... will be slightly different.

like image 61
Deacon Avatar answered Oct 10 '22 04:10

Deacon


If you want to use the .tsv to label your word embeddings in TensorBoard, use the following snippet. It uses the CSV module (see Doug's answer).

# /bin/env python3
import csv

def save_vocabulary():
    label_file = "word2context/labels.tsv"
    with open(label_file, 'w', encoding='utf8', newline='') as tsv_file:
        tsv_writer = csv.writer(tsv_file, delimiter='\t', lineterminator='\n')
        tsv_writer.writerow(["Word", "Count"])
        for word, count in word_count:
            tsv_writer.writerow([word, count])

word_count is a list of tuples like this:

[('the', 222594), ('to', 61479), ('in', 52540), ('of', 48064) ... ]
like image 42
Domi W Avatar answered Oct 10 '22 04:10

Domi W