How can I use readline() to begin from the second line?

Tags:

python

I'm writing a short program in Python that will read a FASTA file which is usually in this format:

>gi|253795547|ref|NC_012960.1| Candidatus Hodgkinia cicadicola Dsem chromosome, 52 lines
GACGGCTTGTTTGCGTGCGACGAGTTTAGGATTGCTCTTTTGCTAAGCTTGGGGGTTGCGCCCAAAGTGA
TTAGATTTTCCGACAGCGTACGGCGCGCGCTGCTGAACGTGGCCACTGAGCTTACACCTCATTTCAGCGC
TCGCTTGCTGGCGAAGCTGGCAGCAGCTTGTTAATGCTAGTGTTGGGCTCGCCGAAAGCTGGCAGGTCGA

I've created another program that reads the first line(aka header) of this FASTA file and now I want this second program to start reading and printing beginning from the sequence.

How would I do that?

so far i have this:

FASTA = open("test.txt", "r")

def readSeq(FASTA):
    """returns the DNA sequence of a FASTA file"""
    for line in FASTA:
        line = line.strip()
        print line          


readSeq(FASTA)

Thanks guys

-Noob

335

asked Apr 22 '11 05:04

Francis

3 Answers

def readSeq(FASTA):
    """returns the DNA sequence of a FASTA file"""
    _unused = FASTA.next() # skip heading record
    for line in FASTA:
        line = line.strip()
        print line

Read the docs on file.next() to see why you should be wary of mixing file.readline() with for line in file:

155

answered Oct 26 '22 06:10

John Machin

you should show your script. To read from second line, something like this

f=open("file")
f.readline()
for line in f:
    print line
f.close()

answered Oct 26 '22 05:10

kurumi

You might be interested in checking BioPythons handling of Fasta files (source).

def FastaIterator(handle, alphabet = single_letter_alphabet, title2ids = None):
    """Generator function to iterate over Fasta records (as SeqRecord objects).

handle - input file
alphabet - optional alphabet
title2ids - A function that, when given the title of the FASTA
file (without the beginning >), will return the id, name and
description (in that order) for the record as a tuple of strings.

If this is not given, then the entire title line will be used
as the description, and the first word as the id and name.

Note that use of title2ids matches that of Bio.Fasta.SequenceParser
but the defaults are slightly different.
"""
    #Skip any text before the first record (e.g. blank lines, comments)
    while True:
        line = handle.readline()
        if line == "" : return #Premature end of file, or just empty?
        if line[0] == ">":
            break

    while True:
        if line[0]!=">":
            raise ValueError("Records in Fasta files should start with '>' character")
        if title2ids:
            id, name, descr = title2ids(line[1:].rstrip())
        else:
            descr = line[1:].rstrip()
            id = descr.split()[0]
            name = id

        lines = []
        line = handle.readline()
        while True:
            if not line : break
            if line[0] == ">": break
            #Remove trailing whitespace, and any internal spaces
            #(and any embedded \r which are possible in mangled files
            #when not opened in universal read lines mode)
            lines.append(line.rstrip().replace(" ","").replace("\r",""))
            line = handle.readline()

        #Return the record and then continue...
        yield SeqRecord(Seq("".join(lines), alphabet),
                         id = id, name = name, description = descr)

        if not line : return #StopIteration
    assert False, "Should not reach this line"

answered Oct 26 '22 04:10

dting

Related questions
                            
                                Python: Call all methods of an object with a given set of arguments
                            
                                reimporting a single function in python
                            
                                RDFLib: Namespace prefixes in XML serialization
                            
                                In Python, without using the /proc filesystem, how do I tell if a given PID is running?
                            
                                The efficiency when using a big data structure in a function in Python
                            
                                Parsing an XML file using Element Tree
                            
                                Speedier/less resource-demolishing way to strip html from large files than BeautifulSoup? Or, a better way to use BeautifulSoup?
                            
                                Main functions, pythonic?
                            
                                Why "decimal.Decimal('0') < 1.0" yields False in Python 2.6.5
                            
                                Python subprocess.Popen slow under uWSGI
                            
                                eclipse, pydev, easy_install-ed eggs problem
                            
                                Python - slice array until certain condition is met
                            
                                Run pika ioloop in background or use custom ioloop
                            
                                How do I serve image Content-types with Python BaseHTTPServerRequestHandler do_GET method?
                            
                                Pythonic loops--how to get multiple elements while iterating a list
                            
                                Why is this not a fixed width pattern?
                            
                                Django 1.3 in appengine
                            
                                How do I use Selenium to login to sites that require username and password?
                            
                                Python : Tkinter widget background (buttons, entries etc)
                            
                                Anonymous class inheritance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With