Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get taxonomic specific ids for kingdom, phylum, class, order, family, genus and species from taxid?

I have a list of taxids that looks like this:

1204725
2162
1300163
420247

I am looking to get a file with taxonomic ids in order from the taxids above:

kingdom_id      phylum_id       class_id        order_id        family_id       genus_id        species_id   

I am using the package "ete3". I use the tool ete-ncbiquery that tells you the lineage from the ids above. (I run it from my linux laptop with the command below)

ete3 ncbiquery --search 1204725 2162 13000163 420247 --info 

The result looks like this:

# Taxid Sci.Name    Rank    Named Lineage   Taxid Lineage
2162    Methanobacterium formicicum species root,cellular organisms,Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobacterium,Methanobacterium formicicum   1,131567,2157,28890,183925,2158,2159,2160,2162
1204725 Methanobacterium formicicum DSM 3637    no rank root,cellular organisms,Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobacterium,Methanobacterium formicicum,Methanobacterium formicicum DSM 3637  1,131567,2157,28890,183925,2158,2159,2160,2162,1204725
420247  Methanobrevibacter smithii ATCC 35061   no rank root,cellular organisms,Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobrevibacter,Methanobrevibacter smithii,Methanobrevibacter smithii ATCC 350611,131567,2157,28890,183925,2158,2159,2172,2173,420247

I have no idea which items (IDS) correspond to what I am looking for (if any)

like image 476
aLbAc Avatar asked Apr 08 '16 15:04

aLbAc


People also ask

What is a taxon ID?

A taxonomy code is a unique 10-character code that designates your classification and specialization. You will use this code when applying for a National Provider Identifier, commonly referred to as an NPI.

What is the correct order of the 7 levels of classification?

There are seven major taxonomic classifications: Kingdom, phylum, class, order, family, genus, and species.

What is the Kingdom phylum class order family?

Classification, or taxonomy, is a system of categorizing living things. There are seven divisions in the system: (1) Kingdom; (2) Phylum or Division; (3) Class; (4) Order; (5) Family; (6) Genus; (7) Species. Kingdom is the broadest division.

What is the correct order for the classification of Species?

Following the domain level, the classification system reads from least specific to most specific in the following order: Kingdom, Phylum, Class, Order, Family, Genus, and Species. A mnemonic device often used to remember this order is King Philip Can Only Find Green Socks.


1 Answers

The following code:

import csv
from ete3 import NCBITaxa

ncbi = NCBITaxa()

def get_desired_ranks(taxid, desired_ranks):
    lineage = ncbi.get_lineage(taxid)
    lineage2ranks = ncbi.get_rank(lineage)
    ranks2lineage = dict((rank, taxid) for (taxid, rank) in lineage2ranks.items())
    return {'{}_id'.format(rank): ranks2lineage.get(rank, '<not present>') for rank in desired_ranks}

def main(taxids, desired_ranks, path):
    with open(path, 'w') as csvfile:
        fieldnames = ['{}_id'.format(rank) for rank in desired_ranks]
        writer = csv.DictWriter(csvfile, delimiter='\t', fieldnames=fieldnames)
        writer.writeheader()
        for taxid in taxids:
            writer.writerow(get_desired_ranks(taxid, desired_ranks))

if __name__ == '__main__':
    taxids = [1204725, 2162,  1300163, 420247]
    desired_ranks = ['kingdom', 'phylum', 'class', 'order', 'family', 'genus', 'species']
    path = 'taxids.csv'
    main(taxids, desired_ranks, path)

Produces a file that looks like this:

kingdom_id  phylum_id   class_id    order_id    family_id   genus_id    species_id
<not present>   28890   183925  2158    2159    2160    2162
<not present>   28890   183925  2158    2159    2160    2162
<not present>   28890   183925  2158    2159    2160    2162
<not present>   28890   183925  2158    2159    2172    2173
like image 155
BioGeek Avatar answered Sep 19 '22 17:09

BioGeek