I m trying to find gene_info file with genenames and chromosomal location. However, I can't seem to locate it on NCBI FTP site. Can anyone give me a pointer?
The Gene database is a resource of the National Center for Biotechnology Information (NCBI) that centralizes gene-related information into individual records (1).
NCBI currently computes the position of genes and exons when an annotation is released. The results are available from the Genomes FTP site, ftp://ftp.ncbi.nlm.nih.gov/genomes/.
From the NCBI home page, click on the Search pull-down menu to select the Gene database, type the Gene Name in the text box and click Go. See Gene Help for tips searching Gene. Locate the desired Gene record in the results and click the symbol to open the record.
A genetic database is one or more sets of genetic data (genes, gene products, variants, phenotypes) stored together with software to enable users to retrieve genetic data, add genetic data and extract information from the data.
See: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/README for details of what is in what files at the NCBI ftp site.
If you want to get the data from NCBI itself you will need to combine multiple files, probably a gene2accession (which also includes position information) and a gene_info file which maps ids to symbols and names etc.
It is probably more convenient to go to the UCSC site for this information, they also provide a public mysql database if you want to explore what is available: http://workshops.arl.arizona.edu/sql1/sql_workshop/mysql/mysqlclient.html
If you just want human, mouse or rat data then the Rat Genome Database has already compiled the data you want (fresh from the NCBI and Ensembl sources): ftp://rgd.mcw.edu/pub/data_release
e.g. for human data look at: ftp://rgd.mcw.edu/pub/data_release/GENES_HUMAN.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With