Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sqlalchemy: select specific columns from multiple join using aliases

This has be stumped for more than a day now and examples I could find have not worked. I am new to SQLALCHEMY and I find the documentation not very enlightening.

The query (so far):

prey = alias(ensembl_genes, name='prey')
bait = alias(ensembl_genes, name='bait')
query = db.session.query(tap,prey,bait).\
    join(prey, tap.c.TAP_PREY_ENSEMBL_GENE_ID==prey.c.ENSEMBL_GENE_ID).\
    join(bait, tap.c.TAP_BAIT_ENSEMBL_GENE_ID==bait.c.ENSEMBL_GENE_ID).\
    filter(\
      or_(\
        tap.c.TAP_PREY_ENSEMBL_GENE_ID=='ENSG00000100360',\
        tap.c.TAP_BAIT_ENSEMBL_GENE_ID=='ENSG00000100360'\
      )\
    ).\
    order_by(desc(tap.c.TAP_UNIQUE_PEPTIDE_COUNT))

tap refers to a table of interacting genes. One interactor is designated the 'bait' and the other the 'prey'. Prey and Bait are aliases for the same table that holds additional information on these genes. The objective is to select all interactions with a given gene 'ENSG00000100360' as either bait or prey.

The problem:

This query returns about 20 or so columns, but I need only six specific ones, two from each original tables (I'd like to rename them as well). From examples found on the interwebz I thought I should add:

  options(
      Load(tap).load_only('TAP_UNIQUE_PEPTIDE_COUNT','TAP_SEQUENCE_COVERAGE'),
      Load(prey).load_only('ENSEMBL_GENE_SYMBOL','ENSEMBL_GENE_ID'),
      Load(bait).load_only('ENSEMBL_GENE_SYMBOL','ENSEMBL_GENE_ID')
    )

But this gives me the following error:

File "/Users/jvandam/Github/syscilia/tools/BDT/quest/blueprints/genereport.py", line 246, in createTAPMSView Load(tap).load_only('TAP_UNIQUE_PEPTIDE_COUNT','TAP_SEQUENCE_COVERAGE') File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/orm/strategy_options.py", line 82, in init self.path = insp._path_registry AttributeError: 'Table' object has no attribute '_path_registry'

I have not been able to find anything on google about what to do about this. The sqlalchemy table objects are created from the database table metadata.

What I am trying to emulate using the sqlalchemy orm statements is:

SELECT
prey.ENSEMBL_GENE_SYMBOL AS PREY_ENSEMBL_GENE_SYMBOL,
prey.ENSEMBL_GENE_ID AS PREY_ENSEMBL_GENE_ID,
bait.ENSEMBL_GENE_SYMBOL AS BAIT_ENSEMBL_GENE_SYMBOL,
bait.ENSEMBL_GENE_ID AS BAIT_ENSEMBL_GENE_ID,
t.TAP_UNIQUE_PEPTIDE_COUNT AS UNIQUE_PEPTIDE_COUNT,
t.TAP_SEQUENCE_COVERAGE AS SEQUENCE_COVERAGE
FROM TAP as t
INNER JOIN ENSEMBL_GENES AS prey
  ON tap.TAP_PREY_ENSEMBL_GENE_ID=prey.ENSEMBL_GENE_ID
INNER JOIN ENSEMBL_GENES AS bait
  ON t.TAP_BAIT_ENSEMBL_GENE_ID=bait.ENSEMBL_GENE_ID
WHERE
  t.TAP_PREY_ENSEMBL_GENE_ID='ENSG00000100360' 
  OR t.TAP_BAIT_ENSEMBL_GENE_ID='ENSG00000100360'
ORDER BY t.TAP_UNIQUE_PEPTIDE_COUNT DESC

Can anyone help me fix my query? Thanks in advance! John

like image 567
John van Dam Avatar asked Jul 08 '15 11:07

John van Dam


1 Answers

Just change this part db.session.query(tap,prey,bait).\ with the below:

db.session.query(\
    prey.ENSEMBL_GENE_SYMBOL.label("PREY_ENSEMBL_GENE_SYMBOL"),
    prey.ENSEMBL_GENE_ID.label("PREY_ENSEMBL_GENE_ID"),
    bait.ENSEMBL_GENE_SYMBOL.label("BAIT_ENSEMBL_GENE_SYMBOL"),
    bait.ENSEMBL_GENE_ID.label("BAIT_ENSEMBL_GENE_ID"),
    tap.TAP_UNIQUE_PEPTIDE_COUNT.label("UNIQUE_PEPTIDE_COUNT"),
    tap.TAP_SEQUENCE_COVERAGE.label("SEQUENCE_COVERAGE"),
).\
select_from(tap).\  # @note: need this in so that FROM and JOINs are in desired order

This will select only the columns you need.

like image 170
van Avatar answered Oct 16 '22 17:10

van