I am learning SQLAlchemy and I am stuck. I have a SQL table (table1) has two fields: 'name' and 'other_names'
I have an excel file with two columns:
first_name alias
paul patrick
john joe
simon simone
john joey
john jo
I want to read the excel file into my table1, so that it looks like this (i.e. all of the aliases for the same line are on one row):
paul patrick
john joe,joey,jo
simon simone
This is the idea that I was trying to do. The code (with comments) that I tried:
for line in open('file.txt', 'r'): #for each line in the excel file
line = line.strip().split('\t') #split each line with a name and alias
first_name = line[0] #first name is the name before the tab
alias = line[1] #alias is the name after the tab
instance =
Session.query(session,tbs['table1'].name).filter_by(name=first_name) #look through the database table, by name field, and see if the first name is there
list_instance = [x[0] for x in instance] #make a list of first names already in database table
if first_name not in list_instance: #if the excel first name is not in the database table
alias_list = [] #make an empty list
alias_list.append(alias) #append the alias
name_obj = lib.get_or_create( #small function to make db object
session,
tbs["table1"],
name = first_name, #add first name to the name field
other_names = alias_list # add alias list to the other_names field
)
elif first_name in list_instance: #elif first name already in db
alias_list.append(alias) #append the alias to the alias list made above
name_obj = lib.get_or_create(
session,
tbs["table1"],
name = first_name,
other_names = alias_list #create object as before, but use updated alias list
)
The problem is that I can get the above code to run with no errors, but also the output is not an appended list, it is simply a database table that looks like the excel file; i.e.
name alias
paul patrick
john joe
simon simone
john joey
john jo
Could someone point out where I am going wrong, specifically, how do i amend this code? Please let me know if the question is unclear, I've tried to make it a simple example. Specifically, how do I initialise and add to lists as a field entry in a SQLalchemy db table.
Update 1: I have updated my code according to kind suggestion below. However I still have the issue. This is the full aim, code and test file: The aim:
I have a table in the database (see below for test file going into table).The table has two fields, name (the latin name e.g. homo sapiens) and other names (the common names e.g. human, man). I want to update a field (other names) in the table, so instead of having:
Rana rugosa human
Rana rugosa man
Rana rugosa frog
Rana rugosa cow
I have:
Rana rugosa human,man,frog,cow
The test_data file looks like this:
origin_organism common_name tested_organism
Rana rugosa human -
Rana rugosa man -
Rana rugosa frog homo sapiens
Rana rugosa cow Rana rugosa
Rana rugosa frog Rana rugosa
Rana rugosa frog -
Rana rugosa frog -
Rana rugosa frog homo sapiens
- - -
- - homo sapiens
- - -
- - -
- - -
- - -
streptococcus pneumoniae - -
The code:
import sys
from sqlalchemy.orm import *
from sqlalchemy import *
from dbn.sqlalchemy_module import lib
import pd
engine = lib.get_engine(user="user", psw="pwd", db="db", db_host="111.111.111.11")
Base = lib.get_automapped_base(engine)
session = Session(engine)
tbs = lib.get_mapped_classes(Base)
session.rollback()
df = pd.read_excel('test_data.xlsx', sheet_name = 'test2')
for index, row in df.iterrows():
origin_latin_name = row['origin_organism'].strip().lower()
other_names_name = row['common_name'].strip().lower()
tested_species = row['tested_organism'].strip().lower()
if origin_latin_name not in [None, "None", "", "-"]:
instance = [x[0] for x in Session.query(session,tbs['species'].name).filter_by(name=origin_latin_name).all()]
if origin_latin_name not in instance:
origin_species = lib.get_or_create(
session,
tbs["species"],
name = origin_latin_name,
other_names = other_names_name
)
elif origin_latin_name in instance:
other_names_query = Session.query(session,tbs['species'].other_names).filter_by(name=origin_latin_name)
other_names_query_list = [x for x in other_names_query]
original_list2 = list(set([y for y in x[0].split(',') for x in other_names_query_list]))
if other_names_name not in original_list2:
original_list2.append(other_names_name)
new_list = ','.join(original_list2)
new_names = {'other_names':','.join(original_list2)}
origin_species = lib.get_or_create(
session,
tbs["species"],
name = origin_latin_name,
other_names = new_list
)
The part from the elif statement doesn't work. I've ran into two problems:
(1) The most recent error I got: NameError: name 'new_list' is not defined
(2) another error I got is that I have another table further on
map1 = lib.get_or_create(
session,
tbs["map1"],
age_id_id = age,
name_id_id = origin_species.id
)
...and it said that origin_species cannot be found, but I think this is linked to the elif statement, that somehow the origin_species object is not being updated properly.
If anyone could help I would appreciate it.
To save as String by json. dumps(my_list) and then while retrieving just do json. loads(my_column) . But it will require you to set the data in a key-value format and seems a bit in-efficient compared to the previous solution.
Update table elements in SQLAlchemy. Get the books to table from the Metadata object initialized while connecting to the database. Pass the update query to the execute() function and get all the results using fetchall() function. Use a for loop to iterate through the results.
Creating and Inserting Data into TablesBy passing the database which is not present, to the engine then sqlalchemy automatically creates a new database.
Simple mistake. You aren't giving it a list. I'm not sure why they end up in different rows, however, I would change the following because at the moment I don't see where you split the names into a list, all I see is you assigning a string onto a list using append.
alias_list = alias.split(',')
Which could also be:
alias_list = line[1].split(',')
Output:
alias_list: ['Name1','Name2','Name3']
Currently your code outputs:
alias_list = ['Name1,Name2,Name3']
Which, whilst it is technically a list by data type, it is a worthless list for the way you want to use it. This is because alias_list[0]
would return the entire string, as opposed to 'Name1'
WORD OF WARNING:
Your code is creating a list unnecessarily. You don't need a list in your database, you can easily achieve what you wabt by using the string that is evaluated when you read the excel file.
What you should do IMHO is to store the string of names as a whole string, then if you need to query the aliases of someone, then you can split the string on the other side, if that makes sense?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With