Is the only way to add a column in PyTables to create a new table and copy?

Tags:

I am searching for a persistent data storage solution that can handle heterogenous data stored on disk. PyTables seems like an obvious choice, but the only information I can find on how to append new columns is a tutorial example. The tutorial has the user create a new table with added column, copy the old table into the new table, and finally delete the old table. This seems like a huge pain. Is this how it has to be done?

If so, what are better alternatives for storing mixed data on disk that can accommodate new columns with relative ease? I have looked at sqlite3 as well and the column options seem rather limited there, too.

419

asked Apr 03 '13 20:04

Zelazny7

2 Answers

Yes, you must create a new table and copy the original data. This is because Tables are a dense format. This gives it a huge performance benefits but one of the costs is that adding new columns is somewhat expensive.

166

answered Sep 18 '22 23:09

Anthony Scopatz

thanks for Anthony Scopatz's answer.

I search website and in github, I found someone has shown how to add columns in PyTables. Example showing how to add a column in PyTables

orginal version ,Example showing how to add a column in PyTables, but have some difficulty to migrate.

revised version, Isolated the copying logic, while some terms is deprecated, and it has some minor error in adding new columns.

based on their's contribution, I updated the code for adding new column in PyTables. (Python 3.6, windows)

# -*- coding: utf-8 -*-
"""
PyTables, append a column
    """
import tables as tb
pth='d:/download/'

# Describe a water class
class Water(tb.IsDescription):
    waterbody_name   = tb.StringCol(16, pos=1)   # 16-character String
    lati             = tb.Int32Col(pos=2)        # integer
    longi            = tb.Int32Col(pos=3)        # integer
    airpressure      = tb.Float32Col(pos=4)      # float  (single-precision)
    temperature      = tb.Float64Col(pos=5)      # double (double-precision)

# Open a file in "w"rite mode
# if don't include pth, then it will be in the same path as the code.
fileh = tb.open_file(pth+"myadd-column.h5", mode = "w")

# Create a table in the root directory and append data...
tableroot = fileh.create_table(fileh.root, 'root_table', Water,
                               "A table at root", tb.Filters(1))
tableroot.append([("Mediterranean", 10, 0, 10*10, 10**2),
              ("Mediterranean", 11, -1, 11*11, 11**2),
              ("Adriatic", 12, -2, 12*12, 12**2)])
print ("\nContents of the table in root:\n",
       fileh.root.root_table[:])

# Create a new table in newgroup group and append several rows
group = fileh.create_group(fileh.root, "newgroup")
table = fileh.create_table(group, 'orginal_table', Water, "A table", tb.Filters(1))
table.append([("Atlantic", 10, 0, 10*10, 10**2),
              ("Pacific", 11, -1, 11*11, 11**2),
              ("Atlantic", 12, -2, 12*12, 12**2)])
print ("\nContents of the original table in newgroup:\n",
       fileh.root.newgroup.orginal_table[:])
# close the file
fileh.close()

#%% Open it again in append mode
fileh = tb.open_file(pth+"myadd-column.h5", "a")
group = fileh.root.newgroup
table = group.orginal_table

# Isolated the copying logic
def append_column(table, group, name, column):
    """Returns a copy of `table` with an empty `column` appended named `name`."""
    description = table.description._v_colObjects.copy()
    description[name] = column
    copy = tb.Table(group, table.name+"_copy", description)

    # Copy the user attributes
    table.attrs._f_copy(copy)

    # Fill the rows of new table with default values
    for i in range(table.nrows):
        copy.row.append()
    # Flush the rows to disk
    copy.flush()

    # Copy the columns of source table to destination
    for col in descr:
        getattr(copy.cols, col)[:] = getattr(table.cols, col)[:]

    # choose wether remove the original table
#    table.remove()

    return copy

# Get a description of table in dictionary format
descr = table.description._v_colObjects
descr2 = descr.copy()

# Add a column to description
descr2["hot"] = tb.BoolCol(dflt=False)

# append orginal and added data to table2 
table2 = append_column(table, group, "hot", tb.BoolCol(dflt=False))
# Fill the new column
table2.cols.hot[:] = [row["temperature"] > 11**2 for row in table ]
# Move table2 to table, you can use the same name as original one.
table2.move('/newgroup','new_table')

# Print the new table
print ("\nContents of the table with column added:\n",
       fileh.root.newgroup.new_table[:])
# Finally, close the file
fileh.close()

answered Sep 16 '22 23:09

Renke

Related questions
                            
                                Create Django Admin Intermediate page
                            
                                Django automatic login after user registration (1.4)
                            
                                Build a tree in python through recursion by taking in json object
                            
                                Installing pyuno (LibreOffice) for private Python build
                            
                                z3 number of solutions
                            
                                How to produce Matplotlib plot with x-axis out of order?
                            
                                Django : Using Mongodb with django [closed]
                            
                                Using __getattr__ and meeting expected behaviour for subclasses
                            
                                Retrieve peers list without download the torrent using python-libtorrent
                            
                                Dynamically adding class methods to a class
                            
                                Numpy: Array of class instances
                            
                                saving constructor arguments automatically
                            
                                Is it possible to WATCH multiple Redis KEYs in python?
                            
                                python inheritance in sqlalchemy
                            
                                ctypes in python crashes with memset
                            
                                How to speed up sklearn SVR?
                            
                                Setting default value of a checkbutton in a menu to True
                            
                                numpy replace groups of elements with integers incrementally
                            
                                How do I iterate over large numbers in Python using range()? [duplicate]
                            
                                Converting numpy string array to float: Bizarre?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is the only way to add a column in PyTables to create a new table and copy?

Tags:

python

pytables

Zelazny7

People also ask

2 Answers

Anthony Scopatz

Renke

Recent Activity

Donate For Us