Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apply function to column before filtering

I have a column in my database called coordinates, now the coordinates column contains information on the range of time an object takes up within my graph. I want to allow the user to filter by the date, but the problem is I use a function to determine the date normally. Take:

# query_result is the result of some filter operation
for obj in query_result:
    time_range, altitude_range = get_shape_range(obj.coordinates)
    # time range for example would be "2006-06-01 07:56:17 - ..."

Now if I wanted to filter by date, I would want to is a like:

query_result = query_result.filter(
    DatabaseShape.coordinates.like('%%%s%%' % date))

But the problem is I first need to apply get_shape_range to coordinates in order to receive a string. Is there any way to do ... I guess a transform_filter operation? Such that before the like happens, I apply some function to coordinates? In this case I would need to write a get_time_range function that returned only time, but the question remains the same.


EDIT: Here's my database class

class DatabasePolygon(dbBase):
    __tablename__ = 'objects'

    id = Column(Integer, primary_key=True)  # primary key
    tag = Column(String)  # shape tag
    color = Column(String)  # color of polygon
    time_ = Column(String)  # time object was exported
    hdf = Column(String)  # filename
    plot = Column(String)  # type of plot drawn on
    attributes = Column(String)  # list of object attributes
    coordinates = Column(String)  # plot coordinates for displaying to user
    notes = Column(String)  # shape notes
    lat = Column(String)

    @staticmethod
    def plot_string(i):
        return constants.PLOTS[i]

    def __repr__(self):
        """
        Represent the database class as a JSON object. Useful as our program
        already supports JSON reading, so simply parse out the database as
        separate JSON 'files'
        """
        data = {}
        for key in constants.plot_type_enum:
            data[key] = {}
        data[self.plot] = {self.tag: {
            'color': self.color,
            'attributes': self.attributes,
            'id': self.id,
            'coordinates': self.coordinates,
            'lat': self.lat,
            'notes': self.notes}}
        data['time'] = self.time_
        data['hdfFile'] = self.hdf
        logger.info('Converting unicode to ASCII')
        return byteify(json.dumps(data))

and I'm using sqlite 3.0. The reasoning why behind most things are strings is because most of my values that are to be stored in the database are sent as strings, so storing is trivial. I'm wondering if I should do all this parsing magic with the functions before, and just have more database entries? for stuff like decimal time_begin, time_end, latitude_begin instead of having a string containing the range of time that I parse to find time_begin and time_end when i'm filtering

like image 442
Syntactic Fructose Avatar asked Jan 06 '16 07:01

Syntactic Fructose


People also ask

How to apply formula only to cells visible after filtering?

How to apply formula only to the cells visible after filtering? For eg : My sheet has 3 columns. Column A contains alphabets, Column B contains Words and column C blank (to be filled with data using formula). I use filter to select rows with column A = alphabet A and column B with word 'excel'. Assuming the rows 2,3,5,10,11 are visible.

How do I filter out data from a specific column?

Click the down arrow next to a column title and uncheck the box next to the data you want to filter out. Once all items have been filtered out, click "OK".

What is the use of filter in Excel?

The FILTER function in Excel is used to filter a range of data based on the criteria that you specify. The function belongs to the category of Dynamic Arrays functions. The result is an array of values that automatically spills into a range of cells, starting from the cell where you enter a formula.

How do I filter out rows that are not displayed?

Click the down arrow next to a column title and uncheck the box next to the data you want to filter out. Once all items have been filtered out, click "OK". Filter some data out so you can see how the Sum function still counts all the rows, even those not being displayed.


2 Answers

I think you should definitely parse strings to columns before storing it in the databases. Let the database do the job it was designed for!

CREATE TABLE [coordinates]
(
  id                INTEGER  NOT NULL PRIMARY KEY,
  tag               VARCHAR2(32),
  color             VARCHAR2(32)    default 'green',
  time_begin        TIMESTAMP,
  time_end          TIMESTAMP,
  latitude_begin    INT
);

create index ix_coord_tag on coordinates(tag);
create index ix_coord_tm_beg on coordinates(time_begin);

insert into coordinates(tag, time_begin, time_end, latitude_begin)
values('tag1', '2006-06-01T07:56:17', '2006-06-01T07:56:19', 123);

insert into coordinates(tag, time_begin, time_end, latitude_begin)
values('tag1', '2016-01-01T11:35:01', '2016-01-01T12:00:00', 130);

insert into coordinates(tag, color, time_begin, time_end, latitude_begin)
values('tag2', 'blue', '2014-03-03T20:11:01', '2014-03-03T20:11:20', 2500);

insert into coordinates(tag, color, time_begin, time_end, latitude_begin)
values('tag2', 'blue', '2014-03-12T23:59:59', '2014-03-13T00:00:29', 2978);

insert into coordinates(tag, color, time_begin, time_end, latitude_begin)
values('tag3', 'red', '2016-01-01T11:35:01', '2016-01-01T12:00:00', 13000);

insert into coordinates(tag, color, time_begin, time_end, latitude_begin)
values('tag3', 'red', '2016-01-01T12:00:00', '2016-01-01T12:00:11', 13001);

.headers on
.mode column

select * from coordinates where tag='tag1' and '2006-06-01T07:56:18' between time_begin and time_end;

select * from coordinates where color='blue' and time_end between '2014-03-13T00:00:00' and '2014-03-13T00:10:00';

Output:

sqlite> select * from coordinates where tag='tag1' and '2006-06-01T07:56:18' between time_begin and time_end;
id          tag         color       time_begin           time_end             latitude_begin
----------  ----------  ----------  -------------------  -------------------  --------------
1           tag1        green       2006-06-01T07:56:17  2006-06-01T07:56:19  123
sqlite>
sqlite> select * from coordinates where color='blue' and time_end between '2014-03-13T00:00:00' and '2014-03-13T00:10:00';
id          tag         color       time_begin           time_end             latitude_begin
----------  ----------  ----------  -------------------  -------------------  --------------
4           tag2        blue        2014-03-12T23:59:59  2014-03-13T00:00:29  2978
like image 104
MaxU - stop WAR against UA Avatar answered Oct 20 '22 05:10

MaxU - stop WAR against UA


Given the title of the question, I assume that you aren't set on using the like() method.

SQLAlchemy's Query.filter() will accept any criteria that evaluates to a boolean.

Why not modify your filter criterion, so that instead of using the like() method on a string, you are testing on Python datetime.date objects?

I don't knowing what obj.coordinates looks like, but here's a rough outline that I hope make sense:

def check_range(coords, date):
    """ takes a date, and a "coordinates" value (representing a date 
        range), and a date as inputs, returns True if the 
        date is within that date range, else returns False
    """
    #some code here...


query_result = query_result.filter(
        check_range(DatabaseShape.coordinates, date)
    )
like image 36
mardlin Avatar answered Oct 20 '22 06:10

mardlin