Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging sym files for splayed tables

Tags:

kdb

q-lang

I have two directories that each contain a date-partitioned splayed table. Each directory has its own sym file as expected. The tables are exactly the same.

I want to consolidate this into one directory but am having issues doing so. Initially I tried to create a soft-link (due to the large amounts of data) of the partitions in the other directory. This didn't work as the tables were using the wrong sym file.

Does anyone have an idea how best to do this? Do I have to regenerate a new sym file for both directories?

Thanks

like image 667
Geoffrey Absalom Avatar asked Jun 24 '13 13:06

Geoffrey Absalom


2 Answers

I'm not sure I understand exactly what your situation is, but I can think of a few possibilities.

  • The two databases are exactly the same. If run a checksum on both directories the hashes match.

In this case, why do you need the two copies? You can run multiple q processes off the same copy of the database. In fact, this is preferable because you benefit from the shared caching provided by the OS disk cache. Just delete one of the copies and point all q processes to the same directory.

  • The two databases contain data loaded from the same source, but are otherwise not the same. If I query each of the databases with the same query, I may get the same result but the checksums of the files do not match.

This can happen if the databases were created independently but with the same source data. Unless you actually made a copy of the files, you can't really assume the databases are the same. An obvious example is that you had a bunch of files that you loaded into each database but the order of the files loaded were different for each database. In this case you cannot use the same sym file! Doing so will make the data look OK at first glance, but all your sym values are wrong. If you do want to combine the two databases for some reason, you will need to take data from one database and load it into the other. This is the only reliable way to be 100% certain you don't corrupt your data.

  • You have two different database that each contain the exact same table (in the checksum sense, maybe you copied the table files from one directory to another).

This probably will not work unless by some miracle the sym values all match, which they won't if the rest of the database is different. This is because the enumerated sym values are global and dependent on all sym values in the database. If you want the table in both databases, you will need to reenumerate the sym columns for whichever database you copy to.

like image 87
moepud Avatar answered Oct 08 '22 17:10

moepud


Read in day-by-day from one directory, evaluate all the enumerated sym columns and write to the other directory, enumerating on the other sym file.

like image 35
skeevey Avatar answered Oct 08 '22 19:10

skeevey