Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle binary strings in R?

Tags:

string

database

r

R is not able to cope with null-strings (\0) in characters, does anyone know how to handle this? More concrete, I want to store complex R objects within a database using an ODBC or JDBC connection. Since complex R objects are not easily to be mapped to dataframes, I need a different possibility to store such objects. An object could be for example:

library(kernlab)
data(iris)
model <- ksvm(Species ~ ., data=iris, type="C-bsvc", kernel="rbfdot", kpar="automatic", C=10) 

Because >model< cannot be stored directly in a database, I use the serialize() function to retrieve a binary representation of the object (in order to store it in a BLOB column):

 serialModel <- serialize(model, NULL)

Now I would like to store this via ODBC/JDBC. To do so, I need a string representation of the object in order to send a query to the database, e.g. INSERT INTO. Since the result is a vector of type raw vector, I need to convert it:

 stringModel <- rawToChar(serialModel)

And there is the problem:

Error in rawToChar(serialModel) : 
  embedded nul in string: 'X\n\0\0\0\002\0\002\v\0......

R is not able to deal with \0 in strings. Does anyone has an idea how to bypass this restriction? Or is there probably a completly different approach to achieve this goal?

Thanks in advance

like image 943
Thomas Avatar asked May 10 '11 12:05

Thomas


People also ask

How do I read a binary file in R?

To read a binary file, we select appropriate values of column names and column values. We use the file name and connection mode rb to create the connection opening. rb : the mode of connection opening. r means to read and b means binary.

Can r read binary file format?

The code needed to read binary data into R is relatively easy. However, reading the data in correctly requires that you are either already familiar with your data or possess a comprehensive description of the data structure. In the binary data file, information is stored in groups of binary digits.

Can we store the binary data in string?

Storing Binary Data as Strings. This function can be used to convert binary strings into a format that can be pasted into source code. Binary strings contain all characters from chr(0).. chr(255) and as such include unprintable / unstorable characters.

What is binary in R?

Data Visualization using R Programming A binary file is a file that contains information stored only in form of bits and bytes. (0's and 1's). They are not human readable as the bytes in it translate to characters and symbols which contain many other non-printable characters.


1 Answers

You need

stringModel <- as.character(serialModel)

for a character representation of the raw bit codes. rawToChar will try to convert the raw bit codes, which is not what you want in this case.

The resulting stringModel can be converted later on back to the original model by :

newSerialModel <- as.raw(as.hexmode(stringModel))
newModel <- unserialize(newSerialModel)
all.equal(model,newModel)
[1] TRUE

Regarding the writing of binary types to databases through RODBC : as for today, the vignette of RODBC reads (p.11) :

Binary types can currently only be read as such, and they are returned as column of class "ODBC binary" which is a list of raw vectors.

like image 158
Joris Meys Avatar answered Oct 16 '22 20:10

Joris Meys