Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

oracle binary data types

is there a type or way how to store data in oracle in binary level. I'm interested both in operations for dml at table and pl/sql.

Currently all binary elements are stored as varchar2(1000)='11111...0000.1111' however operations and data storage size is rather big, therefore need some solutions for optimization. If this data could be stored in binary format, it would require 1000/8 bytes (have >700mln records)

Maybe a solution would be using some kind of java+oracle combination for these operations.

Ideas and suggestions welcome.

like image 310
Saulius S Avatar asked Dec 01 '22 22:12

Saulius S


2 Answers

Use the RAW datatype if you want to store binary data up to 4000 bytes. The data will be stored as a string of bytes without character set conversion.

Use the UTL_RAW package to perform operations on RAWs.

The LONG RAW datatype is deprecated, you should switch to BLOB when you need to manipulate data with more than 4000 bytes.

like image 176
Vincent Malgrat Avatar answered Jan 01 '23 16:01

Vincent Malgrat


See Vincent Malgrat's answer: if you want to store and process binary data in Oracle, then the RAW datatype is the way to go.

(As alegen's answer alludes to, if your intent is to store and retrieve images, video, audio or compressed data which doesn't need to be "processed" in the database, but you are just "storing" it and "retrieving" it, then the BLOB datatype may be more appropriate;

(NOTE: The RAW datatype is limited to 4000 bytes, the BLOB datatype is not. For peformance reasons, I would prefer to use RAW for values that are much shorter (say, 200 bytes or less), where I need to regularly access the values. For much longer values, where a lot of queries did not reference the binary data, I would tend to favor BLOB. (It's all due to the differences in the way that RAW and BLOB are stored internally: inline storage vs. separate blocks, split rows, number of rows that fit in a block, etc.)

For the particular problem you describe, from the information you've provided, then RAW sounds like the way to go. You specify that you have sequences of 1000 bits, but it's not at all clear whether that's a constant, or a maximum length, or whether you've broken up longer strings of binary data into more manageable chunks that will fit into columns. (If you are really working with a single, huge chunk of binary data, you really want to avoid "chopping" it up into a bunch of little pieces, and storing each piece on a separate row. It's going to be much more efficient to store it all together as a single BLOB, and work with it as a simple stream.

All of that is really going to inform your decision on whether to use BLOB or RAW.


That aside, on to your question about converting from VARCHAR2 representation of ones and zeros (e.g. '00101010', with each "bit" of real information stored as a separate character, into a more efficiently stored binary representation, with each "8-bits" of real information requiring one byte of storage...

The Oracle RAW datatype will get you 8-bits stored into a single byte. That is, a RAW(125) would store the equivalent of your VARCHAR2(1000), which would save you 875 bytes per row (for a SBCS, more than twice that if you're using a DBCS). This will significantly reduce storage requirements, get you more rows in a block, and allow for the possibility of better performance.

To convert the data currently stored as VARCHAR2 as a string of ones and zeros, I don't know of any built in function that does that. But it's fairly straightforward to roll your own function to convert the binary string representation to hex string representation. After that, you can use the builtin HEXTORAW function to convert to RAW.

Here's an example that can be used as a starting point.

(NOTE: this function is just an example, it doesn't effectively handle cases when the length of the input string is not a multiple of 8 characters. Also, it's behavior with string values containing characters other than '1' or '0' may not be appropriate (as it's written, it treats any character other than a '0' as if it were a '1'. But, it's good enough as a starting point).

create or replace function binstr_to_hexstr
( as_binstr in varchar2 ) return varchar2
is
  li_n binary_integer default 0;
  ls_hexstr varchar2(16) default '0123456789ABCDEF';
  ls_return varchar2(2000) default '';
begin
  if ( as_binstr is null ) then
    return null;
  end if;
  ls_return := '';
  li_n := 0;
  for i in 1 .. length(as_binstr) loop
    li_n := li_n*2 + abs(instr('01',substr(as_binstr,i,1))-1);
    if mod(i,4) = 0 then
      ls_return := ls_return || substr(ls_hexstr,li_n+1,1);
      li_n := 0;
    end if;
  end loop;
return ls_return;
end;
/

SELECT binstr_to_hexstr('00101010') AS hexstr FROM DUAL UNION ALL
SELECT binstr_to_hexstr('00x0 010') FROM DUAL;

HEXSTR                                                                             
------
2A
2A

NOTE: This function returns an expected result (a matching hex representation) ONLY when the length of the input string is an even multiple of 8 (i.e. MOD(length(as_binstr),8) = 0). Otherwise, the function "loses" trailing bits and/or returns an odd number of hex digits. (The function can be modified to throw an exception when the length of the input argument is not a multiple of 8.)

The HEXTORAW and RAWTOHEX functions are useful when working with RAW data using a client application, such as TOAD, SQL Developer or SQL*Plus. (The HEXTORAW function is what we'd use to convert the output from the binstr_to_hexstr function to RAW.) As an example:

create or replace function binstr_to_raw
( as_binstr in varchar2 ) return raw
is
begin
  return hextoraw(binstr_to_hexstr(as_binstr));
end;
/

As Vincent Malgrat pointed out in his answer, Oracle provides a couple of packages (e.g. UTL_RAW and UTL_ENCODE) which are useful in working with RAW data.

http://docs.oracle.com/cd/E11882_01/appdev.112/e25788/u_raw.htm

like image 41
spencer7593 Avatar answered Jan 01 '23 17:01

spencer7593