Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse description string to populate NULL fields

Tags:

Informix 12.10

tblItems
(
Type        SMALLINT,       {Precious Metal = 1, Other = 2}
Description VARCHAR,
Quantity    SMALLINT,
Name        VARCHAR,
Weight      DECIMAL(5,1),
Purity      SMALLINT,
Brand       VARCHAR,
Model       VARCHAR,
SerialNum   VARCHAR
);

EDIT UPDATE: Sample data below is stored in tblItems.Type and tblItems.Description. Please note that the contents in Description column are all uppercase characters and may also include punctuation character.

2|1LAPTOP APPLE 15.5" MODEL MACKBOOK PRO,S/N W80461WCAGX, WITH CHARGER||||||||
1|1RING 2.3PW 14K||||||||
2|DRILL RIOBY, MODEL D5521 S/N77720||||||||
2|TRIMMER TORO, MODEL 0242 S/N 66759||||||||
2|CELL SAMSUNG NOTE3, MODEL SM-N900T S/N RV8F90YLZ9W||||||||

I need to parse the sample item descriptions into the columns below, using the rules mentioned in the comments :

Quantity,      {if description string does not start with a number, then Quantity = 1}
Name,          {Always the first element if description has no quantity, second element if quantity present] 
Weight,        {Always before "PW" if Type = 1, Default to zero if Type = 2}
Purity,        {Always before "K" if Type = 1, Default to NULL if Type = 2} 
Brand,         {Always the second element in description, if present} 
Model,         {Always after "MODEL", with or without a space}
Serial Number  {Always after "S/N", with or without a space}

I would like to do this with an UPDATE statement, but if Informix has an import utility tool like SQL-Server's SSIS, then that could be a better option.

UPDATE, Expected Results:

Quantity   1               1       1        1         1
Name       LAPTOP          RING    DRILL    TRIMMER   CELL
Weight     0.0             2.3     0.0      0.0       0.0
Purity                     14
Brand      APPLE                   RIOBY    TORO      SAMSUNG
Model      MACKBOOK PRO            D5521    0242      SM-N900T
SerialNum  W8046WCAGX              77720    66759     RV8F90YLZ9W
like image 508
Frank R. Avatar asked Dec 01 '18 05:12

Frank R.


1 Answers

Assuming you are using Informix 12.10.XC8 or above, you can try using regular expressions to parse the description string (see the online documentation here).

For the serial number, for example, you can do:

UPDATE tblitems
SET
serialnum = 
DECODE 
(
    regex_match(description, '(.*)(S\/N)(.*)', 3)
    , 't'::BOOLEAN, regex_replace(description, '(.*)(S\/N)([[:blank:]]?)([[:alnum:]]*)(.*)', '\4', 0, 3)
    , 'f'::BOOLEAN, ''
)

So in the previous example I am testing if the description contains the S/N string and if that is true I use regex_replace to return the value after it, in this case the 4th matching group in the regular expression (I am not using regex_extract to get the value because it seems to return multiple values and I get error -686).

You can extend this approach to the rest of the columns and see if regular expressions are enough to parse the description column.

like image 169
Luís Marques Avatar answered Nov 23 '22 03:11

Luís Marques