I have a few large files which I want to analyze using Google BigQuery.
It was working pretty well, except for fields with floats: I can import them only as strings since their decimals are stored as commas instead of points.
How can I work around that?
Quoted strings enclosed by single ( ' ) quotes can contain unescaped double ( " ) quotes, as well as the inverse. Backslashes ( \ ) introduce escape sequences.
To convert an ARRAY into a set of rows, also known as "flattening," use the UNNEST operator. UNNEST takes an ARRAY and returns a table with a single row for each element in the ARRAY . Because UNNEST destroys the order of the ARRAY elements, you may wish to restore order to the table.
If you want to remove a specific character from your String then you can use the Trimming function to do so. Based on the position of the character that you wish to remove there are three kinds of BigQuery String Functions: TRIM (value1[, value2]): It removes all the leading and trailing characters that match value2.
Working with Strings in Google BigQuery It divides value using the delimiter argument. For STRING , the default delimiter is a comma. For BYTES , you must specify a delimiter. Splitting with an empty delimiter generates an array of UTF-8 characters for STRING values and an array of BYTES for BYTES values.
Importing them as a string seems fine, then running an ETL inside BigQuery should be fast enough (REGEX_REPLACE + FLOAT).
SELECT 2*FLOAT(REGEXP_REPLACE("1,30001", ",", "."))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With