Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a string field has only alphabets in BigQuery

How to check if a string field in a BigQuery table has only alphabets and certain special characters like Å, Ä, and Ö in it? Any value with special characters or numbers should be avoided from selection.

like image 551
rr_coder_111 Avatar asked Nov 17 '25 05:11

rr_coder_111


2 Answers

You can use the function REGEXP_CONTAINS to perform this kind of selection:

SELECT
  text,
  REGEXP_CONTAINS(text, r"^[A-Za-zÅÄÖ]+$") AS is_selected
FROM
  `dataset.yourtable`

Here with the regex "^[A-Za-zÅÄÖ]+$" strings composed of only alphabet, Å, Ä or Ö will return true.

For instance with the text column below:

enter image description here

like image 118
Cylldby Avatar answered Nov 20 '25 04:11

Cylldby


Below is generic solution to account not just for Å, Ä, and Ö but for all such chars like ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,ø,Ø,Å,Á,À,Â,Ä,È,É,Ê,Ë,Í,Î,Ï,Ì,Ò,Ó,Ô,Ö,Ú,Ù,Û,Ü,Ÿ,Ç,Æ,Œ,ñ etc.

select text
from `project.dataset.table`
where not regexp_contains(regexp_replace(normalize(text, NFD), r"\pM", ''), r'\W')            

if applied to sample data provided by @Cylldby in his answer - output is

enter image description here

like image 35
Mikhail Berlyant Avatar answered Nov 20 '25 02:11

Mikhail Berlyant