I have a function that is used throughout my code. The function expects that the passed parameter is a positive integer. Since PHP is loosely typed, the data type is unimportant. But it is important that it contain nothing but digits. Currently, I am using a regular expression to check the value before continuing.
Here is a simplified version of my code:
function do_something($company_id) {
if (preg_match('/\D/', $company_id)) exit('Invalid parameter');
//do several things that expect $company_id to be an integer
}
I come from a Perl background and tend to reach for regular expressions often. However, I know their usage is controversial.
I considered using intval()
or (int)
and forcing $company_id
to be an integer. However, I could end up with some unexpected values and I want it to fail fast.
The other option is:
if (!ctype_digit((string) $company_id)) exit('Invalid parameter');
Is this scenario a valid use of regular expressions? Is one way preferred over the other? If so, why? Are there any gotchas I haven't considered?
To check if a string contains only numbers in JavaScript, call the test() method on this regular expression: ^\d+$ . The test() method will return true if the string contains only numbers. Otherwise, it will return false .
To get a string contains only numbers (0-9) we use a regular expression (/^[0-9]+$/) which allows only numbers.
The original question is about validating a value of unknown data type and discarding all values except those that contain nothing but digits. There seems to be only two ways to achieve this desired result.
If the goal is to fail fast, one would want to check for invalid values and then fail rather than checking for valid values and having to wrap all code in an if
block.
if (preg_match('/\D/', $company_id)) exit('Invalid parameter');
Using regex
to fail if match non-digits. Con: regex engine has overhead
if (!ctype_digit((string) $company_id)) exit('Invalid parameter');
Using ctype_digit
to fail if FALSE. Con: value must be cast to string which is a (small) extra step
You must cast value to a string because ctype_digit
expects a string and PHP will not cast the parameter to a string for you. If you pass an integer to ctype_digit
, you will get unexpected results.
This is documented behaviour. For example:
ctype_digit('42'); // true
ctype_digit(42); // false (ASCII 42 is the * character)
Due to the overhead of the regex engine, option two is probably the best option. However, worrying about the difference between these two options may fall into the premature optimization category.
Note: There is also a functional difference between the two options above. The first option considers NULL
and empty strings as valid values, the second option does not (as of PHP 5.1.0). That may make one method more desirable than the other. To make the regex
option function the same as the ctype_digit
version, use this instead.
if (!preg_match('/^\d+$/', $company_id)) exit('Invalid parameter');
Note: The 'start of string' ^
and 'end of string' $
anchors in the above regex
are very important. Otherwise, abc123def
would be considered valid.
There are other methods that have been suggested here and in other questions that will not achieve the stated goals, but I think it is important to mention them and explain why they won't work as it might help someone else.
is_numeric
allows exponential parts, floats, and hex values
is_int
checks data type rather than value which is not useful for validation if '1'
is to be considered valid. And form input is always a string. If you aren't sure where the value is coming from, you can't be sure of the data type.
filter_var
with FILTER_VALIDATE_INT
allows negative integers and values such as 1.0
. This seems like the best function to actually validate an integer regardless of data type. But doesn't work if you want only digits. Note: It's important to check FALSE
identity rather than just truthy/falsey if 0
is to be considered a valid value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With