Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP filter_var() - FILTER_VALIDATE_URL

The FILTER_VALIDATE_URL filter seems to have some trouble validating non-ASCII URLs:

var_dump(filter_var('http://pt.wikipedia.org/wiki/', FILTER_VALIDATE_URL)); // http://pt.wikipedia.org/wiki/
var_dump(filter_var('http://pt.wikipedia.org/wiki/Guimarães', FILTER_VALIDATE_URL)); // false

Why isn't the last URL correctly validated? And what are the possible workarounds? Running PHP 5.3.0.

I'd also like to know where I can find the source code of the FILTER_VALIDATE_URL validation filter.

like image 746
Alix Axel Avatar asked Jan 26 '10 01:01

Alix Axel


People also ask

What is Filter_var function in PHP?

The filter_var() function filters a variable with the specified filter. This function is used to both validate and sanitize the data. Syntax :- filter_var(var, filtername, options)

How can I check if a URL is valid in PHP?

The First Option is Using filter_var The first efficient way of validating a URL is applying the filter_var function with the FILTER_VALIDATE_URL filter. So, the following code should be used for checking whether the ($url) variable is considered a valid URL: <? php $url = "http://www.w3docs.com"; if (!

How sanitize URL in PHP?

We can sanitize a URL by using FILTER_SANITIZE_URL. This function removes all chars except letters, digits and $-_. +! *'(),{}|\\^~[]`<>#%";/?:@&=.

What is FILTER_VALIDATE_URL in PHP?

Definition and Usage. The FILTER_VALIDATE_URL filter validates a URL. Possible flags: FILTER_FLAG_SCHEME_REQUIRED - URL must be RFC compliant (like http://example) FILTER_FLAG_HOST_REQUIRED - URL must include host name (like http://www.example.com)


2 Answers

Technically that is not a valid URL according to section 5 of RFC 1738. Browsers will automatically encode the ã character to %C3%A3 before sending the request to the server. The technically valid full url here is: http://pt.wikipedia.org/wiki/Guimar%C3%A3es Pass that to the VALIDATE_URL filter and it will work fine. The filter only validates according to spec, it doesn't try to fix/encode characters for you.

like image 136
Rasmus Avatar answered Sep 18 '22 13:09

Rasmus


The following code uses filter_var but encode non ascii chars before calling it. Hope this helps someone.

<?php

function validate_url($url) {
    $path = parse_url($url, PHP_URL_PATH);
    $encoded_path = array_map('urlencode', explode('/', $path));
    $url = str_replace($path, implode('/', $encoded_path), $url);

    return filter_var($url, FILTER_VALIDATE_URL) ? true : false;
}

// example
if(!validate_url("http://somedomain.com/some/path/file1.jpg")) {
    echo "NOT A URL";
}
else {
    echo "IS A URL";
}
like image 39
Huey Ly Avatar answered Sep 19 '22 13:09

Huey Ly