Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

file_get_contents() and unicode characters in domain (like æøå)

Tags:

php

unicode

Whenever I try to grab a page's content using file_get_contents(), and the domain has an unicode character in it, I get this:

file_get_contents(https://møller.dk/): failed to open stream: php_network_getaddresses: getaddrinfo failed: Name of service not known in >FILE LOCATION<

This only happens when I have an unicode character in the domain. Here's an example:

file_get_contents("http://møller.dk/");
like image 614
MortenMoulder Avatar asked Nov 17 '16 19:11

MortenMoulder


People also ask

How do I encode a special character in a URI?

If you're opening a URI with special characters, such as spaces, you need to encode the URI with urlencode () . Name of the file to read.

What is the default Unicode character encoding in PowerShell?

This extension reveals certain Unicode characters that easily corrupted because they are invisible or look like other normal characters. In PowerShell 6+, the default encoding is UTF-8 without BOM on all platforms. In Windows PowerShell, the default encoding is usually Windows-1252, an extension of latin-1, also known as ISO 8859-1.

How to read the contents of a file into a string?

On failure, file_get_contents () will return false . file_get_contents () is the preferred way to read the contents of a file into a string. It will use memory mapping techniques if supported by your OS to enhance performance. If you're opening a URI with special characters, such as spaces, you need to encode the URI with urlencode () .

How to get the encoding of a file using PowerShell?

There is no way for PowerShell to automatically determine the file encoding. You're more likely to have encoding problems when you're using characters not in the 7-bit ASCII character set. For example:


2 Answers

You need to use the idn_to_ascii() function:

file_get_contents('http://' . idn_to_ascii('møller.dk'));

Reference:

  • http://php.net/manual/en/function.idn-to-ascii.php
like image 156
cmorrissey Avatar answered Oct 13 '22 02:10

cmorrissey


You can use Punycode, which encode/decode IDNA names:

$Punycode = new Punycode();
$baseUrl = 'ærlig.no';
$url = 'http://'.$Punycode->encode($baseUrl);

echo file_get_contents($url);
like image 2
Felippe Duarte Avatar answered Oct 13 '22 02:10

Felippe Duarte