Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php substr() function with utf-8 leaves � marks at the end

Tags:

php

utf-8

substr

Here is simple code

<?php  $var = "Бензин Офиси А.С. также производит все типы жира и смазок и их побочных        продуктов в его смесительных установках нефти машинного масла в Деринце, Измите, Алиага и Измире. У Компании есть 3 885 станций технического обслуживания, включая сжиженный газ (ЛПГ) станции под фирменным знаком Петрогаз, приблизительно 5 000 дилеров, двух смазочных смесительных установок, 12 терминалов, и 26 единиц поставки аэропорта.";  $foo = substr($var,0,142);  echo $foo; ?> 

and it outputs something like this:

Бензин Офиси А.С. также производит все типы жира и смазок и их побочных продук�...

I tried mb_substr() with no luck. How to do this the right way?

like image 263
Nazar Avatar asked Jan 31 '12 21:01

Nazar


People also ask

What is substr () in PHP and how it is used?

The substr() is a built-in function of PHP, which is used to extract a part of a string. The substr() function returns a part of a string specified by the start and length parameter. PHP 4 and above versions support this function.

What is the return type of the substr () function?

The SUBSTR function returns a portion of string, beginning at a specified character position, and a specified number of characters long. SUBSTR calculates lengths using characters as defined by the input character set.

What is the purpose of substr () function?

The SUBSTR function acts on a character string expression or a bit string expression. The type of the result is a VARCHAR in the first case and VARCHAR FOR BIT DATA in the second case. The length of the result is the maximum length of the source type.

How to use the PHP SUBSTR () function?

Introduction to the PHP substr () function. The substr () function accepts a string and returns a substring from the string. Here’s the syntax of the substr () function: substr ( string $string , int $offset , int| null $length = null ) : string. Code language: PHP (php) In this syntax: $string is the input string.

How to extract the last character of a string in PHP?

The last character in the input string has an index of -1. Use the negative length to omit a length number of characters in the returned substring. Use the PHP mb_substr () function to extract a substring from a string with non-ASCII characters.

How do I extract a substring from a PHP string?

In this example, the substr () function extract the first 3 characters from the 'PHP substring' string starting at the index 0. The following example uses the substr () function to extract a substring from the 'PHP substring' string starting from the index 4 to the end of the string: In this example, we omit the $length argument.

How to extract a substring from a string with non-ASCII characters?

Use the negative offset to extract a substring from the end of the string. The last character in the input string has an index of -1. Use the negative length to omit a length number of characters in the returned substring. Use the PHP mb_substr () function to extract a substring from a string with non-ASCII characters.


2 Answers

The comments above are correct so long as you have mbstring enabled on your server.

$var = "Бензин Офиси А.С. также производит все типы жира и смазок и их побочных        продуктов в его смесительных установках нефти машинного масла в Деринце, Измите, Алиага и Измире. У Компании есть 3 885 станций технического обслуживания, включая сжиженный газ (ЛПГ) станции под фирменным знаком Петрогаз, приблизительно 5 000 дилеров, двух смазочных смесительных установок, 12 терминалов, и 26 единиц поставки аэропорта.";  $foo = mb_substr($var,0,142, "utf-8"); 

Here's the php docs:

http://php.net/manual/en/book.mbstring.php

like image 176
Kai Qing Avatar answered Oct 01 '22 07:10

Kai Qing


A proper (logical) alternative for unicode strings;

<?php function substr_unicode($str, $s, $l = null) {     return join("", array_slice(         preg_split("//u", $str, -1, PREG_SPLIT_NO_EMPTY), $s, $l)); }  $str = "Büyük"; $s = 0; // start from "0" (nth) char $l = 3; // get "3" chars echo substr($str, $s, $l) ."\n";    // Bü echo mb_substr($str, $s, $l) ."\n"; // Bü echo substr_unicode($str, $s, $l);  // Büy ?> 

Use the PHP: mb_substr - Manual

like image 21
Botir Ziyatov Avatar answered Oct 01 '22 09:10

Botir Ziyatov