Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I change NumberFormatter::parseCurrency() behavior of accepting white space and non breaking space?

I'm trying to parse localized currency strings to currency and float value.

Everything works well for a while, now we experiencing some problems. It seems that NumberFormatter::parseCurrency uses an additional invisible character:

Testcode:

<?php
$formatter = new NumberFormatter("de_DE", NumberFormatter::CURRENCY);
var_dump(array(
    $formatter->parseCurrency("88,22 €", $curr), // taken from output of $formatter->format(88.22)
    $formatter->parseCurrency("88,22 €", $curr), // input with keyboard
    $formatter->parseCurrency("88,22 \xE2\x82\xAc", $curr), // just a test
    $formatter->format(88.22),
    "88,22 €" // keyboard input
));

Output:

array(5) {
  [0]=> float(88,22)
  [1]=> bool(false)
  [2]=> bool(false)
  [3]=> string(10) "88,22 €" // this as input works
  [4]=> string(9) "88,22 €" // this not...
}

As you can see, there is a difference in string length of output 3 and 4.

I get same results in PHP 5.3 (ubuntu with mbstring enabled) and 5.4 (Zend Server on Mac OS X).

The main problem is, input values from my form (ZF1 Application) are equally to output with index 4...

any suggestions? thanks in advance

Edit1:

hexdump of working value:

00000000  38 38 2c 32 32 c2 a0 e2  82 ac 0a                 |88,22......|
0000000b

hexdump of non working value:

00000000  38 38 2c 32 32 20 e2 82  ac 0a                    |88,22 ....|
0000000a

Edit2:

It seems to be a problem with the used whitepsace. c2 a0 is NO-BREAK SPACE and (maybe?) required by NumberFormatter::parseCurrency(). but 0x20 is the default space (which is entered in the input form). Current workaround is replacing the whitespace with NO-BREAK SPACE with $value = str_replace("\x20", "\xC2\xA0", $value);

Edit3:

On another System (Mac OS X with Zend Server 5.6, mbstring enabled, PHP 5.3.14) everything works as expected:

array(5) {
  [0]=> float(88,22)
  [1]=> float(88,22)
  [2]=> float(88,22)
  [3]=> string(9) "88,22 €"
  [4]=> string(9) "88,22 €"
}

Edit4:

The main difference between working with space and working with non break space configuration is the ICU version:

working version:

intl

Internationalization support => enabled
version => 1.1.0
ICU version => 3.8.1

Directive => Local Value => Master Value
intl.default_locale => no value => no value
intl.error_level => 0 => 0

not working version:

intl

Internationalization support => enabled
version => 1.1.0
ICU version => 4.8.1.1
ICU Data version => 4.8.1

Directive => Local Value => Master Value
intl.default_locale => no value => no value
intl.error_level => 0 => 0
like image 395
nofreeusername Avatar asked May 08 '13 10:05

nofreeusername


1 Answers

NumberFormatter::parseCurrency is a thin wrapper around the ICU library function unum_parseDoubleCurrency (see source).

The ICU library function is restrictive in that it will only parse strings that would result from its dual function unum_formatDoubleCurrency. The format is driven by the Unicode locale data, which specifies a non-breaking space between the currency value and the numeric value. Evidently the earlier version of the library accepted other whitespace characters.

In short, you can't make NumberFormatter::parseCurrency accept spaces. However, Zend_Currency should also output non-breaking spaces by default:

$currency = new Zend_Currency(array(
     'currency' => 'EUR',
     'value'    => 88.22,
), 'de_DE');

var_dump(
    strval($currency),             // 88,22 €
    strpos($currency, "\x20"),     // false
    strpos($currency, "\xc2\xa0")  // 5
);

The question is which part of your application is outputting a space and how you address it. You mention it's part of your form, so maybe you could look at having the form return the currency and the value as separate fields, so that you don't have to worry about parsing the number. If the user is entering the string "88,22 €" themselves, you could potentially run in to more problems than just the whitespace issue. Having said that, the workaround you mention (replacing \x20 with \xc2\xa0) is the only way to address that if you want to use NumberFormatter.

like image 71
cmbuckley Avatar answered Nov 02 '22 05:11

cmbuckley