Domanda

Here's a short function designed to test a string against a regular expression that only matches ascii characters:

<?php
$test = 'æhrzBGFX029!^%/\#,.';
if (preg_match('/^[[:ascii:]]*$/u', $test)) {
    echo 'ERR: this shouldn\'t have matched: \'' . $test . '\'';
} else {
    echo 'OK';
}

On Ubuntu, this passes correctly (OK is printed). However on Mac OS X (Mavericks) this returns the error response (ERR: this shouldn't have matched).

I can't figure out why this is. Any ideas?

EDIT: The OS X locale settings are:

LANG="en_US"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

The language settings on an Ubuntu box where it does pass correctly (returns OK) are:

LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
È stato utile?

Soluzione 2

It turns out that the :ascii: expression is evaluated differently across the installation (however I'm not sure if it's on the operating system's end, or the PHP side, or brew, or something else).

Therefore, the issue can be solved in this instance by replacing /^[[:ascii:]]*$/u expression with /^[\x00-\x7F]*$/u. The full code is then:

<?php
$test = 'æhrzBGFX029!^%/\#,.';
if (preg_match('/^[\x00-\x7F]*$/u', $test)) {
    echo 'ERR: this shouldn\'t have matched: \'' . $test . '\'';
} else {
    echo 'OK';
}

Altri suggerimenti

This could be caused by differences between the locales of the two operating systems.

From O'Reilly's Programming PHP:

In particular, what constitutes a "letter" varies from language to language (think of à and ç), and there are character classes in POSIX regular expressions that take this into account.

...

POSIX defines a number of named sets of characters that you can use in character classes. [...] The actual letters vary from locale to locale.

http://docstore.mik.ua/orelly/webprog/php/ch04_09.htm

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top