Since I wasn't able to find a simple solution via some built-in PHP functions, I wrote two functions to (1) check if the entered string may be a number at all and (2) if it is well-formed depending on the separators used.
I restricted the possible separators to period (.
), comma (,
), space () and apostrophe (
'
) as thousands separators. The decimal point may only be one of the first two options. Both sets of separators can be edited to allow even more or restrict the ones in place.
What I am actually doing is to look for all number columns and all separators by using a couple of simple preg_match_all
calls.
The complete code reads as follows and should be self-explaining as I added some comments when throwing a false
. I'm sure, this can be simplified somehow, but it works right now and filters many errors while allowing even some strange combinations such as 2 000 000.25
or 2'000'000,25
.
function check_number($number) {
if ((int) substr($number,0,1) == 0) {
return false; // not starting with a digit greater than 0
}
if ((string) substr($number,-1) != "0" && (int) substr($number,-1) == 0) {
return false; // not ending with a digit
}
preg_match_all('/([^0-9]{2,})/', $number, $sep, PREG_PATTERN_ORDER);
if (isset($sep[0][0])) {
return false; // more than one consecutive non-digit character
}
preg_match_all('/([^0-9]{1})/', $number, $sep, PREG_PATTERN_ORDER);
if (count($sep[0]) > 2 && count(array_unique($sep[0])) > 2) {
return false; // more than 2 different separators
}
elseif (count($sep[0]) > 2) {
$last_sep = array_pop($sep[0]);
if (!in_array($last_sep,array(".",","))) {
return false; // separator not allowed as last one
}
$sep_unique = array_unique($sep[0]);
if (count($sep_unique) > 1) {
return false; // not all separators (except last one) are identical
}
elseif (!in_array($sep_unique[0],array("'",".",","," "))) {
return false; // separator not allowed
}
}
return true;
}
function convert_number($number) {
preg_match_all('/([0-9]+)/', $number, $num, PREG_PATTERN_ORDER);
preg_match_all('/([^0-9]{1})/', $number, $sep, PREG_PATTERN_ORDER);
if (count($sep[0]) == 0) {
// no separator, integer
return (int) $num[0][0];
}
elseif (count($sep[0]) == 1) {
// one separator, look for last number column
if (strlen($num[0][1]) == 3) {
if (strlen($num[0][0]) <= 3) {
// treat as thousands seperator
return (int) ($num[0][0] * 1000 + $num[0][1]);
}
elseif (strlen($num[0][0]) > 3) {
// must be decimal point
return (float) ($num[0][0] + $num[0][1] / 1000);
}
}
else {
// must be decimal point
return (float) ($num[0][0] + $num[0][1] / pow(10,strlen($num[0][1])));
}
}
else {
// multiple separators, check first an last
if ($sep[0][0] == end($sep[0])) {
// same character, only thousands separators, check well-formed nums
$value = 0;
foreach($num[0] AS $p => $n) {
if ($p == 0 && strlen($n) > 3) {
return -1; // malformed number, incorrect thousands grouping
}
elseif ($p > 0 && strlen($n) != 3) {
return -1; // malformed number, incorrect thousands grouping
}
$value += $n * pow(10, 3 * (count($num[0]) - 1 - $p));
}
return (int) $value;
}
else {
// mixed characters, thousands separators and decimal point
$decimal_part = array_pop($num[0]);
$value = 0;
foreach($num[0] AS $p => $n) {
if ($p == 0 && strlen($n) > 3) {
return -1; // malformed number, incorrect thousands grouping
}
elseif ($p > 0 && strlen($n) != 3) {
return -1; // malformed number, incorrect thousands grouping
}
$value += $n * pow(10, 3 * (count($num[0]) - 1 - $p));
}
return (float) ($value + $decimal_part / pow(10,strlen($decimal_part)));
}
}
}
I am aware of one flaw this set of function has: 1.234
or 1,234
will always be treated as the whole number 1234
, as the function assumes the separator must be a thousands separator if there are less than 4 digits in front of the single separator.