PHP: Convert Numeric Character Reference to UTF8 – part 2


First, read PHP: Convert Numeric Character Reference to UTF8 – part 1

We offer more solution to Convert Numeric Character Reference to UTF8:

Method 04

function ncr_utf8_4($input){
    $input = preg_replace_callback("/&#x([a-fA-F0-9]+);/", create_function('$m', 'return "&#".hexdec($m[1]).";";'), $input);
    $input = preg_replace_callback("/(&#[0-9]+;)/", create_function('$m', 'return mb_convert_encoding($m[1], "UTF-8", "HTML-ENTITIES");'), $input); 
    return $input;
}

Method 05

function ncr_utf8_5($input){
    $_utf8 = create_function('$entity',
        '$convmap = array(0x0, 0x10000, 0, 0xfffff);return mb_decode_numericentity($entity, $convmap, "UTF-8");'
    );
    $input = preg_replace('/&#\d{2,5};/e', "\$_utf8('$0')", $input );
    $input = preg_replace('/&#x([a-fA-F0-9]{2,8});/e', "\$_utf8('&#'.hexdec('$1').';')", $input );
    return $input;
}

Update 08/23/2018:
if you see error:

Deprecated: preg_replace(): The /e modifier is deprecated, use preg_replace_callback instead in ...php on line 8

You must use this method instead:

function ncr_utf8_5($input){
    $_utf8 = create_function('$entity',
        '$convmap = array(0x0, 0x10000, 0, 0xfffff);return mb_decode_numericentity($entity, $convmap, "UTF-8");'
    );
    $input = preg_replace_callback('/&#\d{2,5};/', function($ms) use($_utf8){return $_utf8($ms[0]);}, $input );
    $input = preg_replace_callback('/&#x([a-fA-F0-9]{2,8});/', function($ms) use($_utf8){return $_utf8('&#'.hexdec($ms[1]).';');}, $input );
    return $input;
}

ncr to utf8 part 2

1 Comment

Leave a Reply