PHP: convert javascript escape string to UTF8


Read first: PHP: PHP: Convert Numeric Character Reference to UTF8

Example 1: U+0621: ARABIC LETTER HAMZA

Java Escape: “\u0621”
Javascript Escape: “\u0621”

$str = '\u0621';
$str = strtolower($str);
$str =  preg_replace("/^\\\u(.+)$/","&#x\\1;",$str);

echo ncr_utf8_2($str);

result:
ء

We can create this function below:

function je2utf8($jed)
{
    return preg_replace("/(\\\|%)u([0-9A-F]{4})/e", "ncr_utf8_2('&#x\\2;')", $jed);
}

Example 2:

$encoded = "%u0E2A%u0E32%u0E23%u0E32%u0E19%u0E38%u0E01%u0E23%u0E21%u0E40%u0E2A%u0E23%u0E35";
echo je2utf8($encoded);

result:
สารานุกรมเสรี

Alternative function:

function je2utf8_2($jed)
{
    return preg_replace("/(\\\|%)u([A-Z0-9]{4})/e","mb_convert_encoding(('&#'.hexdec('\\2').';'), 'UTF-8','HTML-ENTITIES')", $jed);
}
function je2utf8_3($jed)
{
    //return preg_replace("/(\\\|%)u([0-9A-F]{4})/e", "html_entity_decode('&#x\\2;',ENT_QUOTES,'UTF-8')", $jed);
    return preg_replace("/(\\\|%)u([A-Z0-9]{4})/e","html_entity_decode(('&#'.hexdec('\\2').';'),ENT_QUOTES,'UTF-8')", $jed);
}

Leave a Reply