_canonical_charset()
云策文档标注
概述
_canonical_charset() 函数用于获取字符集的规范形式,适用于传递给 PHP 函数如 htmlspecialchars() 或 charset HTML 属性。它处理 UTF-8 和 ISO-8859-1 等常见字符集的标准化。
关键要点
- 函数接受一个字符串参数 $charset,表示字符集名称,例如 "UTF-8"、"Windows-1252" 或 "SJIS"。
- 返回字符集的规范形式字符串,例如 UTF-8 字符集返回 'UTF-8',ISO-8859-1 相关变体返回 'ISO-8859-1'。
- 内部使用 is_utf8_charset() 检查是否为 UTF-8 字符集,并进行标准化处理。
- 函数自 WordPress 3.6.0 版本引入,相关用途包括 _wp_die_process_input() 和 _wp_specialchars()。
代码示例
function _canonical_charset( $charset ) {
if ( is_utf8_charset( $charset ) ) {
return 'UTF-8';
}
/*
* Normalize the ISO-8859-1 family of languages.
*
* This is not required for htmlspecialchars(), as it properly recognizes all of
* the input character sets that here are transformed into "ISO-8859-1".
*
* @todo Should this entire check be removed since it's not required for the stated purpose?
* @todo Should WordPress transform other potential charset equivalents, such as "latin1"?
*/
if (
( 0 === strcasecmp( 'iso-8859-1', $charset ) ) ||
( 0 === strcasecmp( 'iso8859-1', $charset ) )
) {
return 'ISO-8859-1';
}
return $charset;
}注意事项
- 函数主要用于标准化字符集,以确保与 PHP 函数兼容,但部分检查(如 ISO-8859-1 标准化)可能非必需,代码中包含待办事项注释。
- 开发者应确保传递有效的字符集名称,否则函数可能返回原始输入。
原文内容
Retrieves a canonical form of the provided charset appropriate for passing to PHP functions such as htmlspecialchars() and charset HTML attributes.
Description
See also
Parameters
$charsetstringrequired-
A charset name, e.g. “UTF-8”, “Windows-1252”, “SJIS”.
Source
function _canonical_charset( $charset ) {
if ( is_utf8_charset( $charset ) ) {
return 'UTF-8';
}
/*
* Normalize the ISO-8859-1 family of languages.
*
* This is not required for htmlspecialchars(), as it properly recognizes all of
* the input character sets that here are transformed into "ISO-8859-1".
*
* @todo Should this entire check be removed since it's not required for the stated purpose?
* @todo Should WordPress transform other potential charset equivalents, such as "latin1"?
*/
if (
( 0 === strcasecmp( 'iso-8859-1', $charset ) ) ||
( 0 === strcasecmp( 'iso8859-1', $charset ) )
) {
return 'ISO-8859-1';
}
return $charset;
}
Changelog
| Version | Description |
|---|---|
| 3.6.0 | Introduced. |