esc_xml()

💡 云策文档标注

概述

esc_xml() 是 WordPress 5.5.0 引入的函数，用于对 XML 块进行转义，确保输出内容的安全性和有效性。它通过处理无效 UTF-8 字符、区分 CDATA 和非 CDATA 部分，并应用适当的转义规则来生成安全的 XML 文本。

关键要点

函数 esc_xml() 接受一个字符串参数 $text，返回转义后的字符串。
内部使用 wp_check_invalid_utf8() 检查并清理无效 UTF-8 字符。
通过正则表达式匹配 CDATA 和非 CDATA 部分，对非 CDATA 部分使用 _wp_specialchars() 进行 XML 实体转义。
提供过滤器 'esc_xml'，允许开发者自定义转义后的字符串。
主要用于 WordPress 站点地图（sitemaps）相关功能，如 WP_Sitemaps_Renderer 和 WP_Sitemaps_Stylesheet 类。

代码示例

function esc_xml( $text ) {
    $safe_text = wp_check_invalid_utf8( $text );

    $cdata_regex = '\';
    $regex       = <<<EOF
/(?=(.*?)) # the "anything" matched by the lookahead
(?({$cdata_regex}))            # the CDATA Section matched by the lookahead
|                                      # alternative
(?(.*))                    # non-CDATA Section
/sx
EOF;

    $safe_text = (string) preg_replace_callback(
        $regex,
        static function ( $matches ) {
            if ( ! isset( $matches[0] ) ) {
                return '';
            }

            if ( isset( $matches['non_cdata'] ) ) {
                // escape HTML entities in the non-CDATA Section.
                return _wp_specialchars( $matches['non_cdata'], ENT_XML1 );
            }

            // Return the CDATA Section unchanged, escape HTML entities in the rest.
            return _wp_specialchars( $matches['non_cdata_followed_by_cdata'], ENT_XML1 ) . $matches['cdata'];
        },
        $safe_text
    );

    return apply_filters( 'esc_xml', $safe_text, $text );
}

注意事项

此函数从 WordPress 5.5.0 版本开始可用，旧版本中不可用。
转义过程会保留 CDATA 部分不变，仅对非 CDATA 部分进行转义，确保 XML 结构的正确性。
开发者可以通过 'esc_xml' 过滤器修改转义行为，但需谨慎操作以避免安全风险。

📄 原文内容

Escaping for XML blocks.

Parameters

$textstringrequired: Text to escape.

Return

string Escaped text.

Source

function esc_xml( $text ) {
	$safe_text = wp_check_invalid_utf8( $text );

	$cdata_regex = '<![CDATA[.*?]]>';
	$regex       = <<<EOF
/
	(?=.*?{$cdata_regex})                 # lookahead that will match anything followed by a CDATA Section
	(?<non_cdata_followed_by_cdata>(.*?)) # the "anything" matched by the lookahead
	(?<cdata>({$cdata_regex}))            # the CDATA Section matched by the lookahead

|	                                      # alternative

	(?<non_cdata>(.*))                    # non-CDATA Section
/sx
EOF;

	$safe_text = (string) preg_replace_callback(
		$regex,
		static function ( $matches ) {
			if ( ! isset( $matches[0] ) ) {
				return '';
			}

			if ( isset( $matches['non_cdata'] ) ) {
				// escape HTML entities in the non-CDATA Section.
				return _wp_specialchars( $matches['non_cdata'], ENT_XML1 );
			}

			// Return the CDATA Section unchanged, escape HTML entities in the rest.
			return _wp_specialchars( $matches['non_cdata_followed_by_cdata'], ENT_XML1 ) . $matches['cdata'];
		},
		$safe_text
	);

	/**
	 * Filters a string cleaned and escaped for output in XML.
	 *
	 * Text passed to esc_xml() is stripped of invalid or special characters
	 * before output. HTML named character references are converted to their
	 * equivalent code points.
	 *
	 * @since 5.5.0
	 *
	 * @param string $safe_text The text after it has been escaped.
	 * @param string $text      The text prior to being escaped.
	 */
	return apply_filters( 'esc_xml', $safe_text, $text );
}

View all references View on Trac View on GitHub

Hooks

apply_filters( ‘esc_xml’, string $safe_text, string $text ): Filters a string cleaned and escaped for output in XML.

Uses	Description
wp_check_invalid_utf8()`wp-includes/formatting.php`	Checks for invalid UTF8 in a string.
_wp_specialchars()`wp-includes/formatting.php`	Converts a number of special characters into their HTML entities.
apply_filters()`wp-includes/plugin.php`	Calls the callback functions that have been added to a filter hook.

Show 1 more Show less

Used by	Description
WP_Sitemaps_Renderer::get_sitemap_index_xml()`wp-includes/sitemaps/class-wp-sitemaps-renderer.php`	Gets XML for a sitemap index.
WP_Sitemaps_Renderer::get_sitemap_xml()`wp-includes/sitemaps/class-wp-sitemaps-renderer.php`	Gets XML for a sitemap.
WP_Sitemaps_Renderer::check_for_simple_xml_availability()`wp-includes/sitemaps/class-wp-sitemaps-renderer.php`	Checks for the availability of the SimpleXML extension and errors if missing.
WP_Sitemaps_Stylesheet::get_sitemap_stylesheet()`wp-includes/sitemaps/class-wp-sitemaps-stylesheet.php`	Returns the escaped XSL for all sitemaps, except index.
WP_Sitemaps_Stylesheet::get_sitemap_index_stylesheet()`wp-includes/sitemaps/class-wp-sitemaps-stylesheet.php`	Returns the escaped XSL for the index sitemaps.

Changelog

Version	Description
5.5.0	Introduced.

云策 WordPress 开发者社区

函数文档

esc_xml()

概述

关键要点

代码示例

注意事项

Parameters

Return

Source

Hooks

Changelog

函数文档

概述

关键要点

代码示例

注意事项

Parameters

Return

Source

Hooks

Related

Changelog