函数文档

get_html_split_regex()

💡 云策文档标注

概述

get_html_split_regex() 函数用于获取一个正则表达式,用于在 HTML 文本中分割元素和注释。该函数返回一个静态缓存的正则表达式字符串,以提高性能。

关键要点

  • 函数返回一个正则表达式字符串,用于匹配 HTML 元素、注释和 CDATA 部分。
  • 正则表达式通过静态变量缓存,避免重复计算,提升效率。
  • 该函数被 wp_html_split() 调用,用于从文本中分离 HTML 元素和注释。
  • 自 WordPress 4.4.0 版本引入。

代码示例

static function get_html_split_regex() {
    static $regex;

    if ( ! isset( $regex ) ) {
        // phpcs:disable Squiz.Strings.ConcatenationSpacing.PaddingFound -- don't remove regex indentation
        $comments =
            '!'             // Start of comment, after the  is found.
            .     '-(?!->)' // Dash not followed by end of comment.
            .     '[^-]*+' // Consume non-dashes.
            . ')*+'         // Loop possessively.
            . '(?:-->)?';   // End of comment. If not found, match all input.

        $cdata =
            '![CDATA['    // Start of comment, after the  is found.
            .     '](?!]>)' // One ] not followed by end of comment.
            .     '[^]]*+' // Consume non-].
            . ')*+'         // Loop possessively.
            . '(?:]]>)?';   // End of comment. If not found, match all input.

        $escaped =
            '(?='             // Is the element escaped?
            .    '!--'
            . '|'
            .    '![CDATA['
            . ')'
            . '(?(?=!-)'      // If yes, which type?
            .     $comments
            . '|'
            .     $cdata
            . ')';

        $regex =
            '/('                // Capture the entire match.
            .     '<'           // Start of element.
            .     '(?='         // Is it an element or comment?
            .         '!(?:'    // If comment, which type?
            .             $escaped
            .         ')'
            .     ')'
            .     '[^>]*+'      // Consume non-> characters.
            .     '>'           // End of element.
            . ')'
            . '|'               // OR
            . '('               // Capture the entire match.
            .     '<'           // Start of element.
            .     '/?'          // Optional slash for closing elements.
            .     '[^s>]*+'   // Tag name.
            .     '(?:'         // Non-capturing group for attributes.
            .         's'     // Whitespace before attribute.
            .         '[^>]*+'  // Attribute value.
            .     ')?+'         // Attributes optional.
            .     '>?'          // Optional > for self-closing elements.
            .     ')'
            . ')/';
        // phpcs:enable
    }

    return $regex;
}

📄 原文内容

Retrieves the regular expression for an HTML element.

Return

string The regular expression.

Source

function get_html_split_regex() {
	static $regex;

	if ( ! isset( $regex ) ) {
		// phpcs:disable Squiz.Strings.ConcatenationSpacing.PaddingFound -- don't remove regex indentation
		$comments =
			'!'             // Start of comment, after the <.
			. '(?:'         // Unroll the loop: Consume everything until --> is found.
			.     '-(?!->)' // Dash not followed by end of comment.
			.     '[^-]*+' // Consume non-dashes.
			. ')*+'         // Loop possessively.
			. '(?:-->)?';   // End of comment. If not found, match all input.

		$cdata =
			'![CDATA['    // Start of comment, after the <.
			. '[^]]*+'     // Consume non-].
			. '(?:'         // Unroll the loop: Consume everything until ]]> is found.
			.     '](?!]>)' // One ] not followed by end of comment.
			.     '[^]]*+' // Consume non-].
			. ')*+'         // Loop possessively.
			. '(?:]]>)?';   // End of comment. If not found, match all input.

		$escaped =
			'(?='             // Is the element escaped?
			.    '!--'
			. '|'
			.    '![CDATA['
			. ')'
			. '(?(?=!-)'      // If yes, which type?
			.     $comments
			. '|'
			.     $cdata
			. ')';

		$regex =
			'/('                // Capture the entire match.
			.     '<'           // Find start of element.
			.     '(?'          // Conditional expression follows.
			.         $escaped  // Find end of escaped element.
			.     '|'           // ...else...
			.         '[^>]*>?' // Find end of normal element.
			.     ')'
			. ')/';
		// phpcs:enable
	}

	return $regex;
}

Changelog

Version Description
4.4.0 Introduced.