get_html_split_regex()
云策文档标注
概述
get_html_split_regex() 函数用于获取一个正则表达式,用于在 HTML 文本中分割元素和注释。该函数返回一个静态缓存的正则表达式字符串,以提高性能。
关键要点
- 函数返回一个正则表达式字符串,用于匹配 HTML 元素、注释和 CDATA 部分。
- 正则表达式通过静态变量缓存,避免重复计算,提升效率。
- 该函数被 wp_html_split() 调用,用于从文本中分离 HTML 元素和注释。
- 自 WordPress 4.4.0 版本引入。
代码示例
static function get_html_split_regex() {
static $regex;
if ( ! isset( $regex ) ) {
// phpcs:disable Squiz.Strings.ConcatenationSpacing.PaddingFound -- don't remove regex indentation
$comments =
'!' // Start of comment, after the is found.
. '-(?!->)' // Dash not followed by end of comment.
. '[^-]*+' // Consume non-dashes.
. ')*+' // Loop possessively.
. '(?:-->)?'; // End of comment. If not found, match all input.
$cdata =
'' // One ] not followed by end of comment.
. '[^]]*+' // Consume non-].
. ')*+' // Loop possessively.
. '(?:]]>)?'; // End of comment. If not found, match all input.
$escaped =
'(?=' // Is the element escaped?
. '!--'
. '|'
. '![CDATA['
. ')'
. '(?(?=!-)' // If yes, which type?
. $comments
. '|'
. $cdata
. ')';
$regex =
'/(' // Capture the entire match.
. '<' // Start of element.
. '(?=' // Is it an element or comment?
. '!(?:' // If comment, which type?
. $escaped
. ')'
. ')'
. '[^>]*+' // Consume non-> characters.
. '>' // End of element.
. ')'
. '|' // OR
. '(' // Capture the entire match.
. '<' // Start of element.
. '/?' // Optional slash for closing elements.
. '[^s>]*+' // Tag name.
. '(?:' // Non-capturing group for attributes.
. 's' // Whitespace before attribute.
. '[^>]*+' // Attribute value.
. ')?+' // Attributes optional.
. '>?' // Optional > for self-closing elements.
. ')'
. ')/';
// phpcs:enable
}
return $regex;
}
原文内容
Retrieves the regular expression for an HTML element.
Source
function get_html_split_regex() {
static $regex;
if ( ! isset( $regex ) ) {
// phpcs:disable Squiz.Strings.ConcatenationSpacing.PaddingFound -- don't remove regex indentation
$comments =
'!' // Start of comment, after the <.
. '(?:' // Unroll the loop: Consume everything until --> is found.
. '-(?!->)' // Dash not followed by end of comment.
. '[^-]*+' // Consume non-dashes.
. ')*+' // Loop possessively.
. '(?:-->)?'; // End of comment. If not found, match all input.
$cdata =
'![CDATA[' // Start of comment, after the <.
. '[^]]*+' // Consume non-].
. '(?:' // Unroll the loop: Consume everything until ]]> is found.
. '](?!]>)' // One ] not followed by end of comment.
. '[^]]*+' // Consume non-].
. ')*+' // Loop possessively.
. '(?:]]>)?'; // End of comment. If not found, match all input.
$escaped =
'(?=' // Is the element escaped?
. '!--'
. '|'
. '![CDATA['
. ')'
. '(?(?=!-)' // If yes, which type?
. $comments
. '|'
. $cdata
. ')';
$regex =
'/(' // Capture the entire match.
. '<' // Find start of element.
. '(?' // Conditional expression follows.
. $escaped // Find end of escaped element.
. '|' // ...else...
. '[^>]*>?' // Find end of normal element.
. ')'
. ')/';
// phpcs:enable
}
return $regex;
}
Changelog
| Version | Description |
|---|---|
| 4.4.0 | Introduced. |