函数文档

force_balance_tags()

💡 云策文档标注

概述

force_balance_tags() 是 WordPress 中用于平衡 HTML 标签的函数,通过修改的栈机制处理字符串中的标签闭合问题,常用于摘要生成等场景以防止标签不匹配导致的布局问题。

关键要点

  • 函数通过正则表达式和栈结构处理 HTML 标签,确保标签正确闭合,避免因标签不匹配而破坏页面结构。
  • 主要用于短文章摘要列表,例如处理 more 标签后的 HTML 截断,防止标签未闭合导致的显示错误。
  • 函数忽略 'use_balanceTags' 选项,且由于性能和潜在 bug,不建议在所有 WP 页面上使用,仅在可控摘要长度时使用以避免内存问题。
  • 使用硬编码的单标签和可嵌套标签列表,如果标签不在列表中或嵌套行为改变,可能导致标记错误。

代码示例

function force_balance_tags( $text ) {
    $tagstack  = array();
    $stacksize = 0;
    $tagqueue  = '';
    $newtext   = '';
    // Known single-entity/self-closing tags.
    $single_tags = array( 'area', 'base', 'basefont', 'br', 'col', 'command', 'embed', 'frame', 'hr', 'img', 'input', 'isindex', 'link', 'meta', 'param', 'source', 'track', 'wbr' );
    // Tags that can be immediately nested within themselves.
    $nestable_tags = array( 'article', 'aside', 'blockquote', 'details', 'div', 'figure', 'object', 'q', 'section', 'span' );
    // ... 后续正则匹配和栈处理逻辑 ...
}

注意事项

  • 函数基于正则表达式而非 HTML 解析器,可能性能开销较大,应谨慎使用。
  • 硬编码标签列表可能未与 WordPress 其他列表同步,新增或修改标签时需注意兼容性。
  • 仅在可控的文本长度下使用,以避免内存溢出或难以追踪的 bug。

📄 原文内容

Balances tags of string using a modified stack.

Description

{@internal Modified by Scott Reilly (coffee2code) 02 Aug 2004 1.1 Fixed handling of append/stack pop order of end text Added Cleaning Hooks 1.0 First Version}

Parameters

$textstringrequired
Text to be balanced.

Return

string Balanced text.

More Information

Usage:

This function is used in the short post excerpt list, to prevent unmatched elements. For example, it makes

<div><b>This is an excerpt. <!--more--> and this is more text... </b></div>

not break, when the html after the more tag is cut off.

<div><b>This is an excerpt.

should be changed to:

<div><b>This is an excerpt. </b></div>

by the force_balance_tags function.

Notes:
  • Ignores the ‘use_balanceTags‘ option.
  • This function is not used for all WP pages due to bugs and performance issues. This function doesn’t use an HTML parser but some potentially expensive regular expressions. This function shall only be used if the length of the excerpt can be controlled; otherwise memory issues or some obscure bugs can occur.
  • This function uses two hard-coded lists of elements for single tags and nestable tags. WordPress uses multiple such lists and the lists not kept in sync. If an element isn’t part of these lists or changed its nesting behavior, it may lead to incorrect markup.

Source

function force_balance_tags( $text ) {
	$tagstack  = array();
	$stacksize = 0;
	$tagqueue  = '';
	$newtext   = '';
	// Known single-entity/self-closing tags.
	$single_tags = array( 'area', 'base', 'basefont', 'br', 'col', 'command', 'embed', 'frame', 'hr', 'img', 'input', 'isindex', 'link', 'meta', 'param', 'source', 'track', 'wbr' );
	// Tags that can be immediately nested within themselves.
	$nestable_tags = array( 'article', 'aside', 'blockquote', 'details', 'div', 'figure', 'object', 'q', 'section', 'span' );

	// WP bug fix for comments - in case you REALLY meant to type '< !--'.
	$text = str_replace( '< !--', '<    !--', $text );
	// WP bug fix for LOVE <3 (and other situations with '<' before a number).
	$text = preg_replace( '#<([0-9]{1})#', '<$1', $text );

	/**
	 * Matches supported tags.
	 *
	 * To get the pattern as a string without the comments paste into a PHP
	 * REPL like `php -a`.
	 *
	 * @see https://html.spec.whatwg.org/#elements-2
	 * @see https://html.spec.whatwg.org/multipage/custom-elements.html#valid-custom-element-name
	 *
	 * @example
	 * ~# php -a
	 * php > $s = [paste copied contents of expression below including parentheses];
	 * php > echo $s;
	 */
	$tag_pattern = (
		'#<' . // Start with an opening bracket.
		'(/?)' . // Group 1 - If it's a closing tag it'll have a leading slash.
		'(' . // Group 2 - Tag name.
			// Custom element tags have more lenient rules than HTML tag names.
			'(?:[a-z](?:[a-z0-9._]*)-(?:[a-z0-9._-]+)+)' .
				'|' .
			// Traditional tag rules approximate HTML tag names.
			'(?:[w:]+)' .
		')' .
		'(?:' .
			// We either immediately close the tag with its '>' and have nothing here.
			's*' .
			'(/?)' . // Group 3 - "attributes" for empty tag.
				'|' .
			// Or we must start with space characters to separate the tag name from the attributes (or whitespace).
			'(s+)' . // Group 4 - Pre-attribute whitespace.
			'([^>]*)' . // Group 5 - Attributes.
		')' .
		'>#' // End with a closing bracket.
	);

	while ( preg_match( $tag_pattern, $text, $regex ) ) {
		$full_match        = $regex[0];
		$has_leading_slash = ! empty( $regex[1] );
		$tag_name          = $regex[2];
		$tag               = strtolower( $tag_name );
		$is_single_tag     = in_array( $tag, $single_tags, true );
		$pre_attribute_ws  = isset( $regex[4] ) ? $regex[4] : '';
		$attributes        = trim( isset( $regex[5] ) ? $regex[5] : $regex[3] );
		$has_self_closer   = str_ends_with( $attributes, '/' );

		$newtext .= $tagqueue;

		$i = strpos( $text, $full_match );
		$l = strlen( $full_match );

		// Clear the shifter.
		$tagqueue = '';
		if ( $has_leading_slash ) { // End tag.
			// If too many closing tags.
			if ( $stacksize <= 0 ) {
				$tag = '';
				// Or close to be safe $tag = '/' . $tag.

				// If stacktop value = tag close value, then pop.
			} elseif ( $tagstack[ $stacksize - 1 ] === $tag ) { // Found closing tag.
				$tag = '<!--' . $tag . '-->'; // Close tag.
				array_pop( $tagstack );
				--$stacksize;
			} else { // Closing tag not at top, search for it.
				for ( $j = $stacksize - 1; $j >= 0; $j-- ) {
					if ( $tagstack[ $j ] === $tag ) {
						// Add tag to tagqueue.
						for ( $k = $stacksize - 1; $k >= $j; $k-- ) {
							$tagqueue .= '<!--' . array_pop( $tagstack ) . '-->';
							--$stacksize;
						}
						break;
					}
				}
				$tag = '';
			}
		} else { // Begin tag.
			if ( $has_self_closer ) {
				/*
				 * If it presents itself as a self-closing tag, but it isn't a known single-entity self-closing tag,
				 * then don't let it be treated as such and immediately close it with a closing tag.
				 * The tag will encapsulate no text as a result.
				 */
				if ( ! $is_single_tag ) {
					$attributes = trim( substr( $attributes, 0, -1 ) ) . "><!--$tag";
				}
			} elseif ( $is_single_tag ) {
				// Else if it's a known single-entity tag but it doesn't close itself, do so.
				$pre_attribute_ws = ' ';
				$attributes      .= '/';
			} else {
				/*
				 * It's not a single-entity tag.
				 * If the top of the stack is the same as the tag we want to push, close previous tag.
				 */
				if ( $stacksize --> 0 && ! in_array( $tag, $nestable_tags, true ) && $tagstack[ $stacksize - 1 ] === $tag ) {
					$tagqueue = '<!--' . array_pop( $tagstack ) . '-->';
					--$stacksize;
				}
				$stacksize = array_push( $tagstack, $tag );
			}

			// Attributes.
			if ( $has_self_closer && $is_single_tag ) {
				// We need some space - avoid <br/> and prefer <br />.
				$pre_attribute_ws = ' ';
			}

			$tag = '<' . $tag . $pre_attribute_ws . $attributes . '>';
			// If already queuing a close tag, then put this tag on too.
			if ( ! empty( $tagqueue ) ) {
				$tagqueue .= $tag;
				$tag       = '';
			}
		}
		$newtext .= substr( $text, 0, $i ) . $tag;
		$text     = substr( $text, $i + $l );
	}

	// Clear tag queue.
	$newtext .= $tagqueue;

	// Add remaining text.
	$newtext .= $text;

	while ( $x = array_pop( $tagstack ) ) {
		$newtext .= '<!--' . $x . '-->'; // Add remaining tags to close.
	}

	// WP fix for the bug with HTML comments.
	$newtext = str_replace( '< !--', '<!--', $newtext );
	$newtext = str_replace( '<    !--', '< !--', $newtext );

	return $newtext;
}

Changelog

VersionDescription
5.3.0Improve accuracy and add support for custom element tags.
2.0.4Introduced.

User Contributed Notes

You must log in before being able to contribute a note or feedback.