函数文档

wp_parse_url()

💡 云策文档标注

概述

wp_parse_url() 是 PHP parse_url() 函数的包装器,旨在处理不同 PHP 版本中 URL 解析返回值的不一致性问题,特别是针对无协议 URL 和查询中包含“:”的情况。它通过添加占位符来标准化解析过程,确保跨版本兼容性。

关键要点

  • 函数作为 parse_url() 的包装器,解决 PHP 版本间 URL 解析返回值不一致的问题。
  • 支持可选 $component 参数,使用 PHP 预定义常量(如 PHP_URL_SCHEME)来指定要检索的特定 URL 组件。
  • 返回类型灵活:解析失败时返回 false;成功时返回 URL 组件数组;当请求特定组件时,若组件不存在则返回 null,存在则返回字符串或整数(如 PHP_URL_PORT)。
  • 内部处理无协议 URL(如以“//”或“/”开头的 URL),通过添加占位符 scheme 和 host 来确保正确解析。
  • 在 WordPress 4.4.0 中引入,4.7.0 版本添加了 $component 参数以与 PHP parse_url() 保持功能一致。

代码示例

// 获取 URL 的数组形式
$url_parse = wp_parse_url( 'https://developer.wordpress.org/reference/functions/wp_parse_url/' );

// 输出 URL 的路径部分
echo $url_parse['path'];

/*
Array
(
    [scheme] => https
    [host] => developer.wordpress.org
    [path] => /reference/functions/wp_parse_url/
)
*/

注意事项

  • 该函数修复了 parse_url() 在某些情况下(如 URL 路径中包含“:”时)的错误解析,例如将路径部分误识别为端口。
  • 当 URL 以 scheme://hostname. 形式给出(非完全限定域名)时,解析出的 host 数组成员可能包含无效的主机名(如“hostname.”),开发者需注意验证。
  • 函数广泛用于 WordPress 核心的多个功能中,如 URL 验证、内部链接检测和 HTTPS 迁移等,确保跨版本兼容性。

📄 原文内容

A wrapper for PHP’s parse_url() function that handles consistency in the return values across PHP versions.

Description

Across various PHP versions, schemeless URLs containing a “:” in the query are being handled inconsistently. This function works around those differences.

Parameters

$urlstringrequired
The URL to parse.
$componentintoptional
The specific component to retrieve. Use one of the PHP predefined constants to specify which one.
Defaults to -1 (= return all parts as an array).

Default:-1

Return

mixed False on parse failure; Array of URL components on success; When a specific component has been requested: null if the component doesn’t exist in the given URL; a string or – in the case of PHP_URL_PORT – integer when it does. See parse_url()’s return values.

Source

function wp_parse_url( $url, $component = -1 ) {
	$to_unset = array();
	$url      = (string) $url;

	if ( str_starts_with( $url, '//' ) ) {
		$to_unset[] = 'scheme';
		$url        = 'placeholder:' . $url;
	} elseif ( str_starts_with( $url, '/' ) ) {
		$to_unset[] = 'scheme';
		$to_unset[] = 'host';
		$url        = 'placeholder://placeholder' . $url;
	}

	$parts = parse_url( $url );

	if ( false === $parts ) {
		// Parsing failure.
		return $parts;
	}

	// Remove the placeholder values.
	foreach ( $to_unset as $key ) {
		unset( $parts[ $key ] );
	}

	return _get_component_from_parsed_url_array( $parts, $component );
}

Changelog

Version Description
4.7.0 The $component parameter was added for parity with PHP’s parse_url().
4.4.0 Introduced.

User Contributed Notes

  1. Skip to note 5 content

    Example return array

    // Get array of URL 'scheme', 'host', and 'path'.
    $url_parse = wp_parse_url( 'https://developer.wordpress.org/reference/functions/wp_parse_url/' );
    
    // Output URL's path.
    echo $url_parse['path'];
    
    /*
    Array
    (
        [scheme] => https
        [host] => developer.wordpress.org
        [path] => /reference/functions/wp_parse_url/
    )
    */

  2. Skip to note 6 content

    For reference, here’s a sample that shows URL parts that are returned by parse_url:

    $url = 'https://user:pass@example.org:8080/path/to/file.min.js?param=value∫=1#anchor';
    print_r( wp_parse_url( $url ) );
    
    /*
    Array
    (
        [scheme] => https
        [host] => example.org
        [port] => 8080
        [user] => user
        [pass] => pass
        [path] => /path/to/file.min.js
        [query] => param=value∫=1
        [fragment] => anchor
    )
    */

    And here’s a list of valid $component values:

    * PHP_URL_SCHEME
    * PHP_URL_HOST
    * PHP_URL_PORT
    * PHP_URL_USER
    * PHP_URL_PASS
    * PHP_URL_PATH
    * PHP_URL_QUERY
    * PHP_URL_FRAGMENT

  3. Skip to note 7 content

    wp_parse_url() seems to fix a bug of parse_url(), where port-like string in URL is incorrectly parsed as a port:

    wp_parse_url( '//example.com/foo:1234' );
    
    /* Output:
    array(2) {
      "host" => string(11) "example.com"
      "path" => string(9) "/foo:1234"
    } */
    parse_url( '//example.com/foo:1234' );
    
    /* Output:
    array(3) {
      "host" => string(11) "example.com"
      "port" => int( 1234 )
      "path" => string(9) "/foo:1234"
    } */