函数文档

do_robots()

💡 云策文档标注

概述

do_robots() 函数用于生成并输出默认的 robots.txt 文件内容,适用于 WordPress 站点。它通过设置 HTTP 头、执行动作钩子和应用过滤器钩子,允许开发者自定义 robots.txt 的输出。

关键要点

  • 函数设置 Content-Type 为 text/plain; charset=utf-8,并触发 do_robotstxt 动作钩子。
  • 默认输出包括 User-agent: *、Disallow 路径(基于 site_url() 解析)和 Allow 特定文件(如 admin-ajax.php)。
  • 使用 robots_txt 过滤器钩子可以修改输出内容,参数包括 $output 和 $public(基于 blog_public 选项)。
  • 从版本 5.3.0 起,如果搜索引擎可见性被禁用,不再输出 Disallow: /,而是通过 wp_robots_no_robots() 过滤器回调处理。

代码示例

add_filter( 'robots_txt', function( $output, $public ) {
    if ( '0' == $public ) {
        $output = "User-agent: *nDisallow: /nDisallow: /*nDisallow: /*?n";
    } else {
        foreach( array( 'jpeg','jpg','gif','png','mp4','webm','woff','woff2','ttf','eot' ) as $ext ) {
            $output .= "Disallow: /*.{$ext}$n";
        }
        $site_url = parse_url( site_url() );
        $path = ( ! empty( $site_url['path'] ) ) ? $site_url['path'] : '';
        $output .= "Disallow: $path/wp-login.phpn";
        $robots = preg_replace( '/Allow: [^s]*/wp-admin/admin-ajax.phpn/', '', $output );
        if ( ! is_null( $robots ) ) {
            $output = $robots;
        }
        $output .= "Sitemap: {$site_url['scheme']}://{$site_url[ 'host' ]}/sitemap_index.xmln";
    }
    return $output;
}, 99, 2 );

📄 原文内容

Displays the default robots.txt file content.

Source

function do_robots() {
	header( 'Content-Type: text/plain; charset=utf-8' );

	/**
	 * Fires when displaying the robots.txt file.
	 *
	 * @since 2.1.0
	 */
	do_action( 'do_robotstxt' );

	$output = "User-agent: *n";
	$public = (bool) get_option( 'blog_public' );

	$site_url = parse_url( site_url() );
	$path     = ( ! empty( $site_url['path'] ) ) ? $site_url['path'] : '';
	$output  .= "Disallow: $path/wp-admin/n";
	$output  .= "Allow: $path/wp-admin/admin-ajax.phpn";

	/**
	 * Filters the robots.txt output.
	 *
	 * @since 3.0.0
	 *
	 * @param string $output The robots.txt output.
	 * @param bool   $public Whether the site is considered "public".
	 */
	echo apply_filters( 'robots_txt', $output, $public );
}

Hooks

do_action( ‘do_robotstxt’ )

Fires when displaying the robots.txt file.

apply_filters( ‘robots_txt’, string $output, bool $public )

Filters the robots.txt output.

Changelog

Version Description
5.3.0 Remove the “Disallow: /” output if search engine visibility is discouraged in favor of robots meta HTML tag via wp_robots_no_robots() filter callback.
2.1.0 Introduced.

User Contributed Notes

  1. Skip to note 2 content

    Some examples of how robots.txt file content can be altered.

    /**
     * Add Disallow for some file types.
     * Add "Disallow: /wp-login.php/n".
     * Remove "Allow: /wp-admin/admin-ajax.phpn".
     * Calculate and add a "Sitemap:" link.
     */
    add_filter( 'robots_txt', function( $output, $public ) {
    	/**
    	 * If "Search engine visibility" is disabled,
    	 * strongly tell all robots to go away.
    	 */
    	if ( '0' == $public ) {
    		$output = "User-agent: *nDisallow: /nDisallow: /*nDisallow: /*?n";
    	} else {
    		/**
    		 * Disallow some file types
    		 */
    		foreach( array( 'jpeg','jpg','gif','png','mp4','webm','woff','woff2','ttf','eot' ) as $ext ) {
    			$output .= "Disallow: /*.{$ext}$n";
    		}
    
    		/**
    		 * Get site path.
    		 */
    		$site_url = parse_url( site_url() );
    		$path	  = ( ! empty( $site_url['path'] ) ) ? $site_url['path'] : '';
    
    		/**
    		 * Add new disallow.
    		 */
    		$output .= "Disallow: $path/wp-login.phpn";
    
    		/**
    		 * Remove line that allows robots to access AJAX interface.
    		 */
    		$robots = preg_replace( '/Allow: [^s]*/wp-admin/admin-ajax.phpn/', '', $output );
    
    		/**
    		 * If no error occurred, replace $output with modified value.
    		 */
    		if ( ! is_null( robots ) ) {
    			$output = $robots;
    		}
    		/**
    		 * Calculate and add a "Sitemap:" link.
    		 * Modify as needed.
    		 */
    		$output .= "Sitemap: {$site_url['scheme']}://{$site_url[ 'host' ]}/sitemap_index.xmln";
    	}
    
    	return $output;
    
    }, 99, 2 );  // Priority 99, Number of Arguments 2.