do_robots()
云策文档标注
概述
do_robots() 函数用于生成并输出默认的 robots.txt 文件内容,适用于 WordPress 站点。它通过设置 HTTP 头、执行动作钩子和应用过滤器钩子,允许开发者自定义 robots.txt 的输出。
关键要点
- 函数设置 Content-Type 为 text/plain; charset=utf-8,并触发 do_robotstxt 动作钩子。
- 默认输出包括 User-agent: *、Disallow 路径(基于 site_url() 解析)和 Allow 特定文件(如 admin-ajax.php)。
- 使用 robots_txt 过滤器钩子可以修改输出内容,参数包括 $output 和 $public(基于 blog_public 选项)。
- 从版本 5.3.0 起,如果搜索引擎可见性被禁用,不再输出 Disallow: /,而是通过 wp_robots_no_robots() 过滤器回调处理。
代码示例
add_filter( 'robots_txt', function( $output, $public ) {
if ( '0' == $public ) {
$output = "User-agent: *nDisallow: /nDisallow: /*nDisallow: /*?n";
} else {
foreach( array( 'jpeg','jpg','gif','png','mp4','webm','woff','woff2','ttf','eot' ) as $ext ) {
$output .= "Disallow: /*.{$ext}$n";
}
$site_url = parse_url( site_url() );
$path = ( ! empty( $site_url['path'] ) ) ? $site_url['path'] : '';
$output .= "Disallow: $path/wp-login.phpn";
$robots = preg_replace( '/Allow: [^ s]*/wp-admin/admin-ajax.phpn/', '', $output );
if ( ! is_null( $robots ) ) {
$output = $robots;
}
$output .= "Sitemap: {$site_url['scheme']}://{$site_url[ 'host' ]}/sitemap_index.xmln";
}
return $output;
}, 99, 2 );
原文内容
Displays the default robots.txt file content.
Source
function do_robots() {
header( 'Content-Type: text/plain; charset=utf-8' );
/**
* Fires when displaying the robots.txt file.
*
* @since 2.1.0
*/
do_action( 'do_robotstxt' );
$output = "User-agent: *n";
$public = (bool) get_option( 'blog_public' );
$site_url = parse_url( site_url() );
$path = ( ! empty( $site_url['path'] ) ) ? $site_url['path'] : '';
$output .= "Disallow: $path/wp-admin/n";
$output .= "Allow: $path/wp-admin/admin-ajax.phpn";
/**
* Filters the robots.txt output.
*
* @since 3.0.0
*
* @param string $output The robots.txt output.
* @param bool $public Whether the site is considered "public".
*/
echo apply_filters( 'robots_txt', $output, $public );
}
Hooks
- do_action( ‘do_robotstxt’ )
-
Fires when displaying the robots.txt file.
- apply_filters( ‘robots_txt’, string $output, bool $public )
-
Filters the robots.txt output.
Changelog
| Version | Description |
|---|---|
| 5.3.0 | Remove the “Disallow: /” output if search engine visibility is discouraged in favor of robots meta HTML tag via wp_robots_no_robots() filter callback. |
| 2.1.0 | Introduced. |
Skip to note 2 content
wp_kc
Some examples of how robots.txt file content can be altered.
/** * Add Disallow for some file types. * Add "Disallow: /wp-login.php/n". * Remove "Allow: /wp-admin/admin-ajax.phpn". * Calculate and add a "Sitemap:" link. */ add_filter( 'robots_txt', function( $output, $public ) { /** * If "Search engine visibility" is disabled, * strongly tell all robots to go away. */ if ( '0' == $public ) { $output = "User-agent: *nDisallow: /nDisallow: /*nDisallow: /*?n"; } else { /** * Disallow some file types */ foreach( array( 'jpeg','jpg','gif','png','mp4','webm','woff','woff2','ttf','eot' ) as $ext ) { $output .= "Disallow: /*.{$ext}$n"; } /** * Get site path. */ $site_url = parse_url( site_url() ); $path = ( ! empty( $site_url['path'] ) ) ? $site_url['path'] : ''; /** * Add new disallow. */ $output .= "Disallow: $path/wp-login.phpn"; /** * Remove line that allows robots to access AJAX interface. */ $robots = preg_replace( '/Allow: [^s]*/wp-admin/admin-ajax.phpn/', '', $output ); /** * If no error occurred, replace $output with modified value. */ if ( ! is_null( robots ) ) { $output = $robots; } /** * Calculate and add a "Sitemap:" link. * Modify as needed. */ $output .= "Sitemap: {$site_url['scheme']}://{$site_url[ 'host' ]}/sitemap_index.xmln"; } return $output; }, 99, 2 ); // Priority 99, Number of Arguments 2.