Robots Txt for Drupal

Learn to create a robots.txt file for Drupal with an easy-to-follow example to protect your site from search engine crawlers.

Drupal Robots.txt Example

Robots.txt is a text file used by webmasters to instruct search engine robots which parts of their website should not be indexed. The file is placed in the root directory of the website, so it will be accessible via the URL “www.example.com/robots.txt”. It is important to have a robots.txt file in order to maintain the integrity of your website and to prevent search engines from indexing certain files or pages that you do not want to be publicly available.

Below is an example of a robots.txt file for Drupal websites:


User-agent: *
Disallow: /admin 
Disallow: /includes
Disallow: /misc
Disallow: /modules
Disallow: /profiles
Disallow: /scripts
Disallow: /themes
#Paths (clean URLs)
Disallow: /admin/
Disallow: /comment/reply/
Disallow: /filter/tips/
Disallow: /node/add/
Disallow: /search/
Disallow: /user/register/
Disallow: /user/password/
Disallow: /user/login/
#Paths (no clean URLs)
Disallow: /?q=admin/
Disallow: /?q=comment/reply/
Disallow: /?q=filter/tips/
Disallow: /?q=node/add/
Disallow: /?q=search/
Disallow: /?q=user/password/
Disallow: /?q=user/register/
Disallow: /?q=user/login/
#Sitemap
Sitemap: http://www.example.com/sitemap.xml

This robots.txt file instructs search engine robots to not index pages in the /admin, /includes, /misc, /modules, /profiles, /scripts, and /themes directories, as well as all pages that can be accessed with the “/admin/”, “/comment/reply/”, “/filter/tips/”, “/node/add/”, “/search/”, “/user/password/”, “/user/register/”, and “/user/login/” paths. It also provides a link to the website’s sitemap.xml file, which can be used by search engine bots to crawl your website more efficiently.

It is important to remember that robots.txt files are not a reliable way to secure your website, as anyone can access the file and view the content. They should only be used to prevent search engine bots from indexing content that is not meant to be publicly available, such as administrative pages.

Answers (0)