What Is Robots.txt?
According to Google, a It robots.txt
is a file that tells search engine crawlers which URLs they can access on your site. This is mainly used to avoid overloading your website with requests.
What Is a Robots.txt File Used For?
A the robots.txt
file is primarily used to manage crawler traffic to your site and control which parts of your website are accessible to search engines.
How to View a Robots.txt File
A the robots.txt
file is always located at the root of your site.
For example, for the site www.lighttangent.com
, the robots.txt file can be found at:
bashCopyEditwww.lighttangent.com/robots.txt
A robots.txt
file consists of one or more rules. Each rule either blocks or allows access for all or specific crawlers to a specified file path on the domain or subdomain where the robots.txt
file is hosted. Unless otherwise specified in the file, all files are implicitly allowed for crawling.
Examples of Robots.txt
Here are two example URLs—one using HTTP and the other HTTPS:
http://www.healthaluxury.com/robots.txt
https://www.lighttangent.com/robots.txt
What Does a Robots.txt File Look Like?
Here is the robots.txt
file for healthaluxury:
pgsqlCopyEditUser-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: http://healthaluxury.com/wp-sitemap.xml
How to Create a Robots.txt File for Your Website (Step-by-Step)
You can use almost any text editor to create a robots.txt
file—for example, Notepad, TextEdit, vi, or emacs.
Please do not use a word processor, as it may save the file in a proprietary format or introduce unwanted characters (like curly quotes) that can cause issues for crawlers.
🔹 Save the file using UTF-8 encoding, if prompted.
Format and Location Rules:
- The file must be named
robots.txt
.
- The file must be named
- Your site can have only one
robots.txt
file.
- Your site can have only one
- It must be located at the root of the site’s domain.
✅ Example:https://www.example.com/robots.txt
❌ Not allowed:https://example.com/pages/robots.txt
- It must be located at the root of the site’s domain.
- If you are unsure how to access your site root or need permissions, contact your web hosting provider.
- If you can’t access the site root, consider alternative methods like meta tags to block crawling.
- A
robots.txt
file can also be posted on:- Subdomains (e.g.,
https://site.example.com/robots.txt
)
- Subdomains (e.g.,
- Non-standard ports (e.g.,
https://example.com:8181/robots.txt
)
- Non-standard ports (e.g.,
- A
- The file applies only to paths within the protocol, host, and port where it is posted.
- For instance, rules in
https://example.com/robots.txt
apply only tohttps://example.com/
—not tohttps://m.example.com/
orhttp://example.com/
.
- For instance, rules in
- The file applies only to paths within the protocol, host, and port where it is posted.
- The file must be UTF-8 encoded (including ASCII characters). Google may ignore characters outside the UTF-8 range, potentially rendering the rules invalid.
Source: Google Search Console
How to Check if Your Robots.txt File Is Working
You can verify your robots.txt
file using this link in Google Search Console:
Check robots.txt in Google Search Console