What Is Robots.txt

Robots

What Is Robots.txt?

According to Google, a It robots.txt is a file that tells search engine crawlers which URLs they can access on your site. This is mainly used to avoid overloading your website with requests.

What Is a Robots.txt File Used For?

A the robots.txt file is primarily used to manage crawler traffic to your site and control which parts of your website are accessible to search engines.

How to View a Robots.txt File

A the robots.txt file is always located at the root of your site.
For example, for the site www.lighttangent.com, the robots.txt file can be found at:

bashCopyEditwww.lighttangent.com/robots.txt

A robots.txt file consists of one or more rules. Each rule either blocks or allows access for all or specific crawlers to a specified file path on the domain or subdomain where the robots.txt file is hosted. Unless otherwise specified in the file, all files are implicitly allowed for crawling.

Examples of Robots.txt

Here are two example URLs—one using HTTP and the other HTTPS:

      • http://www.healthaluxury.com/robots.txt

      • https://www.lighttangent.com/robots.txt

    What Does a Robots.txt File Look Like?

    Here is the robots.txt file for healthaluxury:

    pgsqlCopyEditUser-agent: *
    Disallow: /wp-admin/
    Allow: /wp-admin/admin-ajax.php
    Sitemap: http://healthaluxury.com/wp-sitemap.xml
    

    How to Create a Robots.txt File for Your Website (Step-by-Step)

    You can use almost any text editor to create a robots.txt file—for example, Notepad, TextEdit, vi, or emacs.
    Please do not use a word processor, as it may save the file in a proprietary format or introduce unwanted characters (like curly quotes) that can cause issues for crawlers.

    🔹 Save the file using UTF-8 encoding, if prompted.

    Format and Location Rules:

       

        • The file must be named robots.txt.

        • Your site can have only one robots.txt file.

        • It must be located at the root of the site’s domain.
          ✅ Example: https://www.example.com/robots.txt
          ❌ Not allowed: https://example.com/pages/robots.txt

        • If you are unsure how to access your site root or need permissions, contact your web hosting provider.

        • If you can’t access the site root, consider alternative methods like meta tags to block crawling.

        • A robots.txt file can also be posted on:

             

              • Subdomains (e.g., https://site.example.com/robots.txt)

              • Non-standard ports (e.g., https://example.com:8181/robots.txt)

          • The file applies only to paths within the protocol, host, and port where it is posted.

               

                • For instance, rules in https://example.com/robots.txt apply only to https://example.com/—not to https://m.example.com/ or http://example.com/.

            • The file must be UTF-8 encoded (including ASCII characters). Google may ignore characters outside the UTF-8 range, potentially rendering the rules invalid.

          Source: Google Search Console

          How to Check if Your Robots.txt File Is Working

          You can verify your robots.txt file using this link in Google Search Console:
          Check robots.txt in Google Search Console