What is a robots.txt file?

A robots.txt file is a plain text file that sits at the root of your website and tells search engines which pages to crawl and which to skip. Every website should have one. It lives at yourdomain.com/robots.txt and is the first file Google reads when it visits your site.

Should I block AI bots in my robots.txt?

It depends on your goal. Blocking AI training bots like GPTBot prevents your content from being used to train new AI models. However blocking bots like Perplexity or Google AI Overview means you will not be cited in AI search results. The smart approach is to block training-only bots and allow citation bots. Groew's robots.txt generator handles this distinction automatically.

Where do I upload my robots.txt file?

The robots.txt file must be placed in the root directory of your website so it is accessible at yourdomain.com/robots.txt. On WordPress you can edit it via Yoast SEO or a plugin. On Shopify it is managed through the admin panel. On other platforms you upload it directly to the root folder via FTP or your file manager.

Does robots.txt hide pages from Google completely?

No. Blocking a URL in robots.txt stops Google from crawling it, but Google may still index it if other pages link to it. To completely remove a page from Google's index you need to use a noindex meta tag on the page itself, or submit a removal request in Google Search Console. robots.txt controls crawling, not indexing.

How often should I update my robots.txt?

Update your robots.txt whenever you add new sections to your site that should not be crawled, change your sitemap URL, or want to add or remove bot restrictions. For most small and medium sites, reviewing robots.txt once every six months is enough. After any major site restructure, review it immediately.

Architecting Authority

Groew / Free Tools / robots.txt Generator

Free SEO Tool

Free robots.txt Generator

Tell search engines exactly what to crawl and what to skip. Block the right paths, protect from AI training bots, and add your sitemap. Download in 60 seconds.

ⓘ New to robots.txt? Tap here to understand what it does and why it matters

A typical robots.txt looks like this

# Rules for all search engines
User-agent: *
Disallow: /wp-admin/
Disallow: /cart/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://yourdomain.com/sitemap.xml

What you need to know

✓ Lives at yourdomain.com/robots.txt. Google checks it every visit.

✓ Controls which pages Google crawls — not which ones it shows.

✓ A wrong setting can remove your whole site from Google. Use safe defaults.

✓ Every website should have one. Many sites have none or a broken one.

✓ This tool generates the correct file. You download and upload it once.

Your website platform Loads the recommended default settings for your platform. You can adjust everything after.

Block admin and technical paths These areas have no SEO value and waste your crawl budget. Safe to block on almost every site.

Block low-value content pages These pages often create duplicate content. Block them unless you specifically optimise them for SEO.

AI bot control Training bots use your content to build AI models. Citation bots use it to answer questions and link back to you. Block training, allow citation.

Sitemap URL Adding your sitemap URL helps Google find all your pages faster. Format: https://yourdomain.com/sitemap.xml

Crawl delay (optional) Seconds between bot requests. Google ignores this, but Bing and smaller bots respect it. Useful if your server is slow. Leave at 0 unless you have a reason to change it.

Enable crawl delay Adds Crawl-delay to your robots.txt

Live preview 0 lines

First time? Click Download robots.txt below. Then follow the upload instructions for your platform further down this page.

✦ Use this prompt in ChatGPT or Claude

Copy this personalised prompt and paste it into ChatGPT or Claude for deeper robots.txt advice.

Where to upload your robots.txt file

WordPress

Install Yoast SEO or Rank Math. Go to SEO Settings and find the robots.txt editor. Paste your file content there and save.

Shopify

Go to Online Store > Themes > Edit Code. Find robots.txt.liquid in the Templates folder. Replace the content and save.

Squarespace

Go to Settings > Advanced > URL Mappings. You cannot directly edit robots.txt on free Squarespace plans. Upgrade to Business plan to access Developer Mode.

Webflow

Go to Project Settings > SEO > robots.txt. Paste your file content in the text box and publish your site.

Wix

Go to SEO Tools > SEO Settings > Robots.txt. Wix provides a robots.txt editor in the dashboard for Premium plans.

Custom / FTP

Connect to your server via FTP or cPanel File Manager. Navigate to the root directory (same level as index.html). Upload your robots.txt file there.

Understanding your settings

What each robots.txt rule does

Knowing the difference between these three types of rules prevents accidental ranking loss.

Disallow

Block a path from crawling

Tells Google not to crawl this URL or folder. Does not remove the page from search results if it is already indexed. Use this for admin areas, duplicate content, and private pages.

Allow

Override a blocked path

When you block a folder, Allow lets you re-open specific paths within it. Example: block /wp-admin/ but allow /wp-admin/admin-ajax.php so dynamic features still work.

User-agent

Target a specific bot

User-agent: * applies rules to all bots. Naming a specific bot like GPTBot or Googlebot creates rules that only apply to that one. Useful for blocking AI training without affecting search rankings.

Alokk's perspective

Alokk Founder and Lead Growth Architect, Groew

In 2024 and 2025, the robots.txt file became important in a new way. AI companies began using it to respect content ownership. The smart move is not to block everything out of fear, and not to leave everything open out of ignorance. Block AI training bots that consume your content without giving you traffic. Allow AI citation bots that can put your brand in front of millions of AI search users. One client added the right bot rules to their robots.txt and within 90 days their brand started appearing in Perplexity and ChatGPT answers for their core keywords. The robots.txt file is no longer just a technical formality.

Common questions

robots.txt questions answered

A robots.txt file is a plain text file at the root of your website that tells search engines which pages to crawl and which to skip. It lives at yourdomain.com/robots.txt and is the first file Google reads when it visits your site. Every website should have one.

Yes. A badly configured robots.txt is one of the most common causes of pages disappearing from Google. If you block your CSS or JavaScript files, Google cannot render your pages properly. If you block your main content, it will not rank. Always use Google Search Console to test your file after any change.

It depends. Blocking AI training bots like GPTBot prevents your content being used to train new AI models but does not affect your Google rankings. Blocking citation bots like PerplexityBot means you will not be cited in AI search answers. The smart move is to block training-only bots and allow citation bots. This generator handles that distinction automatically.

The robots.txt file must sit at the root of your website so it is accessible at yourdomain.com/robots.txt. On WordPress use the Yoast SEO or Rank Math robots.txt editor. On Shopify go to Online Store, Themes, Edit Code, then find robots.txt.liquid. On Webflow it is in Project Settings under SEO. See the upload instructions further up this page for all major platforms.

No. Blocking a URL in robots.txt stops Google from crawling it but Google may still index the page if other pages link to it. To completely remove a page from search results you need a noindex meta tag on the page itself, or a removal request in Google Search Console. robots.txt controls crawling, not indexing.

Review your robots.txt whenever you add new site sections that should not be crawled, change your sitemap URL, or want to add bot restrictions. For most small and medium sites, reviewing it every six months is enough. After any major site restructure, review it immediately.

From Groew's Search Authority Team

The complete guide to robots.txt for business owners

Most robots.txt guides are written for developers. This one is written for founders and marketers who need to get it right without becoming an SEO expert.

Why every website needs a robots.txt file

Google crawls billions of pages every day using a limited budget assigned to each domain. Crawl budget is the number of pages Google will visit on your site in a given period. If you have no robots.txt, Google crawls everything including your admin panel, thank-you pages, cart pages, and search result URLs. Every wasted crawl is a crawl your important pages did not get. A properly configured robots.txt file directs Google to spend its entire crawl budget on pages that matter. For large sites, this is a significant ranking factor. For small sites, it still signals to Google that your site is technically well-maintained. Both matter for organic search infrastructure.

Read the complete guide

The five most dangerous robots.txt mistakes

The first mistake is accidentally blocking the entire site with Disallow: /. This removes every page from Google's crawl queue. The second is blocking CSS and JavaScript folders. Google needs to render your pages and if it cannot load your styles or scripts, it sees a broken page. The third is blocking your sitemap. The fourth is using a robots.txt from a template without adjusting it for your platform. The fifth is forgetting to add the sitemap URL at the bottom. None of these will destroy your site overnight but each one silently reduces your search performance month by month. This generator uses safe defaults based on your platform to avoid all five.

How to handle AI bots in 2025

In 2023 and 2024, every major AI company launched a web crawler. OpenAI launched GPTBot. Anthropic launched anthropic-ai. Google launched Google-Extended for Bard training. Common Crawl (used by many AI companies) continues to index the open web. The question for every website owner is: do you want your content used to train AI models? Many do not. The distinction to understand is that there are two different types of AI bots: training bots that consume your content to build AI models, and citation bots that read your content to answer user queries and link back to you. Blocking GPTBot prevents OpenAI from training on your content but does not affect whether ChatGPT cites you in search answers. Blocking PerplexityBot means Perplexity will not crawl your site and will not cite you in its AI search results. The Groew robots.txt generator defaults to blocking training bots while keeping citation pathways open, because AI citations are becoming as valuable as Google rankings for brand authority.

Testing your robots.txt after changes

After uploading your robots.txt file, verify it is accessible by visiting yourdomain.com/robots.txt in your browser. You should see the plain text content. Then go to Google Search Console, navigate to Settings, and use the robots.txt report to check for any crawl errors. Google also provides a URL inspection tool that shows whether a specific page is blocked by your robots.txt. For a thorough check, use the free SEO audit tool which flags crawlability issues alongside meta tags, headings, and schema. Set a reminder to review your robots.txt every six months or after any major site restructure.

Platform-specific robots.txt considerations

WordPress sites should always block /wp-admin/ but must allow /wp-admin/admin-ajax.php or dynamic features like forms and cart updates will break. Shopify automatically generates a robots.txt file but it can be overridden via the Liquid template system for advanced control. Squarespace has restricted robots.txt access on lower plans. Webflow provides a dedicated robots.txt editor in project settings. For custom-built sites, the file must be manually uploaded to the root directory via FTP or your hosting control panel. Always check that your file is accessible at the /robots.txt path immediately after uploading.

robots.txt as part of your Revenue Infrastructure

A robots.txt file is not exciting. But it is foundational. Every site that Groew audits as part of a B2B SEO system build gets a robots.txt review in the first week. We have seen sites where a single incorrect Disallow rule was responsible for 40 percent of their pages being uncrawled for over a year. Getting the technical foundation right is not optional. It is the first step in building organic authority that compounds. Every ranking you earn depends on Google being able to crawl and understand your pages in the first place. The Digital Landlord model starts with making sure the doors to your property are open to the right visitors.

Technical SEO is just the foundation. We build the system on top.

Groew audits your full technical setup, content architecture, and search authority as part of a free growth audit. We show you exactly what a 90-day infrastructure sprint would build in your market.