WordPress Tip: Avoid Crawling Your Site Too Early

0
722
Wordpress no crawl zone

Here’s a tip for new WordPress installations on avoiding search engines from crawling the site too early.

I see this common problem that new WordPress users commit way too often that I have to write about it. The problem is, people go and install WordPress on a public web server and go merrily along creating their website. Days pass and lots of work is done on it. However, what they don’t know is that the search engine’s just maybe crawling their site!

When you install WordPress onto your HTTP server for the first time, you must avoid this at all costs! You do not want your website crawled by Google while you are working on your website. If Google indexes the pages on your site before you have the site completed, you will run into search engine problems with pages you don’t even want. You will then have one heck of a time trying to get rid of them out of Google search index.

Let’s avoid that entire issue and be smarter. By doing so, you will save yourself a lot of grief and wasted time.

Create An index.html Page

What you should do is first create an index.html page and place it at the root of your domain. Make it a very simple HTML 5 document telling all visitors that your site is under construction. Make sure you run this page through the W3C validator to avoid any search engine penalties on this home page document. You always want to start off on a good foot with Google!

Create a robots.txt File

As an additional step, open up your text editor and put this in it:

User-agent: *
Disallow: /

Save the file as robots.txt and put it on the root of your domain folder. This will prevent all search engines from crawling your WordPress site.

Install WordPress in its Own Folder

Next, install WordPress in its own folder. This is very important. Your domain folder directory on the web server should look something like this:

/public_html
  /domain.com
    robots.txt
    index.html
    /wordpress

If Google happens to come along and crawl your site, it is not going to know anything about your website because there are no anchor tag links from the index.html file to the WordPress folder. You also told the spiders to shoo away.

This way, you can develop your WordPress website behind the scenes and not get any of the pages of your site crawled.

When you install, don’t forget to disable the ability for search engines to know about your site. It will prompt you for this in the installation. If you forget, you can go to Settings | Reading | Search Engine Visibility and toggle it on. Make sure that it is to be safe.

Going Live for Your WordPress Website

Follow my instructions on what to do after installing WordPress.

To go live, follow the instructions in the section “Install in another directory”. This will point the web server at your wordpress folder, making it go into effect.

Finally:

  1. Turn off the search engine visibility setting
  2. Remove the two lines you entered in robots.txt
  3. Remove the index.html file

You are now published.