Internet Business Blog
« Previous | Home | Next »

 

Robots.txt Files: Fence off Sections of Your Web Site from Search Engines.

Share/Bookmark

August 13, 2007

Robots.txt Files: Fence off Sections of Your Web Site from Search Engines.Do you ever forget to do something that's really simple? It's easy to overlook some of the simple things when you're worried about the more complex issue of Search Engine Optimization (SEO) or getting traffic to your website, etc. Robots.txt files fall into that category. Do you even have a robots.txt file on your site? It's very simple and can help with your site's ranking in the search engines a couple of ways.

What is a robots.txt file?
A robots.txt file is a small text file that you place in the root directory of your web site. You can list directories that robots (search engine spiders) should not visit. You can get specific if you'd like and specify different things for different robots (search engines) by targeting specific user-agents, but generally that's not necessary.

Here's a sample robots.txt file:
User-agent: *
Disallow: /cgi-bin/
Disallow: /print-friendly/
Disallow: /~john/

Some Rules:

  • Specify one subdirectory per line.
    • The above example would stop robots from crawling the cgi-bin, print-friendly, and ~john directories.

  • You can only have one robots.txt file and it has to be in the root directory of your site.

Other ways to do it:
There is also a META tag that has just about the same meaning.
Use this meta tag in the header of a page you don't want crawled (indexed by a search engine).

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Important:
Because all robots may not support or respect the robots.txt file or the META tag, your best bet is to use both.

SEO:
You may be wondering exactly how this could affect SEO (Search Engine Optimization)? The biggest way is with duplicate content. Search engines do not want to find duplicate content. If you have a printer friendly version of your blog entries, then you want to stop the duplicate printer layout versions of your entries from being crawled. I have my templates and publishing parameters setup to put all the printer friendly pages in a specific folder and I list this folder in my robots.txt file. I also configured the publishing template for the printer friendly pages to use the META tag shown above. This stops most search engines from indexing the printer friendly versions of the pages and therefore eliminates a possible problem caused by having duplicate content.

The second way this helps is to stop the printer friendly pages from showing up in the search engine result pages at all. I want people finding my site by searching to land on the regular versions of my pages, not the printer friendly versions. My printer friendly template strips off the left and right columns and therefore removes most navigation. By only having the regular pages listed in the search engines the experience of a visitor to my site is better.

Of course there are other reasons to stop certain subdirectories from being crawled. You may have products such as eBooks, training videos, or scripts and test pages that you do not want showing up in the results of a search. Because the robots.txt file is not respected by every spider crawling around out there, you should always secure sensitive data in subdirectories that are password and username protected.

The robots.txt file is just a small, easy to create text file, but small things like this can add up to make a big difference.

Learn more about robots.txt files here: www.robotstxt.org/wc/robots.html.

Fred Black

Share/Bookmark

Tip Jar: Leave a Donation

Comments: 1,   TrackBacks: 0.

Posted by Fred on August 13, 2007 | Printer-Friendly

TrackBack: http://www.pqInternet.com/Blog/mt-tb.cgi/76

You may reprint or distribute this article as long as you leave the content intact, list me at the author, and include a link back to this web site... see this page: linking back to this blog.

Assigned Categories: Search Engines: SEO | Web Site Design, HTML, CSS

Related Entries:


Get Free Updates! Enter your name and e-mail address to receive a short notice each time I make a new post.

First Name:

Last Name:

E-Mail Address:

E-Mail again:

NOTE: You will receive a confirmation email. You must click the link in the email to activate your free updates. Please check your spam folder(s) if you don't receive the email.


Comments and TrackBacks
  Comments:
  1. From: Dr. Robert Doebler

    Thank you for the information on SEO. It has really helped me.

    Posted by Dr. Robert Doebler on September 13, 2011 5:27 AM

 


Post A Comment




Remember personal info?




Comment Policy <--- Read the comment policy (Updated 1/13/2010).

About  Contact  Free Products Fred W. Black

RSS, EMail, Facebook...

Subscribe by EMail

RSS 2.0 Feed for www.pqInternet.com.
Add to Google Toolbar

Free Updates via EMail

Receive Free Updates.

Free Products and Software.

My Services

www.pqWorks.com

Search

Link to Me!

How to Link to this Blog.

Products

Products I Use & Recommend

www.3WayLinks.Net

www.1WayLinks.Net

30 Days to Change Your Life!

Z Plus Subliminal Clearing

Free Traffic System

Wordtracker Keyword Research Tool

www.aweber.com Opt-In List Management.

1&1 Hosting

Categories

All

ClickBank

Copywriting

Funny

Internet Business

Internet Marketing

Life

Search Engines: SEO

Social Media

Technology

Traffic

Truth and Freedom

Videos (free)

Web Site Design, HTML, CSS

Recent Entries

The Lost Art (and Appreciation) of Quality

Knowledge without Wisdom is Dangerous.

11 Reasons You're Failing

How to Change Your Life in 30 Days?

Barney Fife Chases the SEO Bandits at J.C. Penny

Have You Been Slapped by the Google Ranking Adjustment?

Advertising on Facebook

How to Use Social Media to Grow a Brick and Mortar Business...

Facebook Fan Pages and Static FBML Part 3: iFrames!

Facebook Fan Pages and Static FBML Part 2

Facebook Fan Pages and Static FBML - Part 1

Texting Social Media Updates - Useful vs. Useless

Success or Failure in an Internet Business Depends on Who's Looking Back at You in the Mirror...

Bullets from God - The Greatest Copywriter of All Time?

Success is not Sticky - You Have to Hold the BALL!

All Entries

Recently Commented On

Bullets from God - The Greatest Copywriter of All Time?

11 Reasons You're Failing

Facebook Fan Pages and Static FBML Part 3: iFrames!

How to Change Your Life in 30 Days?

Barney Fife Chases the SEO Bandits at J.C. Penny

Knowledge without Wisdom is Dangerous.

The Lost Art (and Appreciation) of Quality

Are you Fishing with Bird Seed?

Archives

All

Money Attracting Products

Make Custom Gifts at CafePress

Blog Roll

Do More Than Pray (.com)

Terry Dean

Ryan Healy

ProBlogger

Internet Business Resources Blog

Michel Fortin

G. Brent Riggs

Jonathan Leger

Mark J Ryan

Dr. Joe Vitale

Search Engine Journal

Effortless Abundance

Friday Traffic Report

Links

Burlington Dance Center

Front Street Playschool

the IconFactory

Fred Black Music

Daryl Laws Sports Performance Blog

Williams High School Booster Club

Jacob Ingle

Light Peak

Repairing Canon Lens Error

Cell Phones for Soldiers

Web Hosting

My Recommended Web Hosting Service: 1&1 Hosting

About this Blog...

By:Fred W. Black

Contact Information

Powered by:Movable Type 3.34.

Copyright 2006 -2011, PhaseQuest.Com.
All rights reserved.

Subscribe by EMail RSS 2.0 Feed for www.pqInternet.com.

Add to Google Toolbar
My Facebook Fan Page
Twitter
Add www.pqInternet.com, to Google. Add www.pqInternet.com, to My Yahoo! Add www.pqInternet.com, to My MSN. Subscribe to www.pqInternet.com, with Bloglines Add www.pqInternet.com, to Your Technorati Favorites! Add www.pqInternet.com, to Windows Live

rs

Some photos are by: Lee Hinshaw Photography

© Copyright 2006 - 2011 PhaseQuest, all rights reserved.

 

Get Free Updates!

Insert your name and e-mail address to receive a short notice each time I make a new post.

First Name:

Last Name:

E-Mail Address:

E-Mail again:

NOTE: You will receive a confirmation email. You must click the link in the email to activate your free updates. Please check your spam folder(s) if you don't receive the email.

*I value your privacy and will never sell, rent, giveaway, or abuse your information.