Fred Black pqInternet E-Mail Signup

Insert your name and e-mail address to receive a short notice each time I make a new post.

First Name:

Last Name:

E-Mail Address:

E-Mail again:

NOTE: You will receive a confirmation email. You must click the link in the email to subscribe. Please check your spam folder(s) if you don't receive the email.

*I value your privacy and will never sell, rent, giveaway, or abuse your information.

 

Internet Marketing, Internet Business, Make Money Online, Work from Home

Internet Business and Marketing

pqInternet.com

About Fred W. Black.

Fred W. Black

Link to Me!

How to Link to this Blog.

Categories

All

ClickBank

Copywriting

Funny

Internet Business

Internet Marketing

Life & Law of Attraction

Search Engines: SEO

Technology

Traffic

Truth and Freedom

Web Site Design, HTML, CSS

Recent Entries

Marketing Lesson from the Homeless...

When Good CSS Goes Bad

The Honest-to-God, Unvarnished Truth About Success

Are you Fishing with Bird Seed?

Why Do Smart People Sometimes Find Success So Difficult?

Using My Software to Determine Why a Site Ranks Poorly in Google

Toilet Paper Millionaire!

Freds Inverse Law of Marketing Intelligence (the cream, as well as, the crap rises to the top)

Are You Failing Miserably or Miserably Failing?

Know Yourself

Are You Putting Cash in the Trash?

Behind Bars: Could the New FTC CAN-SPAM Rules Land You In JAIL?

Who Cares?

SEO: Number 1 and Number 2

Beyond SEO - The Tale of the Three Legged Dog

All Entries

Recently Commented On

The Honest-to-God, Unvarnished Truth About Success

Marketing Lesson from the Homeless...

Who Cares?

Build It and They Will Come...

When Good CSS Goes Bad

Using My Software to Determine Why a Site Ranks Poorly in Google

Toilet Paper Millionaire!

Are you Fishing with Bird Seed?

Archives

All

Say No to the
No Follow Tag

Books & Things.


 


 


 


 

« Previous | Home | Next »

 

Robots.txt Files: Fence off Sections of Your Web Site from Search Engines.

August 13, 2007

Robots.txt Files: Fence off Sections of Your Web Site from Search Engines.Do you ever forget to do something that's really simple? It's easy to overlook some of the simple things when you're worried about the more complex issue of Search Engine Optimization (SEO) or getting traffic to your website, etc. Robots.txt files fall into that category. Do you even have a robots.txt file on your site? It's very simple and can help with your site's ranking in the search engines a couple of ways.

What is a robots.txt file?
A robots.txt file is a small text file that you place in the root directory of your web site. You can list directories that robots (search engine spiders) should not visit. You can get specific if you'd like and specify different things for different robots (search engines) by targeting specific user-agents, but generally that's not necessary.

Here's a sample robots.txt file:
User-agent: *
Disallow: /cgi-bin/
Disallow: /print-friendly/
Disallow: /~john/

Some Rules:

  • Specify one subdirectory per line.
    • The above example would stop robots from crawling the cgi-bin, print-friendly, and ~john directories.

  • You can only have one robots.txt file and it has to be in the root directory of your site.

Other ways to do it:
There is also a META tag that has just about the same meaning.
Use this meta tag in the header of a page you don't want crawled (indexed by a search engine).

<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Important:
Because all robots may not support or respect the robots.txt file or the META tag, your best bet is to use both.

SEO:
You may be wondering exactly how this could affect SEO (Search Engine Optimization)? The biggest way is with duplicate content. Search engines do not want to find duplicate content. If you have a printer friendly version of your blog entries, then you want to stop the duplicate printer layout versions of your entries from being crawled. I have my templates and publishing parameters setup to put all the printer friendly pages in a specific folder and I list this folder in my robots.txt file. I also configured the publishing template for the printer friendly pages to use the META tag shown above. This stops most search engines from indexing the printer friendly versions of the pages and therefore eliminates a possible problem caused by having duplicate content.

The second way this helps is to stop the printer friendly pages from showing up in the search engine result pages at all. I want people finding my site by searching to land on the regular versions of my pages, not the printer friendly versions. My printer friendly template strips off the left and right columns and therefore removes most navigation. By only having the regular pages listed in the search engines the experience of a visitor to my site is better.

Of course there are other reasons to stop certain subdirectories from being crawled. You may have products such as eBooks, training videos, or scripts and test pages that you do not want showing up in the results of a search. Because the robots.txt file is not respected by every spider crawling around out there, you should always secure sensitive data in subdirectories that are password and username protected.

The robots.txt file is just a small, easy to create text file, but small things like this can add up to make a big difference.

Learn more about robots.txt files here: www.robotstxt.org/wc/robots.html.

Fred Black

About the Author

Fred Black is an experienced programmer, web site developer, online business operator, systems integrator, father, husband, musician, and songwriter. Visit his Internet Business Blog at: http://www.pqInternet.com.


Insert your name and e-mail address to receive a short notice each time I make a new post.

First Name:

Last Name:

E-Mail Address:

E-Mail again:

NOTE: You will receive a confirmation email. You must click the link in the email to subscribe. Please check your spam folder(s) if you don't receive the email.



Reddit Add this Article to Onlywire del.icio.us Technorati StumbleUpon Netscape Sphinn Top Blogs

Tip Jar: Leave a Donation

Comments: 0,   TrackBacks: 0.

Posted by Fred on August 13, 2007 | Printer-Friendly

TrackBack: http://www.pqInternet.com/Blog/mt-tb.cgi/76


Assigned Categories: Search Engines: SEO | Web Site Design, HTML, CSS


Related Entries:


You may reprint or distribute this article as long as you leave the content and the About the Author resource box at the end intact.

 

 
Comments and TrackBacks 
 

 


Post A Comment









Remember personal info?






Subscribe

Insert your name and e-mail to receive a short notice each time I make a new post.

E-mail Address:

E-mail address again:

First Name:

Last Name:

NOTE: You will receive a confirmation email. You must click the link in the email to subscribe. Please check your spam folder(s) if you don't receive the email.

My Courses & Products

Web Site Creation Course: HTML, CSS, and More... Free!
 

Interactive Web Site Course: PHP, JavaScript, Forms, and More... Free!
 

Make Money Online: Complete Online Business Course... Free!
 

Article Marketing Software.
 

Photo Gallery - pqGallery.com
 

Blue Solar Water Bottles
 

Pretend with Miss Kim.
Children's Creative Movement DVD
Pretend with Miss Kim (my wife) takes children on an imaginary fun time as they learn the basic movements of ballet and dance.

Search


RSS Feeds, etc.

Subscribe to Blog Feed:

RSS Feed
RSS 2.0 Feed for www.pqInternet.com. RSS 2.0 Feed
RSS 1.0 Feed
Add to Google Toolbar

Add www.pqInternet.com, to Google.
Add www.pqInternet.com, to My Yahoo!
Add www.pqInternet.com, to My MSN.
Add www.pqInternet.com, to My AOL.
Subscribe to www.pqInternet.com, with Bloglines
Subscribe to www.pqInternet.com, in NewsGator Online
Subscribe to www.pqInternet.com, in Rojo
Subscribe to www.pqInternet.com, in FeedLounge
Subscribe to www.pqInternet.com, in NetVibes
Add www.pqInternet.com, to Your Technorati Favorites!
Subscribe to www.pqInternet.com in myEarthlink
Add www.pqInternet.com, to Windows Live

What are Blog Feeds and RSS anyway?

Products I Use & Recommend.

www.3WayLinks.Net

Wordtracker Keyword Research Tool

www.aweber.com Opt-In List Management.

1&1 Hosting

Blogroll

Clayton Makepeace

Terry Dean

ProBlogger

Eric Graham

Michel Fortin.com

Jonathan Leger

Robert Phillips

Dr. Joe Vitale

Ryan Healy

Richard Lee

G. Brent Riggs

Search Engine Journal

Links

Cell Phones for Soldiers

the IconFactory

Fred Black Music

Niall Kennedy

The Lions Paw

Web Hosting.

My Recommended Web Hosting Service: 1&1 Hosting

Mugs, Mousepads, etc.

About this Blog

By:Fred W. Black

Contact Information

Powered by:Movable Type 3.34.

Copyright 2006 -2008, PhaseQuest.Com.
All rights reserved.

Resources L2

Some photos are by: Lee Hinshaw Photography

© Copyright 2006 - 2008 PhaseQuest, all rights reserved.