Board logo

subject: Harnessing the Power of Robots.txt -download article [print this page]


Harnessing the Power of Robots.txt -download article

Once we have a website up and running, we need to make sure that all visiting search engines can access all the pages we want them to look at.

Get your Landing Pages to Work Hard Lift Conversion by up to 200% For Details Visit http://www.ConversionLandingPages.com

Sometimes, we may want search engines to not index certain parts of the site, or even ban other SE from the site all together.

This is where a simple, little 2 line text file called robots.txt comes in.

Robots.txt resides in your websites main directory (on LINUX systems this is your /public_html/ directory), and looks something like the following:

User-agent: *

Disallow:

The first line controls the "bot" that will be visiting your site, the second line controls if they are allowed in, or which parts of the site they are not allowed to visit

If you want to handle multiple "bots", then simple repeat the above lines.

So an example:

User-agent: googlebot

Disallow:

User-agent: askjeeves

Disallow: /

95000+ Quality PLR Articles Pack Buy PLR Articles Pack & Save Now http://www.Free-Plr-Article.com

This will allow Goggle (user-agent name GoogleBot) to visit every page and directory, while at the same time banning Ask Jeeves from the site completely.

To find a "reasonably" up to date list of robot user names this visit

Even if you want to allow every robot to index every page of your site, it's still very advisable to put a robots.txt file on your site. It will stop your error logs filling up with entries from search engines trying to access your robots.txt file that doesn't exist




welcome to loan (http://www.yloan.com/) Powered by Discuz! 5.5.0