Why Pages Disallowed in Robots.txt Still Appear in Google
Meaning of Robots.txt
Meaning of Robots.txt
Robots.txt is a useful file which places in your website's root and controls how search engines index your pages. One of the most useful declarations is "Disallow" it stops search engines accessing private or irrelevant sections or pages of your website, e.g.
Disallow: /temp/
Disallow: /mypage.html/
You can Even Block Search Engines Indexing Every Page on Your Domain, e.g.:
User-agent: *
Disallow: /
Blocked Pages can Still Appear in Google HOW?
Take a little while to understand how and why it happens. Assume you have a page at http://www.abc.com/mypage.html containing confidential information about your company's new "coupon codes" project. You may want to share that page with partners, but don't want the information to be public knowledge just yet. Therefore, you block the page using a declaration in http://www.abc.com/robots.txt:
User-agent: *
Disallow: /mypage.html
A few weeks later, you're searching for "coupon codes" in Google and you found http://www.abc.com/mypage.html at 1st Page of Google. How could this happen? It means, Google abides with your robots.txt instructions, isn't?
However, this is not a violation of robots.txt rules. This happens because of very simple reason that Google found your link from elsewhere, means http://www.abc.com/mypage.html might be linked from any external website, so Google caught you from there. Meta information also comes from that particular external link, not from your page content.
There are Several Solutions that will Stop Your Pages Appearing in Google Search Results:
Set a "no index" Meta Tag: Google will never show your page or follow its links if you add this code to your HTML head section:
Use the URL removal tool: Google offer a URL removal tool within their Webmaster Tools.
Add authentication: Apache, IIS, and most other web servers offer basic authentication facilities. The visitor must enter a user ID and password before the page can be viewed. This may not stop Google showing the page URL in results, but it will stop unauthorized visitors reading the content.
Why Pages Disallowed in Robots.txt Still Appear in Google
By: Bob Smith
To Stay One Step up of the Google Dance wits SEO Web Design Company Barcelona En 3 D Con Google Earth Faster Google Chrome Tweak - Speed Up Google Chrome in Easy Steps PPC Basics Keyword Research Free Google Tools Benefits Of Using Google Adsense 5 Maneras De Usar Google Voice Para Su Negocio When To Use Or Not Use Google Adwords Quick info on Link Building - Fear of Google in Link Building Google Chrome Google Adwords - RIPOFF REPORT! How to get free google adwords vouchers and facebook vouchers Great SEO Tips for Google Adsense How Many Times Will You Get Listed On Google
www.yloan.com
guest:
register
|
login
|
search
IP(216.73.216.125) California / Anaheim
Processed in 0.018102 second(s), 7 queries
,
Gzip enabled
, discuz 5.5 through PHP 8.3.9 ,
debug code: 37 , 2343, 96,