Board logo

subject: The Web And Its Secrets [print this page]


There are some 800 million pages filled with information when it comes to the World Wide Web but when people use Internet search engines they are only able to reach about half of these pages based on a new study by a team of computer scientists.

With regard to indexing the Web, search engines are making less of an effort to do so. When it comes to one institute, a computer and communications firm owns it.

Researchers were able to discover from a similar study done at the end of 1997 that the top six search engines collectively covered about 60 percent of the Web and from the results also came the discovery of the best engine being able to hit a third of all sites.

From one well known journal came a report last February about how only 42 percent of all sites in a test of 11 top search engines was found and how there is no program that is capable of covering more than about 16 percent of the Web.

There is the promise to equalize access to information when it comes to the Web but most of the time quality websites carrying a lot of good information are kept on the sidelines for search engines tend to index the sites that have more links to them.

It was estimated that there were around 320 million pages online but there is more ground that needs covering when it comes to Internet information and content because 14 months later they found out that there is more than double of these pages.

From the Web comes about 6 trillion bytes of information while the library of congress has 20 trillion bytes.

It was according to the random surfing exercise of 2,500 Web sites done by researchers that there were 3 million servers available publicly having about 289 pages per server on the average.

Considering that just a few sites may have millions of pages, they said that it is possible to have even more information available on the Net.

Their test of servers found about 83 percent of them contained commercial content company Web pages, catalogues and the like, with just 6 percent involving scientific or educational material, just under 3 percent with health information, just over 2 percent personal Web pages, and 2 percent with pornographic content.

Volume is not the element responsible for making so much of the Web hard to find but the techniques used by the search engines.

Finding pages for search providers is a matter of making use of two main methods namely user registration not to mention following links to find new pages.

It has been written by researchers that search engines are to blame for making a biased sample of the Web due to them following links to find new pages leading them to find and index pages that have more links to them.

Taking this into consideration, it is not a matter of being unable to do the indexing, this is a matter of making resources be beneficial to users in another way like offering valuable free email for some.

It has been mentioned by a search engine expert that because of how simple a majority of the information requests are most people fail to see what they are missing.

When it comes to this imbalance in cataloguing it is expected to continue for a few more years and this is because of the production of information content by humans to be posted on new sites being much slower when compared with the rate of increase in computer resources.

by: John Chambers




welcome to loan (http://www.yloan.com/) Powered by Discuz! 5.5.0