Click to See Complete Forum and Search --> : File searching


ggmn
April 5th, 2006, 07:45 PM
Not posted in this topic before and it looks a bit quieter than others..... but if anyone knows and has the time to answer:

Why is it that a dialup connection can request a single web page from the other side of the world, out of the millions of sites there are, and have it downloaded in seconds, yet a simple file name search under windows, using even a lightning fast PC, can take what seems like forever?

This question arose from trying to build a search facility into a simple program, but Windows' own find feature isn't exactly fast, and when turning to the web for answers, I thought, wait a minute............

yiannakop
April 6th, 2006, 01:33 AM
Well, it seems that the subject of search engines in the net is a whole academic field investigating fast and accurate algorithms for parsing, indexing and searching web pages. It is obvious that such indexing schemes have not been implemented in windows file searching. Here (http://www-db.stanford.edu/~backrub/google.html) is on of the first paper (maybe the first, I am not sure) of the people who created google. It is quite old, but gives a general idea of how google works.

:wave:

Ejaz
April 6th, 2006, 03:32 AM
Well, I don't think that this comparison is justifiable. When we request for some document at internet, then we not only provide the document name (like say index.html), but also the complete url, which sometimes also contains the information of the folders/sub-folders, machine identficiation (IP) etc and finally the file which is required. So, basically our criteria is already very narrow down here. The webserver has to look into the share folders and see if the resource requested exist or not, if it is, there you go.

Now, in case of local search, if you just search a file in the entire computer, then even with indexing, it may not be that much suitable, if you provide the leave node, which is supposed to contain the target file. In this case, we'll be equavilent to what we do with internet. And eventually, this way the local search will be faster. :)

yiannakop
April 6th, 2006, 04:42 AM
Well, I don't think that this comparison is justifiable. When we request for some document at internet, then we not only provide the document name (like say index.html), but also the complete url, which sometimes also contains the information of the folders/sub-folders, machine identficiation (IP) etc and finally the file which is required. So, basically our criteria is already very narrow down here. The webserver has to look into the share folders and see if the resource requested exist or not, if it is, there you go.


Sorry, I didn't understand the initial question correctly. As Ejaz says, ggmn does not ask for internet searching (like google), but for downloading the page when url is given. In that case the difference is obvious as Ejaz explains. But as I said in my previous message, even in the case of internet searching, the algorithms are much faster than local windows file searching, even though the first searhing is executed over millions of web pages.
:wave: