Search Engine Working Process
1. Web Crawling
2. Caching
3. Indexing
4. Searching
1. Web Crawling- Search Engine works by storing information about web pages. Crawlers are used to create a copy of visited web pages for search engines to provide fast searches. We can use robots.txt to prevent crawlers to crawl web pages. Different search engines use different processes of crawling.
Search Engine Crawlers
Google Google-bot
Yahoo yahoo! Slurp
Msn MSN bots
Altavista Alta Vista bot
Ask Teoma bots
2. Caching- Web cache is like a mediator between web server(main server) and client server and watches requests come by and saving the copies of responses such as images, html pages. Then If there is any request for same URL, they can response that instead of asking main server again.
Types of Cache:-
1. Browser Cache 2. Proxy Cache 3. Gateway Cache
3. Indexing- Web Indexing is a process in which search engine store data to retrieve fast and accurate information. For example an index of 1000 documents can take milliseconds, but a sequential scan take hours.
Three Distinct Parts Of Web Indexing-
1. Web crawler finds and fetches web documents.
2.Indexer sorts every word on every document and stores resulting index of words in huge database.
3.The query processor, which compares your search query to the index and recommends the documents that considers most relevant.
4. Searching- Web search is a query that user put into search engine to take most relevant information.
1. Web Crawling
2. Caching
3. Indexing
4. Searching
1. Web Crawling- Search Engine works by storing information about web pages. Crawlers are used to create a copy of visited web pages for search engines to provide fast searches. We can use robots.txt to prevent crawlers to crawl web pages. Different search engines use different processes of crawling.
Search Engine Crawlers
Google Google-bot
Yahoo yahoo! Slurp
Msn MSN bots
Altavista Alta Vista bot
Ask Teoma bots
2. Caching- Web cache is like a mediator between web server(main server) and client server and watches requests come by and saving the copies of responses such as images, html pages. Then If there is any request for same URL, they can response that instead of asking main server again.
Types of Cache:-
1. Browser Cache 2. Proxy Cache 3. Gateway Cache
3. Indexing- Web Indexing is a process in which search engine store data to retrieve fast and accurate information. For example an index of 1000 documents can take milliseconds, but a sequential scan take hours.
Three Distinct Parts Of Web Indexing-
1. Web crawler finds and fetches web documents.
2.Indexer sorts every word on every document and stores resulting index of words in huge database.
3.The query processor, which compares your search query to the index and recommends the documents that considers most relevant.
4. Searching- Web search is a query that user put into search engine to take most relevant information.