Thursday, December 30, 2004

Dear All,

On the last day of the year, here is the fifth quiz in our continuing endeavour to create world class literature on information and communication technologies (ICTs) in our mainstream CYBER QUIZ series. Enjoy it.

Check this Forum on New Year's Day - January 1, 2005, and find a specially designed quiz for the NEW YEAR - a tribute to the year which will become history in some 15 hours from now!

Meanwhile, here is wishing you all

A VERY HAPPY NEW YEAR.

And happy quizzing!

Dr D.C.Misra
December 31, 2004.
___________________________________________________________________
CYBER QUIZ –5: Have Data? Will Search by Dr D.C.Misra
___________________________________________________________________________

An engine is a clever device, a result of ingenuity. So is the case with a search engine, a software program designed to search the vast store-house of information of the World Wide Web. Search engines, far more than portals, are true traffic hubs of the Web. “Have data? Will search” appears to have been a slogan of the year 2004. There appears to be search engine now for every thing and any where. Let us check.
______________________________________________________________________

1. What is a search engine?
2. How does a search engine differ from (a) directory, (b) surf engine, and (c) metacrawler?
3. (a) Which is the largest search engine and how much of the World Wide Web has been indexed by it, and (b) How many major search engines are there for searching the World Wide Web, and (c) Which was the first widely used search engine for searching the World Wide Web?
4. Search engine expert Chris Sherman calls it a “generonym,” a brand name used as a generic name for searching. Which search engine has, thus, become the default search engine of the Web surfers?
5. (a) Who founded the search engine Google, how many queries does it receive per day and after what is it named, and (b) What has BackRub got to do with search engine Google?
6. What are (a) IBM’s Clever Project, and (b) Microsoft’s Stuff I’ve Seen (SIS) Project?
7. Call it the Internet Gutenberg Revolution of the early 21st century. This revolutionary project, to be completed in six years, proposes to make freely accessible the entire collection of world’s books online. Who has launched the project?
8. If it was the most hyped technology craze of year 2000, what is InfraSearch, also known as gonesilent.com (
http://.gonesilent.com)?
9. Who won the gold, the silver, and the bronze medals at the Search Site Olympics 2002?
10. What is common between the following: (a) Highway 61, (b) Bigfoot, (c) Dogpile, (d) Colossus, and (e) Internet Sleuth?
11. What is common between the following: (a) Hot Bot, (b) Lycos, (c) Alta Vista, (d) Excite, and (e) Northern Light?
12. A number of search engines now exist for specialized searches. What do the following engines then search: (a) Blinkx, (b) CiteSeer.IST, (c) EESE (d) Technorati, and (e) Google Scholar?
13. Which are (a) five best Indian search engines, and (b) (i) most accurate, and (ii) most usable Indian search engines?
14. What is common between (a) Kenjin, (b) Web Check, (c) First Direct, (d) The Brain, and (e) Mohomine?
15. (a) What is the percentage of search result pages out of the all page views, and (b) How much traffic to websites is generated by the search engines?
16. (a) What is the percentage of all Internet sessions that start with a visit to a search engine, (b) How many searches are made worldwide each day, (c) How many searches are made each day from UK computers, and (d) What is the percentage of searches that goes no further than the first page of results?
17. (a) What is common between Overture, Espotting, FindWhat, IQSeek.com and SPRINKS, and (b) What is buzz index, who invented it, and what are buzz movers and buzz leaders?
18. This is a special kind of google, which searches not text, but three-dimensional (3-D) shapes, and that too in industrial databases, and its prototype has been developed by an Indian. Name him.
19. The search market has become highly competitive as it holds the key to Internet. Giants like Google, Yahoo, Microsoft, and Amazon are in the fray. If so, what is A9, when was it launched and who launched it?
20. With desktop search suddenly becoming hot in the year 2004, when did the following release their desktop search tools: (a) Google, (b) Copernic, (c) Yahoo!, (d) Microsoft, and (e) Ask Jeeves?
______________________________________________________________________
ANSWERS TO CYBERQUIZ–5: Have Data? Will Search by Dr D.C.Misra
______________________________________________________________________

1. A search engine is a program that creates its listings automatically by crawling the World Wide Web. It has three major elements. First, the spider which reads the pages on the Web. Second, the index which is a collection of pages, found by the spider. Third, a software which sifts through millions of pages recorded in the index to match a search and rank them in order of relevance. Also sometimes called a spider, a crawler, a worm, or a knwbot (knowledge robot), it searches the World Wide Web by looking for titles of documents, uniform resource locators (URLs), headers, or text.
Search engines are of two types – 1. General, and 2. Specialized. The general search engines cover a wide variety of subjects (for example,
http://www.google.com/) while specialized search engines cover special subjects or topics, for example, news search engines (say, http://news.altavista.com/), speciality, for example, computer search engines (say, http://download.cnet.com/), medical search engines (say, http://www.hon.ch.MedHunt/), etc.
2. (a) A directory, unlike a search engine, is based on listings prepared by human beings. A short description of a Web site is submitted to the directory, on the basis of which listings are prepared. A search then finds matches only in the descriptions submitted and not the entire website. Examples of directories include Yahoo! and Lycos which started as small university projects (Yahoo! at Stanford University and Lycos at Carnegie Mellon University), (b) A surf engine provides information, constantly updated, about the sites visited, their ownership, popularity, ratings and related sites. The term was invented by Jaquith. Its example is Alexa (short for Library of Alexandria which made an attempt to collect all human knowledge at one place), founded in 1998 by Brewster Kahle, the inventor of Wide Area Information Server (WAIS, pronounced ways), a precursor to the Web (For free download of Alexa, visit its Web site
http://www.alexa.com/). It is based on uniform resource locators (URLs) and not on keywords as is the case with search engines, and (c) A metacrawler, also called a meta search engine, is a search engine of search engines, that is, it searches other search engines and directories. Examples include All4one (four search engines) (www.all4one.com), Beaucoup (10 search engines) (www.beaucoup.com), MetaCrawler (www.metacrawler.com), Mamma (http://www.Mamma.com), Dogpile (http://www.Dogpile.com), Web Ferret (http://www.ferretsoft.com/), and Search (800 search engines) (http://www.search.com/).
3. (a) Inktomi (purchased by Yahoo! in December 2002). It has indexed only about half Web. (Source: Michael Spector, The New Yorker), (b) Only about two dozen, and (c) WebCrawler. This program became the first widely used search engine in 1993.
4. Google (
http://www.google.com/). It was awarded the best brand name on the Internet. It went online on September 15, 1997. Google, Inc., founded in 1998, went public in August 2004. Google uses an advanced search technology – PageRank™ technology and hypertext – matching analysis developed by its founders. The importance of Web pages is calculated by solving an equation of 500 million variables and more than 2 billion terms. All this is done under half a second!
5. (a) Lawrence Page, 29 (son of a computer science professor) and Sergey Bin, 28 (a native of Moscow), Stanford University graduate students. Their company (more than 1,000 employees with more than 50 Ph.Ds) – Google – is based in Mountain View, California. Google search engine receives more than 200 million queries each day. More than half of the search requests come from outside the United States. It searches more than 8 billion web pages (8,058,044,651 web pages (
http://www.google.com/, as on December 25, 2004, to be exact and up to date). It is named after google which is a number - 10 raised to the power of 100 or the numeral one followed by hundred zeros, and (b) A precursor to search engine Google. By January 1996 Larry Page and Sergey Brin had begun collaboration on a search engine BackRub. It was so named for its unique ability to analyze the “back links” pointing to a given website. (Source: http://www.google.com/corporate/history.html).
6. (a) It is a search engine which is being used only at the IBM Almaden Research Center in San Jose, California (
http://www.almaden.ibm.com/cs/k53/clever.html). It is an attempt to fine-tune the search on the Web by identifying ‘hub pages’. The ‘hub pages’ are identified by rating the links. The approach thus does not look only at keywords. The Clever project is a part of the Computer Science Principles and Methodologies Department at the IBM Almaden Research Center, and (b) Stuff I’ve Seen (SIS) is a “prototype tool that makes it easy for you to find information you've seen before, whether it came as email, attachments, files, web pages, appointments, tablet journal entries, etc. (http://research.microsoft.com/adapt/sis/index.htm). Stuff I've Seen is developed by the Adaptive Systems and Interaction Team at Microsoft Research.
7. Google (
http://www.google.co.in/intl/en/press/pressrel/print_library.html). The company announced on December 14, 2004 that it has reached agreements with five of the most celebrated libraries in the world to digitise more than 15 million books and make them freely accessible on the Internet. Costing $10 per book, the project involves Oxford (Bodleian – up to 1.5 million out of 8 million books), Stanford (8 million books), Michigan University (7 million books), Harvard (40,000 out of 15 million books) and New York Public Library (fragile works). (Source: Reid, Tim and Amy Hunter, Washington (2004): World's leading libraries agree to put books online, Times on Line, December 15, http://entertainment.timesonline.co.uk/article/0,,2-1403621,00.html).
8. It is a search engine with super powers based on peer-to-peer (P2P) computing Gnutella. The traditional search engines search a central index of Web content while InfraSearch searches all the computers in the network giving latest information. InfraSearch is being designed by Gene Klan, a 23-year old programmer and his friends. InfraSearch was acquired by Sun Microsystems in February 2001 to become part of Sun's
JXTA (Juxtapose) project. It has roots in University of California, Berkeley's Experimental Computing Facility.
9. Google got the gold medal (with an amazing 65.95 out of 72 points), Lycos edged out MSN (with 49.57 points as against MSN’s 49.08 points) to obtain the silver medal, and MSN (with 49.08 points) obtained the bronze medal. There were five finalists for the Search Site Olympics 2002: 1. Alta Vista (46.40), 2. Excite (disqualified), 3. Google (65.95), 4. Lycos (49.57), and 5. MSN Search (49.08) (Figures in parentheses indicate the scores obtained out of 72 points). The Search Site Olympics were organized by Cnet (
http://www.cnet.com/software). The search engines did not participate in the Olympics as such. On the other hand, they were evaluated against set criteria by a panel of judges by virtue of their existence on the World Wide Web as search engines.
10. (a) Highway (
http://www.highway61.com), (b) Bigfoot (http://www.bigfoot.com), (c) Dogpile (http://www.dogpile.com), (d) Colossus (http://www.searchenginecolossus.com), and (e) Internet Sleuth (http://www.isleuth.com) are all meta search engines.
11. They are all search engines. (a) Hot Bot (
http://www.hotbot.com), (b) Lycos (http://www.lycos.com), (c) Alta Vista (http://www.altavista.com), (d) Excite (http://www.excite.com), and (e) Northern Light (http://www.nlsearch.com) are all general search engines.
12. (a) An integrated search tool (
http://www2.blinkx.com/overview.php), (b) An academic search engine and digital library hosted by Pennsylvania State University (Penn State)'s School of Information Sciences and Technology ((http://citeseer.ist.psu.edu/citeseer.html), (c) An engineering electronic journal search engine, based at Heriot Watt University in Edinburgh, United Kingdom’s EEVL, "the Internet Guide to Engineering, Mathematics andComputing." (http://www.eevl.ac.uk/about.htm), (d) A real-time search engine for blogs (http://www.technorati.com/about.), and (e) Google’s academic search engine launched on November 18, 2004 (beta version) (http://scholar.google.com/scholar/about.html#about).
13. (a) 1. 123 India (http://www.123india.com), 2. Digital HT (http://www.digitalht.com), 3. India Times (http://www.indiatimes.com), 4. Khoj (http://khoj.com), and 5. Locate India (
http://www.locateindia.com), (ii) Digital HT and India Times belong to the newspapers The Hindustan Times and The Times of India respectively, and (b) (i) Locate India (http://locateindia.com), and (ii) 123 India (http://123india.com) (Source: Computers@Home, January 2000, a New Delhi magazine which has since ceased publication).
14. (a) Kenjin (
http://www.kenjin.com), (b) Web Check (http://www.webtop.com), (c) (http://www.firstdirect.co.uk), (d) The Brain (http://www.thebrain.com), and (e) Mohomine (http://www.mohomine.com) are all new search engines.
15. (a) 3.5 per cent or one in 28 pages (Source: Alexa Insider, June 1, 1999), and (b) 7 per cent (Source: SatMarket, December 19, 2000).
16. (a) 86, (b) Upwards of 400 million, (c) Upwards of 20 million, and (d) 48. (Source: Alasdair Reid, Campaign © Brand Republic / The Economic Times, New Delhi, April 23, 2003, Wednesday, Brand Equity, p-3).
17. (a) They are top five paid-for (pay per click) search engines (SEs). For details, visit the website
http://www.thewebseye.com/pay-per-click.htm, and (b) It is daily index of popularity of a subject as revealed by the search queries made for it. Invented by Yahoo.com (Yahoo, Inc.), which has been selling it to companies since May 2000, a subject’s buzz score is the percentage of users searching for that subject on a given day multiplied by a constant to make the number easier to read. Each point is equal to 0.001 per cent of users searching on Yahoo! on a given day. For example, a buzz score of 500 for “Pokeman” translates to 0.5 per cent of all users searching on Yahoo!
Buzz movers are the subjects with the greatest percentage increase in buzz score from one day to next. The subjects with the greatest buzz score (most searched subjects) on a given day are called buzz leaders. The index is published Tuesday to Saturday and has a time lag of two days (needed for data processing and result verification). The Web site also maintains an archive, which goes back to September 2000. For details, visit the Web site
http://buzz.yahoo.com/.
18. Karthik Ramani, a professor of mechanical engineering and director of the Purdue Research and Education Center for Information Systems in Engineering, or PRECISE. He is a 1985 product of Indian Institute of Technology (IIT), Madras. The method has been detailed in a research paper written by Ramani, doctoral student Kuiyang Lou and Sunil Prabhakar, an assistant professor of computer science. In this method a 3-D model of a part is converted into a bunch of cubes called voxels, or volume elements, which are further converted into “skeletal graph” based on “feature vectors.” (Source:
http://www.innovations-report.com/html/reports/information_technology/report-27629.html and Hindustan Times, New Delhi, April 15, 2004, p-21, quoting Press Trust of India, London).
19. A9 is a search engine, which was launched on April 14, 2004 (beta version) by A9.com, Inc., a separately branded and operated subsidiary of Amazon.com, Inc., opened in October 2003 at Palo Alto, CA. A9 is built on technology licensed from Google but, in addition to web search results, it gives book results from Amazon.com including Search Inside the Book,TM site info and diary. For searching, however, an Amazon.com account is required. For details, visit the website
http://www.a9.com/-/company/whatsCool.jsp. See also Gaither, Chris, Los Angeles (LAT-WP) (2004): Amazon.com enters online search market through the back door, The Indian Express, New Delhi, April 15, Friday, p-14).
20. (a) October 14, 2004. Check Google Desktop Search Beta at
http://desktop.google.com/about.html, (b) October 18, 2004. Check Copernic Desktop Search (CDS) at http://www.copernic.com/en/products/desktop-search/, (c) December 11, 2004. Yahoo! has licensed the X1 search software for Windows from tech incubator Idealab (http://www.theregister.co.uk/2004/12/11/yahoo_licenses_x1_search/). The search tool is likely to be made available in early 2005, (d) December 13, 2004. Check it at MSN Toolbar Suite Beta (For Windows XP/2000 only) (http://toolbar.msn.com/desktop/results.aspx?FORM=PCHP&q=), and (e) December 16, 2004. Check Ask Jeeves Desktop Search (http://sp.ask.com/docs/desktop/).
______________________________________________________________________
*Independent eGov and IT Consultant based in New Delhi, India. Dr Misra moderates the Cyber Quiz group at
http://in.groups.yahoo.com/group/cyberquiz/ and also maintains a blog on Cyber Quiz at http://cyberquiz.blogspot.com/. Email: dcmisra[at]gmail.com.
_________________________________________________________________
Acknowledgement: The author is grateful to Ms Beth Blakely for editorial advice.
__________________________________________________________________
Cyber Quiz Series: Dr Misra’s three earlier quizzes in the series are also available on Tech Republic for download:

1. Cyber Quiz 1: The Internet (December 3, 2004) at
http://techrepublic.com.com/5138-6249-5464809.html

2. Cyber Quiz 2: The World Wide Web (December 10, 2004) at
http://techrepublic.com.com/i/tr/downloads/support/fun_games/CyberQuiz2.WWW.doc

3. Cyber Quiz 3: Check your E-mail (December 17, 2004) at:
http://techrepublic.com.com/i/tr/downloads/support/fun_games/Cyberquiz3_Email.doc __________________________________________________________________­­­
Disclaimer: While reasonable care has been taken to compile the quiz, neither the author nor the publisher is responsible for the accuracy, inclusion, exclusion or the interpretation of the contents. Readers are advised to consult authoritative sources before acting on the information contained here. The purpose of the quiz is educational and popularisation of information and communication technologies (ICT) .­­No responsibility for the content is assumed.


Use of Content: The use of the content here for educational and non-commercial purposes is encouraged provided due credit is given to theauthor __________________________________________________________________
©Dinesh Chandra Misra 2004 (Beta Version – December 31, 2004)


Dr D.C.Misra
December 31, 2004




.