EBB 013b
EBB 013b
EBB 013b
Yahoo Hacks
While Google hacks-tips, tricks, techniques, and scripts that make Google more
powerful and useful-are plentiful and fairly well documented, the same cannot be
said (yet) for Yahoo Hacks, despite the fact that O'Reilly published a Yahoo Hacks
book in late 2005. Part of the reason for this was the absence of Yahoo APis, a
problem Yahoo recognized and rectified with its Developer site.
While many of the hacks, mostly employing some form of API, are geared toward
maps, Yahoo launched a webpage devoted exclusively to Yahoo and "mixed" API
applications.
I recommend you pay special attention to the following applications that use Yahoo
APis, although you may find others even more useful to you:
• uses the Yahoo API so it complies with their TOS [terms of service]. 51
Showing 201 unique domalno from the Hrst 250 I MilliS of 213 tota1resui1S
ri!YII6J JQJIII Jl::lJIQJIYI job mu educn (2)210.47.174.208 r I~ I!:IIQIIIII!:!IIQIIYJ job hztc.edu en (2) 221.12.26.151
r 1~1 I6J l!:illii l!:il !QJIYI cs whu.edu.cn (2) 202.114.121.41 r 1\'\'J I!:IIQIIIII!:!IIQIIYI ear•er ruc.edu en (2) 202.112 t17.1t6
r ~1161 lf;:IJII Jl::liiQJ Ill wv.w.Msz.edu.cn (2) 216.17.227.219 r ltlll~l 1~1 III J.l::!II.QII!J 1ob.hl;u.edu.cn (2) 210.46.96.35
ri~I16J IQJ III I!JIIQJ 1!1 www.sdngy. edu.cn (2) 211.64.116.10
.12 Unique Goverment Domains r.gov, ·.mil) with 10 Unique C Block Addresses
Ir 1'!'!116J IQIIIII!:iiiQI IYI "'"""·"'la.g<». en (4) 211 99 196 166 r t.!YII/':1 IQII!II!::!IIQI lYJ """"yov en (2) 202.123.110.3
1r 1~11101 IJ;,I II!l!::!l IQII.'!.I•mbassy·t•pkiSt•n.fmprqov.cn (2) 211.99.196.218 r 1~1 1~1 JQI 1!1 1!::!1 IQIIYI wgzc.yrwu.gcv.cn (21 61.153.32.13
Ir 1~1161 JQJIIJl!:!l IQJ III - ... septa gov.cn (4) 61 157.75.21 r IY'lll61 IQII!ll!:!IIQ.IIYI WHN.thanqchun 9"" en (2) 221.8.13.135
r ~116] IQJ l!l II::'IIQJ IYI YNIW 'la~z.qov.co (2) 218 4.101.3 r 1'£{1 I!:IIQII!II!:!I [QJ 111 pOol at prel•iturn so.qov.br (2) 200.230.190.68
r ~1161 IQII!II!:iiiQII11 gwy2006.mop.gov.cn (2) 202.106.181.242 r l'!!lli':IIQIIIII!:!IIQIIYJ wcm.rmprc.gonn (2) 211.99.196.166
Ir 1'!'!116J IQIIIJ l!:iiiQIIYI bsg sh uov en (2) 218.242.255 118 r 1\>YI I!:IIQIIII l!:!ji.QJ_IY.Jww"'- >jcx:qov:u' (2) 218.75.53.69
For each unique domain, Link Harvester provides [Wl=Whois Source data for
domain; ~=Internet Archive data for domain; [Q]=Google cache of actual webpage;
DJ=Google's text only cache of actual webpage; [ill=Google's text only cache of
domain; fQ]=Whois Source's information about the domain from the Open Directory;
[YJ=Yahoo's Directory Listing of Whois Source data about the domain.
51
Link Harvester, Linkhounds, <http://www.linkhounds.com/link-harvester/> (14 November 2006).
S
r u""bje.:::
ct' - - - - - Result s: AP I ~i~ h ~
150 .:J
iava Is .:J IYahoor .:J 12• c .:J Download CSV
Google Key:
Enhu 11 to 10 s it es :
ht tp: I ! www. ja v a. com
i : http://IN\IY'IN.java _com
2: java. sun. com
3: www.ja11a com
4_ rdrw l yaho o. co m
5: javaboulique.internet .com
Sh owing .15 sites with at least2 matching ha ckllnks hom 178 se.u c. h results
! ~ ~ ~ ~ Site Name
5 ?> ~ l:lii i611!.:!11QIIYI btog$.su n.co m (209 .249.11 6. 203)
(; !$ ~ l:lil l~l l.diiQJ IYJjavn •uo.corn (209. 249.1 16.141)
~ c15 1~1 1~1 lt!IIQIIYI VWM~.jcp org (192. 18 97.62)
~ 1\ ;$ l'!l!lli':llt!I IQIIYI w.w:.mrcrosoft . com (207.46.18.30)
1> ~ !$ J!':i] I~JI!.:!J IQJ IYI w.:.-:.su ro.co m (209 .249116.195)
l; t ~ ll:':il \611!.:!1 !Q! !'.!.I VMW.t&l~cr t y com (66.37 .219.37)
~ ;:; 1'/\'1\!;] lt!I IQJ IY. Jallontrs.br gfrshqames.<Om (63.251.168.82)
<5 ~ 1!"!1161Itt] IQJ IYI ome!ol s1_ra1iGs.c orn (54.158. 108.35)
~ ~ ]~] 1 61l:!l IQJIYJ dess•nt net (59.60 .119 225)
The following Yahoo Hacks generally mirror certain Google hacks, with the
exception of the originurlextension: syntax, which is unique to Yahoo and very
powerful.
~ Disabling Word Stemming . Yahoo does not give users the option to turn off
word stemming, which can frustrate users trying to perform precise searches.
To run a precise search , enclose the term in double-quotes, e.g., ["drink"] will
not find drinks (except in sponsored results).
);> Searching by Filetype. Despite the fact Yahoo mysteriously disabled its
filetype syntax, you can use originurlextension: to search by file type, but
this syntax is imperfect.
To search by specific type of file, use the syntax originurlextension: plus one of
these or any file extension, such as cgi, log, zip, etc. Because this workaround
is not a true filetype search, you can search on any file extension.
Searchroller. Searchroller uses a JavaScript to let you create a neat little search
query bookmarklet 53 for your future use. The bookmarklet comprises a set of
domains you like to search on routinely but don't want to type in each time. For
example, perhaps you'd like to search simultaneously on a whole group of news
sites. Tara Calishain's script lets you input the uris for the news' sites once, then
save them to your Favorites or Bookmarks. Each time you click on the bookmarklet,
a screen will appear asking you to enter a query term or terms, then the bookmarklet
will automatically go to Yahoo and run that query against all the uris you have
previously selected. It's a great timesaver when you consider this is a typical
Searchroller bookmarklet query, although it could be much longer:
Search roller
http://www.researchbuzz.org/2004/1 0/new yahoo hack search roller fo.shtml
Artificial Proximity Search. Since Yahoo's APis are so new and as yet not fully
exploited, clever folks like Tara Calishain have come up with ways to force Yahoo to
perform new types of searches. The proximity search lets you input one search term
and look for it from 1 to 5 "spaces" (really, words) from a second search term. For
example, I can search for henry within two words of thoreau and find many instances
52
In order to read RSS or XML feeds, you need a reader or aggregator to parse this type of data.
53
A bookmarklet is a tiny JavaScript application contained in a bookmark that can be saved and used
the same way you use normal bookmarks. Bookmarklets do not require users to download and install
software. For more on bookmarklets, visit <http:/lwww.bookmarklets.com/>.
of Henry David Thoreau. This tool is very good for finding names with the last name
listed first, e.g., Thoreau, Henry David.
~-~---~----------·-··--·-
MSN Search is no more. As of mid-September 2006, Windows Live Search was out
of beta and officially supplanted MSN Search. It came as a surprise to no one that
the new Live Search has the familiar clean, uncluttered look popularized by Google.
Live marks a clear change in Microsoft's overall direction from a multipurpose portal
to a search service: "Live.com is now first and foremost a search destination,"
according to Christopher Payne, Microsoft's corporate vice president. 54
The question on everyone's mind is whether or not Live Search is any better than
MSN Search or Google or Yahoo or any number of other search engines. Thus far,
Live is not noticeably superior to MSN Search, but it is a one of the top three largest
and most powerful US-based search engines.
);> offers cached links with the date Microsoft estimates the page was last
updated (usually the date the Microsoft spider last crawled the page);
sometimes a date will appear next to the cached link on the results' page if
that page has recently been updated.
);> has a "Near Me" search option that only works in the US; it uses your IP
address to determine your location; users can override this location by
changing it on the Options page. Note that you cannot leave the default
location empty. If you do not enter a location, Live Search will default to
what it reads as your IP address's geolocation.
);> offers web, news, image, local, Q&A, academic, feeds, video, products, and
new "build your own" searches.
The "Search Builder" que,.Y customization tool has been replaced by the "Advanced"
option; as with "Search Builder" the Advanced option opens a little window beneath
the search form.
54
Chris Sherman, "Microsoft Upgrades Live Search Offerings," SearchEngineWatch, 12 September
2006, <http://searchenqinewatch .com/showPaqe.html?page=3623401 > (5 October 2006).
~ Display: display the site in a specific language (most major languages with
some notable exceptions, e.g ., Arabic, Thai).
~ Group results from the same site: Show the first 1, 2, or 3 results.
~ Location: set a default location; Microsoft detects your physical location from
your IP address, but you may enter a new geographical location in its place .
Remember: you cannot leave the default location empty . If you do not enter a
location, Live Search will default to what it reads as your IP address's
geolocation.
R.elated SMrdies:·
.Gat:dlnar H!.oa!th. St Loois C«r<:im•\5
orr~rs drug deveJomw<nl sel"iic<!<: in ~hepharmaC~ruticol and ·biei'?chncl<>qy ~~ r:ardi.f)al K~lth
tr!~u_Stf~~ ~t~_n ~~~-! ~h~t'li'\~t)~ll on·-~~:nP~~"<'t, fr'lv~tor rala_l~'iZ - 3!:1~ rteWs r
·. ._:i!IM~...r;:,~.v~..l~1'urt~.f~~~:.i$1~®:l\:. D
. 'ft'V\>f , Q arrll'hi>II.;: CJf'l/;; 211'11~.$/ ir~ •<1J:P ' !d.s!'.fdl~h can:lin&ll~l.:tures
);> A Type of results (web, image, etc.): the number of resulting pages and
estimated total number of results.
);> B The title of the webpage found, an excerpt from the webpage with the
search terms balded, the uri of the webpage.
);> C Cached page: links to a copy of the page as saved by the Live Search
engine; Live Search shows the last date the page was examined by its spider;
search terms are not highlighted on the cached page. Important Note: the
cached copy of Microsoft file types are safe to view.
);> D Additional results from the same site; clicking on "Show more results
from ... " will bring up the pages from that site that match the keyword(s).
);> E Related Searches offer options either for similar terms or terms with
multiple meanings, e.g., "cardinal."
Live Search assumes as its default that multiple search terms are joined by the AND
operator, so that a search on the keywords [windows explorer] will find all the
webpages that contain both search terms.
Live Search will not return any results if there is no webpage containing all the
search terms. Try this query to see what I mean:
Unlike Google, Live Search does not limit the number of search terms to 10
keywords. Live will try to match all the keywords you enter.
Live Search does not offer any word stemming or truncation, i.e., searching for
variations of search terms. A search for [child] will not find [children].
Live Search automatically clusters search results. If you want to see more pages
from a specific site, simply select the link following the uri of the result.
Live permits the use of nested boolean queries in simple search. The operators
must be capitalized. Live Search will run nested boolean queries (those using
parentheses), such as:
Live Search will ignore stop words, i.e., commonplace words, if the query
contains non-stop words; the query [to be or not to be] will only search for the term
"not." However, you can search on any single letter or number by itself, e.g., [1]. You
can also force Live Search to look for stop words either by enclosing the query in
double quotes ["to be or not to be"] or by placing a plus sign in front of the stop word,
e.g., [+1 number] or [+to +be +or +not].
Otherwise, it is unnecessary to use the plus sign (+) with any terms because by
default Live Search searches for all keywords. However, many times searchers need
to exclude certain terms that are commonly associated with a keyword but irrelevant
to their search. That's where the minus sign (-) comes in. Using the minus sign in
front of a keyword ensures that Live Search excludes that term from the search. For
example, the results for the search ["pearl harbor" -movie] are very different from the
results for ["pearl harbor"]. You may use the boolean operator NOT instead of the
minus sign.
Live Search interprets the ampersand [&] as a space, so these searches are
virtually identical: [at&t], [at & t], ["at t"]. Also, while Live Search will not actually
search on a plus sign, the search engine will search for [c++], although it does not
recognize [c+].
Live Search now offers as many languages in which users may search as Yahoo
and Google. Using either the language preference settings or the advanced search
window, users can select from nearly 40 languages in which to search and see
results. There are three ways to specify a search language:
1. in the Advanced search window, select Language, then pull down and click on
a specific language.
2. type your search terms into the search box, and then add language: followed
immediately by the two-character language code. For example, to search only
for sites in French: [language:fr keyword]
Live Search does not distinguish words using diacritical marks such as accents or
umlauts. Live Search finds terms matching those with and without the diacritic. The
term [fac;ade] finds fac;ade and facade, and vice versa.
Live Search offers several special search terms to restrict searches and make them
more effective.
[books -site:amazon.com] finds pages containing the keyword "books" that are
not at any amazon.com website.
[site:ir] finds all the pages from the Iranian (.ir) top-level domain indexed by Live
Search.
~ uri: unlike Google's uri query, Live's uri query checks to see if the domain or web
address is in the Live Search index. This query is not really intended to be used
with other search terms.
Examples of how to use the uri: command:
~ inurl: restricts results to pages that contain search terms within the uri of a site.
Multiple terms can be used, but all must appear in the uri (this query is similar to
Google's allinurl: query).
Examples of how to use the inurl: command:
[inurl:microsoft downloads] finds all pages containing both the terms "microsoft"
and "downloads" anywhere in the uri.
[inbody:amazon -inurl :amazon] finds all pages containing the term "amazon"
anywhere in the body (text) of a webpage but which do not contain the term
"amazon" in the uri of the page.
~ intitle: restricts results to pages containing search term(s) in the webpage's title.
Can be used with or without other search terms.
Examples of how to use the intitle: command:
[intitle:amazon inbody:brazil] will find pages that contain "amazon" in the title of
the webpage and "brazil" in the body text of the webpage.
~ contains: restricts results to pages that have links to specific the file type(s). Can
be used with or without other search terms.
Examples of how to the contains: command:
[music contains:mp3] finds webpages that contain links to MP3 files and have the
keyword "music" in them.
("final report" contains:pdf] finds webpages that contain links to PDF files that
have the phrase "final report" in them.
~ link: Restricts results to pages containing links to a specific uri. Can be used with
or without additional keywords.
Advanced Search > Links to returns results for pages that currently link to a
specific uri.
[link:jpl.nasa.gov asteroid] finds all pages containing links to any page in the
jpl.nasa.gov domain and the keyword "asteroid" anywhere on the linking
web page.
);> linkdomain: Restricts results to pages that link to any page within the specified
domain. This is a broader search than the link: query. You can use this option to
determine how many links there are to a specific page from sites indexed by Live
Search. Can be used with or without additional keywords.
Examples of how to use linkdomain:
);> linkfromdomain: Restricts results to pages that are linked from the specified
domain. This query only works with second-level domains, e.g., [domain.com].
You can use this option to determine how many links there are from a specific
page. Can be used with or without additional keywords.
Examples of how to use linkfromdomain:
[linkfromdomain:nasa.gov] finds all the pages the nasa.gov domain links to, i.e.,
links from nasa.gov to site x.
);> Results ranking: allows users to emphasize different factors to get a different
set of results for the same search.
1. Type your search terms into the search text box, and then click Advanced
Search.
2. Select Results ranking, and then move the equalizer slider(s) in the direction
you want.
"You can put emphasis on different factors to get a different set of results for the
same search. The sliders control:
• Very popular: To add emphasis to sites by the number of other sites that
link to them, move the middle slider up.
Notes
Search terms 'J pdat e:d rerently Verv ~,o pular A ppro ~ imace m,atch
Site/Dornain
Ltnks t o
Country/Region
Language
t
Static Les,; popu lar Ex ad: match
Slide the bar:; t o Y>'eioht these factors . .. ,. " ·: , , ~ ' :.:_,
~esults ranking!
);> filetype: restricts results to a specific filetype. Can be used with or without
additional keywords. The file types Live Search will search for include the major
Microsoft file types and a few others:
Microsoft Excel (xis)
Text (txt)
[filetype:doc domain:nasa.gov] finds all Word files at the NASA domain in Word
format.
[filetype:xls "financial data"] finds all Excel spreadsheets that contain the phrase
"financial data."
Live Search does offer safe previewing of non-HTML file types, and this is
especially useful for Microsoft file types, such as Word documents and
PowerPoint slides. In order to access the safe HTML versions, users must
select the "Cached page" on the results page:
. additional keywords.
Examples of how to use the ip: command:
[ip:66.218. 77.68 "computer security"] finds all the sites on this specific host
computer containing the phrase "computer security."
~ · It limits
searches to text within a feed. Feeds are specially formatted brief descriptions of
content with a link to the full version of that content. RSS (and the competing
Atom) feeds are in XML format. These feeds are usually used for syndicating
web content such as blogs and news. The feed: command only searches the text
of the feed, which is often a very condensed description of the full web content.
Example of how to use the feed: command:
[feed:"trojan horse"]
Each of the results represents an XML feed that includes the phrase "trojan
horse." There is no point in clicking on the link in a browser because that brings
up the XML page that most browsers are not designed to parse. The cached
copy shows the search terms as they appeared in the feed.
>- hasfeed: shows the pages that offer feed links and, if you add a keyword
(something I'm pretty sure Live intended you to do), the pages with feed links and
that also have that keyword somewhere on the page.
Examples of how to use the hasfeed: command:
The results are webpages that offer news feeds and contain the phrase "trojan
horses" on the webpage. This does not guarantee, however, that the news feed
will be about Trojan horses, but the chances are good that if you are looking for
sites with newsfeeds about this topic, you can find them using this query.
This query should find the pages at the Microsoft website with feeds about
encryption . What this query actually finds are pages at the Microsoft site that
contain both XML feeds and the word encryption in the text, so a little research
will reveal which of these Microsoft newsfeeds are the most appropriate to the
topic of encryption .
>- inanchor: restricts results to pages containing search term(s) in the webpage
anchor.
Live Search Special Features
Spell Checker: Live Search has a very good spell check option. When you input a
query, Live checks to see if you are using the most common spelling of the keyword .
If not, just like Google, Live nicely asks, * Were you looking for x, where x is the
most common spelling. The Live Search dictionary also includes some proper
names.
Dictionary Definitions : as with Google and Yahoo, Live Search offers the define
option. To use it, type [define] then a word or brief phrase, e.g., [define king cobra].
Live's define option is more limited than some others because it only refers to
Encarta.
French president). Live Search can also directly answer certain specific questions,
such as [how tall is the empire state building]:
Live Search no longer has the Encarta option that used to exist on MSN Search. For
now, the easiest way I have found to invoke Encarta from Live Search and to take
advantage of the Encarta Free Pass is to limit your search to Encarta using the site:
syntax, e.g., [site:encarta.msn.com keyword]. This will give you two hours of free
Encarta research.
RSS Results: when added to the end of any search result, the &format=rss
parameter will provide users those search results via RSS. "When you subscribe to
this RSS feed from Live Search, you'll get the top ten search results for this query
delivered to your RSS Reader or personalized site. You can subscribe to any
number of RSS feeds of Live Search results and view them all in your RSS Reader
without re-running your search queries." To use this option, first search for your
terms, e.g., [tsunami relief]. On the results' page, add &format=rss to the end of the
uri in the address bar and hit return:
http://search.live.com/results.aspx?q=tsunami&mkt=en-
US&form=QBRE&go.x=O&go.y=O&go=Search&format=rss
The resulting page will look something like this; from here, follow the instructions on
the webpage:
Oi, ff-you atre&ll'y.u-;e ime o(ll'le:Se ieade[~. yDtJ .can sub.Si:iltle..ttn jllit-one-cticl<:
~.!
tsuriarrii ;QGV
~;;p:/1) 911~~
i[i ~ ·)~ iJij 1 ; .J I f,;<)'I.J'\(
TSIK,alni- Wii<lpedia, lhe freP. eoc.v~ dia
!l:t!>:ti•n:i~~;p_,.,j; · ;.:·vm\W'r,+~inarni
A i~unami mronuncfatioo 'lsi>;(IO!Pl'f OF!tsu'.oo;rnil ).Is ·a !.eries of .,.,.aves·-..nen a·oody. g1 w..te1 , wen.a·s an;ocean.iS
·(apidfY·displaceii onoa·massiYe sc.ale . EiR1tlqua~ , mas~ · mc.'/ements·aoove Of•.... .
2>1 k ;; I~ llJ:!loW •JI~
''f'i'*or!lfcto T§Ynamit
~~ti~tf~f!W;~.mt r::,n!llo't ;J,Vu~ Urrar"V
Wettoine io T~unamll T5llnamil is ·nosied ;m<S maintair.ed at trre lllil.ve~itv otwa,hilgtoo tiy the Department ot Earth
and Spate Silen~es • This wet~oileh. deaicated to p~Jng.ge~ral ... · ·· · · · · ··
2t1s~~ uu. lJ ..;s SJD urc
Number Search: Live Search offers many types of number searches, including:
);> UPS tracking: enter the UPS tracking number [1 Z9999X99999999] or [ups
1Z9999X99999999].
);> USPS tracking: enter the tracking number or USPS plus the tracking number
with or without spaces [usps 9999999999999999999999].
);> FedEx tracking: enter the tracking number or FEDEX plus the tracking
number [fedex 9999999999999999].
);> DHL and Airborne Express tracking: enter DHL plus the tracking number
[DHL 9999999999]; a DHL tracking search must include DHL in the query.
Calculator: Live Search uses the Encarta Calculator and Equation Solver to perform
mathematical functions using "operators, exponents, and roots, factorials, modulo,
percentages, logarithms, trig functions, and mathematical constants." The Encarta
calculator appears to be the most sophisticated of all those offered by major search
engines because it will even solve complex algebraic equations, such as 4x"3-
2x+.9=0
Add +
Subtract -
Multiply *
Divide I
1\
Raise a number to an exponent (For example, 3"2 is 3 squared)
...........\H j•
di('l!t:r:tiu.!;. ~~ lj j
file',-trtt~ ~::6o.:t
lrcO,..~i'!W,y!-1,.C_hN<t '~J:J~<
~ l.qft
di«t+P1i _,i.~! · ~~::',)~4 {)J.
iii~ '!'.ir.,-: G'S~t.
lit.,:r•.: J/~_'JI .0,~1-ti ~t>Ct!.; .. g,\1,.,._.,
E.wUL:l.llc
J!i?Ji .ttt.rl~~r·! ii-9? ~ ~ ~4~
.o~t~-- .i~ aH
1;1 ..
I•(H•!<''! •'~. ~t -~ ioll •tln~-JX!:;..:;t: j
~~·~....... ~
: llii~'!:r.ii ... ~ tc: ~4 '-_.M<({M
i ~~~:.t~1t..~.-~J•>J>i~
-~ ' - .. - ~. . . . . ·.- ... ~ ....... ' -~-- •• ~- ·. ~ . ·.• • '1' "
t" r;Q'e *':·~~-.: :--.:;A f' ~-:- .."!;;;;t j l-"QIJ;· ! ~hL· . S...f' • r:r.A i :-~ · ~ -nrf~. ;. ::-~b':f ;j~>r-< : ·, i"-i·-oh!':l':;) b.~ ~~~!;'d',( t'Ud ··~r, l : t ( ..·t.~ ·~~.::. : ·t .~i::: '::•!• ;~_ t •.;:::c ~'<t.r-i..
Also, when you search for a famous person using Live image search, look for the
"Related People" window to appear on the right side of the screen. This can be an
extremely useful tool in finding relationships between people in the news or historical
figures.
The Live image search respects some but not all of the web search syntax and
some of it is not really very useful for image search:
);> site/domain: restricts results to images from a specific website or domain,
including a specific top-level domain (com, gov, dell.com, a country digraph,
etc.). May be used with or without keywords.
Examples of how to use the site: command in image search:
[site:amazon.com "twelfth night"] finds images of "twelfth night" that are from
amazon.com; note that the images from amazon.com may reside on another
website (amazon.com is in the image's uri).
[site:ir] finds all the image pages from the Iranian (.ir) top-level domain indexed
by Live Search.
);> inurl: restricts results to images that contain the term in the uri of the image
itself. Can be used with or without other search terms.
Examples of how to use the inurl: command in image search:
[inurl:amazon "rain forest"} finds all pages containing "amazon" in the uri of the
image and "rain forest" anywhere on the webpage.
);> intitle: restricts results to images that appear on pages containing search term(s)
in the title of the webpage. Can be used with or without other search terms.
Examples of how to use the intitle: command in image search:
[intitle:amazon brazil] will find pages that contain "amazon" in the title of the
webpage and "brazil" anywhere on the webpage.
[intitle:amazon inurl:brazil] will find pages that contain "amazon" in the title of the
webpage and "brazil" in the image's uri.
Video Search: Live video search is clearly trying to be competitive in the video
search market. In October, Microsoft announced a new partnership with Blinkx to
power its video search. This looks like a very good move for Microsoft. "Biinkx
already powers video search on sites ranging from AOL to ITN, Lycos and Times
Online. It also indexes video from the likes of BCC, Fox, MTV, Sky News, Reuters
and YouTube and makes and makes videos on those sites searchable on Blinkx or
partner sites. To date, the company has indexed more than six million hours of
audio, video, and TV programming to make it searchable." 55 However, as of this
writing, the Live video search has not yet been updated to reflect this
partnership.
55
Eric Auchard, "Biinkx Signs Microsoft Pact," Reuters via Yahoo, 9 October 2006,
<http:l/news.yahoo.com/s/nm/20061009/wr nm/media blinkx de 3> (17 October 2006).
As of now, the Live video search results include a thumbnail image from the video
with the title, source, length, and format. All videos are viewed at the originating site,
as shown below with the Newsweek On Air interview with Iranian President
Ahmadinejad.
~~1t~~t:~~!~:..i-~;~~~;~th·:;::~~:=~~~=~;~~-.
aothor pf •Prom. the SecvfitY
tarrotiit'> l>oir.tt· ofViaw." {Pr<teg@r·
N/i~JJS'V'I/ I~ J;k' · \ir,l ;,.r-; ~ ~r:; .., ·'!\ ':' ~-~-.c. r ~ ~.., , H.l"< ~~- .
~ II
@~n -- :
.,_.~ !~~~~-~-:-~.:.:::=:~-~:-- - ·~:~:-~~~,=~:
~-- -- -------------------------· -------------· ______ _ _ _ _]
a~usp_ensian of
You can use some of the web search syntax for video search. Note the difference
between these two searches:
[site:reuters.com iran]
[reuters iran]
The first query returns only those videos on Iran from the Reuters website; the
second query returns queries from any site that includes the keywords "reuters" and
"iran." We will have to wait and see how these query options change once the
results come from Blinkx.
News Search: as of now, the Live news search is only a list of stories listed by
relevance. MSN Newsbot <http://newsbot.msnbc.msn.com/> remains Microsoft's
premier news page. However, if you want to search for news stories, MSN Newsbot
takes you directly to the new Live news search. Most of the web search
commands work for news search. Especially useful is the site/domain: syntax,
which lets users limit a news query to a specific source:
Feed Search (Beta): This search is virtually identical to the feed: websearch. It
limits searches to text within a feed. Feeds are specially formatted brief descriptions
of content with a link to the full version of that content. RSS (and the competing
Atom) feeds are in XML format. These feeds are usually used for syndicating web
content such as blogs and news. The feed search only searches the text of the feed,
which is often a very condensed description of the full web content.
Example of how to use the feed: command:
[feed:"trojan horse"]
Each of the results represents an XML feed that includes the phrase "trojan horse."
There is no point in clicking on the link in a browser because that brings up the XML
page that most browsers are not designed to parse. The cached copy shows the
search terms as they appeared in the feed.
Live Book Search (beta): Microsoft added its own proprietary book search in late
2006. Details are in the Book Search section below.
Academic (Beta): Microsoft introduced Academic Search Beta for scholarly search
earlier this year, and it is now also a Live search option. Academic search still has a
separate website at the Windows Academic Live Beta Homepage. Clearly,
Academic search is intended to compete with Google Scholar and other scholarly
search sites. Unlike Google Scholar, Academic search focuses on computer
science, physics, medical, and electrical engineering publications. As with Amazon
and Google Scholar, Academic search has partnered with the Online Computer
Library Center (OCLC). "OCLC's involvement in Windows Live Academic is part of
the Open WorldCat Find in a Library program," 56 and also provides metadata from
WorldCat to Academic search to give researchers access to the resources in library
collections around the world.
As with almost anything, Academic search has good features and weaknesses. Here
is a snapshot of the first page of results on the search [neural network]. When you
execute a query, you will be presented with an interface that looks like this. One of
the first things you notice is the split screen, which I actually like.
56
"WorldCat live in Windows Live Academic search tool," OCLC Newsletter, Issue 2, 2006,
<http://www.oclc.org/nextspace/002/updates.htm> (17 October 2006).
I 'ftH:~: i
I. The. ,;;,,p!ii:i;;f new-a~. geu .;nd its !l# !SJJ.·:._~ilr~"ili.0i!"-'~~ef1t.~;, ·
' en.!.flSrn o..... -nt>lw(!!ti{ln:i'!it~c tw« fi:J;;: · ~llill
1 II nrJoissh§lin1lcrta3'.s .mt.~~'~?'• ?P?kar.:ms
A.m~<JJ iunx.JJeural network: -ii':.e y~&.!lCUYi!l. ~'!!.~!!!'!~_::
; Put{tSlr~d 'Vers.ior
: Rt.1~, F~i\"0 I W\tt':~~nQ) i.i I 'ft;~-,_,_Q) Zh-'10
1\ N::-vrlTlH::l.'(N~u,.l)l N.t!.tl'IU._ric._; ThoC' A\,t~_J~ ~~~jt)~JfrtYJilft fifrt!]:·~IJL
I'
li'!~tri;;,l· '!~""'·'
r..i+J:ih~l- ~#1,.~~~,1"\~ -rf~n~'r:'.>f~·
M,_,h·vrb . .. .. . .. . ..
2
·
t!~'lr:l 'rl~t-.ia~, il'l:d li Wej:..s?Y~r.g St~_tJ Qf·CO(()!'}~~+t( Sc.it:f;-u lnd
Te<b:i'Ofq:t,
·~v,t~~::h '.·:-~·-~·
t;-e-~~r~li~c.ot~.Uni·,-~riily f:-ro:;li! .., !i :::~:·;,~;i:t~:!ci.:~;•:u~J~.lf::~·~~~:~·~':k::,;n~ :~i;~,.
1'1-il$-p~u~r t.r~-~~~~s -~ n•,v-el. ·Mdiit~dwe. Wh!d't it.,..-«~.ab1\\ ("lhlmi!ar
! I;;;·;~::~:~,;~b;k,~~·i~.P.
· l~a 2 ~~cu~F"Q-t.,ar!<·b_a:red W.<.!'11r:~i:1l ;;.t~.rntvl an:J dt.ci~0n "§y$t.m.
iE6f.>rr,;,Yls: (')Mpu:. ~·-~() 1.l {U:?i/, ;r;.;., li!:W--l)lt~·. Ak_S\!~ct:!~SFte·!
· 4,b>([o<l·G~11;)<nU<~-1 CNJFDo&;d•oo\ [l•l•th&i:i~·ot i f~~Te.\¥•d, I
: :1-~'!l:i:~h l!._r~
} · ·~ijflj~j
i0'
I Val;li>~0!1
l !E!!-
j ?~i~~/'i~ii}I),,(,:IJiis~~
On the left you see the results; on the right-hand side of the screen is more detailed
information that appears automatically as you click on different results. You have the
option to view the abstract or properly formatted citations:
1. Slider bar: This allows you to expand or contract the amount of
information contained in the search result
2. Preview pane: This pane allows you to obtain more information on the
result that you are hovering over with your mouse on the results pane
3. Abstract: one of the options in the preview pane- choosing this option
will allow you to see the abstract of the article that you are hovering
over with your mouse on the results pane
4. BibTeX/RefWorks/EndNote: citation options in the preview pane -
choosing one of these options will allow you to see the formatted
citation (BibTeX, RefWorks, or EndNote format) on the preview pane
for the search result that you are hovering over with your mouse on the
results pane. BibTeX, RefWorks, and EndNote are different formats
that allow users to create citations automatically. The EndNote RIS
~ the ability to extract citations (if you need to cite the information, this is a big
benefit).
New M~ctl:,
Releil$t-s
Gadget
'~r,~·~.~M ·~~-?~'~(§
'F:Ui~ f~ ~~~:f~1:)
Given the newness of Live Macros, there are not very many to choose from yet;
however, I expect to see this list grow and there are already some useful macros,
such as the "reference" macros . Here is the reference macro added to the Live
Search main menu with the results from querying the reference macro only.
t!f.uc!;J;.!I!'...:_;).51.flc!~!l!J.. 90l1~~!.!rtl~tY.r:I.'JJ:su~.i:n.fu.>J1M:~:::Qr:rr
1'h~· mOst atx.~r;d.~~l .(¢reat~ir· th.im Q{l ~~b) ar.o ·~\6s t sr:at\f.e··is. t..wan~.~m .. 2.3'8 ( r~t~ru:e
A.S><lO P Y~"'"); .tise> p.-1!<$\tn!- dre .,,_,,._n ium- 235 -(hlitf..fii\> ,7 x Hl 8 i'•-nr~)- atm
"'<lr•ii.!m-<:34 (1-ialHif~-:.:s;.: JQ $ v"ar•)
•:;¥1~/e-..<ifHl~\=~~/t'\l-1~:/topi'Zl~J'T:<r,r>fi urll= · ~~~!tit
.ti!:WJWrn ~~ Fr1Pb::iot.l«::;!ftf.Brli:!iru)fcq
·ri~>icn aci;ui; with ••aw ~utcon5 ir<· tO.. ·ri!lative!y ,..,.., iso\<:il~ lm\nlum· 23.S (the·
cmlV n~h:.u:a!ly occuni/:'9 fissile mat-:riaf). w~~id1 m'us.t. be-.. xp~~ated frm.n ~M ple1itlf~l
i-;o~ap@ li(ilniurn~ ·;(::m ·ro( iu . ·~ ·
'.l~'•l':t4 ,h:'it·.:;r=- lk ~,l':',if_!"r./~h/artic f~ ... 9._i)?#~~~ U!:.fi:!':ii,;m_· ~ :".r'· e 1 rl'!:ze
Hl';.lh~.);,;l!'_.l.ti!B~Y.dQl)_!l:.£1il.=..ff':..(l.QbJini:'"'""':Ul'oiJ~dl.;; ~~~
The moSt ;;bun,d~n• 'gn'!ate>r·th~n 9<i'l;•) .:onrlr1_1o.ststab!e is llfal11!Jm·2311 ( half.~fe
4-S>< lO fl. y~;!•<J.); ~;!so .,·..,zen! an~ u~<soiurn-2!15- (hdlf-~~ 7-><10 8 ~,;,~r~) <lrW
1Jraroii,J<n··2~4.(hilf•-(;fu :< ,!l'<Hl S y,..a,:~)
~~~··n-'1 ;fUIC. ycl(ip~a;.?tirrVkidc/;!r!',l. ... ur:iink~m. f\~1!! ~ -~~~S..b.\!:.4..lt~GS .
I believe there are too many results from Wikipedia in the reference search, but you
can easily eliminate the Wikipedia results by adding [-site:wikipedia.org] to any query
(conversely, you could limit your search to Wikipedia by adding [site:wikipedia.org] to
your query. Live Search Macros are only the latest in a number of "create your own
search engine" options, all of which are variations on complex queries of already
existing search engines. For comparison , see the section on Custom Search
Engines below. ·
QnA: Live Search's new QnA (question and answer) search is mostly fluff, at least
for now. You can look at the questions and responses to see what I mean (typical
questions: "How can i get my Space Cadet Pinball that was preinstalled in Windows
XP back in Windows Vista?" "Do you think the Internet is contributing to 'Intellectual
Laziness'?"). Lots of opinion, not a lot of fact. Let us hope this is not all that "Web
2.0" portends.
Live Platform: In September 2005 Microsoft announced it would begin offering APis
for Live Search, Virtual Earth , Spaces (weblogs), Messenger, Gadgets, and Expo
classified ads database. These have begun to rival Google in terms of innovation
and shared technology. To keep abreast of these changes, I recommend the MSN
Developer Center.
MSN Developer Center's Windows Live Platform and Services for Web Mash ups
http://msdn.microsoft.com/live/default.aspx
Microsoft subsequently opened Windows Live Dev (Beta), a "one-stop shop for the
Windows Live Platform, including information on getting started with Windows Live
services, latest documentation and APis, samples, access to community areas and
relevant blogs, and announcements of future releases and innovations." 57 Microsoft
is trying to make it easy for users to integrate their products with Live regardless of
platform, browser, or language. Certainly the first two are a departure for Microsoft,
which in the past had made the requirement of a Windows platform and an Internet
Explorer browser a necessity in most cases in order to "play ball" with the software
giant. A further example of Microsoft's reluctant openness is the fact that Microsoft's
Internet Explorer 7+ browser will not default to Live Search, something other search
engines had objected to.
Windows Live Dev (Beta) http://dev.live.com/
Microsoft is working very hard to improve and expand its search properties, so much
so that at times one feels as if we can see them working under the hood as we
watch. Clearly, there are many things that need improvement and many things that
are very good about Live.com. It will continue to be one of the top search sites on
the Internet. If you are interested in keeping up with news about and changes to Live
Search, there is a blog devoted to it; the blog offers RSS and Atom syndication.
Also, all the Windows Live Beta projects are accessible through one webpage if you
want to see what Microsoft is planning.
Windows Live Ideas Beta http://ideas.live.com/
Live Search Weblog http://blogs.msdn.com/livesearch/
57
Windows Live Dev, Live Dev News, 8 June 2006,
<http://dev.live.com/bloqs/devlive/archive/2006/05/19/15.aspx> (17 October 2006).
Gigablast
The Gigablast search engine, which has been around since 2002, is still not quite in
the same league as powerhouses Google, Yahoo, and Live Search, but it is well on
its way to becoming one of the best search engines. That's something of a surprise
given Gigablast's humble origins and unique status among major search engines. In
case you're not familiar with Gigablast, it is different from its major competitors most
notably because it is still owned and largely run by the guy who first wrote its C++
code in 2000. Matt Wells is still the very hands-on proprietor of Gigablast. Its
database now indexes over 2 billion pages, up from 650 million in late 2004.
While this falls short of the size of the Google, Yahoo, and Live Search databases,
it's not bad, especially considering a lot of the "stuff' in those databases is dross and
the numbers are not verified independently.
How does Gigablast stack up to the big boys? Gigablast has some very nice
features, some of which are unique to it, such as the IP range search (something
AlltheWeb once offered).
Gigablast http://www.gigablast.com/
Strengths
~ simple interface
~ cached copies with date indexed [archived copies]
~ cached copies of webpages without images [stripped]
~ links to Internet Archives [older copies]
~ clusters results by default (can be turned off)
~ no limit on number of search terms
~ file types indexed include Microsoft Word, Excel, and PowerPoint, as well as
PDF, PostScript, HTML, and text; syntax is:
o type:pdf for Adobe Acrobat PDFs
o type:doc for Microsoft Word documents
o type:ppt for PowerPoint presentations
o type:xls for Excel spreadsheets
o type:ps for PostScript files
o type:text for ASCII text files
o type:html for HTML Web pages
This query finds all the sites in the Gigablast database that begin with the IP
address 66.218.77:
Yaho.o! (;'eaCities
ws.~p~lli~s ,yahliri. r:p.'l"'IQilM~~·t:im~mtJar=l)li!t'rlanjS21 ° ntkoIartllived ail!JYl, ~ o · !fi)!t~r eo~le.slo !n(lexeu; Jul ~& aoos modified: 0
'Jul27 2005 ·
Roces.GIJtars .·
~t<:~M.(.ilfe~.yah~ o-. ~oM!!Iflisign?n\etl~l:l~r-:ratefiog<~n 2,3K · {<Hthi\iiffi'topyJ. [stripped] •to!dew:rpi~SI •.Indexed: Jul26 2005 •
°
modf1i(!d;JyJ 28 2{)05 · · · ·
[ l!l!)t!ue~IJ!ts·rram tills si!SJ
This query finds all the sites in the Gigablast database residing on the specific
host whose IP address is 66.218.77.68:
SMScheerleadinq
br::~.i;ripiil.>n: The ·omi:ial thaerleadinll·. page for ·.sMs·in · Manassa~ , Virginia. provides II)'Oul;informalion, team ·nem, and
·tontliCIS. . . . . . . . ' .
C~e·r)olr.. Sl:wrti:ic·Chee!'lea\llln§r·v,,uth .and •R~. m~!i.im .
~·l:gc:e~tllie~.C:othl~a llrese~•9i!<mei<Jclll8111!Scri.e.ef!i8Qll i ng·:tltml · 30 ;1!k 0
[inei\!VedJiop}'}olslil'PP&dl:oloielgr copl?Sio lnelexsd: Oct l)g :W05 °
modified: Feb 1e:zoo5
~ other special syntax includes link:, site:, title:, and suburl:, which searches
for web pages that have the keyword anywhere in the uri
~ although Gigablast will ignore stop words in a long query, users can search
on any word or number by itself
~ default operator is AND; OR and AND NOT also work; nested queries (with
parentheses) are supported
~ ~-~ only search
engine that will display the metatags in the results list, but the syntax for this
query is very complex. Please see the Gigablast review at Search Engine
Showdown for details on this type of query:
"Meta Tag Searching and Display: Gigablast is the only search engine
indexing meta tags beyond just the meta description and meta keywords that
some others index. It is the only search engine that can also display meta
tags in the results list. Gigablast claims to be indexing all "generic" meta tags.
In addition, it can display the meta tags in the results list. Doing this requires
adding commands to the URL of the results list. At the end of the uri, add a
&dt= followed by the word(s) for the meta tags, followed by a colon, and then a
number to represent how many characters from each meta tag should be
displayed . So, for example, adding &dt=keywords+author+generator+description:30
will display the meta tag content for meta keywords, meta author, meta
generator, and meta description tags for any records retrieved. Use a +
between meta tag words. It seems that this "generic" meta tag approach
excludes more complex meta tags like Dublin Core, which use a syntax like
DC.Creator. The dot syntax will not work for the display command, although
Gigablast does index some of the content of these tags." 58
Reload IG http://www.gigablast.com/search>k!z=134827&q=dublin+core&dt=kc. l ~
add string to the end of
..DC- dot now conforms with the Expressing Dublin Core in HTMUXHTML meta and I
.. Now you can click on the DC- dot button, wherever you are, to create Dublin Core m resulting uri in address
about.. ..This se!Vice will relrieve a Web page and automalically generate Dublin C o r i o - - - - - - - - - - - - - - - - '
metadata, either as ..
D2:o:snr-t,x,· Give DC- dot a URL and see the Dublin Core it generates .
keywol <Is: Dublin Core, DG; generator; editor; Warwick Framework; SOIF; TEl; US MARC; XML; GILS; ROADS; RDF; IMS
yenelaiOI: HTML TidY. see 'N•.WV.W3 org
dP.SCiilllion: A CGI based Dublin Core
r~:;legmv. Referenre : Librane s: Llbraty and lnfurrnalion Science: Te chnical Setvices: cataloguin,J: l•,teta,Jata· Dul,lin Cote
'oWN'.ukoln.ac ui<J'rnel«d«laidflloU- B.Bk- larchiverJ copy!-~ -!older copies! - indexed : Oct 05 2005 · modified: Dec 1 I 2001
~ clearly displays date webpage was indexed and, in some cases, modified
~ search query spellchecker (Did you mean? option)
58
Greg R. Notess, "Review of Gigablast," Searchengineshowdown, 17 September 2006,
http://www .search eng ineshowdown.com/features/qigablasUreview.html> ( 14 November 2006).
);> undocumented feature: will search in some specific languages, but I don't
know how many; use language:de to search for webpages in German, for
example.
Weaknesses
);> most obviously, the Gigablast index is still smaller than those of Google,
Yahoo, or Live Search
};> no truncation
};> is not case sensitive
};> no wildcard
);> limited file type searches
);> limited language options
);> poor documentation
Directory: As with Google and Yahoo, Gigablast's web directory uses the Open
Directory Project's collection but Gigablast use a "hypertechnology for searching the
directory that allows its users to perform searches over websites , not just the actual
pages, under any topic in the directory, in effect, instantly creating over 500,000
vertical search engines Additionally, all directory searches are enhanced by the
massive amount of link information from Gigablast's multi-billion page index." So a
Gigablast directory search returns not only DMOZ categories but "Giga Bits" and
website listings as well.
XML Search Feed: Gigablast also offers an XML Search Feed that will run up to
1000 queries per day with a maximum of ten results each. But remember, you must
have XML parsing software to read XML feeds, so this new feature isn't an option for
all users.
Related Pages: Gigablast's Related Pages were introduced in March 2005. Related
Pages are "relevant search results which do not necessarily contain the searcher's
query terms." Related Pages are results that are contextually related to the query
terms without having a direct connection to them. The Related Pages appear in the
yellow box on the results page.
~Search I
Re<>ultrs 1 to 10 of about 2.640,799 for··artificial intelligence ..
Reference Pages
·~ '"' 1~ 11 :•11- 10 31<- tarr:hi\1~- ~- !old.;>r CQJ2iesj ·indexed· May 15 2005- modified Mar 31 2005
Gigablast still "runs on eight desktop machines, each with four 160-GB IDE hard
drives, two gigs of RAM, and one 2.6-GHz Intel processor. It can hold up to 320
million Web pages (on 5 TB), handle about 40 queries per second and spider about
eight million pages per day. Currently it serves half a million queries per day to
various clients, including some metasearch engines and some pay-per-click
engines." We are not talking about a huge "server farm" here. Interestingly, despite
keeping his search engine "small," Gigablast creator/proprietor Matt Wells says "I am
a firm believer that bigger is better," and toward that end he is hoping to get the
Gigablast index up to 5 billion pages. For more on Wells and Gigablast, read his
interview with his former boss at lnfoseek in the April 2004 edition of AMC Queue:
"A Conversation with Matt Wells: Steve Kirsh of Propel Software Interviews
Gigablast Designer," ACM Queue, val. 2, no. 2, April 2004,
http://www.acmqueue.com/modules.php?name=Content&pa=showpaqe&pid=135
(15 1\Jovember 2006).
Exalead
The French search engine Exalead, which introduced a new look in 2006, has
features that make it worth special mention . Exalead offers both proximity searches
and truncation, two options no other major search engine offers anymore. In
addition, Exalead presents thumbnail images of websites in the results list (if you
want them) and related search terms, directory categories, website locations, and
filetypes. Exalead now claims to index more than eight billion pages. Although this is
far smaller than some major search engines, it is a respectable number and one that
is sure to increase .
While the new version of Exalead did away with one of its best features-the safe
page preview-Exalead offers a number of other unusual or unique features
designed to create a very powerful search tool:
Weh limapes
I.-···-····-··-····-·····---- ·····---······--·- _•................ ·- - ·· ··--· - ··---· (i;~R:;) A!Minced search
Notice the images below the query box. Exalead lets users put "shortcuts" here by
entering a title and uri for your favorite websites .
Exalead is in the process of updating its help pages; thus far, you can find various
types of help at these pages:
Exalead http://www.exalead.com/search
Exalead Refine Your Search http://www.exalead .com/search/?action=kourou&id=49
Exalead Advanced Search Help
http://www.exalead.com/search/?action=kourou&id=24
Exalead Search Syntax Help
http://www.exalead .com/search/C?definition=quervSyntaxReference
7. Display view on results page: text only; text and thumbnail; text thumbnail
and extra
'-~~ '<ii;Dir<-rl..:
.. • §!l''i~»l!LCiiit~[ll " !:'!t~l!t!l 'QQDWfm>;!!
• S<xi<Ti an~ C<Jll!w• > Pltilo'"?ill: • C'''""""'1f PJiil~•o~\!J •e!!~
~ A Matching Documents: the best results for the query with the page title listed
first; Exalead clusters results, showing only the "best" page for each website.
~ C Page preview and thumbnail image: The biggest disappointment of the new
Exalead is that it no longer offers the safe page preview option for webpages.
Instead it has chosen to give a thumbnail image of the cached copy of the
webpage; users can click on "Preview" to see the cached copy, complete with
highlighted search terms and the date cached. Fortunately, Exalead does
offer safe previewing of non-HTML file types, and this is especially
EXALEAD
URL
http //www. exalead com/
Key features
• • • • • • • • • wildcards for stemming words panem matching ('regular expressions") phonetic search approximate spelling search NEAR proximity operator
full Boolean search thumbnails of pages displayed in results related terms and categories displayed on the results page user specified shortcuts (Smart
Bookmarks) to other search engines on the home page
Search options
Default search type Case sensitive? Wildcard!TnmcatlOn All of your words No Yes. Asterisk(") at the end of words, for example
pollut~ Also pattern matching/regular expressiOns for internal wildcards. for example /psych *tst/ or /mpg(ll213)7/
Phrases and proximity Phrases" ...... " For example "climate change". NEAR operator to search for terms within sixteen words of one another. Specify
maximum number of words using NEARfn, for example climate NEAR/5 change Plus sign(+) before stop words such as "the', "of'. The plus sign can also be
used to disable automatic stenuni.ng if set up by the user tmder preferences. Millus sign (-) before the word, for example
branson -balloon murl. for pages Wlth the term in the1r URL, for example murl.chocolate intide: for pages that contam the adJacent word in the bde, for example
intitle:chocolate link: for example linbba.co.uk
~ D Directory link: opens the related categories folders from The Open
Directory Project, which are also listed to the right. You can completely alter
the results by selecting a different related category, e.g., in this example,
continental philosophy instead of phenomenology. Clicking on "More choices"
will greatly expand the related terms and related categories lists.
~ E Add to shortcuts: selecting this link will make the current site one your
shortcuts that appears on the Exalead homepage.
~ F Related Terms: clicking on a related term runs a new search on that term
and displays a new results page with new and different related terms, related
categories, etc. Clicking on "More choices" will greatly expand the related
terms and related categories lists.
iiiamrll::=:. '® Oil ecto1y. ,.;,ts c.r•d Enlellaimntsr'l ~ hlu&it • ... ,. Q ,. Quilled ty Voice13
~'\Audio fite: 'My K1nd of So ldl9r (gbY_my_kind_or_sold ier.mp3)- 3.6 Mb
B Video 11/e: gt>v rer.mov- 0.2 Kl>
@) Unotflcl.ll RSS Feed: MatadoJ Records
~ I Document Type: clicking on a specific file type will only return matching
documents in that specific file type, e.g., PDF, TXT, DOC, PPT, RTF, and
XLS (remember: do not open the Microsoft file types on the Internet; use the
page preview option in the thumbnail image to view these files).
~ J Image Search: Clicking on image search will automatically run the web
search against the image database.
Exalead assumes as its default that multiple search terms are joined by the AND
operator, so that a search on the keywords [windows explorer] will find all the
webpages that contain both search terms. However, unlike Google, Exalead does
not search first for phrases, then the terms anywhere on a webpage.
Exalead will not return any results if there is no webpage containing all the
search terms. Try this query to see what I mean:
[rollerskate handshake buckyball]
However, remember you can use the OPT (optional) operator to make a term
desirable but not required.
Unlike Google, Exalead does not limit the number of search terms to 32
keywords. Exalead will try to match all the keywords you enter.
Exalead is not case sensitive.
Exalead automatically clusters search results. If you want to see more pages
from a specific site, the only way I know to do so now is to run a site search. For
example, to see the pages at Amazon UK search for [site:amazon.co.uk].
Exalead permits the use of the OR operator in simple search. The OR must be
capitalized.
Exalead recognizes double-quotes as enclosing a phrase.
Exalead ignores certain stop words, i.e., when searched alone or with other stop
words. If you include a stop word such as a, an, the, in, or be in a search,
Exalead searches for it. If you need to search for stop words by themselves or
with other stop words, you must either enclose them in double-quotes or put the
plus sign (+) in front of them. Compare [to be or not] to ["to be or not to be"] and
compare [fire and ice] to ["fire and ice"].
Using the minus sign (-) in front of a keyword ensures that Exalead excludes
that term from the search. For example, the results for the search
[phenomenology -philosophy] are very different from the results for
[phenomenology].
Advanced search
{-,..Web Search --) dVJjm€-d seRrrh
What?
• exact phr.:.~-s~ .::g. ·to t-1? cr n0: :o oo~
• lnr!:1uidP.n terrn~ e g c:."Jw-mad
• Yt'Lmt:, staJ1\r1g w1\b e_g me:,.sa!l-
• fi!~_Q.!!.~U-~-~1-!§.lli~l9 .::.g. ~)ilfld(JN.e:~xqflt'J<Jd
• if!pru~~~n~l~ s-pelling e.r;. JfH?Ji::.li.t..a·f:Y..!&fJ&d
• <~djac13nl"NDrd3 e g_ {Se-ck N£4P sxc.f)4nge)
• togic;;~l f?X[')res,;ion "-'-CI ( (;'2N o.~ .<;peed} AND No)! .rJgin)
• /12Lf11Ri 8:Xf)n?$SiOn BJ:,l/2 (/
Where?
When?
• modlti~rJ 3ltl?r:;. llFVI?n datE- <?&. at.et<•1!''2'?'i49'.1
• modtfied o~fCJrt? (!. ai':'t:r' date eg. t-i-fore '31/12119<-}9
Two features Exalead offers that have almost vanished from search elsewhere are
proximity searches and truncation/wildcards.
Exalead's proximity search uses NEAR. The default setting is for terms that are
within sixteen terms of each other, but users can change the proximity by adding
a number, e.g., [empire I\IEAR/5 building]. With the NEAR operator, order is almost
irrelevant as this query demonstrates. A query using the name of an 18th Century
French foreign minister, Charles Jean-Baptiste Fleuriau, comte de Morville, shows
how the NEAR operator works: the query [comte de Morville NEAR Fleuriau NEAR
Charles NEAR Jean-Baptiste] finds any indexed page containing all these terms
within sixteen words of each other, regardless of the order in which they appear
either in the query or in the text.
Weh l lrnaq&s
e? Did you mean : comet de MOPIIIIe NEAR Fleuriau NEliR Chari·>S NEAR
View. i~; ~~\@}
i
l
Related •erms
. ..lean-Ban!tste
• Minister 10 France
• Louis Michal
··; ! ., PiOit& Henri
I~·-,-·l·
~~' . ORVILLE (C ha~le$-Jean-Baptiste d& Fleurlau d'Armenonville. comte ...
1.' M
.• · ~
~--"
. \a ~~~f6S of Foreign .AJfa1rs' Gallery -Archives diplomatiques - Mtnistere des Mairas
, ·'"" .... MORVILLE (ChaJIO!rJeoll'Bap11s1e de Fleuriou ...
j • Tow
\ languages
d'Alt'tE:I f\Hl&
Also, the presence or absence of parentheses does not appear to affect the NEAR ·
search. Proximity operators can be extremely useful in finding pages with
search terms that may not be in a precise order while excluding a lot of
irrelevant hits.
Exalead supposedly offers both automatic truncation (word stemming) and the
wildcard, which are welcome features discarded by other search engines. As of
now, Exalead is the only major search engine to offer truncation or a wildcard. On a
search with two or more words, stemming is supposed to be automatic. However, I
find that the automatic truncation feature is so capricious as to be useless:
sometimes it works, usually it doesn't. In a search for [child play toy], Exalead does
not find children, plays/played/playing, or toys.
However, when I search on [child*], Exalead will return pages with children
highlighted as a search result. The wildcard also can be used inside a search term,
e.g., [kazak*stan). However, this search will also find kazakh and kazak as well as
kazakstan and kazakhstan. The wildcard option is listed in the Advanced search
window as words starting with, but keep in mind the asterisk can be used inside
words as well.
Exalead has a number of other interesting features. For example, in the advanced
search window, users can choose among these search method options: exact
UNCLASSIFIEDHFQR QFFICIM 1
'SF ONI v
The phonetic search sounds great, but I am often frustrated by it because so many
websites misspell so many words, Exalead is going to find those misspelled words
first (try: [genealogy] to see what I mean). However, the phonetic search
successfully figured out that [criptografy] meant [cryptography]. The phonetic search
has genuine utility.
· Web'··.·---
~~~Aii1ii0~9E~60iii :i89:1i~19D!Ii~j;it~~~~Pii~~-~:~ -t~.: .~'TI:"f R•l•••;!!:!~' 1 • 1 ''*4hli'
I • Public key c,-..,ptograph'l
_,._.-,..... ~ ..."""' -...."·-=;.;, -. ·r: ~ _ . . . ~ • Srronq r.r11ptograohy
1 ·'$i:; ~' ""'""•"" ! CS8511 ... 51. Cryptography Apphca~ons 'Bistro . . ! • Quanlum cryptography
Jg .£~~$..~~- ! Ho~epage·for C~tography Applications Bistro semmar offered dunng Spnng 2004 at the ! • AJ)plied-GryptQrmwhy
.....
~~~ -~~=-- ! w..v..v.c:s.'Yll9'nie..&du/.::rah!-19k-t-.g_q__ !~.~--~-;y;g,_,.J!:-:
~~r-' \ E1 Video fllo: !\':!! (onewayenuyplOVl). 21.1 Mb
! • Elliptic turves in~
j
1
Mnlfi 111 adla
~Audio 161 Video (iii) RSS
Languages
1
:~:;,;;;;s;:::-_-::;~1 Cr;!ptoSys cryptography software too!s for \/isuai Basic and C/C++ ... File rypes
·-"'·--·-·i CryptoSys cryptography software tools for Visual Basic and C/C++IC# developers ... • Acrob'-t r. pd!)
'""""==·~ j Cryptogrnphy software lools for Visual Basic and C/C++IC# developers The ... • To<! Cl!!l
~~~,-v=- j c.r:,;ptll-3J-5.neV·- 1 1 St:p :mG- i!ik- hq~_!g_~r.!.~J.t.~t;_ [ • Word (.doc)
~·;_E.;:§lig&_t=
What I like much, much better is Exalead's regular expression patterns option,
which amounts to a true wildcard search. Here's how it works:
Use a forward slash (/) at the beginning and end of the term; use a period (.) to
indicate one missing term; if you are not sure how many letters are missing, use
the wildcard (*) after the period. For example, the query [/crypt.*c/] will find
cryptographic and cryptologic:
Web j lmaa~?s
':·: \leact
~ eb · Resuns 1-tOO of about 1.269,173 for !trypl.'ci
· Related teuns
r,: Produ·;!~ • Crtpluqraphic ;,llwr!lhms
~'1\\\'.C"bd'fe-l:press com· Oet lt ne\1\1 on e8ay E)(,,:.ress. HaPJJ1 Shopping.
• [email protected]..~CJft-r~~art:
West Memoli als . Headstones ~- ivlarkers • G.rJQ!.qm'-'1.~.1'§
W¥o;¥;'.~stme-rr.Clrra~.:;.com · Ccypt- we offer affordable cemete'Ymarkers, headstones, monuments ~nd cemelery memorials. Free • Cryptographic prolO\:uls
• Crypwgraphic &yslern6-
····r Mullimi!:dhl
View:--·
~\Audio 18 \fideo ~ RSS
L.:myuages
• , I.G!_ernauo.:J.31 i\ssoc@I!•Jn fo_r:_(:ryp!<:>lgg!~f<??t:arch
. lunher research in cryptology and related fields .... Crypto 2007. August 19-23,2007, Santa
• _(;.f.12h?..b
• German
Barbara, Calrfomia, USA. ... Workshop on Cryptographic t1ardware ...
Dlredory
: wv"'-v.1arr.ur~- 41. ,··-~J k :DS1_1,=:t:·
~GJ!!!>puters
1... ~1 ~ Ohecto1ies: ~RFCo
• Bclf::nc~ and Errllronment >Mat~~> .. > Comrnumc::~ttun ThE:\ory > (;ryptooraphv ~ Sc.r~and En~lronrnent
• £J..i.~.!t~-~-~D~--~~~~-i-~_Q_~_CQ~!."!! > M.~!!l > QHl.~.Q!?.~~-9.!!2-.
... Computer'li. > Hack1ng :> CryptocuaE.!!i File typos
• Acrob<st Cpdf)
, • Te•t (.t<tl I
Cryptologic Inc - suft\v,;re develvpment cc.rnp.;nv specializinq in • Word I doc)
l.lorochot~ I
j
Ctyptologic rs an Interne! so11ware development company wilh leading proprietary
technofogres that enable secure, hrgh-speed financialtransactmns over ... ' -~~~ ( •)
v.N{to¥.(1Yf"'IOIOijiC rom!· 1:? (kl 7.CC5 · 13k -.--.-:.~ !~' ·, .r;!l•. .;l~
Here are the basic rules for pattern matching (wildcard) searches:
The pipe ( 1 ) stands for OR, and parentheses are used to group
characters.
The last character is always a slash (I). This tells Exalead this
is the end of the query.
Exalead will handle complex boolean queries in the simple search screen or from
the Advanced search window. The boolean operators Exalead supports are AND,
OR, NOT or AND NOT (in caps). A typical boolean query would be:
[(baseball OR football) NOT cardinals]
In addition, there are two other operators that can be used in a boolean query:
NEAR and OPT. NEAR finds search terms within 16 words of each other and OPT
makes a query term preferable but does not require it. For example:
This is nice to know because most search engines use AND as their default, and will
not return results unless all terms are found. Check the difference between the
results for these two searches in Exalead: (buckyball skateboard OPT flyswatter]
and [buckyball skateboard flyswatter].
Exalead will search in all or one of most languages. Use either the syntax language:
followed by the language digraph or the pulldown menu in the Advanced search
window. Also, Exalead offers a country search option either from the Advanced
search window or using the syntax country: followed by the country digraph. ·
Exalead does not recognize diacritical marks at this time. This means that a
search on [fa9ade] finds both fa9ade and facade. However, Exalead will handle
some non-Latin character sets. Exalead officially supports Unicode (UTF),
Windows encodings, and miscellaneous encodings (Arabic, Chinese, Korean,
Japanese, and Russian).
Exalead offers limited field searching, i.e., special search terms to restrict
searches and make them more effective. These special operators can be used in
both simple search and in the Advanced search window.
[language:de welt] finds all the pages indexed by Exalead that are written in
German and contain the keyword "welt," which has a very different meaning
in German than in English.
;;.. country: restricts results to pages in a specific country. The country syntax
uses the two-letter ISO country codes. Must be used with additional
keywords.
[country:de wissenschaft] finds all the pages indexed by Exalead that are
purportedly in Germany and contain the term "wissenschaft." It will not limit
the search to the German TLD "de."
);;> site: restricts results to a specific website or domain, excluding specific top-
level domains. You must search on a second-level domain for site to work.
May be used with or without keywords.
[site:ir] does not find the pages from the Iranian (.ir) top-level domain.
However, [site:gov.ir] does find all the pages from the Iranian government
domain indexed by Exalead.
);;> filetype: restricts results to PDF, MS Word, and other filetypes. May be used
with or without keywords. Exalead converts these other types of files to
HTML, making them safe to view. Select [PR£'\Ils.w] to see the HTML version.
To search by specific type of file, use the syntax filetype: plus one of these
abbreviations:
[filetype:xls] finds all pages indexed by Exalead that are in Excel spreadsheet
format.
[filetype:pdf "white paper"] finds all pages indexed by Exalead that are in PDF
format and contain the phrase "white paper" anywhere in the text, title, or uri.
[intitle:amazon] finds all pages that include the word amazon in their title
["rain forest" intitle:amazon] finds all pages that include the word amazon in
their title and mention the phrase "rain forest" anywhere in the document (title
or text or anywhere in the document)
[inurl:amazon] finds all pages that include the word amazon anywhere in their
uri.
["cosmic ray" inurl:spacecraft] finds all pages that include the exact phrase
"cosmic ray" anywhere in the document (title or text or anywhere in the
document) and include spacecraft anywhere in the site's uri.
Advanced Search > Where? > on pages that contain a given link
Image Search : Exalead offers some nice options with its image search. You can look
for images of specific sizes (small, medium, large), computer wallpaper by
resolution, image color, layout, or filetype. Exalead's advanced search options work
in image search as well.
V'Je ~ i lm.:>ges
leaa lcessini
---- -----
i . (.. rdy:)c.CttE' (32%) :if."' _'_~i.-L'.
~ • 8lc1ck & Whit~? (0 1%) -~:.:_··~' !F
! L.1yout
cassltri r Bild ... Cas5int piclure c,uslnllaunch The Cas'9ini -- \ • !o•ad"ap• (78%)
139x196·2l.2Kb-glf 2\3 X \50· 13.7 Kb ·gil 200 X 145 • 6.9 Kb • jpag 200 x 148- 6.1 KtJ- ipeg • Portr.,l (21%)
\'\iw.v ge:oplr~:~ I•J· b'~ rJe
:.;,m~or,r inr n .1::, ,:, ~jrJV ~di!IM• 1.r•r: r,r;m f. rJ 1t~ (J n. cM. c0 rYI
1 File types
' • -~(82%)
: • ~(17%) .?.:.~...'..=..,\~'
: • !:'!!!; (06%) :~~-:.~---~L~::.
i Setuch within resufrs
i
~I
( • Len dlokes ) .
Exalead is not in the Google and Yahoo class yet, but because it offers unique and
important features dealing with truncation, wildcards, proximity searching, etc., it is
one of the top-tier search services . In addition, Exalead offers the option to
preview non-html files (e.g., Microsoft file types) safely, which is extremely important
given the security dangers that plague Internet users. Exalead is a valuable addition
to the world of Internet search.
Ask
During 2006 Teoma and Ask Jeeves ceased to exist as separate search sites and
merged under the Ask .com umbrella. I had never been impressed with Ask Jeeves,
which was one of the few sites that continued to try to respond to users questions,
though not very successfully. Teoma was always an "also ran" in the world of
search. However, when Barry Diller, former Chairman and CEO of Paramount
Pictures and Fox, Inc.'s, and his lAC/Interactive Corp. acquired Ask Jeeves this
year, things changed dramatically. The name was shortened to Ask, the annoying
butler icon was gone, along with the ubiquitous ads and usually unfulfilled promise of
answers to natural language queries. Ask incorporated Teoma's search algorithm,
ExpertRank, and the Teoma site went away. Now, Ask.com has become a major
player.
One of the most striking differences is obvious as soon as you run a search. Instead
of a list of sponsored links, which Google, Live Search, and Yahoo all display, Ask
shows "zoom related search" links, designed to help users either narrow or expand a
search. Of course, Ask still serves up ads with its search results, but the search
company is putting the primary focus on free search results and not on sponsored
results.
6. Default Ask Site: You can chose one location from a list including the US,
France, Germany, Italy, Netherlands, Spain, and UK, or no default site.
Your results will vary depending on the default site.
The other setting option is similar to Yahoo's feature that lets users edit the search
tools. Here are the options Ask offers; you can select only the ones you want to
appear on your Ask main search page.
~
LJ
Web
lm3ges
p
[]
Advanced Search
Bloglines
•
t.li.i '
Tool bar
Uni! ConvetBion
~ New·s
;';il-
,.,._·;, Currency Conversior1 l!l \tVh1te Pages
D 01ctiomuy ~. Stocks
Cardinal Heaftt>
c,,, diu ill Heallh is t~
8 ng provider of prodtH:ls. sePiices and lechru:lloqies swpporlmiJ the.
Cardinal Birds .A11d How
TI1~y liw
El:iiJe Ja~
llhfl~. . - .. lJ~ E?.iJfl.·.OipQ ?PLe...§.rrd__ \:S.'?i.'l\t!J'y_:~taJ.t i?i~t:t· V.Y~L.
Cardinal. Text by John Jarnt>s Audubon from 81rds of Ameuca .. THE CARDINAL GROSBEAI<. DO!JVY'OOd
{Northern C.lr dh•.ll.l 8ilkl F.agle
~ •, ·!,-,( .. l:l·~·,•l '"·····!.' ,.,·li·;.· 1.11.
E:llack Bear
btromg com- Northern Cardinal Gird 0fthe Week 1\lcw 14-21. 1999 Flow~ring D•>9W(J¢d
bi<ding com Northern Caltlht.ll Bird of I he \Nee~. Nov 14-21. 1939 Similar species. Tho? male More"J:
Norlhern Co11olhMl1S ~mmislakl'able
I '
~ A Smart Answers: Ask's best guess about what you want, Smart Answers
provides quick access to encyclopedias (Wikipedia, Houghton Mifflin, or
Columbia), weather, dictionary results, translations, conversions, etc. Note
that "other matches" will try to disambiguate a search term with multiple
meanings such as [cardinal]. This is an extremely useful way to find
information about commonplace topics, such as [Rwanda]:
cwanda
~ B Webpaqe Title & Description: the title and a brief summary of the website.
~ C Binoculars Site Preview: Ask's Binoculars Site Preview are periodic screen
captures of the browser navigating a page. To view the site preview, users
should only move the mouse over the binoculars because clicking on the
binoculars takes you to the site. The mouseover is of a static image, so it is
safe to view, but I find it too small to be very useful beyond revealing the
general nature of a site.
~ D Cached: a link to the version of the site stored by Ask with the date and
time the page was indexed.
~ E Save: Ask offers this service for web and image searches. When users click
on a "save" link on either a web or image search, for web searches Ask will
save the title of the result, the uri, the description, the binoculars icon, and the
query used to find that result. For image searches, Ask will save the name
and location of the picture, as well as the query used to find the image. Also,
everything s~ved is fully searchable so all saved content is easy to find again
later. However, for the save feature to work properly, users need to allow
search history to be enabled (the default). If you do not want Ask to save your
search history, go to My Stuff 1 Settings and uncheck "Record all my searches
into my 'Search History."'
~ F Zoom Related Search: This is a popular feature retained from Teoma that
helps users either narrow or broaden a search "with possible alternative
search terms which appear on the right hand side of the Ask results page.
• Narrow Your Search: helps you to drill down into topics that are
specifically related to your search
• Expand Your Search: allows you to explore topics that are conceptually
related to your search
~ G More Search Types: Selecting any of these other search options causes
Ask to search automatically for images, news stories, blog entries, etc., with
your search term(s).
59
Ask.com Site Features, "Zoom Related Search,"
<http://help.ask.com/en/docs/about/site features.shtml#relatedsearch> (14 November 2006).
Ask assumes as its default that multiple search terms are joined by the AND
. operator, so that a search on the keywords [windows explorer] will find all the
webpages that contain both search terms.
Ask will not return any results if there is no webpage containing all the search
terms. Try this query to see what I mean:
Ask is not case sensitive. There does not appear to be anything you can do about
this .
Ask does not offer word stemming or truncation, i.e., searching for variations of
search terms. Ask searches for exactly the term as you enter it, e.g., a search for
[window] will not search for [windows].
Ask automatically clusters search results. Multiple hits from the same site are
indented and there is usually an option to see more results from a specific site.
Ask permits the use of the OR operator in simple search. The OR needs to be
capitalized.
Beyond the use of the OR operator in its simple search, Ask does not support
boolean search. ·
Searchers can delimit phrases using double-quotes. For example, if I search on:
[the last king of france]
without double-quotes, Ask will ignore the "the" and the "of' in its search. I noticed
that the results from this search are more relevant than the ones I received from
Google for the same query. If I enclose the same query in double-quotes, Ask will
search on exactly the phrase ["the last king of france"], and the first hit links to a site
that lists all the Kings of France, where Louis XVIII can be readily identified.
Enclosing searches in double-quotes is much more effective for finding precise
results than relying on automatic phrase searching.
Ask appears to ignore stop words outside double quotes only when other search
terms are used. These two searches will return identical results:
However, if I search for [the], Ask returns over 2 billion hits. If I add another search
term, e.g., [the france], that query is identical to searching for [france], so the stop
word is ignored. Nonetheless, it appears that if you search only for stop words, Ask
will find pages containing them all, e.g., [i a an the].
Ask does not seem to like the plus sign (+) because it returns an error message
when I try to use it. By default Ask searches for all keywords except stop words.
However, there are many times when searchers need to exclude certain terms that
are commonly associated with a keyword but irrelevant to their search. That's where
the minus sign (-) comes in. Using the minus sign in front of a keyword ensures that
Ask excludes that term from the search. For example, the results for the search
["pearl harbor" -movie] are very different from the results for ["pearl harbor"].
Ask treats most punctuation marks the same way, as links in a search string. For
example, Ask handles a search for [c-span], [c.span], ["c span"], and [c?span]
basically the same way. However, a search for [cspan] with no space or mark is
treated differently.
[shuttle site:www.nasa.gov] finds pages about the space shuttle at the NASA
website.
["bulletin officiel" site:fr] finds pages in the French top-level domain about official
bulletins.
["bulletin officiel" site:-fr] finds pages containing the phrase "bulletin officiel" that
are not in the French top-level domain. Note that the minus sign goes after the
site: syntax.
~ title: or intitle: restricts the results to documents containing the keyword in the
title.
Advanced Web Search > Location of words or phrases > In page title
[title:amazon] finds all pages that include the word amazon in their title
[intitle:amazon jungle rainforest] finds all pages that include the words amazon,
jungle, and rainforest in their title. Using intitle: makes this search function the
same as Google's allintitle: query. Note: use a hyphen to search for phrases
using the intitle: syntax because the double-quotes do not work.
[-books title:amazon] finds all pages that contain amazon in the title and do not
contain the term books anywhere on the page. Note that you must put the
excluded term before the intitle: syntax.
[title:galileo site:-nasa.gov] finds all pages that contain the term galileo in the title
but are not at any nasa.gov website.
~ inurl: restricts the results to documents containing the keyword in the uri.
[inurl:nasa] finds all pages that include nasa anywhere in the uri (address)
[inurl:nasa site:-gov] finds all pages that include nasa anywhere in the uri of sites
that are not in the .gov top-level domain. Note that the minus sign goes after the
site: syntax.
[inurl:shuttle inurl:-nasa] finds all pages that include shuttle in the uri but exclude
nasa from the uri. Note that the minus sign goes after the site: syntax.
[inurl:nasa shuttle sts-90] finds all pages that include both nasa and shuttle in the
uri of a site. Used this way, Ask's inurl: command functions the same as Google's
allinurl: command, that is, all terms must be in the uri.
[-shuttle inurl:nasa] finds all pages with nasa in the uri but do not include the term
shuttle anywhere on the page. Note that you must put the excluded term before
the intitle: syntax.
Blog Search: Ask is partnered with Bloglines, the most popular (and my favorite)
RSS feed reader, to create blog and RSS feed search. The blog search options are:
);> sort by date, popularity, or relevance (which combines date and popularity).
);> options to subscribe and/or post to a feed using several different applications.
: 0 !
Uh iJ v•:>·;!~; Maps Ma;h~R RoJndJP Pan 112 0 GnO>Jff M.lp~ API Crif.r.i•l
t ·. -~ ·: <.. <> i:d [,";l' ·~-~ ·-~ !;:.!","i:, ~:::r,,·,. ~ ·.·"tll: ···.·:~); ·: e:c~
The abo.;e ~nk show:. 3n e):ample reslJ 1 O::her Vk Goagle M<1ps r..,1ashups !h1s.l'~ Go.!J!e MJ.tt$ mashlJps
r~tmdup and some- other llT~;;1 Go·tg1e hf,lfJS :~pp~
rj (,.! 3oo~l• Mapt· 1'>1>o
kn~w m.1ps could b~ lun?
tt'J;> ·. .·o·· r·
menus
~ RSS Answers will display the three most recent entries at a blog. Obviously,
only a limited number of blogs work in RSS Answers, but it is a quick way to
see what is new at your favorite blog site. Here is an example of an RSS
Answers for John Battelle's Searchblog:
IJ Dictionary Search
Find the definition on ask. com: I_____ _ _ . _ ......1: G~ I _....
I
Search multiple dictionaries on onelook. corn: i 1 Go ·I
Find more instances on bartleby.com: L. _-· --.. -· -- ---:J'~·
6rowse by topic on allrefer.com ~~~~l~ct~~pi~·- _____ [EJ~
Maps: to map a US or Canadian location, search on the street address, city and
state or the word map and a location. Some international maps are now available.
See the section on maps for details.
News: links to news stories appear when a search term matches current news
stories. Sort news by date or relevance. A separate Ask News page is available at
http://news.ask.com/
Travel Shortcuts: To find arrival and departure information, flight delays, airport
status, and weather conditions at a US or Canadian airport, enter the airport's three-
letter code and the word airport. For example, to information about Baltimore-
Washington International, enter [bwi airport].
White Page Search: search for US phone numbers and addresses for people,
businesses, government offices, doctors, and schools in the U.S.
Web Answers: This option is the remnant of Ask Jeeves, that is, Ask's attempt to
provide direct answers to questions. Users may write a natural language question or,
in this example, if an answer exists to a commonly asked question, such as the
meaning of 'ontology,' the Web Answers will appear under the definition.
VV~~ ·lmd£.~. · rJe-w~ • 81oq~ & ;~•ds · ~hillilll..fl.<l · More
jdP.f rp cnlolo~~
In that conte<t, 3n ontology •~ a ~p~nnon un tor maktng ontological for mal dl?f.r..fhon oi Outnlo9y L:tr;rj
untoloru•l cl)wrnrtm ~nt •s 91'/tln below. Ontology Oid! orm~
~ : ·... ·. · ~q · ' •:!'..:! _ _. ·!•I'. .
.!.,r, t;;<.IEnsi~·~
-tie
'Nii_i;ill?._an 0nJQ[QQ'{2
An ontology ii a specifiution of a conceptualiution ....
~ -.... ..1 '. · : . \.;·(.!-- · ,1 1. • 1·•) :..:!Lrl: : ·
.S~ t r;,m;;: .; O•Jt.b~ dtl ons tn f'..ML - row;~ros a Qlobal kr.GNIP.dqe ba>e.
Ontology Is defined as a descrlprlon. such ~sa formal specification or a program. of the concepts •nd
relationships that can ••1st for an agent or a community or agents)
~ · ···· , • ·:; t " .!1·.
, ... - ... t,.· ..... - · .. ... -t .... l ••.. ..-,.., - ....... .- ,- 11 ... ,.. '" , - ,,
Conversions: The Ask conversion tool will automatically convert world currency,
temperature, weight, length, area, and cooking/volume. Users can use the query
[convert amount x to amount y], e.g., [convert 200 iraqi dinars to pound sterling] or
try a natural language query such as [how many kilometers are in a nautical mile].
The conversion tool is very easy to use and impressive.
i:"~
VVeb ·~·N ews· Blogs & Fee ds· ~ · More :>
~m !convert 200 iraqi din ars to po und sterling Se .uch A cJvanced Sear ch
Web Search
Discl,.imer: All date reflects mid-market rates updated every 30 minutes by XE.com
Image Search: the Ask image search uses "authoritativeness" to rank its results and
also accesses a proprietary image index. It is one of the best image search tools
available. The image search appears as one of the default search tools on the right-
hand side of the main search page. There is no advanced image search and no
special image search options. However, when you search for an image, zoom
related search terms to expand or narrow the search appear. If you select the "save"
option, this link will save the image to your personal "stuff," which can later be
accessed via http://mystuff.ask.com/. If you select "info" about an image, you will
then see detailed information about the image, including copyright information, and
its source homepage will appear in a frame in the bottom portion of the screen.
'li•~ 'lm"-'it1 New~ · (C\Jp_<t§..~...f!•;, • §hqP.P.!:'..'! · f'lla..:2
ldcfr.~ ontology
Onlolog~ Phrlo;opf;r
IJefiloe Ontologcco!
In thar conre>.:~. an ontology is a sp~ltl:ltJOh USI tor m:.krng ontological Ontolo~y werM)•
onlolv~·cal corrrmrlrneot rs g.·~an befc-11. Ontology Orctronary
fj!o·· t.· : •. ' I ' d ' . • .. ' ·:· -·i!~!!!!_(~~~
• Jearch
'IVeh · ~ · Nf<\v~ · Bfog!: f>. Fi!~d~ · ~ · lv1or~
~~~fioe ontology
'.;IJtal •s an Ontob~
An ontology is a specifie~tion of :1 conceptu:~lization ....
~ ~ •. ,.. I ..' !..• · ,_: ..! ! .·, •l = ·"v' ~ .·
l'1u • _·
.-..r> .- •..1.,.. I,...-.. • , .. .-« ~f:..,~._,_, .~.- _..,. ,.. __ II .,,.. •• .... ,, ,...
Number Search: Ask offers many types of number searches. The numbers Ask will
search for are:
);> USPS tracking: enter USPS plus the tracking number with or without
spaces [usps 9999999999999999999999], or enter [usps tracking] to bring
up the USPS tracking query option .
);> FEDEX tracking: enter FEDEX plus the tracking number [fedex
9999999999999999], or enter [fedex tracking] to bring up the FEDEX
tracking query option.
);> DHL and Airborne Express tracking : enter DHL plus the tracking number
[DHL 9999999999], or enter [DHL tracking] to bring up the DHL tracking
query option.
);> ZIP codes: enter a US ZIP code, either five or nine digits
);> VIN Information: to find information about a vehicle's history, search on its
17-character Vehicle Identification Number (VIN)
For anyone who wants additional help in learning how to use the Internet more
effectively, many excellent resources are available for free via the Internet. Also,
there are more and more sites appearing to help new Internet users get started with
searching the web. Some help you choose the right search engine, others how to
formulate a query, and others are step-by-step tutorials.
The Internet Detective Tutorial is a free online tutorial that is part of the lntute: Virtual
Training Suite, a set of "free Internet tutorials to help you learn how to get the best
from the Web for your education and research ... [created by] a national team of
subject specialists based in universities and colleges across the UK." 60 Not familiar
with lntute? It is the newly evolved face of the Resource Discovery Network, a
carefully selected and evaluated set of academic research resources. The Internet
Detective tutorial focuses on how to evaluate Internet sources for quality and
authoritativeness, how to avoid wasting time on questionable websites and
searches, and how to avoid violating copyright laws and plagiarism. The tutorial
includes a set of practical exercises to try your Internet research skills. Although the
tutorial is aimed at university research, I highly recommend it for all readers. The
tutorial requires about an hour to complete, but it is designed so you can do it in
more than one sitting.
The. following are tutorials, guides, and search-oriented sites available on the
Internet:
60
lntute: Virtual Training Suite, <http://www.vts.intute.ac.uk/> (12 September 2006).
This section, which first appeared in the 2006 edition, was born of the rapid growth
of both unconventional search techniques such as Google hacking and the wildfire
spreading of such tools as online maps. This year, I have added a new section on
Wikipedia and expanded the maps and mapping section.
"Google Hacking"
This topic has received a great deal of attention in the world of Internet search in the
past few years. While this activity is generically referred to as "Google hacking," 61
this is a double misnomer. First, to limit this practice to "Google" is a mistake
because many of these kinds of searches can be run using any search engine,
though they are clearly going to be most effective with a large, powerful search tool
that offers many search options, such as Google. Second, this is not hacking in the
sense that most people use the term, i.e., gaining access to a computer or data on a
computer illegally or without authorization. Nothing I am going to describe to you is
illegal, nor does it in any way involve accessing unauthorized data. "Google (or
search engine) hacking" involves using publicly available search engines to
access publicly available information that almost certainly was not intended
for public distribution. In short, it's using clever but legal techniques to find
information that doesn't belong on the public Internet.
To understand how this information has found its way into search engine databases,
we need a quick overview of how search engines work. Very simply, search engines
deploy "spiders" (aka crawlers or bats), which is actually software that "crawls"
websites looking for new sites, updating old ones, following links, and dumping all
that data into search engine databases where it is stored, sorted, and eventually
accessed by users. There is nothing illegal, immoral, or even fattening about search
61
Let's talk about the term hacking for a minute. A hacker is someone who is proficient at using or
programming a computer; in short, a computer expert. While there is no universal agreement on a
preferred term for someone engaged in illegal/illicit computer or network activity, I will call these
"black hat" hackers "malicious hackers" to distinguish them from "white hat" or neutral "hackers,"
meaning proficient or expert computer users.
engine spiders. Indeed, without them, we would have little or no idea what is "out
there" and available to us. The problem for webmasters is that it is their
responsibility to keep the search engine spiders out of any parts of their websites
they do not want to be accessed and indexed by a search engine. The spider is not
smart; it simply knows that if a "door" is open, it can-and will-go in and crawl
around. Webmasters must tell spiders "do not enter" (primarily) by the use of the
Robots Exclusion Protocol.
Robots Exclusion 62 comes in two basic flavors: either a metatag that can be inserted
into the HTML of a web page (usually used by an individual) or a Robots Exclusion
Protocol (robots.txt) file, a specially formatted file inserted by the website
administrator to tell the spider which parts of the website may and may not be
indexed by the spider. If a robots exclusion is missing or improperly configured, the
spider will index pages that the website owner may not have wished to have been
accessed.
That was then, this is now. You might think people would have learned, but judging
by the amount of "sensitive" information still available, many have not. Even though
search engines now routinely index many non-HTML file types, many individuals and
organizations still do not protect these files from the long reach of search engine
spiders. Furthermore, there are many ways for sensitive information to end up in
search engine databases. An improperly configured server, security holes, and
unpatched software can give search engine spiders unintended access. Quite
frankly, most of the problems boil down to one thing: human error, either through
ignorance or neglect.
What kinds of sensitive information can routinely be found using search engines?
·The types of data most commonly discovered by Google hackers usually falls into
one of these categories:
62
For additional information, see: <http://www.robotstxt.org/wc/exclusion.html> (14 November 2006).
~ search by file type 63 , site type, and keyword: many organizations store
financial, inventory, personnel, etc., data in Excel spreadsheet format and
often mark the information "Confidential," so a Google hacker looking for
sensitive information about a company in South Africa might use a query such
as:
a similar but more specific search could involve use of a keyword such as
budget to search for Excel spreadsheets at Indian websites; for example:
[filetype:xls site:in budget]
~ one of the most popular Google hacking technique is to employ stock words
and phrases such as proprietary, confidential, not for distribution, do not
distribute, along with a search for specific file types, especially Excel
spreadsheets, Word documents, and PowerPoint brie·fings.
~ search for files containing login, userid, and password information; note,
even at international sites, these terms usually appear in English. This type of
information is typically stored in spreadsheet format, so a typical search might
be:
[filetype:xls site:ru login]
63
It is critical that you handle all Microsoft file types on the Internet with extreme care. Never open a
Microsoft file type on the Internet. Instead, use one of the techniques described here.
);> misconfigured web servers that list the content of directories not intended
to be on the web often offer a rich load of information to Google hackers; a
typical command to exploit this error is:
);> numrange search: this is one of the least known and (formerly) one of the
scariest searches available through Google. Numrange uses two number
separated by two periods (dots) and no spaces. While "legitimate" numrange
users probably will want to indicate what the numbers mean, e.g., weight,
money, pixels, etc. Google does ·not require any special words or symbols to
run a successful numrange search; hence its power. Numrange can be used
with keywords and other Google search options, such as:
Now if you try these searches, you will see this message:
Not Found
The requested URL
/sorry/ ?continulil::ht tp : // www . googl g . com/sea.rch'l..3Fnum\lD100'\26hl \loen\261 r\3D\26newwindow\3Dl \26.-al: a\30of f\26Q\30nurnr.enge\25
was not found on this server.
Lest you think I am spilling the beans here, I assure you I am not revealing anything
that is not already widely known and used on the Internet both by legitimate and illicit
Google hackers. I am fully indebted to Johnny Uohnnyihackstuff) Long for many of
the "Google hacking" techniques 64 I have learned. Please use the information he
provides judiciously because many of the Google hacking techniques he discusses
are really designed for cracking, i.e., breaking into websites and servers. That is not
64
Johnny Long, Google Hacking for Penetration Testers, Syngress: Rockland, MA, 2004.
Also, a lot of the best information Johnny offers is for his site members only, and I do
not want to suggest you register there. Nonetheless, Johnny's briefing slides from
the 2004 Black Hat and Defcon 12 conferences are available at the official Black Hat
Briefings website and elsewhere (so much for registration). I have also found his
excellent white paper "The Google Hacker's Guide" at other sites that do not require
registration; there is another very good briefing on the dangers of Google by
Sebastian Wolfgarten.
There was a fair amount of sniping following Long's talks at Black Hat and Defcon,
mostly of the "big deal" variety, i.e., it is not "real" hacking and therefore not worthy
of presenting at Defcon. However, this is a very shortsighted point of view when one
considers the kinds of information that is so very easily available via Google, et al.
How would you like to see your Social Security Number, credit card number, and
that very handy little three digit number on the back of your credit card used for
"verification," bank routing information, mother's maiden name, etc., in the next
Google hacking briefing? Yes, all this kind of information is readily available (I
know ... l've uncovered quite a bit of it myself). And this doesn 't even take into
consideration all the other website weaknesses, such as multiple vulnerabilities with
liS 6.0 Web-based administration, that can be exposed using Google.
Joe Barr, "Google Hacks are for Real," Newsforge.com, 6 August 2004
http://www.newsforge.com/article.pl?sid=04/08/05/1236234
Taken all together, the information Johnny Long has found using Google (he sticks
with this one search engine), combined with the techniques he details at his website,
provide an excellent tutorial on using Google to find stuff that really should not be on
the public Internet or easily accessible via a search query. Furthermore, the greatest
value of his efforts may not be in finding useful information but in demonstrating the
vulnerabilities of any given website and the necessity of taking strong measures to
ensure the information that gets into Google (as well as other search engine
databases and the Internet Archive) is only that which is intended.
Given the large amount of "sensitive" or private data readily available via Internet'
search engines, people naturally wonder why companies and individuals do not
actively try to remove this information. Sometimes they do, but much still remains
accessible. Why? Getting private information "back" is harder than preventing
its disclosure in the first place. There are steps you can take to remove your data, ·
but as hacker Adrian Lamo says, "removing links after the fact isn't a very elegant
solution." Nor is it likely to be terribly effective. There are a number of reasons for
this, but what it boils down to is: it's very hard to put the genie back in the bottle.
First of all, you have to find out if your data is "out there" in order to ask search
engines to remove it and, clearly, many people and organizations are not playing
defense, that is, they are not routinely checking to see what is indexed from their
websites. Let's say you find something on Google that shouldn't be on the public
Internet. The first thing you have to do is to protect the sensitive pages on your site
or remove them entirely. However, even when you have removed those pages from
your website, this doesn't mean they can't be accessed. Once documents are
indexed in a search engine database, a publicly available copy of those documents
(usually referred to as the cache copy) may remain behind for days, weeks, even
months.
The next step is to ask Google to remove your sensitive pages from its database.
However, even when Google removes your data, there are literally hundreds of other
search engines around the world, and who knows what they have indexed from your
site. It will not be an easy task finding out. And I'll hazard a guess that not all of them
will be quite so accommodating as Google in removing pages. -
65
The Internet Archive is a non-profit organization that was founded to "build an 'Internet library,' with
the purpose of offering permanent access for researchers, historians, and scholars to historical
collections that exist in digital format. Based in San Francisco, the Internet Archive has been
harvesting the World Wide Web since 1996, to create one of the largest data collections in the world.
The Internet Archive's web archive contains over 100 terabytes of data, and the collection is growing
at a rate of 12 terabytes per month." <http://www.archive.org/> (14 November 2006).
Because of the vast amount of information available using public search engines, it's
relatively easy to find lots of interesting, amusing, shocking examples of sensitive
information. While this is all fine and good for entertaining yourself and impressing
your friends, what we are really after is ·useful, meaningful, and actionable
information. Put succinctly:
So how do you find "something" useful? While it isn't easy to do so, I can make
some suggestions that might help. The most valuable assets you have are your
subject matter knowledge and your creativity. Add these to a few search engine
strategies, and you can probably ·find many relevant and genuinely useful pieces of
information. The strategies I recommend for finding "something" rather than just
"anything" are:
You will have a lot more success searching for information within the Chinese
Ministry of Foreign Affairs [site:fmprc.cn.gov] than looking at all the sites indexed for
China [site:cn] or even for the government of China [site:gov.cn]
Add keywords
Here's where your subject matter knowledge and creativity really help. You are the
best source of information about what words are most likely to yield the best quality
and quantity of useful information. As a general rule, more uncommon words work
best (consider using unusual proper names).
Most of the best information found by Google hackers is not on webpages {HTML)
but in other types of files. Try all or most of the file types one at a time (these are not
the only searchable file types; check the particular search engine's documentation
(Help page) for others):
Muhammad Aslam
.at ccTLO Man<~ger
................ .:... . _____',.~.: .... :·.~
AFGNlC
And, ofte·n, PowerPo.int fHes are also
available i;n PDF(safer/easier to read)
[site:www.companyname.com inurl:database] or
[site:www.companyname.com inurl:directory] or
[site:www.companyname.com inurl:index]
Then, look for keywords, such as companies, and move to the next level query:
[site:www.cornpanyname.com inurl:companies]
You may be able to browse through the list of companies and get names,
addresses, phone numbers, etc.
I cannot emphasize strongly enough how important it is to use keyword search terms
that are in the native language of the entity you are researching. The Internet is
becoming much less dependent upon English, and sites written in languages that do
not use the Latin alphabet are growing by leaps and bounds. For example, a search
term written in the native language and encoding is far more likely to yield
interesting, useful results than the same word transliterated into English. Most good
quality search engines now correctly render non-Latin search terms regardless of
how the term is transliterated in English. A search on the Arabic ~ returns very
different results than searching on [muhammad], [mohamet], [mohammed], etc.
~ ~ ~ ~
~
Gougle :: D>9sktop
Semch I ~eo~r<rtl
Most search engine algorithms are now set up to "read" accented search terms
differently from those without accents. It's easy to test this by searching first for a
term without any diacritical marks and then the same word with the marks, e.g.,
resume vs. resume.
1
/ Strictly taken not diacritics but parts of the character.
66
I am constantly amazed by the frequency of misspelled words, uris, file names, etc.,
I encounter on the Internet. By far, most appear to be simple mistakes, often made
by non-English speakers trying to cope with our confusing language. These
mistakes tend to propagate as users copy and paste them again and again, which is
what I believe happened here:
66
Fact Index, <http://www.fact-index.com/d/di/diacritic.html>
Google
Web Results 1 - 10 of 10 from www.chinadaily.com.cn for enlgish . (0.30 seconds)
jobs
Chinadaily.ccm.c n Re cruitment tf!IE El W~~&~?l" ~~Oi~ itll~
r.;m'i\rJ(t~f!i(1 ;gl (~t:?.m) . ~* :tr MH9~.:t~Hli: ...
VNNv.chinodaily .com.cn/e nlybh / doci2004-03/ 16/clmter,t_.3lS3·t4. hlm- 23k. - (:~ch>;d - ~:: imil ot >J;<t18>;
jobs
Chinadaily. com. en Re cruitment tf!IE El W~ ~01?l" . ~ ~~i..!!:itll~
i.Q:iHf!i(1;g) (~t:?.m). ~* :tr~~~.:ti.~it~ill: ...
vN·tH. c.hrnadaily. corn. cn/en luish/ doct2004-03ll Giconter,t_31531 G.htm - 23k - (;~':_L~)- 7!:r:Jt''~.P.i'.~l"-~~
jobs
Chinadaily . com.cn Recruitment tf!IE El t\i(lffl !ili11l". r!J±rol:~-JiJi I~H£J! (
~lrtttr~3!-> (2:g) ( ~t:?.m ). ~* : 30 ...
\'·,'•N-.V.ch ino(iaily.corn.cnie ui!JiShi doc/20rJ•J.r::r3.f1G/co ntent_315317.htm- 23k- Gz.rlwd- cJimil~l' C'l~~::
lObS
Chinadaily.com.cn Recruitment tf!IE El Wtm !JOli'l" (21
1it~~ ::!-' ~~:ltW > ~"~Jt!!~l?'f~l! «.211!t~~::!-'~~:ltW ...
wv•rw. chir,ad~ ily .com . cnien i!Jishi docf2004-04.106/conl8nt_321050.htm- 9k- (;]_~l1EQ- SimUi!r_p'l!J\:'~
jobs
Chmadaily.com .cn Rec ruitment tf!OO 8 t\i(lffl!Jiil?l" ~i~:¥:~~ I'JUUi(
2-3:S) (~t:?.r!J) . Ift.PJiU JI!!: ~t:?. ~i!i ...
www.chirradail:;.eom.cn/e nlyish! ,joct200,1-03f l6/con tent_3·t5312 .htrn- 30k · ~.i'.fh_~j- ;:L!:.it~lc.l!.~.!l~~-
Finally, the enormity of the task of finding meaningful and useful information on the
Internet is both daunting and comforting: daunting because we know we can only
scratch the surface of all the data and comforting because there is an almost
limitless pool of possibilities. I find it useful to keep the challenge in perspective by
recalling that a study published in 2000 showed "the sixty known, largest deep Web
sites contain data of about 750 terabytes (HTML-included basis) or roughly forty
times the size of the known surface Web." 67 In short, there is just so much data and
information available via the Internet that no institution, no government, no
computer, and certainly no individual can possibly grasp more than a small portion of
all there is.
67
Michael K. Bergman, "The Deep Web: Surfacing Hidden Value," BrightPianet .com, July 2001,
<http://www .brightplanet.com/technology/deepweb .asp> (14 November 2006), Introduction.
This topic is new this year and expands upon the entries on Rollyo and Gigablast's
Custom Topic Search from last year's edition. During 2006 there was an explosion in
the number of custom search engines, including entries from Google, Yahoo, and
Live Search, so you know the powerhouses think this is worth a try. Whether this
trend catches on remains to be seen.
The phrase "custom search engine" is very misleading. l\lone of these sites permits
users to create a new search engine. What each site does in its own way is to let
users customize an existing search engine to search specific sites in specific ways
and return results in a personalized fashion. Thus, a better name for these services
would be customizable searching, but that moniker is clearly unappealing. Just
remember that you are not creating a new search engine any more than customizing
a car is building a new automobile from the tires up.
Most of the custom search sites operate on a simple principle: they automate a long
"site" search, e.g., the search is equivalent to [keyword(s) AND (site 1 OR site 2 OR
site 3 ... 0R site n)], where n stands for the maximum number of sites you are allowed
to search.
Gigablast's Custom Topic Search was one of the first "create your own search
engines" to appear, although Gigablast's creator Matt Wells never claimed it was
anything other than a way to customize Gigablast. The beauty of the Gigablast CTS
is that it requires no software installation but is very, very simple HTML code, so
simple anyone can edit and understand it. No registration is required.
topic search is that you, and not some anonymous marketer, choose the sites you
want to search . This "tool" (for want of a better word) is amazingly easy to use and
powerful. As someone whose eyes glaze over at the mere sight of code, let me put
this in "user" language. If you are familiar with Google's site: syntax, imagine being
able to have a "canned" query that runs against up to 200 websites of your own
choosing and lets you run it whenever you like and use whatever keyword(s) you
want at any time. The query on Google would look something like this:
The problem with Google is that multiple site/domain searches are cumbersome at
best, and they quickly run up against Google's 32-word limit. Enter Matt Wells and
Gigablast. As the creator and sole proprietor of his own search engine, Matt has the
luxury of being able to add new options easily. I think CTS is his best innovation yet.
Even if you are as HTML-averse as I am, this code is so easy to edit that it's a piece
of cake. To make things even easier, I have done the basics for you. First, however,
I highly recommend you read through the Gigablast pages below on the concepts
behind CTS. .
Now you're ready to take a look at, edit, and try the CTS. Copy and paste this HTML
code into an application such as 1\Jotepad.
<head>
<title>Gigablast Custom Search</title>
</head>
<body>
Search News Websites
<form method="post" action="http://www.gigablast.com/search">
<input type="text" name="q" size="60">
<input type="submit" value="search" border="O">
<input type="hidden" name="sc" value="1 ">
<input type="hidden" name="sites" value="cnn.com news.yahoo.com
news.google.com usatoday.com foxnews.com">
</form>
</body>
This is a bare bones version of the CTS code. Now you can play with the code and
make it into your own custom topic search page. I should mention that I set the "site
clustering" option to 01\J <input type="hidden" name="sc" value="1 "> but you can
reset it to OFF by changing 1 to 0. Once you save as an HTML file, all you have to
do to use it is to open the file in your browser, insert keyword(s), and go.
Obviously, you will want to add more sites to search (I only put in a few) and change
the topic to something of interest to you (I chose the rather bland News topic for
demonstration purposes). Also, you can enter sub-sites or more specific sites, such
as cnn.com/WORLD or dir.yahoo.com.
One thing to keep in mind that is you are searching Gigablast's database of
pages from these websites, not the sites themselves. The "work" that goes into
creating a CTS is mostly up front because once you create your list of sites, it is not
a complicated matter to add to or subtract from it. I can easily imagine creating a set
of these search forms on a variety of topics using existing bookmarks.
Rollyo http://rollyo.com/
Rollyo stands for "Roll your own" search engine, meaning that you select the
sources you want to search. Rollyo is powered by Yahoo, so results will come from
Yahoo only. Rollyo lets users search up to 25 sites (not a huge number) and also try
out and use other people's "Searchrolls." In order to save, share, and use your
Searchrolls on other computers, you must register with an email address and a user-
created name and password.
Rollyo has some unusual features. For example, Rollyo permits users to upload their
bookmarks to create Searchrolls, edit someone else's Searchroll to make it your
own, keep your Searchrolls private or share them. Rollyo searches entire sites or
you can limit your search to a subdomain; however, you cannot limit your search to
directories within a site, e.g., in this case, everything after the slash is ignored:
security. news. com/library.
Rollyo has a nice little bookmarklet called Rollbar that "gives you access to all of
your Searchrolls wherever you are.
• Search any site you visit, from the same spot on your browser, without
having to dig around for every site's search page.
One of the most attractive features of Rollyo is the ability to share Searchrolls. Here
is an example of a Searchroll named "Muslim World Views." The sources searched
are on the left side:
Tr)"il 0u1
l:'l CHECK IT OUT!
in: IMuslim World V1ews :=OJ Sea1ch I Get your RoiiBllr Add o Se~rchbox
enqlrsh .aljCJ2eer.
Baghdad •Jntversrtv bombrng kills 70
1:=\amrepubh .
Rollyo has added blog and news searches (again, from Yahoo) to the results. Rollyo
makes it very easy to create, save, and edit custom searches.
"When we say we're letting people build a custom search engine, we mean the
whole thing: choosing which pages they want to include in their index, how the
content should be prioritized, whether others can contribute to the index, and
what the search results page will look like ... Here's how a Custom Search Engine
works: organizations or individuals simply go to www.google.com/coop/cse and
select the websites or pages they'd like to include in their search index. Users
can choose to restrict their search results to include only those pages and sites,
or they can give those pages and sites higher priority and ranking within the
larger Google index when people search their site. Users can then customize the
look, feel and functionality of their search engine." 68
68
Google Press Release, "The Power of Google Search is Now Customizable," 23 October 2006,
<http://www.google.com/press/annc/custom search.html> (17 January 2007).
After a telephone conference with Google's Marissa Mayer and the Google product
managers, search and Google expert John Battelle shared his comments, which I
think are excellent insights:
"While similar to Rollyo's innovative custom roll, the Google CSE adds the benefit
of allowing users to roll an unlimited number of sites together and display the
results on their own site, with personalized presentation . Someone on the call
described this as the fragmentation of search. The ability to build verticals will
allow experts to build specialized engines. But while the engines will be
individual, the collaborative element of tagging the domains encourages
communities of knowledge to create together. So while each will stand apart from
the amazing all-in-one answer box, the Custom Search will also allow a
thickening or deepening of intelligent tags in Co-op, which feeds the one box that
unites them all. " <http://battellemedia .com/archives/003006 .php>
Not surprisingly, you must have a Google account to use this service. Also,
Google Custom Search includes AdSense sponsored links alongside search results,
but government sites, non-profits, and educational institutions are exempt from the
advertising requirement. To see the Google Custom Search in action , take a look at
Real Climate.org's internal search: <http://www.realclimate.org/> Even better, check
out Customsearchguide, a directory of Google Custom Search Engines that others
have created but you can use. Here is an example of general science and
technology custom searches.
- - - · - - - -- - -- - --~, - - - - --
Control Center
~ CustomSearchGui~
General Science And Technology Search Forms
Yo •• A1~ He1 e: Home ::. Tec hnology> General Science And Tec hnol ogy Se arch Forms
Bon.inU} And r ins n ct . !::!s.ill!l · ~~. Ri! hHt net . S h oppin g .~. ~ . ~ .~
E•litor Vislt.-.r . .
_se_•u_~h_F_
u ff_"_______ R~ tl n ?R~Ihr9 _
De_sc_IIP_h~_"_____________
Technolo gy Se arc h nfa . . . . Searches science and technology resources.
Scie nc e and Engineerin g Search nla n/a Find info on sci en ce and engineering.
Wehm.lsters & BIO!JfJer s: Li nk here & en co ur ag e pe ople to use and vot e for your favorite CSEs
<~ hre!- :
"http: fft.rwu. cust o ~~at::ch gui de. com/ceu:.egor i e :s/ t e c hnology/ge n era l -:s c i enc e- ~
Science And Tec hno l ogy Search f orrt!!J </ a > ~
·~ .••,•••• •.••... ·~· .·.•...'Jru.
Customsearchguide http://www.customsearchguide.com/
According to its website, ''The Alexa Web Search Platform provides public access to
the vast web crawl collected by Alexa Internet. Users can search and process
billions of documents and even create their own search engines using Alexa's
search and publication tools. Alexa provides compute and storage resources that
allow users to quickly process and store large amounts of web data. Users can view
the results of their processes interactively, transfer the results to their home
machine, or publish them as a new web service."
What exactly is Alexa offering to the user? In essence, Alexa gives the user, whether
an individual or organization, access to the same kind of powerful technology used
by Google, Yahoo, and Live Search . "Alexa spiders 4 billion to 5 billion pages a
month and archives 1 terabyte of data a day. The new platform will allow developers
to build their own search engines." The goal? To democratize web search by taking
it out of the hands of giants like Google and putting it into the hands of literally
anyone and everyone. The implications are enormous. And it appears it is a hit. In
fact, within a very short time of its initial opening, Alexa had to cut off new
applications temporarily because it was overloaded with customers wanting to sign
up for the new service, but the site soon reopened registration .
The Alexa Web Search Platform (AWSP) offers the user the capability to:
);;> define (search): AWSP has a much more robust set of search options, syntax,
and APis than other search engines and also permits the use of stored
(canned) queries; the AWSP "data store" contains text, html, music, video,
images, and more types of files.
I .
);;> process: users can search the ent1re Alexa data store and "are able to
process both the raw content and the metadata extracted by Alexa's internal
processes."
);;> publish: the output of the search can be anything from one result to an
entirely new vertical search engine, for example a new video search engine
or a new search engine for automotive parts. Quite literally, "by making use of
these utilities, a user might introduce a great new search service to the world
with nothing more than a home computer." 69
The costs are modest and are based on consumption (you pay for what you use and
not for a subscription or service contract):
Simply stated, Alexa/Amazon are "renting" their huge database ("data store") to any
and all takers for a remarkably reasonable price and, what is more, offering detailed
69
Alexa Web Search Platform User Guide, Introduction: What Can I Do with the Platform?
<http://paqes .alexa.comfawspfdocsfWebHelpfAWSP User Guide.htm> (17 January 2007) .
70
There is one example of something similar, which came to my and some others' minds. If you are
familiar with IBM's WebFountain and its proprietary implementations for specific customers, you may
see some similarities . WebFountain also spidered the web and then let IBM's customers run queries
against that data set in more sophisticated ways than simple querying (something akin to
datamining). However, the problem with WebFountain and its progeny was that IBM had to write the
programs, and thereby hangs a tale of woe. For more, I recommend Jeff Dalton's blog entry on this
topic (I think he nails it). Jeff Dalton, "Alexa Web Search Platform: IBM WebFountain 2.0," Jeffs
Search Cafe, <h ttp:f fsearch cafe .blogspot. comf2005f 12falexa-web-search- platform-ibm .h tml>
user support on how to maximize the effectiveness of this data to get the most out of
it. The customer is empowered to write his own program to run against the
Alexa/Amazon data, download the results (metadata), and even create his own
private search engine on their platform. Perhaps I am wrong, but this could be a
huge development, perhaps even a major change in the way we use the web.
Fagan Finder
The Fagan Finder site has been a boon to searchers for some time not so much
because of its basic interface, which is a good but unexceptional megasearch tool,
but because of the many other "useful tools" site creator Michael Fagan has made
available.
which of the search engines is capable of searching for that particular type of file.
Not every search engine on the list searches for every file type.
Also, keep in mind that the Fagan Finder file type search for XML is less precise
than going directly to Google or Yahoo and searching by filetype: in Google and by
originurlextension: in Yahoo. If you use one of these search engines, you can
specify that you only want to search for, say, those files that are .rss by entering the
query [filetype:rss] or [originurlextension:rss]. These queries will return only those
documents in RSS format, not those in XML or RDF. So I recommend using the
Fagan Finder search by file type for files types other than XML, RSS, or RDF ..
URLinfo http://www.faqanfinder.com/urlinfo/
The indefatigable Michael Fagan also introduced a beta version of a new tool,
URLinfo, in mid-2004. URLinfo fills a void created when AlltheWeb effectively shut
down and took with it the useful "uri investigator." While Yahoo now offers Site
Explorer and Google a lame version [info:domain.com], Fagan's URLinfo provides
many more options for exploring a site. As with everything he does, Fagan has gone
all out with URLinfo, almost to the point of providing too many options! However, he
has done a smart thing in keeping the main URLinfo page simple, "hiding" the nearly
85 investigative tools in his toolkit behind a variety of tabs. I think URLinfo is
important and valuable enough to spend time looking at most of the options in some
detail.
Note the eleven tabs at the top, behind each of which is a range of investigatory
options. For help using URLinfo simply click on the dark blue [info] link on the far
right. The first step in using URLinfo is to enter a uri (address) in the search box at
the top of the page. Keep in mind that if you enter a uri in the search box and
simply hit return, you will be taken to that webpage, not to information about
it.
Entering a uri can prove to be more problematic than you might think because not
every URLinfo tool can handle the same format. For example, in the General tab, the
one most users are likely to use most frequently, you will get very different results
depending on the type of uri entered. For basic .com, .org, .net, .info, .biz, and .us
domains, Domain Tools is great. However, for any other top-level domain, you must
use Global Whois, and it will not search on anything but first-level domain names.
This means that neither Domain Tools nor Global Whois can look up
[www.duma.gov.ru]. Global Whois, however, will find first-level domains such as
[www.feb-web.ru]. This does not mean you cannot find information about
[www.duma.gov.ru].
As you can see, you get lots of data about the Russian Duma website. Note that
there are many additional useful links from the Alexa page, including one to the
Internet Archive's Wayback machine.
r
roeyA~PtTBeHH~R AyMo ~.
EXPLORE THIS SITE Sponsored Links
ll>eAeponbHOro Coo poHHA ~~ ~ 5- '>overview Raid Th~ Btwk!
Potci4HtKo i4 <l>eAepo ... Mono;!y you n~ver ~n~?w you
~ Traf1rc Detarls
we-r~ mis'3itlg!
O¢l~u.HanbHbai cepeep. ~ crop ~,R H. per.naa.AeHT 9 Related Lir•K'.
rocy.aapC TB eHHO~ ..QyMbl, HH.Pop~t.~a4'tR 0 ee >:> Sne> L!nkrng in F1ee Con!JOQ Netp.1ss
3 8 KOHO.Il3TeflbHOit J1eRTeJ1bHOCnt J]enyTaTCKI·H1 KOpnyc. Free .A.ccess ~o I hi? VV~b's T0p
3aKOHOAaTe.nbCTBO Pa> V1H¢lopMaLJ.~tA npecc-c..ny>K6bl. Sub~cription sit~?sl
Shue your thoughts P<>ople who visit thi s poge Jlso vis it:
Writ e a t(....,.j,,w on
Amazoa corn ..
• C:abine l of Mini ~~ t~•s of Ukr ·lin ·~ ~'\":,".\.' .~·"'''·~·J~ J •.~. Sit·~ info ot
The Alexa database contains site statistics, contact information, similar pages, and
more.
Let's look at a different uri for the SurfWax results. What you are seeing are
"SurfWax SiteSnaps rM, [which] count the number of links, images, words, and forms
on a page, shows the meta description tag, and extracts 'key points' and
'FocusWords."' This is a very useful way to analyze a website without actually
visiting it, though the amount of information is considerably less for some sites than
others, cf., www.fateh. net.
view page deer ~
Fal•h
Pa1~ s Hman movemen\ founded by Visser Arala\ .
Close
metaEUREKA
meta EUREKA shows information about the page (last modified date, page size),
meta information (description, keywords, author), web server information, and the
number of backlinks
Furl
Furl is a collaborative bookmarking system. This tool allows you to see the
comments others have written about a webpage.
Del .icio .us
Del.icio.us is a collaborative bookmarking system. This tool allows you to see the
comments others have written about a webpage.
Gibeo
- -
Gibeo allows anyone to annotate any part of a web page, and others can
comment on the annotation. Gibeo requires registration .
Semantic data extractor
The Semantic data extractor finds information about a page (metadata, page
outline) by looking at its HTML code.
The next tab is for Links. This is pretty straightforward. The first two links are to
Yahoo, the first for the link: command (links to a specific page) and the second for
the Yahoo Site Explorer or alternately linkdomain: command (links to a website). The
next is the Live Search (MSN) link: search, and then the Google link: command,
which no longer shows all links as it once did. Gigablast does not show all links to a
page, either.
The links from blogs is a very useful service because it lets you check to see if a
website is mentioned in a number of weblogs very quickly (I expect Technorati to
give the best results).
Bloqpulse
lntelliseek's Blog search (was not working when I tried it)
Bloqlines
Backlinks from blogs known to Bloglines, an online RSS/Atom aggregator.
Blogdex is defunct.
Technorati
Backlinks from blogs known to the Technorati blog indexer. Each result is shown
with an extract containing the link.
Feedster
Backlinks from blogs known to the Feedster RSS/Atom search engine.
BloqDiqqer
Backlinks from blogs known to the BlogDigger RSS/Atom search engine.
Waypath
Backlinks from blogs, known to the Waypath blog indexer, each is listed with the
date that the link was first seen and an extract from the page Unlike some other
backlinks tools, Waypath lists the permalinks rather than blog home pages .
Daypop
Backlinks from blogs and news websites known to the Daypop search engine.
Slog Rolling
BlogRolling is a service for bloggers to include blogrolls (lists of blogs) on their
own blogs. This shows what users include the given site on their blogroll.
Popdex
Backlinks from blogs (as well as the date of linkage) known to the Popdex blog
indexer.
The Similar tab is not entirely self-explanatory. Alexa, UCmore, Furl, and Google all
try to show related or similar websites, though not in the same way. Alexa shows
'people who visit this page also visit .. .'; UCmore clusters related pages by topic;
Furl is a collaborative bookmarking tool, so it only shows pages bookmarked by the
same person (of dubious use); and Google's related pages is, in Fagan's and my
opinion, of poor quality. Google News will show related news articles, but only if the
original article has been indexed by Google News. The Waypath tool looks for blog
entries about a website, and Waypath is showing no links to http://www.google.com
and two hits on http://www.microsoft.com. There is obviously a problem with this
specific search.
The Cache tab is much more useful at this time. Fagan has done us all the great
service of bringing the search tools that cache webpages together so they can be
searched from one convenient interface. Also, URLinfo makes it possible to see
Google's cached pages without images, style sheets, or forms with Google
(plain). Openfind is an Asian search engine and does not yet have an English
version. I was unable to figure out how their caching works because of the language
barrier. For news and blogs Daypop caches each page it crawls. "Its cache is often
the most up-to-date copy of the page, and it shows the exact time that the copy was
made."
Here's the low-down on the other general cache tools at Fagan Finder:
Internet Archive
The Internet Archive has been crawling the web and caching pages since 1996.
The Wayback Machine allows you to view the copies made during any of those
crawls, and also to compare any two versions of the same page.
Gooqle
When Google crawls the web, it stores a copy of each web page. This is the
most recent copy. This can also be used as a means of viewing some non-HTML
files converted to HTML.
Gooqle (plain)
Google's stripped cache, with images, styles (style sheets), and forms removed.
Giqablast
Gigablast does not provide direct access to its cache. You must follow the link
labeled [archived copy]. Gigablast's cache shows the date on which the copy
was made.
Openfind
Openfind is an Asian search engine; their English version is under construction.
Spurl
Spurl is a collaborative online bookmarking tool. Whenever someone using Spurl
bookmarks a page, a cached copy is stored. So Spurl may contain many different
copies of the same page on different dates and times, which can be accessed
from a selection box at the top of any Spurl cached page.
lncyWincy
This is the cached version of a web page from when it was last crawled by
lncyWincy. That date is shown at the top of the page.
Scrub The Web
Cached version of the page from the Scrub The web search engine.
Ay-Up
Cached version of the page from the Ay-Up search engine.
Objects Search
Cached version of the page from the Objects Search engine. Objects Search has
a small index, so don't expected every page to be cached. After using this tool,
follow the link below the page you want labeled 'cached.'
Search Spider
This is the cached version of a web page from when it was last crawled by
SearchSpider. Most pages appear to have been last cached during July 2003.
The Search section is pretty much self-explanatory, except that MSN searches Live
and Teoma searches Ask. Fagan explains the Blogs/Feeds tab very well for those
who are interested in searching weblogs and RSS or Atom news feeds. The
Translate tab simply sends your request to Fagan Finder's superb Translation
Wizard discussed in the online dictionary and translators' section. The Track and
Post tabs are in general not going to be useful for most of you in your work
environment. The Develop tab offers an excellent selection of web authoring
resources such as validation, editing, spelling, cacheability, and keyword analysis
tools. One tool users may not recognize and which could prove quite useful is
Traffic from Alexa. Here's Fagan's description:
users compare two sites and shows you "Where do people go" on the site. It's a gold
mine of data about the sites in Alexa's top 100,000; unfortunately, most of the sites I
wanted to research were not in that top group, so no statistics were available when a
site fell below the 100,000 threshold.
In case the Google PageRank tool confuses you, it normally requires users to
download and install the Google Toolbar. However, you can access the Google
PageRank option from URLinfo without the Google Toolbar. The results look rather
mysterious, but the PageRank is there. In the following example, AOL's home page
has a page rank of 8 (where 10 is the highest. .. and Google gets a 10 ranking, by the
way):
http://www .aol.com
PR Toolbar: 9
PR Actual: 9
Finally, under Mise you'll find the tools that didn't quite fit anywhere else. One word
of caution about BugMeNot: this is a service for sharing login information for
websites that require user registration and, as such, its ethics is questionable. I do
not recommend using it. It may also violate organizational Internet usage rules.
I think URLinfo will prove to be a very useful if not indispensable tool for researchers,
but I also think the key to using it effectively is not using every bell and whistle.
-------------------------------------------------------------
Wikipedia
Wikipedia http://en.wikipedia.org/
The 2007 edition is the first to include a separate section on and discussion of
Wikipedia and the entire "wiki" phenomenon. The extraordinary growth and success
of Wikipedia demand recognition and comment. Although the numbers change
constantly, in mid-2006, Wikipedia sites were the twelfth most visited Internet sites
71
among US properties, up over 300 percent from the 'previous year. On March 1,
2006, Wikipedia reached one million articles, and "the site receives as many as
71
Safa Rashtchy, et al., "Silk Road: Solid Search Results Could Boost the Sector," PiperJaffray
Industry Note, 10 July 2006, available at John Battelle's Searchblog,
<http://battellemedia.com/archives/Rashtchy%20-%20Silk%20Road%20071 0 .pdf:> [PDF] ( 14
November 2006).
fourteen thousand hits per second." 72 Just what is the Wikipedia itself and the wiki
concept in general that have led to a level of success that is nothing short of
astounding? For an excellent overview, I turn to my colleague Diane White's article
from an internal publication many of you read, The WorthWhile Web. In the May
2006 edition, Diane wrote:
"In true Ouroborosian fashion, the Wikipedia defines itself as a 'multilingual Web-
based free-content encyclopedia ... written collaboratively by volunteers, allowing
most articles to be changed by anyone with access to a web browser and an Internet
connection.' It exists as a wiki, which again Wikipedia self-defines as 'a type of
website that allows anyone visiting the site to add, remove or otherwise edit all
content very quickly and easily, often without the need for registration.' Truly
collaboration to the extreme, wikis are the latest trend in open-ended community
involvement and public debate. But it also conjures fears of authority and validity run
amok, and general mischief and vandalism. Wikis are popping up everywhere; but
just what are they, and how did they become so ubiquitous? More to the point, can
they be trusted, or are they just the work of a few people with big egos and lots of
time? ... The term wiki is a shortened form of the Hawaiian language term wiki wiki,
which is commonly used as an adjective to denote something quick or fast. It is also
sometimes interpreted as the backronym for What I Know Is. The invention of the
wiki is credited to Ward Cunningham, author of the book, The Wiki Way (Addison-
Wesley Longman, March 2001, ISBN 0-201-71499-X). The first wiki, WikiWikiWeb,
was created in 1994 and installed on the web by Cunningham in 1995. 73
"Once begun, almost anyone can edit a wiki, often without actually registering to do
so. Wikis can be on any subject, on every subject, and in multiple languages. The
most famous wiki, Wikipedia, was begun in 2001, initially as part of a broader, peer-
reviewed project and later as a stand-alone, 'neutral point of view' product. Guided
from the beginning by Larry Sanger and Jimmy Wales, today it is available in over
100 languages, with over 1 million articles in the English edition alone ...
72
Stacy Shiff, "Can Wikipedia Conquer Expertise?" The New Yorker, 24 July 2006,
<http://www.newyorker.com/fact/content/articles/060731fa fact> (14 November 2006).
73
"Wikipedia," Wikipedia: The Free Encyclopedia, <http://en.wikipedia.org/wiki/Wikipedia > (23
August 2006).
accuracy of its science entries.' 74 From there it has escalated, with refutations and
calls for retraction from Encylopaedia Britannica and heated responses from Nature.
Wikipedia itself has steered clear of this particular fray; however, it does attempt to
respond to criticism and has a page on its site for common criticisms. It also
addresses issues such as copyright, vandalism, and authorship.
"So what's the bottom line? The same as it's always been. When performing
thorough research, be it Internet-based or otherwise, the onus is always on the
researcher to check sources, validity, and authority. The speed and relative ease at
which changes can be made to a wiki, while good for consensus correction and
corroboration, are not so good for measured and thoughtful debate. A number of
articles in Wikipedia are sourced, but many are not, and just because it's on the
Internet, does not mean it is true. In addition, merely because it's free does not
mean Wikipedia is more suspect and Britannica is more reliable. There is an
argument to be made for being so passionate qbout a topic that you feel the need to
share that passion with the world. But one man's passion is also another's conceit.
There is a counter to every argument, a rebuttal to every claim.
"Like it or not, wikis and wiki behaviors have entered the mainstream, just like blogs
and MySpace and the iPod. Love it or hate it, if you are involved in open source
research you need to know about wikis." 75
The Wikipedia Itself: The Good, the Bad, and the Dubious
As Diane White clearly indicates, there are many, many wikis now available on the
Internet, and their numbers continue to increase at present. I want to focus on
Wikipedia itself because it remains the center of the wiki universe and thus far
shows no signs of decline. Many Wikipedia critics mourn the decline of traditional
encyclopedias because they are thinking of an encyclopedia such as Britannica in its
current form, that is, "the most authoritative source of ... information and ideas," the
"definitive source of knowledge." 76 According to Tom Panelas, Britannica's Director
of Corporate Communications, "We can't cover as many things as they [Wikipedia]
do but we wouldn't even try to. What they do is very different from what we do. We
don't have an article on extreme ironing, and we shouldn't." 77
74
Jim Giles, "Internet Encyclopaedias Go Head to Head," Nature, 14 December 2005 (last updated
28 March 2006), <http://www.nature.com/news/2005/051212/full/438900a .html> (14 November
2006).
75
Diane White, "Wikis and the Wikipedia," The WorthWhile Web, May 2006,
<http ://www.fggm.osis.gov/Worthwh ile/archive/20060501 .html>.
76
Paula Berinstein, "Wikipedia and Britannica: The Kids Are All Right (And So's the Old Man),"
Information Today, March 2006, <http://www.infotoday.com/searcher/mar06/berinstein .shtml> (11
September 2006).
77
Berinstein.
Wikipedia relies a/most entirely upon individual users to create, edit, maintain, and
often argue about its entries. It is free and carries no advertising; it is a nonprofit and
has a tiny staff.
~ Its content is "open," that is, almost any topic can be included; traditional
encyclopedias generally do not include "how-to" instructions ("How to draw a
diagram with Microsoft Word"), new or transient popular culture ("24: The TV
Series"), or breaking stories ("Jon Benet Ramsey").
78
1n what must be one of the most profound examples of friendship since Damon and Pythias, Boston
actually traveled voluntarily with his wife to New South Wales to "keep Palmer company." Anyone
who has read about a sea voyage from England to Australia at that time knows the trip in and of itself
was a major sacrifice. Robert Hughes, The Fatal Shore (New York: Vintage Books, 1988), 180.
79
Hughes, 180.
>- Wikipedias are available in 229 languages. These are not always just
translations of the English language Wikipedia but often contain their own
content.
>- In 2006 comedian Steven Colbert's amusing rant against "wikiality" and
"truthiness," i.e., that reality and truth are what the most people say they are.
and his charge to his viewers to change a Wikipedia article on African
elephants caused the entire site to go down temporarily. His point is well
taken: if enough Wikipedians agree that the earth is flat, then the Wikipedia
will reflect that "wikiality." While that is an absurd example, people vehemently
(and often violently) disagree over the most basic topics (try to think of
anything that isn't controversial).
);> Wikipedia "does not favor the Ph.D. over the well-read fifteen year old." 80
While the democratization of knowledge and information has a certain appeal,
the fact that Wikipedia pages dealing with policies, rules, administration,
coordination, and other metadata now comprise thirty percent of Wikipedia
indicates that the free-for-all nature of Wikipedia is giving ground to the harsh
reality of the need for "crowd control." There is a fine line between democracy
and mob rule.
);> There is no "weighting" of the relative significance of any topic: compare the
Wikipedia entries on the Beatles v. Boethius. Judged by sheer quantity,
articles on popular culture far exceed those of traditional scholarly topics.
Given its potentially limitless size, this may not be a drawback, but if
everything from 'The Simpsons" to "The Nicomachean Ethics" is on an equal
footing, then aren't we back to the Colbert criticism that all objective
standards are obliterated?
);> Some critics maintain that emergent enterprises such as Wikipedia reflect an
"online collectivism" that lead to a kind of group think and produce poor
quality results that both appeal to and are a product of the lowest common
80
Shiff, "Can Wikipedia Conquer Expertise?"
denominator. For more on this topic, read Jaron Lanier's now famous think
piece "Digital Maoism" and the many responses to it on Edge.org. 81
All this being said, nothing is going to stop people from using Wikipedia as a
reference, in many cases, their primary source for information. Some search
engines-for example, Ask-now proudly display Wikip~dia responses at the top of
the results list. Most will return Wikipedia links near the top. The best advice I can
give you vis-a-vis Wikipedia and related community generated resources is as
follows:
);> Use multiple sources: Do not as a rule rely on Wikipedia as your sole
reference or source of information. Any Wikipedia entry that is not well
sourced should raise a red flag.
);> Trust but Verify: Look for verification of Wikipedia information from sources
such as traditional references that have been through editorial review:
encyclopedias, dictionaries, scholarly (peer-reviewed) publications, university
websites, books, etc.
);> Follow those links: The best thing about Wikipedia in my opinion are the
external links from entries; with the virtual demise of web directories,
Wikipedia fills that void by supplying excellent links to what are often the best
websites on a topic.
);> Be skeptical : The more controversial the topic, the more skepticism you need
to apply to the Wikipedia entry. For example, the article "Asteroid" is quite well
done, but there isn't quite the controversy about that topic that there is about,
say, Hezbollah, an article that was locked because of vandalism.
Wikipedia has an internal search option, but as any Wikipedia user knows, it is not
the best way to search Wikipedia. First, unlike virtually every search engine on the
web, its default is OR not AND, meaning it searches for ANY of the terms you enter.
To search Wikipedia content you are better off using a separate search engine,
either one of the major search engines or a specialty search tool designed to search
Wikipedia.
81
Jaron Lanier, "Digital Maoism," Edge.org, June 2006,
<http:l/www.edge.org/3rd culture/lanier06/lanier06 index .html> (14 November 2006).
82
"Britannica Rips Nature Magazine on Accuracy Study," Encyclopedia Britannica Corporate
Website, 24 March 2006, <http://corporate.britannica .com/press/releases/nature .html> (14 November
2006).
A1>eldc~od may b~ used , althe start of floe r~am~ only, fo: e>: a m>~ e .. ..,.,~ ,pe dia org".
r· (oQS~.gov Seerch I
Showing below up to 50 r~sulls starling with #1 .
View {!loe·••li> 50}h'' ';r) (70 t''O II((\ \'.!'J'Ji ~>JO)
• l.Nttt.rt ! vc-· t-;. I 11-'l f· ti f-~ !,:;:.. ~p· r1flin:.r;ad(rl)m l):;er J.Jrr. ~;ca n
If R:H'C"•I C.1>Ht;f~~ 2 h:t~., .lfn')~ .) I)O -· ;t; lm~ed fretm Use' t<l!k A~v~ff02
t- R:trd) !l'let ti: l~ 3 l tllp.t.v..,·~ ;;:·;.;;;2.;. t. ht:.. .l~Jf.rd:fllitlk~Ufr otn Ai ; tr c:.lf rt(: n t ; o t
llt~'fl
hll{· li:,·/ y; -1 ··· ~\
·J '" ~-""'-,It JO•.. rF lint-:eQ: fro:m Y 'V.:tl!o; > r :·~_h l F<i': lhty
h:q; :'.·v.o~u r·'~
. : ::1:.:· ~: .:.::-. ~~ gG\·J ~ ·..-.:··: : ~ .-: :· .:':'~~~ :,:-:: ·:: ~· m i P ~.n:? l":nkod from lrn~ge t<!SliNk1 jp-g
~~~'!' : .w, ,~v c!! ·:.. :· ~- I• ..... J"., a ~ l!l.·p;_ ~!<JI' ~ t. ll ·· 1.¥ lin.lo::"d flam V'l.;..H :.;y ~ Hight faciltty
h• ip :r~.,.·:..w· ·T.. . · ; ·: : : .:e :..:,:-~ !jr:=\'.-'' ""•es. r .- h _ r ~ t'l t~r-:Virn. q ~·· :' Q-illjl•:y p·_~ r:f' linked ft.om lrn..ii]t: Langley tP.5e nrdf· c er·.1ro1 ~ F-9
e k lp 1-"t'f·l-.'>" .i ~. v. ; ;-l t lo ~i·;.:, t! U•.· I i. lld '?-7&-'Ai(lt:'r~.t)l ' h:m 9 hnked from R.:;: i:;.•) fi T JO I)(' S
9 h'.li' ;; :o;,;Jti 1·.;·. -1 ~:1 PAp •~i.- ~ 1. :: ~ c :: ~r: hi\·=·t--2 hr.k ~d from Talk G!t.L'J\a! chJ'st~
tO h!:p M~p.1ri r: ·1 c i ·y·tl-dpn1' 1;""/ ·;r 11 :' f.~rr I:; h.r.ked fJ"m f.:<J :'"!~ r:et·ulil
11 1-n fJ :~· ~porj. r1.1 : l :?: ·1:-~p,>:1.1 Jf~' .:.C ~~ ! i t,;r-· 16' lmkcd from ff~J!o v~ :-.r ')r~tem
12 !:•! !• ,.,.,.p:Jt! :,., ; 1 j· · ··'.-tpo(YJ •i'<r~~:·; J,!, .Jrff lt n~e d fr orn Trr pl ~ st7tr S}''!itf}m
t) t.t>~ :-:d ~~~,
13 tmp :,.~p,>d r;:F3 'Y:J~r·Qd··'h'~ '-{: 1 ~. hin:l& kr; ~·ed frCim laf~ · Mif.:l~·ju {m•:; 3~)
• S:o ~C'd p3:;t-~
I~ h~lp '<' p,:•:J r:hJ ·J~ ·f )P'> j ')fi~1 EC' ~b ht r lo1 l;r.k ed from Tt~lk .!...nl3 !C(·
15 hj t ~~ /i.-lf•:;d ro 1<:- ·i •J ·.· :J..; r;od ~ 3~·.?3(~( 3 ht1 • 10 lmk ed (rom B ~t i ~· ;,.c \: I'J~ t ~ r
16 Hlp·r'.-:r•rol r1':': ~ :·:'l·:<ou :;JJ ::·: .:11: (:;Hon)·: _J l html-;; lmked from 1.11..; l)p(·ff.tfln Q bud').:·\
17 !::1;; Ndppl1 ''·'. 1 : _-,·,J:,.-t J ~-C':J.:l11! tlt· l\' •. w:> .Ju. i 11 It!? l:.n.ked fraJrn Ga ry .r.. t<le1n
16 f~ 'l p iir.• p.·; J',i'- ~ ·~,' ,_:·!f L.nked from c :vuds Cllid It:~ f;st!h"s Ra'.!:a lll Em:::n Srste:m
19 b:•p '''')W• "'"·' 1o o !hnked from ~~' {?3\f lhte)
20 L:lp " "'l-" ""·"' .r ; ,.··s l'nked frOII• ln••!ta Dr y~al> f:i.Jf. p
21 l!'!p r:.,A_.,,-... ~ ; · • ·. - ,"'J ~()l'::, '"" •).'.r,•.lr!if'! php fi- linkP.d from Ad~-;r.ced Mlf.IO\\'li't';! Sounding I) nit
Pr;I ___Uif l'•l.l
22 !, ttp ·J,J !I~ :J·:: w1 , ' ) : ·· l~l1J ~ ·... /•.J•.-' '''nJif :: :r~c.rr l.H·: A~ t=rr:··l:? l~nk~d hom lm::-tgJ?:SikOI S ~ 'f X"-w1oq, di~·JOna~ ~1 ew.jpg
..23 litll • .· , _. ~ \ , ,... ._ \\ " ~ , ~ rrh\),, \ .:.. ~:·.~ ~I ,Jl\ ul/ i ,. ·~ r~ 1,c; 1':. \!J hlll:l-9 hnked fi(Ht"l S·l.()lsky Ur\-ED B:" ;.c.J.: Hntl'-:
7.~ h' tp ~~~!I : J! .: 0 ; ~3 r ':rl:.-, ;::c. r:.'.~::ln::lw: ~ -:l/).l.:)_?..ft4'~ ' hrmf~J tJn~· ed from lmftf)e LV] ~ rrr·: ~rp ?SGpiXJPO
25 h'l !• ,·it~t!:: :t ~ c ,u : i .: ~d'"l"l:"l .'i : Jtl.:..tr c•luo' : ~ 1 , •:;e)-!J.: 37 L M:ml fr linked fflJm Ea1th flag
20 l:~l u !~ --t\ 1.- .1. ·. · :~,s; .~ .,;_·.:;J.. ~·.:~ ..;~: . .!. . ~ t rL' t.i-J ....~..,: (:.S9-CM.Ji ~ ,:;:: ! t9 1inke-d fram E~rt h fiaQ
There are literally hundreds of results for this query. However, you can limit your
search to a specific page [leonid.arc.nasa.gov/meteor.html], which in this case
returns one result:
navigation A wildcard may be used, at the start of the name only, for example "*.wikipedia.org".
a
a
Mam Page
Commwlity Portal
!leonid arc. nasa gov/meteor. html Search I
a Featured content Showing below up to 1 results starting wtth #1.
a Current events View (previous 50) (next 50) (20 I 50 11 OIJ I 250 I 500).
o Rec en! changes
1. http:l.lleorud 'J-c.nasa. gov.lmettor ht:ml ~linked from Leonids
a Random article
a Help V1ew (previous 50) (next 50) (20 I 50 I 100 I 250 I 500).
This is a very useful tool if you need to find out what pages in Wikipedia link to a
specific site. Be sure to follow these basic rules for using this feature:
1. a full domain name, e.g., [www.nasa.gov] (this will only find links to this
specific domain) OR
2. a partial domain name with a wildcard, e.g., [*.nasa.gov] (this will find links to
any site at nasa.gov, such as ase.arc.nasa.gov) OR
Some Wikipedias other than the English language version have a similar page. For
example, the German language Wikipedia link search page is:
<http://de. wiki pedia .orq/wiki/Spezial: Linksearch>.
If you use the English Wikipedia link below and substitute the appropriate language
digraph for the "en," you can find these non-English language link search pages.
See this page <http://meta.wikimedia.org/wiki/List of Wikipedias> for all the
Wikipedias and the appropriate digraph.
Qwika http://www.qwika.com/
Qwika indexes English, German, French, Japanese, Italian, Dutch, Portuguese,
Spanish, Greek, Korean, Chinese and Russian wikis; the original content is
combined with machine translated content to/from English. However, when
searching for a non-Latin term Qwika will only find that term in the international
Wikipedia not the English language Wikipedia even if it is, e.g., LIJUPVIl.
LuMriX http://wiki.lumrix.net/
LuMirX uses AJAX technology and searches English, German, Japanese,
French, Polish, Italian, Swedish, Dutch, Portuguese, Russian, Danish, Spanish,
Finnish, Norwegian, Hungarian, Turkish, and Chinese Wikipedias. However,
when searching for a non-Latin term LuMriX will only find that term in the
international Wikipedia not the EnGlish language Wikipedia even if it is there, e.g.,
c;e~me.
Cbutv. IJJj: ITepics :.:J Top 200 results of at least 5G8 retrieved for lhe query phll.uch (Details-)
U]
1
the small town of Chaeronea, in the Greek region known as Boeotia, probably during the reign of the
Cl.lllllhiS (1)
~ · Roman Emperor Clcmdius, Plutarch !ravelled widely in the M~nean world, including twice to
Alexan~•• (25) '...!6 Rome. He had a number of influential Roman friends, including Soscius Senecio and FuncJ~~. both
·. ~.--~ ·---· ·- rmpor1anl Senators, to whom some of his la1er writings were dedicated. He lived most of his life at
Lene1. Mn1.1llft (5)
Chaeronea. and was initiated into I he mysteries of the Greek god~- However his duties as the
Pl111.11 ch c1.1te1 (3) senior of the two priests of Apollo at the Qpacle of Delphi (where he was responsible for interpreting the auguries of
the Pythia or priestess/oracle) apparently occupied little of his time- he led a most active social and civic life and
Clcehl (11)
produced an incredible body of writings. much of which is still extant.
SU•ll11lll (12) ~n.-:Vil!rnr.dra org;"'\lo:"ll<.iJPtLrt8rc/l
Num.1 Pout~~iliu~(10)
2. Plutarch of Eretria {'I "·
Meoe. Expl.>lns (5) Plutarch (in Greek n.\ouTapxo~. lived 4th century BC) was a !:i@!!! of Erelria in EuboP.a. Whether he was the
immediate successor of Thernison. and also whether he was m any way connected with him by blood, are paints
Pouwey. M"1cus (6)
which we have no means of ascertaining. Trusting perhaps to the influence of his friend Meidias, he applied to the
more 1 all clusters Athenians in 354 BC for aid against his rrval, Callras ofChalcis, who had allied himselrVYith Phrhp ofrvlacedon. The
application was granted in spite of the resistance of O!!moslhene5, and the command or the expedition was entrusted
to Phocion, who defealed Callias at Tamynae in 350 BC. But the conduct of Plutarch in th9 baUie had placed the
Athenians in great jeopardy. and though it may have been nothing more lhan rashness. Phocion would seem to have
regarded it as \Jeac.hery, fOJ he thenceforth trealed P1U\arch as an enemy anti expelled him from Ere\ ria.
en ""111-IP,Ed,~ mg..\4-JY.IJ?:ui<H=:h_c:f_Er•J'Jr:..
Wikiseek http://wikiseek.com/
Launched in early 2007, Wikiseek was created with the assistance of Wikipedia,
although it is not a part of Wikipedia. "The contents of Wikiseek are restricted to
Wikipedia pages and only those sites which are referenced within Wikipedia,
making it an authoritative source of information less subject to spam and SEQ
schemes. Wikiseek utilizes Searchme's category refinement technology,
providing suggested search refinements based on user tagging and
categorization within Wikipedia, making results more relevant than conventional
search engines." <http://www.wikiseek .com/> Wikiseek uses AJAX technology to
create changing "tag clouds" of possible terms as you type.
This is a good way to find articles within English language Wikipedia and to
search sites referenced in Wikipedia, but it is by no means a substitute for a
general search engine. Results from Wikipedia are identified by the W icon. The
tag cloud that appears at the top of each successful search is designed to show
related categories to help users either narrow or broaden a search. Keep in mind
these are user-generated tags, so many of them, e.g., "Japanese terms," do not
correspond to Wikipedia categories.
Wikiseek"l t~~nami --····-
~ b--:·Hcr w;,\' lC' seMch W:lo. tpe>d ia ,;-s.e.ar.,d)ijl_~~-. 1
1-·· ,; . ,.. ,
I Di sas ters [ "• Ill·, Hi sto ry of Southeast Asia i·JJ1c:·•''' '·""·'" '!'"""";,,,"·'.'' ,.,, ,,,,;~t,o:ordc-
!
1 "L'c:..·i.Y.2.·.:.soo·:.,_, Ts un ap1 i ·:l= :::::.£ili>:_:
\~· T5un ami v.: arT!lll •:l 5-yslern Fun Ar.tiVIty Pnt<h ("S
This give s t ime for a possible tsunami forecast to be made and warnino s t o be issued to th reat ened Tsun ami Re lief Patch In Stock and read y
t o ship .
areas, if warranted . .. A tsunami warning system is a system to detect tsunami s and 1ssue warn1nos t o
w~w . adv~ncageembl~m . con~
prevent loss of life and prooertv.
t>rLw• k·peGJ a. orgjvnk:/T.:::L:nam•_warn•no_::y._,- ·~\~ t <-r_, ~.:,@ Ts•J nMli Relid
i.1eCilcaiReiliifior Tsunami Victims sss
Tsunami F1=:turt;'s - Tsundmls.cc•rn - Tsunami F-kture-s. Wise GivinQ Aflia nce
Tsunami E-mail and Web s•te Scams ..,Tsunl!lmi Charities Information ... Tsunami Ne w s \'tiWW .I Tl~P-l\1.1]
... HOME & .. .Mi ssing Pe ople ... Prayer and Reflection Room
News In Pl t:~ urE-£
-~o~v: "<AI .t5Ur"JZHTH5 .::o m/t~L •n .J ml - p1ctur e~ .htm t
The last unidentified v1ctims at the boxino
day Tsunami laid t o rest
_D!UIIdrrli B_ornb :: wv-·v-1 .thefi r stp;:.st.~:c .uk
He w1 ns the aw ard for our lonoest-wo rkin g emDioyee 1 clo cking in at almost exactly 3 vearsl We are
YB P/ sad to see t he 5th member of Tsunl!lmi Bomb QO, and hop e he'll be happy in the r eal world. T SiJnc.n·d Vct1rn-:;
v. v~· -l.t$=J!I·3'""1":Jbo mb.com Article in Business Week Read it online.
Free Trial I
Tsunami Bomb Lynes -Nww .veep/1-"ledi-:..rom
Tsunami Bomb- ...... Home:> T:> Tsunami Bomb Lyrics ...Tsunami Bomb Lyrics ... News ... Contact u s \/151t 'Ncs tt rn U.JJ!.?.=
• Ty..Q_-'3~
... Re quest ... We Like ... Search fo r: Help recent d1sa st er victim s. Find aut haw
V..·\'!~ .l ync.:"Lsearr:-h .~7i:"m /drtlst6933 _ 0_0 .hr:mi to se nd money online.
The drawbacks to Wikiseek are that it only searches the English language
version of Wikipedia and it cannot parse non-Latin languages. It touts itself as
an "authoritative source of information less subject to spam and SEQ schemes,"
but a search for [viagra] will quickly prove it is no better (in fact worse) than the
major search engines in filtering spam. There are no preferences to change the
number of results, for example, or to limit the search only to Wikipedia or only to
links, but since Wikiseek is still in Beta, these features may appear later.
Wikiseek also offers a Firefox plug-in to add Wikiseek to the Wikipedia search
form on all Wikipedia pages.
WikiWax http://www.wikiwax.com/
WikiWax also uses "Look Ahead" AJAX technology to show very extensive lists
of dynamically generated related terms. However, WikiWax cannot parse non-
Latin search terms, e.g., Ll. llJPVIl.
WikiWax ...
your quick index to Wikipedia
pluto
pluto
f>luto (g~dl
Pluto (manQa)
Pluto (mythology)
!Pluto (planet)
Pluto Bulsara 1 Farrokh
Pluto deb~te 1 The
Pluto Express
Pluto Junior
Pluto Kuiper Express
Pluto Nash 1 The Adventures of
PLUTO reactor
Pluto Saves the Shi p
Pluto Shervingto n
Pluto Water
Pluto. Aspects of
Pluto, Astronomy
luto, Breakfast on
luto , Destination
Pluto, Geology of
lI ii
luto, Jupiter, Neptune and
LUTO, Operation
luto, Planet
luto, Private
Pluto, Project
';
luto, Religion
Pluto 1 Sailor
Pl uto 1 Zoltan
Pluto's Kiss:
Pint•' m>nn
You can further restrict the search to Wikipedia by clicking on "More from this site,"
which is an excellent way to search Wikipedia using Yahoo:
········-····--~ ~· · ·- ··-·--··-···---~~···-·----··---- ---
Web ! [[O_?_q_~_§: ! B_q§:_g ~ ~~-':!_9iq i Qj£c-c.lory_ ~ LOC<![ ! Ne:~-~ j _;li'IQP..P..ing i MQ~-~--?.-:
1l'.h,aoo!.sEARCH linlern~t : ts9~lfh'IH
I ~~ P•eh!1enc.e':'
········ --·······---·--·-----·---··----·-··-······
Se3rch Results 1 - 10 01 flbiXJt 23.2,000 tor iJ.t!n.rutl· 0 05 sec. fAbour. thLt pt~~qe)
The lntemet (also known simply as the Net) can be briefly understood as ~a network of networks" Specifically, it is the
worldwide, publicly accessible network of ...
Quick Links: Cre.:!llon of the lntemet. Tc:day's lntemet- Internet protQ(Q]::;
sn·~ikipedi~_r)!!)f..'\o''kiilnletuet- ~~{~- g~-~~~-~-~? .. :it:~·::!.
You can also use the site: syntax to search just the Wikipedia (or Wikipedias, if you
like) in:
~ Yahoo http://search.yahoo.com/
~ Google http://www.google.com/
~ Ask http://www.ask.com/
~ A9 http://a9.com/
~ Gigablast http://www.qigablast.com/
~ Exalead http://www.exalead.com/search
~ Clusty (site: and host: are interchangeable, but Clusty has a special Wikipedia
search option) http://clusty.com/
5. Cesme- Vikipedi
... anlam ayrrm sayfasr. ~e~me kavramrnrn farklr kullanrmlarrnr ... Retrieved from "http ://tr.wikipedia.orq
lwrkii%C3%B7e%C5%9Fme" Sayfa kategorisi: Anlam ayrrm .
tr.wildpetlit1.!H!JlYvikilt;e§m~- 10~- Ca(,h:;d- ivL:t? fr;J;~: u·,js ~~l ~
Search Tip:
To search all Wikipedias:
[site:wikipedia.org]
site:DIGRAPH.wikipedia.org, e.g.,
[site:de.wikipedia.org nordafrika]