Internet in TRB EXAM
Internet in TRB EXAM
Internet in TRB EXAM
This article is about the public worldwide computer network system. For other uses, see Internet (disambiguation).
Internet
General[show]
Governance[show]
Information infrastructure[show]
Services[show]
Guides[show]
Internet portal
V
T
E
Body (BAN)
Personal (PAN)
Car/Electronics (CAN)
Near-me (NAN)
Local (LAN)
Home (HAN)
Storage (SAN)
Campus (CAN)
Backbone
Metropolitan (MAN)
Wide (WAN)
Cloud (IAN)
Internet
Interplanetary Internet
V
T
E
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite (TCP/IP) to serve
several billion users worldwide. It is a network of networks that consists of millions of private, public, academic, business, and
government networks, of local to global scope, that are linked by a broad array of electronic, wireless and optical networking
technologies. The Internet carries an extensive range of information resources and services, such as the inter-
linked hypertext documents of the World Wide Web (WWW), the infrastructure to support email, and peer-to-peer networks.
Most traditional communications media including telephone, music, film, and television are being reshaped or redefined by the
Internet, giving birth to new services such as voice over Internet Protocol (VoIP) and Internet Protocol television (IPTV). Newspaper,
book and other print publishing are adapting to website technology, or are reshaped into blogging and web feeds. The Internet has
enabled and accelerated new forms of human interactions through instant messaging, Internet forums, and social
networking. Online shopping has boomed both for major retail outlets and smallartisans and traders. Business-to-
business and financial services on the Internet affect supply chains across entire industries.
The origins of the Internet reach back to research commissioned by the United States government in the 1960s to build robust, fault-
tolerant communication via computer networks. While this work, together with work in the United Kingdom and France, led to
important precursor networks, they were not the Internet. There is no consensus on the exact date when the modern Internet came
The funding of a new U.S. backbone by the National Science Foundation in the 1980s, as well as private funding for other
commercial backbones, led to worldwide participation in the development of new networking technologies, and the merger of many
networks. Though the Internet has been widely used by academia since the 1980s, the commercialization of what was by the 1990s
an international network resulted in its popularization and incorporation into virtually every aspect of modern human life. As of June
2012, more than 2.4 billion people—over a third of the world's human population—have used the services of the Internet;
The Internet has no centralized governance in either technological implementation or policies for access and usage; each
constituent network sets its own policies. Only the overreaching definitions of the two principal name spaces in the Internet,
the Internet Protocol address space and the Domain Name System, are directed by a maintainer organization, the Internet
Corporation for Assigned Names and Numbers (ICANN). The technical underpinning and standardization of the core protocols
(IPv4 and IPv6) is an activity of the Internet Engineering Task Force (IETF), a non-profit organization of loosely affiliated
international participants that anyone may associate with by contributing technical expertise.
Contents
[hide]
1 Terminology
2 History
3 Technology
o 3.1 Protocols
o 3.2 Routing
4 Governance
5 Modern uses
6 Services
o 6.1 World Wide Web
o 6.2 Communication
7 Access
8 Users
9 Social impact
o 9.3 Telecommuting
o 9.4 Crowdsourcing
o 9.6 Philanthropy
o 9.7 Censorship
10 See also
11 References
12 External links
o 12.1 Organizations
Terminology
The Internet Messenger by Buky Schwartz inHolon
The Internet, referring to the specific global system of interconnected IP networks, is a proper noun and written with an initial capital
letter. In the media and common use it is often not capitalized, viz. the internet. Some guides specify that the word should be
capitalized when used as a noun, but not capitalized when used as a verb or an adjective.[3] The Internet is also often referred to
as the Net.
Historically the word internet was used, uncapitalized, as early as 1883 as a verb and adjective to refer to interconnected motions.
Starting in the early 1970s the term internet was used as a shorthand form of the technical term internetwork, the result of
interconnecting computer networks with special gateways or routers. It was also used as a verb meaning to connect together,
The terms Internet and World Wide Web are often used interchangeably in everyday speech; it is common to speak of "going on the
Internet" when invoking a web browser to view web pages. However, the Internet is a particular global computer network connecting
millions of computing devices; the World Wide Web is just one of many services running on the Internet. The Web is a collection of
interconnected documents (web pages) and other web resources, linked by hyperlinks and URLs.[6] In addition to the Web, a
multitude of other services are implemented over the Internet, including e-mail, file transfer, remote computer control, newsgroups,
and online games. All of these services can be implemented on anyintranet, accessible to network users.
The term Interweb is a portmanteau of Internet and World Wide Web typically used sarcastically to parody a technically unsavvy
user.[7]
History
Professor Leonard Kleinrockwith the first ARPANET Interface Message Processors at UCLA
Main articles: History of the Internet and History of the World Wide Web
Research into packet switching started in the early 1960s and packet switched networks such as Mark I at NPL in the UK,
[8]
ARPANET, CYCLADES,[9][10]Merit Network,[11] Tymnet, and Telenet, were developed in the late 1960s and early 1970s using a
variety of protocols. The ARPANET in particular led to the development of protocols for internetworking, where multiple separate
The first two nodes of what would become the ARPANET were interconnected between Leonard Kleinrock's Network Measurement
Center at the UCLA's School of Engineering and Applied Science and Douglas Engelbart's NLS system at SRI International (SRI)
in Menlo Park, California, on 29 October 1969.[12] The third site on the ARPANET was the Culler-Fried Interactive Mathematics
center at the University of California at Santa Barbara, and the fourth was the University of Utah Graphics Department. In an early
sign of future growth, there were already fifteen sites connected to the young ARPANET by the end of 1971.[13][14] These early years
were documented in the 1972 film Computer Networks: The Heralds of Resource Sharing.
Early international collaborations on ARPANET were sparse. For various political reasons, European developers were concerned
with developing the X.25networks.[15] Notable exceptions were the Norwegian Seismic Array (NORSAR) in June 1973,[16] followed in
1973 by Sweden with satellite links to the TanumEarth Station and Peter T. Kirstein's research group in the UK, initially at
the Institute of Computer Science, University of London and later at University College London.[citation needed]
In December 1974, RFC 675 – Specification of Internet Transmission Control Program, by Vinton Cerf, Yogen Dalal, and Carl
Sunshine, used the term internetas a shorthand for internetworking and later RFCs repeat this use.[17] Access to the ARPANET was
expanded in 1981 when the National Science Foundation(NSF) developed the Computer Science Network (CSNET). In 1982,
the Internet Protocol Suite (TCP/IP) was standardized and the concept of a world-wide network of fully interconnected TCP/IP
TCP/IP network access expanded again in 1986 when the National Science Foundation Network (NSFNET) provided access
tosupercomputer sites in the United States from research and education organizations, first at 56 kbit/s and later at 1.5 Mbit/s and
45 Mbit/s.[18] Commercial Internet service providers (ISPs) began to emerge in the late 1980s and early 1990s. The ARPANET was
decommissioned in 1990. The Internet was commercialized in 1995 when NSFNET was decommissioned, removing the last
restrictions on the use of the Internet to carry commercial traffic. [19] The Internet started a rapid expansion to Europe and Australia in
the mid to late 1980s[20][21] and to Asia in the late 1980s and early 1990s.[22]
Since the mid-1990s the Internet has had a tremendous impact on culture and commerce, including the rise of near instant
communication by email, instant messaging, Voice over Internet Protocol (VoIP) "phone calls", two-way interactive video calls, and
theWorld Wide Web[23] with its discussion forums, blogs, social networking, and online shopping sites. Increasing amounts of
data are transmitted at higher and higher speeds over fiber optic networks operating at 1-Gbit/s, 10-
Gbit/s, or more.
The Internet continues to grow, driven by ever greater amounts of online information and knowledge, commerce, entertainment
and social networking.[26] During the late 1990s, it was estimated that traffic on the public Internet grew by 100 percent per year,
while the mean annual growth in the number of Internet users was thought to be between 20% and 50%.[27] This growth is often
attributed to the lack of central administration, which allows organic growth of the network, as well as the non-proprietary open
nature of the Internet protocols, which encourages vendor interoperability and prevents any one company from exerting too much
control over the network.[28] As of 31 March 2011, the estimated total number of Internet users was 2.095 billion (30.2% of world
population).[29] It is estimated that in 1993 the Internet carried only 1% of the information flowing through two-way
telecommunication, by 2000 this figure had grown to 51%, and by 2007 more than 97% of all telecommunicated information was
Technology
Protocols
Main article: Internet protocol suite
As the user data is processed down through the protocol stack, each layer adds an encapsulation at the
sending host. Data is transmitted "over the wire" at the link level, left to right. The encapsulation stack
procedure is reversed by the receiving host. Intermediate relays remove and add a new link
encapsulation for retransmission, and inspect the IP layer for routing purposes.
Application layer
DHCP
DHCPv6
DNS
FTP
HTTP
IMAP
IRC
LDAP
MGCP
NNTP
BGP
NTP
POP
RPC
RTP
RTSP
RIP
SIP
SMTP
SNMP
SOCKS
SSH
Telnet
TLS/SSL
XMPP
more...
Transport layer
TCP
UDP
DCCP
SCTP
RSVP
more...
Internet layer
IP
IPv4
IPv6
ICMP
ICMPv6
ECN
IGMP
IPsec
more...
Link layer
ARP/InARP
NDP
OSPF
Tunnels
L2TP
PPP
Ethernet
DSL
ISDN
FDDI
DOCSIS
more...
V
T
E
The communications infrastructure of the Internet consists of its hardware components and a system of software layers that control
various aspects of the architecture. While the hardware can often be used to support other software systems, it is the design and the
rigorous standardization process of the software architecture that characterizes the Internet and provides the foundation for its
scalability and success. The responsibility for the architectural design of the Internet software systems has been delegated to
the Internet Engineering Task Force (IETF).[31] The IETF conducts standard-setting work groups, open to any individual, about the
various aspects of Internet architecture. Resulting discussions and final standards are published in a series of publications, each
called a Request for Comments (RFC), freely available on the IETF web site.
The principal methods of networking that enable the Internet are contained in specially designated RFCs that constitute the Internet
Standards. Other less rigorous documents are simply informative, experimental, or historical, or document the best current practices
into a layered system of protocols (RFC 1122, RFC 1123). The layers correspond to the environment or scope in which their
services operate. At the top is the application layer, the space for the application-specific networking methods used in software
applications, e.g., a web browser program uses the client-server application model and many file-sharing systems use a peer-to-
peer paradigm. Below this top layer, the transport layer connects applications on different hosts via the network with appropriate
data exchange methods. Underlying these layers are the core networking technologies, consisting of two layers.
The internet layer enables computers to identify and locate each other via Internet Protocol (IP) addresses, and allows them to
connect to one another via intermediate (transit) networks. Last, at the bottom of the architecture, is a software layer, the link layer,
that provides connectivity between hosts on the same local network link, such as a local area network (LAN) or a dial-up connection.
The model, also known as TCP/IP, is designed to be independent of the underlying hardware, which the model therefore does not
concern itself with in any detail. Other models have been developed, such as the Open Systems Interconnection (OSI) model, but
they are not compatible in the details of description or implementation; many similarities exist and the TCP/IP protocols are usually
The most prominent component of the Internet model is the Internet Protocol (IP), which provides addressing systems (IP
addresses) for computers on the Internet. IP enables internetworking and in essence establishes the Internet itself. IP Version 4
(IPv4) is the initial version used on the first generation of today's Internet and is still in dominant use. It was designed to address up
to ~4.3 billion (109) Internet hosts. However, the explosive growth of the Internet has led to IPv4 address exhaustion, which entered
its final stage in 2011,[32] when the global address allocation pool was exhausted. A new protocol version, IPv6, was developed in the
mid-1990s, which provides vastly larger addressing capabilities and more efficient routing of Internet traffic. IPv6 is currently in
growing deployment around the world, since Internet address registries (RIRs) began to urge all resource managers to plan rapid
IPv6 is not interoperable with IPv4. In essence, it establishes a parallel version of the Internet not directly accessible with IPv4
software. This means software upgrades or translator facilities are necessary for networking devices that need to communicate on
both networks. Most modern computer operating systems already support both versions of the Internet Protocol. Network
infrastructures, however, are still lagging in this development. Aside from the complex array of physical connections that make up its
infrastructure, the Internet is facilitated by bi- or multi-lateral commercial contracts (e.g., peering agreements), and by technical
specifications or protocols that describe how to exchange data over the network. Indeed, the Internet is defined by its
Routing
Internet packet routing is accomplished among various tiers of Internet service providers.
Internet service providers connect customers, which represent the bottom of the routing hierarchy, to customers of other ISPs via
other higher or same-tier networks. At the top of the routing hierarchy are the Tier 1 networks, large telecommunication companies
which exchange traffic directly with all other Tier 1 networks via peering agreements. Tier 2 networks buy Internet transit from other
providers to reach at least some parties on the global Internet, though they may also engage in peering. An ISP may use a single
upstream provider for connectivity, or implement multihoming to achieve redundancy. Internet exchange points are major traffic
Computers and routers use routing tables to direct IP packets to the next-hop router or destination. Routing tables are maintained by
manual configuration or by routing protocols. End-nodes typically use a default route that points toward an ISP providing transit,
while ISP routers use the Border Gateway Protocol to establish the most efficient routing across the complex connections of the
global Internet.
Large organizations, such as academic institutions, large enterprises, and governments, may perform the same function as ISPs,
engaging in peering and purchasing transit on behalf of their internal networks. Research networks tend to interconnect into large
subnetworks such as GEANT, GLORIAD, Internet2, and the UK's national research and education network, JANET.
General structure
The Internet structure and its usage characteristics have been studied extensively. It has been determined that both the Internet IP
routing structure and hypertext links of the World Wide Web are examples of scale-free networks.[34]
Many computer scientists describe the Internet as a "prime example of a large-scale, highly engineered, yet highly complex system".
[35]
The Internet is heterogeneous; for instance, data transfer rates and physical characteristics of connections vary widely. The
Internet exhibits "emergent phenomena" that depend on its large-scale organization. For example, data transfer rates exhibit
temporal self-similarity. The principles of the routing and addressing methods for traffic in the Internet reach back to their origins in
the 1960s when the eventual scale and popularity of the network could not be anticipated.[36] Thus, the possibility of developing
alternative structures is investigated.[37] The Internet structure was found to be highly robust[38] to random failures and very
Governance
The Internet is a globally distributed network comprising many voluntarily interconnected autonomous networks. It operates without
The technical underpinning and standardization of the Internet's core protocols (IPv4 and IPv6) is an activity of the Internet
Engineering Task Force(IETF), a non-profit organization of loosely affiliated international participants that anyone may associate
To maintain interoperability, the principal name spaces of the Internet are administered by the Internet Corporation for Assigned
Names and Numbers(ICANN), headquartered in Marina del Rey, California. ICANN is the authority that coordinates the assignment
of unique identifiers for use on the Internet, including domain names, Internet Protocol (IP) addresses, application port numbers in
the transport protocols, and many other parameters. Globally unified name spaces, in which names and numbers are uniquely
assigned, are essential for maintaining the global reach of the Internet. ICANN is governed by an international board of directors
drawn from across the Internet technical, business, academic, and other non-commercial communities. ICANN's role in coordinating
the assignment of unique identifiers distinguishes it as perhaps the only central coordinating body for the global Internet. [40]
Asia-Pacific Network Information Centre (APNIC) for Asia and the Pacific region
Latin American and Caribbean Internet Addresses Registry (LACNIC) for Latin America
The National Telecommunications and Information Administration, an agency of the United States Department of Commerce,
continues to have final approval over changes to the DNS root zone.[41][42][43]
The Internet Society (ISOC) was founded in 1992, with a mission to "assure the open development, evolution and use of the
Internet for the benefit of all people throughout the world".[44] Its members include individuals (anyone may join) as well as
corporations, organizations, governments, and universities. Among other activities ISOC provides an administrative home for a
number of less formally organized groups that are involved in developing and managing the Internet, including: the Internet
Engineering Task Force (IETF), Internet Architecture Board (IAB), Internet Engineering Steering Group (IESG), Internet Research
On 16 November 2005, the United Nations-sponsored World Summit on the Information Society, held in Tunis, established
Modern uses
The Internet allows greater flexibility in working hours and location, especially with the spread of unmetered high-speed connections.
The Internet can be accessed almost anywhere by numerous means, including through mobile Internet devices. Mobile
phones, datacards, handheld game consoles and cellular routers allow users to connect to the Internet wirelessly. Within the
limitations imposed by small screens and other limited facilities of such pocket-sized devices, the services of the Internet, including
email and the web, may be available. Service providers may restrict the services offered and mobile data charges may be
Educational material at all levels from pre-school to post-doctoral is available from websites. Examples range from CBeebies,
through school and high-school revision guides and virtual universities, to access to top-end scholarly literature through the likes
of Google Scholar. For distance education, help with homework and other assignments, self-guided learning, whiling away spare
time, or just looking up more detail on an interesting fact, it has never been easier for people to access educational information at
any level from anywhere. The Internet in general and theWorld Wide Web in particular are important enablers of
The low cost and nearly instantaneous sharing of ideas, knowledge, and skills has made collaborative work dramatically easier, with
the help of collaborative software. Not only can a group cheaply communicate and share ideas but the wide reach of the Internet
allows such groups more easily to form. An example of this is the free software movement, which has produced, among other
things, Linux, Mozilla Firefox, and OpenOffice.org. Internet chat, whether using an IRC chat room, an instant messaging system, or
a social networking website, allows colleagues to stay in touch in a very convenient way while working at their computers during the
day. Messages can be exchanged even more quickly and conveniently than via email. These systems may allow files to be
exchanged, drawings and images to be shared, or voice and video contact between team members.
Content management systems allow collaborating teams to work on shared sets of documents simultaneously without accidentally
destroying each other's work. Business and project teams can share calendars as well as documents and other information. Such
collaboration occurs in a wide variety of areas including scientific research, software development, conference planning, political
activism and creative writing. Social and political collaboration is also becoming more widespread as both Internet access
The Internet allows computer users to remotely access other computers and information stores easily, wherever they may be. They
may do this with or without computer security, i.e. authentication and encryption technologies, depending on the requirements. This
is encouraging new ways of working from home, collaboration and information sharing in many industries. An accountant sitting at
home can audit the books of a company based in another country, on a server situated in a third country that is remotely maintained
by IT specialists in a fourth. These accounts could have been created by home-working bookkeepers, in other remote locations,
based on information emailed to them from offices all over the world. Some of these things were possible before the widespread use
of the Internet, but the cost of private leased lines would have made many of them infeasible in practice. An office worker away from
their desk, perhaps on the other side of the world on a business trip or a holiday, can access their emails, access their data
using cloud computing, or open a remote desktop session into their office PC using a secureVirtual Private Network (VPN)
connection on the Internet. This can give the worker complete access to all of their normal files and data, including email and other
applications, while away from the office. It has been referred to among system administrators as the Virtual Private Nightmare,
[45]
because it extends the secure perimeter of a corporate network into remote locations and its employees' homes.
Services
World Wide Web
This NeXT Computer was used by Tim Berners-Lee at CERN and became the world's first Web server.
Many people use the terms Internet and World Wide Web, or just the Web, interchangeably, but the two terms are not synonymous.
The World Wide Web is only one of hundreds of services used on the Internet. The Web is a global set of documents, images and
other resources, logically interrelated by hyperlinks and referenced with Uniform Resource Identifiers (URIs). URIs symbolically
identify services, servers, and other databases, and the documents and resources that they can provide. Hypertext Transfer
Protocol (HTTP) is the main access protocol of the World Wide Web. Web servicesalso use HTTP to allow software systems to
Chrome, lets users navigate from one web page to another via hyperlinks embedded in the documents. These documents may also
contain any combination of computer data, including graphics, sounds, text, video, multimedia and interactive content that runs
while the user is interacting with the page. Client-side software can include animations, games, office applications and scientific
demonstrations. Through keyword-driven Internet research using search engines like Yahoo! and Google, users worldwide have
easy, instant access to a vast and diverse amount of online information. Compared to printed media, books, encyclopedias and
traditional libraries, the World Wide Web has enabled the decentralization of information on a large scale.
The Web has also enabled individuals and organizations to publish ideas and information to a potentially large audience online at
greatly reduced expense and time delay. Publishing a web page, a blog, or building a website involves little initial cost and many
cost-free services are available. Publishing and maintaining large, professional web sites with attractive, diverse and up-to-date
information is still a difficult and expensive proposition, however. Many individuals and some companies and groups use web logs or
blogs, which are largely used as easily updatable online diaries. Some commercial organizations encourage staff to communicate
advice in their areas of specialization in the hope that visitors will be impressed by the expert knowledge and free information, and
One example of this practice is Microsoft, whose product developers publish their personal blogs in order to pique the public's
interest in their work. Collections of personal web pages published by large service providers remain popular, and have become
increasingly sophisticated. Whereas operations such as Angelfire and GeoCities have existed since the early days of the Web,
newer offerings from, for example, Facebook and Twitter currently have large followings. These operations often brand themselves
Advertising on popular web pages can be lucrative, and e-commerce or the sale of products and services directly via the Web
continues to grow.
When the Web began in the 1990s, a typical web page was stored in completed form on a web server, formatted in HTML, ready to
be sent to a user's browser in response to a request. Over time, the process of creating and serving web pages has become more
automated and more dynamic. Websites are often created using content management or wiki software with, initially, very little
content. Contributors to these systems, who may be paid staff, members of a club or other organization or members of the public, fill
underlying databases with content using editing pages designed for that purpose, while casual visitors view and read this content in
its final HTML form. There may or may not be editorial, approval and security systems built into the process of taking newly entered
Communication
Email is an important communications service available on the Internet. The concept of sending electronic text messages between
parties in a way analogous to mailing letters or memos predates the creation of the Internet. Pictures, documents and other files are
over-Internet Protocol, referring to the protocol that underlies all Internet communication. The idea began in the early 1990s
with walkie-talkie-like voice applications for personal computers. In recent years many VoIP systems have become as easy to use
and as convenient as a normal telephone. The benefit is that, as the Internet carries the voice traffic, VoIP can be free or cost much
less than a traditional telephone call, especially over long distances and especially for those with always-on Internet connections
such as cable or ADSL. VoIP is maturing into a competitive alternative to traditional telephone service. Interoperability between
different providers has improved and the ability to call or receive a call from a traditional telephone is available. Simple, inexpensive
VoIP network adapters are available that eliminate the need for a personal computer.
Voice quality can still vary from call to call, but is often equal to and can even exceed that of traditional calls. Remaining problems
for VoIP include emergency telephone number dialing and reliability. Currently, a few VoIP providers provide an emergency service,
but it is not universally available. Older traditional phones with no "extra features" may be line-powered only and operate during a
power failure; VoIP can never do so without a backup power source for the phone equipment and the Internet access devices. VoIP
has also become increasingly popular for gaming applications, as a form of communication between players. Popular VoIP clients
for gaming include Ventrilo and Teamspeak. Modern video game consoles also offer VoIP chat features.
Data transfer
File sharing is an example of transferring large amounts of data across the Internet. A computer file can be emailed to customers,
colleagues and friends as an attachment. It can be uploaded to a website or FTP server for easy download by others. It can be put
into a "shared location" or onto a file server for instant use by colleagues. The load of bulk downloads to many users can be eased
by the use of "mirror" servers or peer-to-peer networks. In any of these cases, access to the file may be controlled by
user authentication, the transit of the file over the Internet may be obscured by encryption, and money may change hands for
access to the file. The price can be paid by the remote charging of funds from, for example, a credit card whose details are also
passed – usually fully encrypted – across the Internet. The origin and authenticity of the file received may be checked by digital
signatures or by MD5 or other message digests. These simple features of the Internet, over a worldwide basis, are changing the
production, sale, and distribution of anything that can be reduced to a computer file for transmission. This includes all manner of
print publications, software products, news, music, film, video, photography, graphics and the other arts. This in turn has caused
seismic shifts in each of the existing industries that previously controlled the production and distribution of these products.
Streaming media is the real-time delivery of digital media for the immediate consumption or enjoyment by end users. Many radio
and television broadcasters provide Internet feeds of their live audio and video productions. They may also allow time-shift viewing
or listening such as Preview, Classic Clips and Listen Again features. These providers have been joined by a range of pure Internet
"broadcasters" who never had on-air licenses. This means that an Internet-connected device, such as a computer or something
more specific, can be used to access on-line media in much the same way as was previously possible only with a television or radio
receiver. The range of available types of content is much wider, from specialized technical webcasts to on-demand popular
multimedia services. Podcasting is a variation on this theme, where – usually audio – material is downloaded and played back on a
computer or shifted to a portable media player to be listened to on the move. These techniques using simple equipment allow
anybody, with little censorship or licensing control, to broadcast audio-visual material worldwide.
Digital media streaming increases the demand for network bandwidth. For example, standard image quality needs 1 Mbit/s link
speed for SD 480p, HD 720p quality requires 2.5 Mbit/s, and the top-of-the-line HDX quality needs 4.5 Mbit/s for 1080p. [46]
Webcams are a low-cost extension of this phenomenon. While some webcams can give full-frame-rate video, the picture either is
usually small or updates slowly. Internet users can watch animals around an African waterhole, ships in the Panama Canal, traffic at
a local roundabout or monitor their own premises, live and in real time. Video chat rooms and video conferencing are also popular
with many uses being found for personal webcams, with and without two-way sound. YouTube was founded on 15 February 2005
and is now the leading website for free streaming video with a vast number of users. It uses a flash-based web player to stream and
show video files. Registered users may upload an unlimited amount of video and build their own personal profile. YouTube claims
that its users watch hundreds of millions, and upload hundreds of thousands of videos daily.[47]
Access
Common methods of Internet access in homes include dial-up, landline broadband (over coaxial cable, fiber optic or copper
wires), Wi-Fi, satellite and 3G/4G technology cell phones. Public places to use the Internet include libraries and Internet cafes,
where computers with Internet connections are available. There are also Internet access points in many public places such as
airport halls and coffee shops, in some cases just for brief use while standing. Various terms are used, such as "public Internet
kiosk", "public access terminal", and "Web payphone". Many hotels now also have public terminals, though these are usually fee-
based. These terminals are widely accessed for various usage like ticket booking, bank deposit, online payment etc. Wi-Fi provides
wireless access to computer networks, and therefore can do so to the Internet itself. Hotspots providing such access include Wi-Fi
cafes, where would-be users need to bring their own wireless-enabled devices such as a laptop or PDA. These services may be
free to all, free to customers only, or fee-based. A hotspot need not be limited to a confined location. A whole campus or park, or
Grassroots efforts have led to wireless community networks. Commercial Wi-Fi services covering large city areas are in place in
London, Vienna, Toronto, San Francisco, Philadelphia, Chicago and Pittsburgh. The Internet can then be accessed from such
places as a park bench.[48] Apart from Wi-Fi, there have been experiments with proprietary mobile wireless networks like Ricochet,
various high-speed data services over cellular phone networks, and fixed wireless services. High-end mobile phones such
as smartphones in general come with Internet access through the phone network. Web browsers such as Opera are available on
these advanced handsets, which can also run a wide variety of other Internet software. More mobile phones have Internet access
than PCs, though this is not as widely used.[49] An Internet access provider and protocol matrix differentiates the methods used to
get online.
An Internet blackout or outage can be caused by local signaling interruptions. Disruptions of submarine communications cables may
cause blackouts or slowdowns to large areas, such as in the2008 submarine cable disruption. Less-developed countries are more
vulnerable due to a small number of high-capacity links. Land cables are also vulnerable, as in 2011 when a woman digging for
scrap metal severed most connectivity for the nation of Armenia.[50] Internet blackouts affecting almost entire countries can be
achieved by governments as a form of Internet censorship, as in the blockage of the Internet in Egypt, whereby approximately
93%[51] of networks were without access in 2011 in an attempt to stop mobilization for anti-government protests.[52]
Users
See also: Global Internet usage, English on the Internet, and Unicode
Overall Internet usage has seen tremendous growth. From 2000 to 2009, the number of Internet users globally rose from 394 million
to 1.858 billion.[57] By 2010, 22 percent of the world's population had access to computers with 1 billion Googlesearches every day,
300 million Internet users reading blogs, and 2 billion videos viewed daily on YouTube.[58]
The prevalent language for communication on the Internet has been English. This may be a result of the origin of the Internet, as
well as the language's role as a lingua franca. Early computer systems were limited to the characters in the American Standard
After English (27%), the most requested languages on the World Wide Web are Chinese (23%), Spanish (8%), Japanese (5%),
Portuguese and German (4% each), Arabic, French and Russian (3% each), and Korean (2%). [59] By region, 42% of the
world'sInternet users are based in Asia, 24% in Europe, 14% in North America, 10% in Latin America and the Caribbean taken
together, 6% in Africa, 3% in the Middle East and 1% in Australia/Oceania. [60] The Internet's technologies have developed enough in
recent years, especially in the use of Unicode, that good facilities are available for development and communication in the world's
widely used languages. However, some glitches such as mojibake (incorrect display of some languages' characters) still remain.
In an American study in 2005, the percentage of men using the Internet was very slightly ahead of the percentage of women,
although this difference reversed in those under 30. Men logged on more often, spent more time online, and were more likely to be
broadband users, whereas women tended to make more use of opportunities to communicate (such as email). Men were more likely
to use the Internet to pay bills, participate in auctions, and for recreation such as downloading music and videos. Men and women
were equally likely to use the Internet for shopping and banking.[61] More recent studies indicate that in 2008, women significantly
outnumbered men on most social networking sites, such as Facebook and Myspace, although the ratios varied with age. [62] In
addition, women watched more streaming content, whereas men downloaded more. [63] In terms of blogs, men were more likely to
blog in the first place; among those who blog, men were more likely to have a professional blog, whereas women were more likely to
According to Euromonitor, by 2020 43.7% of the world's population will be users of the Internet. Splitting by country, in 2011 Iceland,
Norway and the Netherlands had the highest Internet penetration by the number of users, with more than 90% of the population with
access.
Social impact
The Internet has enabled entirely new forms of social interaction, activities, and organizing, thanks to its basic features such as
widespread usability and access. In the first decade of the 21st century, the first generation is raised with widespread availability of
Internet connectivity, bringing consequences and concerns in areas such as personal privacy and identity, and distribution of
copyrighted materials. These "digital natives" face a variety of challenges that were not present for prior generations.
Many people use the World Wide Web to access news, weather and sports reports, to plan and book vacations and to find out more
about their interests. People use chat, messaging and email to make and stay in touch with friends worldwide, sometimes in the
same way as some previously had pen pals. The Internet has seen a growing number of Web desktops, where users can access
Social networking websites such as Facebook, Twitter, and MySpace have created new ways to socialize and interact. Users of
these sites are able to add a wide variety of information to pages, to pursue common interests, and to connect with others. It is also
possible to find existing acquaintances, to allow communication among existing groups of people. Sites like LinkedInfoster
commercial and business connections. YouTube and Flickr specialize in users' videos and photographs.
The Internet has been a major outlet for leisure activity since its inception, with entertaining social experiments such
as MUDsand MOOs being conducted on university servers, and humor-related Usenet groups receiving much traffic. Today,
manyInternet forums have sections devoted to games and funny videos; short cartoons in the form of Flash movies are also
popular. Over 6 million people use blogs or message boards as a means of communication and for the sharing of ideas.
The Internet pornography and online gambling industries have taken advantage of the World Wide Web, and often provide a
significant source of advertising revenue for other websites.[65] Although many governments have attempted to restrict both
industries' use of the Internet, in general this has failed to stop their widespread popularity. [66]
Another area of leisure activity on the Internet is multiplayer gaming.[67] This form of recreation creates communities, where people
of all ages and origins enjoy the fast-paced world of multiplayer games. These range from MMORPG to first-person shooters,
from role-playing video games to online gambling. While online gaming has been around since the 1970s, modern modes of online
gaming began with subscription services such as GameSpy and MPlayer.[68] Non-subscribers were limited to certain types of game
play or certain games. Many people use the Internet to access and download music, movies and other works for their enjoyment
and relaxation. Free and fee-based services exist for all of these activities, using centralized servers and distributed peer-to-peer
technologies. Some of these sources exercise more care with respect to the original artists' copyrights than others.
Internet usage has been correlated to users' loneliness.[69] Lonely people tend to use the Internet as an outlet for their feelings and to
share their stories with others, such as in the "I am lonely will anyone speak to me" thread.
Cybersectarianism is a new organizational form which involves: "highly dispersed small groups of practitioners that may remain
largely anonymous within the larger social context and operate in relative secrecy, while still linked remotely to a larger network of
believers who share a set of practices and texts, and often a common devotion to a particular leader. Overseas supporters provide
funding and support; domestic practitioners distribute tracts, participate in acts of resistance, and share information on the internal
situation with outsiders. Collectively, members and practitioners of such sects construct viable virtual communities of faith,
exchanging personal testimonies and engaging in collective study via email, on-line chat rooms and web-based message boards." [70]
Cyberslacking can become a drain on corporate resources; the average UK employee spent 57 minutes a day surfing the Web while
at work, according to a 2003 study by Peninsula Business Services.[71] Internet addiction disorder is excessive computer use that
interferes with daily life. Psychologist Nicolas Carr believe that Internet use has other effects on individuals, for instance improving
skills of scan-reading and interfering with the deep thinking that leads to true creativity.[72]
Electronic business
Main article: Electronic business
Electronic business (E-business) involves business processes spanning the entire value chain: electronic purchasing and supply
chain management, processing orders electronically, handling customer service, and cooperating with business partners. E-
commerce seeks to add revenue streams using the Internet to build and enhance relationships with clients and partners.
According to research firm IDC, the size of total worldwide e-commerce, when global business-to-business and -consumer
transactions are added together, will equate to $16 trillion in 2013.IDate, another research firm, estimates the global market for
digital products and services at $4.4 trillion in 2013. A report by Oxford Economics adds those two together to estimate the total size
of the digital economy at $20.4 trillion, equivalent to roughly 13.8% of global sales.[73]
While much has been written of the economic advantages of Internet-enabled commerce, there is also evidence that some aspects
of the Internet such as maps and location-aware services may serve to reinforce economic inequality and the digital divide.
[74]
Electronic commerce may be responsible for consolidation and the decline of mom-and-pop, brick and mortar businesses
Telecommuting
Main article: Telecommuting
Remote work is facilitated by tools such as groupware, virtual private networks, conference calling, videoconferencing, and Voice
over IP (VOIP). It can be efficient and useful for companies as it allows workers to communicate over long distances, saving
significant amounts of travel time and cost. As broadband Internet connections become more commonplace, more and more
workers have adequate bandwidth at home to use these tools to link their home to their corporate intranet and internal phone
networks.
Crowdsourcing
Main article: Crowdsourcing
Internet provides a particularly good venue for crowdsourcing (outsourcing tasks to a distributed group of people) since individuals
tend to be more open in web-based projects where they are not being physically judged or scrutinized and thus can feel more
comfortable sharing.
Crowdsourcing systems are used to accomplish a variety of tasks. For example, the crowd may be invited to develop a new
technology, carry out a design task, refine or carry out the steps of an algorithm (see human-based computation), or help capture,
Wikis have also been used in the academic community for sharing and dissemination of information across institutional and
international boundaries.[78] In those settings, they have been found useful for collaboration on grant writing, strategic planning,
departmental documentation, and committee work.[79] The United States Patent and Trademark Office uses a wiki to allow the public
to collaborate on finding prior art relevant to examination of pending patent applications. Queens, New York has used a wiki to allow
The English Wikipedia has the largest user base among wikis on the World Wide Web[81] and ranks in the top 10 among all Web
was notable for its success in soliciting donation via the Internet. Many political groups use the Internet to achieve a new method of
organizing in order to carry out their mission, having given rise to Internet activism, most notably practiced by rebels in the Arab
Spring.[83][84]
The New York Times suggested that social media websites, such as Facebook and Twitter, helped people organize the political
revolutions in Egypt where it helped certain classes of protesters organize protests, communicate grievances, and disseminate
information.[85]
The potential of the Internet as a civic tool of communicative power was thoroughly explored by Simon R. B. Berdal in his thesis of
2004:
As the globally evolving Internet provides ever new access points to virtual discourse forums, it also promotes new civic relations
and associations within which communicative power may flow and accumulate. Thus, traditionally ... national-embedded peripheries
get entangled into greater, international peripheries, with stronger combined powers... The Internet, as a consequence, changes the
topology of the "centre-periphery" model, by stimulating conventional peripheries to interlink into "super-periphery" structures, which
Berdal, therefore, extends the Habermasian notion of the Public sphere to the Internet, and underlines the inherent global and civic
nature that intervowen Internet technologies provide. To limit the growing civic potential of the Internet, Berdal also notes how "self-
If we consider China’s attempts to filter "unsuitable material" from the Internet, most of us would agree that this resembles a self-
protective measure by the system against the growing civic potentials of the Internet. Nevertheless, both types represent limitations
to "peripheral capacities". Thus, the Chinese government tries to prevent communicative power to build up and unleash (as
the 1989 Tiananmen Square uprising suggests, the government may find it wise to install "upstream measures"). Even though
limited, the Internet is proving to be an empowering tool also to the Chinese periphery: Analysts believe that Internet petitions have
influenced policy implementation in favour of the public’s online-articulated will ... [86]
Philanthropy
The spread of low-cost Internet access in developing countries has opened up new possibilities for peer-to-peer charities, which
allow individuals to contribute small amounts to charitable projects for other individuals. Websites, such
as DonorsChoose and GlobalGiving, allow small-scale donors to direct funds to individual projects of their choice.
A popular twist on Internet-based philanthropy is the use of peer-to-peer lending for charitable purposes. Kiva pioneered this
concept in 2005, offering the first web-based service to publish individual loan profiles for funding. Kiva raises funds for local
intermediary microfinance organizations which post stories and updates on behalf of the borrowers. Lenders can contribute as little
as $25 to loans of their choice, and receive their money back as borrowers repay. Kiva falls short of being a pure peer-to-peer
charity, in that loans are disbursed before being funded by lenders and borrowers do not communicate with lenders themselves. [87][88]
However, the recent spread of low cost Internet access in developing countries has made genuine international person-to-person
philanthropy increasingly feasible. In 2009 the US-based nonprofitZidisha tapped into this trend to offer the first person-to-person
microfinance platform to link lenders and borrowers across international borders without intermediaries. Members can fund loans for
as little as a dollar, which the borrowers then use to develop business activities that improve their families' incomes while repaying
loans to the members with interest. Borrowers access the Internet via public cybercafes, donated laptops in village schools, and
even smart phones, then create their own profile pages through which they share photos and information about themselves and
their businesses. As they repay their loans, borrowers continue to share updates and dialogue with lenders via their profile pages.
This direct web-based connection allows members themselves to take on many of the communication and recording tasks
traditionally performed by local organizations, bypassing geographic barriers and dramatically reducing the cost of microfinance
Censorship
Internet censorship by country[90][91][92]
Some governments, such as those of Burma, Iran, North Korea, the Mainland China, Saudi Arabia, and the United Arab
Emirates restrict what people in their countries can access on the Internet, especially political and religious content. This is
accomplished through software that filters domains and content so that they may not be easily accessed or obtained without
elaborate circumvention.[93]
In Norway, Denmark, Finland, and Sweden, major Internet service providers have voluntarily, possibly to avoid such an arrangement
being turned into law, agreed to restrict access to sites listed by authorities. While this list of forbidden URLs is supposed to contain
addresses of only known child pornography sites, the content of the list is secret.[94] Many countries, including the United States,
have enacted laws against the possession or distribution of certain material, such as child pornography, via the Internet, but do not
mandate filtering software. There are many free and commercially available software programs, called content-control software, with
which a user can choose to block offensive websites on individual computers or networks, in order to limit a child's access to
Internet
A visualization of routing paths through a portion of the Internet.
General[show]
Governance[show]
Information infrastructure[show]
Services[show]
Guides[show]
Internet portal
V
T
E
Availability Worldwide
The World Wide Web (abbreviated as WWW or W3,[3] commonly known as the web) is a system of
interlinked hypertext documents accessed via the Internet. With a web browser, one can view web
pages that may contain text, images, videos, and other multimedia and navigate between them
viahyperlinks.
Tim Berners-Lee, a British computer scientist and at that time employee of CERN, a European research
organisation near Geneva,[4] wrote a proposal in March 1989 for what would eventually become the World
Wide Web.[1] The 1989 proposal was meant for a more effective CERN communication system but
Berners-Lee eventually realised the concept could be implemented throughout the world. [5] Berners-Lee
and Flemish computer scientistRobert Cailliau proposed in 1990 to use hypertext "to link and access
information of various kinds as a web of nodes in which the user can browse at will", [6] and Berners-Lee
finished the first website in December that year. [7] Berners-Lee posted the project on the alt.hypertext
newsgroup on 7 August 1991.[8]
Contents
[hide]
1 History
2 Function
o 2.1 Linking
3 Web servers
4 Privacy
5 Intellectual property
6 Security
7 Standards
8 Accessibility
9 Internationalization
10 Statistics
11 Speed issues
12 Caching
13 See also
14 References
15 Further reading
16 External links
History[edit]
Main article: History of the World Wide Web
The NeXT Computer used by Berners-Lee. The handwritten label declares, "This machine is a server. DO NOT POWER IT
DOWN!!"
In the May 1970 issue of Popular Science magazine, Arthur C. Clarke predicted that satellites would
someday "bring the accumulated knowledge of the world to your fingertips" using a console that would
combine the functionality of the photocopier, telephone, television and a small computer, allowing data
transfer and video conferencing around the globe. [9]
In March 1989, Tim Berners-Lee wrote a proposal that referenced ENQUIRE, a database and software
project he had built in 1980, and described a more elaborate information management system. [10]
With help from Robert Cailliau, he published a more formal proposal (on 12 November 1990) to build a
"Hypertext project" called "WorldWideWeb" (one word, also "W3") as a "web" of "hypertext documents" to
be viewed by "browsers" using a client–server architecture.[6] This proposal estimated that a read-only
web would be developed within three months and that it would take six months to achieve "the creation of
new links and new material by readers, [so that] authorship becomes universal" as well as "the automatic
notification of a reader when new material of interest to him/her has become available." While the read-
only goal was met, accessible authorship of web content took longer to mature, with the wiki concept,
blogs, Web 2.0 andRSS/Atom.[11]
The proposal was modeled after the SGML reader Dynatext by Electronic Book Technology, a spin-off
from the Institute for Research in Information and Scholarship at Brown University. The Dynatext system,
licensed by CERN, was a key player in the extension of SGML ISO 8879:1986 to Hypermedia
within HyTime, but it was considered too expensive and had an inappropriate licensing policy for use in
the general high energy physics community, namely a fee for each document and each document
alteration.
A NeXT Computer was used by Berners-Lee as the world's first web server and also to write the first web
browser, WorldWideWeb, in 1990. By Christmas 1990, Berners-Lee had built all the tools necessary for a
working Web:[12] the first web browser (which was a web editor as well); the first web server; and the first
web pages,[13] which described the project itself.
The first web page may be lost, but Paul Jones (computer technologist) of UNC-Chapel Hill in North
Carolina revealed in May 2013 that he has a copy of a page given to him by Berners-Lee during a visit to
UNC in 1991 which is the oldest known web page. Jones stored it on a magneto-optical drive and on his
NeXT computer.[14]
On 6 August 1991, Berners-Lee posted a short summary of the World Wide Web project on
the alt.hypertext newsgroup.[15] This date also marked the debut of the Web as a publicly available
service on the Internet, although new users only access it after August 23. For this reason this is
considered the internaut's day. Many newsmedia have reported that the first photo on the web was
uploaded by Berners-Lee in 1992, an image of the CERN house band Les Horribles Cernettes taken by
Silvano de Gennaro; Gennaro has disclaimed this story, writing that media were "totally distorting our
words for the sake of cheap sensationalism." [16]
The first server outside Europe was set up at the Stanford Linear Accelerator Center (SLAC) in Palo Alto,
California, to host the SPIRES-HEP database. Accounts differ substantially as to the date of this event.
The World Wide Web Consortium says December 1992,[17] whereas SLAC itself claims 1991.[18][19] This is
supported by a W3C document titled A Little History of the World Wide Web.[20]
The crucial underlying concept of hypertext originated with older projects from the 1960s, such as the
Hypertext Editing System (HES) at Brown University, Ted Nelson's Project Xanadu, andDouglas
Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar
Bush's microfilm-based "memex", which was described in the 1945 essay "As We May Think".[21]
Berners-Lee's breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he
explains that he had repeatedly suggested that a marriage between the two technologies was possible to
members of both technical communities, but when no one took up his invitation, he finally assumed the
project himself. In the process, he developed three essential technologies:
The World Wide Web had a number of differences from other hypertext systems available at the time. The
web required only unidirectional links rather than bidirectional ones, making it possible for someone to link
to another resource without action by the owner of that resource. It also significantly reduced the difficulty
of implementing web servers and browsers (in comparison to earlier systems), but in turn presented the
chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide Web was non-
proprietary, making it possible to develop servers and clients independently and to add extensions without
licensing restrictions. On 30 April 1993, CERN announced that the World Wide Web would be free to
anyone, with no fees due.[23] Coming two months after the announcement that the server implementation
of the Gopher protocol was no longer free to use, this produced a rapid shift away from Gopher and
towards the Web. An early popular web browser was ViolaWWW for Unix and the X Windowing System.
Robert Cailliau, Jean-François Abramatic of IBM, and Tim Berners-Lee at the 10th anniversary of the World Wide Web
Consortium.
Scholars generally agree that a turning point for the World Wide Web began with the introduction [24] of
the Mosaic web browser[25] in 1993, a graphical browser developed by a team at the National Center for
Supercomputing Applications at the University of Illinois at Urbana-Champaign (NCSA-UIUC), led
by Marc Andreessen. Funding for Mosaic came from the U.S. High-Performance Computing and
Communications Initiative and the High Performance Computing and Communication Act of 1991, one
of several computing developments initiated by U.S. Senator Al Gore.[26] Prior to the release of Mosaic,
graphics were not commonly mixed with text in web pages and the web's popularity was less than older
protocols in use over the Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic's
graphical user interface allowed the Web to become, by far, the most popular Internet protocol.
The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the European
Organization for Nuclear Research (CERN) in October 1994. It was founded at the Massachusetts
Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense
Advanced Research Projects Agency (DARPA), which had pioneered the Internet; a year later, a second
site was founded at INRIA (a French national computer research lab) with support from the European
Commission DG InfSo; and in 1996, a third continental site was created in Japan at Keio University. By
the end of 1994, while the total number of websites was still minute compared to present standards, quite
a number of notable websiteswere already active, many of which are the precursors or inspiration for
today's most popular services.
Connected by the existing Internet, other websites were created around the world, adding international
standards for domain names and HTML. Since then, Berners-Lee has played an active role in guiding the
development of web standards (such as the markup languages in which web pages are composed), and
has advocated his vision of aSemantic Web. The World Wide Web enabled the spread of information over
the Internet through an easy-to-use and flexible format. It thus played an important role in popularizing
use of the Internet.[27] Although the two terms are sometimes conflated in popular use, World Wide Web is
not synonymous with Internet.[28] The web is a collection of documents and both client and server software
using Internet protocols such as TCP/IP and HTTP.
Tim Berners-Lee was knighted in 2004 by Queen Elizabeth II for his contribution to the World Wide Web.
Function[edit]
The terms Internet and World Wide Web are often used in everyday speech without much distinction.
However, the Internet and the World Wide Web are not the same. The Internet is a global system of
interconnected computer networks. In contrast, the web is one of the services that runs on the Internet. It
is a collection of text documents and other resources, linked by hyperlinks and URLs, usually accessed
by web browsers from web servers. In short, the web can be thought of as an application "running" on the
Internet.[29]
Viewing a web page on the World Wide Web normally begins either by typing the URL of the page into
a web browser or by following a hyperlink to that page or resource. The web browser then initiates a
series of communication messages, behind the scenes, in order to fetch and display it. In the 1990s,
using a browser to view web pages—and to move from one web page to another through hyperlinks—
came to be known as 'browsing,' 'web surfing,' or 'navigating the web'. Early studies of this new behavior
investigated user patterns in using web browsers. One study, for example, found five user patterns:
exploratory surfing, window surfing, evolved surfing, bounded navigation and targeted navigation. [30]
The following example demonstrates how a web browser works. Consider accessing a page with the
URL http://example.org/wiki/World_Wide_Web.
First, the browser resolves the server-name portion of the URL (example.org) into an Internet Protocol
address using the globally distributed database known as the Domain Name System (DNS); this lookup
returns an IP address such as 208.80.152.2. The browser then requests the resource by sending
an HTTP request across the Internet to the computer at that particular address. It makes the request to a
particular application port in the underlying Internet Protocol Suite so that the computer receiving the
request can distinguish an HTTP request from other network protocols it may be servicing such as e-mail
delivery; the HTTP protocol normally uses port 80. The content of the HTTP request can be as simple as
the two lines of text GET /wiki/World_Wide_WebHTTP/1.1 Host: example.org
The computer receiving the HTTP request delivers it to web server software listening for requests on port
80. If the web server can fulfill the request it sends an HTTP response back to the browser indicating
success, which can be as simple as HTTP/1.0 200 OK Content-Type: text/html; charset=UTF-8 followed
by the content of the requested page. The Hypertext Markup Language for a basic web page looks like
<html> <head> <title>Example.org – The World Wide Web</title> </head> <body> <p>The World Wide
Web, abbreviated as WWW and commonly known ...</p> </body> </html>
The web browser parses the HTML, interpreting the markup (<title>, <p> for paragraph, and such) that
surrounds the words in order to draw the text on the screen.
Many web pages use HTML to reference the URLs of other resources such as images, other embedded
media, scripts that affect page behavior, and Cascading Style Sheets that affect page layout. The browser
will make additional HTTP requests to the web server for these other Internet media types. As it receives
their content from the web server, the browser progressively rendersthe page onto the screen as specified
by its HTML and these additional resources.
Linking[edit]
Most web pages contain hyperlinks to other related pages and perhaps to downloadable files, source
documents, definitions and other web resources. In the underlying HTML, a hyperlink looks like <a
href="http://example.org/wiki/Main_Page">Example.org, a free encyclopedia</a>
Such a collection of useful, related resources, interconnected via hypertext links is dubbed a web of
information. Publication on the Internet created whatTim Berners-Lee first called the WorldWideWeb (in
its original CamelCase, which was subsequently discarded) in November 1990.[6]
The hyperlink structure of the WWW is described by the webgraph: the nodes of
the webgraph correspond to the web pages (or URLs) the directed edges between them to the hyperlinks.
Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced with
different content. This makes hyperlinks obsolete, a phenomenon referred to in some circles as link
rot and the hyperlinks affected by it are often called dead links. The ephemeral nature of the Web has
prompted many efforts to archive web sites. The Internet Archive, active since 1996, is the best known of
such efforts.
Dynamic updates of web pages[edit]
Main article: Ajax (programming)
JavaScript is a scripting language that was initially developed in 1995 by Brendan Eich, then of Netscape,
for use within web pages.[31] The standardised version is ECMAScript.[31] To make web pages more
interactive, some web applications also use JavaScript techniques such as Ajax(asynchronous JavaScript
and XML). Client-side script is delivered with the page that can make additional HTTP requests to the
server, either in response to user actions such as mouse movements or clicks, or based on lapsed time.
The server's responses are used to modify the current page rather than creating a new page with each
response, so the server needs only to provide limited, incremental information. Multiple Ajax requests can
be handled at the same time, and users can interact with the page while data is being retrieved. Web
pages may also regularlypoll the server to check whether new information is available. [32]
WWW prefix[edit]
Many hostnames used for the World Wide Web begin with www because of the long-standing practice of
naming Internet hosts (servers) according to the services they provide. The hostname for aweb server is
often www, in the same way that it may be ftp for an FTP server, and news or nntp for a USENET news
server. These host names appear as Domain Name System or (DNS)subdomain names, as
in www.example.com. The use of 'www' as a subdomain name is not required by any technical or policy
standard and many web sites do not use it; indeed, the first ever web server was
called nxoc01.cern.ch.[33] According to Paolo Palazzi,[34] who worked at CERN along with Tim Berners-
Lee, the popular use of 'www' subdomain was accidental; the World Wide Web project page was intended
to be published at www.cern.ch while info.cern.ch was intended to be the CERN home page, however the
dns records were never switched, and the practice of prepending 'www' to an institution's website domain
name was subsequently copied. Many established websites still use 'www', or they invent other
subdomain names such as 'www2', 'secure', etc.[citation needed]. Many such web servers are set up so that
both the domain root (e.g., example.com) and the www subdomain (e.g., www.example.com) refer to the
same site; others require one form or the other, or they may map to different web sites.
The use of a subdomain name is useful for load balancing incoming web traffic by creating a CNAME
record that points to a cluster of web servers. Since, currently, only a subdomain can be used in a
CNAME, the same result cannot be achieved by using the bare domain root. [citation needed]
When a user submits an incomplete domain name to a web browser in its address bar input field, some
web browsers automatically try adding the prefix "www" to the beginning of it and possibly ".com", ".org"
and ".net" at the end, depending on what might be missing. For example, entering 'microsoft' may be
transformed to http://www.microsoft.com/ and 'openoffice' tohttp://www.openoffice.org. This feature started
appearing in early versions of Mozilla Firefox, when it still had the working title 'Firebird' in early 2003,
from an earlier practice in browsers such asLynx.[35] It is reported that Microsoft was granted a US patent
for the same idea in 2008, but only for mobile devices. [36]
In English, www is usually read as double-u double-u double-u.[citation needed] Some users pronounce it dub-
dub-dub, particularly in New Zealand. Stephen Fry, in his "Podgrammes" series of podcasts, pronounces
it wuh wuh wuh.[citation needed] The English writer Douglas Adams once quipped in The Independent on
Sunday (1999): "The World Wide Web is the only thing I know of whose shortened form takes three times
longer to say than what it's short for".[citation needed] In Mandarin Chinese, World Wide Web is commonly
translated via a phono-semantic matching towàn wéi wǎng (万维网), which satisfies www and literally
means "myriad dimensional net",[37] a translation that very appropriately reflects the design concept and
proliferation of the World Wide Web. Tim Berners-Lee's web-space states that World Wide Web is
officially spelled as three separate words, each capitalised, with no intervening hyphens. [38]
Use of the www prefix is declining as Web 2.0 web applications seek to brand their domain names and
make them easily pronounceable.[39] As the mobile web grows in popularity, services
likeGmail.com, MySpace.com, Facebook.com and Twitter.com are most often discussed without adding
www to the domain (or, indeed, the .com).
Web servers[edit]
Main article: Web server
The primary function of a web server is to deliver web pages on the request to clients. This means
delivery of HTML documents and any additional content that may be included by a document, such as
images, style sheets and scripts.
Privacy[edit]
Main article: Internet privacy
Every time a web page is requested from a web server the server can identify, and usually it logs, the IP
address from which the request arrived. Equally, unless set not to do so, most web browsers record the
web pages that have been requested and viewed in a history feature, and usually cache much of the
content locally. Unless HTTPS encryption is used, web requests and responses travel in plain text across
the internet and they can be viewed, recorded and cached by intermediate systems.
When a web page asks for, and the user supplies, personally identifiable information such as their real
name, address, e-mail address, etc., then a connection can be made between the current web traffic and
that individual. If the website uses HTTP cookies, username and password authentication, or other
tracking techniques, then it will be able to relate other web visits, before and after, to the identifiable
information provided. In this way it is possible for a web-based organisation to develop and build a profile
of the individual people who use its site or sites. It may be able to build a record for an individual that
includes information about their leisure activities, their shopping interests, their profession, and other
aspects of their demographic profile. These profiles are obviously of potential interest to marketeers,
advertisers and others. Depending on the website's terms and conditions and the local laws that apply
information from these profiles may be sold, shared, or passed to other organisations without the user
being informed. For many ordinary people, this means little more than some unexpected e-mails in their
in-box, or some uncannily relevant advertising on a future web page. For others, it can mean that time
spent indulging an unusual interest can result in a deluge of further targeted marketing that may be
unwelcome. Law enforcement, counter terrorism and espionage agencies can also identify, target and
track individuals based on what appear to be their interests or proclivities on the web.
Social networking sites make a point of trying to get the user to truthfully expose their real names,
interests and locations. This makes the social networking experience more realistic and therefore
engaging for all their users. On the other hand, photographs uploaded and unguarded statements made
will be identified to the individual, who may regret some decisions to publish these data. Employers,
schools, parents and other relatives may be influenced by aspects of social networking profiles that the
posting individual did not intend for these audiences. On-line bullies may make use of personal
information to harass or stalk users. Modern social networking websites allow fine grained control of the
privacy settings for each individual posting, but these can be complex and not easy to find or use,
especially for beginners.[40]
Photographs and videos posted onto websites have caused particular problems, as they can add a
person's face to an on-line profile. With modern and potential facial recognition technology, it may then be
possible to relate that face with other, previously anonymous, images, events and scenarios that have
been imaged elsewhere. Because of image caching, mirroring and copying, it is difficult to remove an
image from the World Wide Web.
Intellectual property[edit]
This section does not cite any references or sources. Please help improve this section by adding
citations to reliable sources. Unsourced material may be challenged and removed. (January 2013)
Main article: Intellectual property
The intellectual property rights for any creative work initially rests with its creator. Web users who want to
publish their work onto the World Wide Web, however, need to be aware of the details of the way they do
it. If artwork, photographs, writings, poems, or technical innovations are published by their creator onto a
privately owned web server, then they may choose the copyright and other conditions freely themselves.
This is unusual though; more commonly work is uploaded to websites and servers that are owned by
other organizations. It depends upon the terms and conditions of the site or service provider to what
extent the original owner automatically signs over rights to their work by the choice of destination and by
the act of uploading.[citation needed]
Some users of the web erroneously assume that everything they may find online is freely available to
them as if it was in the public domain, which is not always the case. Content owners that are aware of this
widespread belief, may expect that their published content will probably be used in some capacity
somewhere without their permission. Some content publishers therefore embeddigital watermarks in their
media files, sometimes charging users to receive unmarked copies for legitimate use. Digital rights
management includes forms of access control technology that further limit the use of digital content even
after it has been bought or downloaded.[citation needed]
Security[edit]
The web has become criminals' preferred pathway for spreading malware. Cybercrime carried out on the
web can include identity theft, fraud, espionage and intelligence gathering.[41] Web-
basedvulnerabilities now outnumber traditional computer security concerns, [42][43] and as measured
by Google, about one in ten web pages may contain malicious code. [44] Most web-based attackstake
place on legitimate websites, and most, as measured by Sophos, are hosted in the United States, China
and Russia.[45] The most common of all malware threats is SQL injection attacks against websites.
[46]
Through HTML and URIs the web was vulnerable to attacks like cross-site scripting (XSS) that came
with the introduction of JavaScript[47] and were exacerbated to some degree by Web 2.0 and Ajax web
design that favors the use of scripts.[48] Today by one estimate, 70% of all websites are open to XSS
attacks on their users.[49]
Proposed solutions vary to extremes. Large security vendors like McAfee already design governance and
compliance suites to meet post-9/11 regulations,[50] and some, like Finjan have recommended active real-
time inspection of code and all content regardless of its source. [41] Some have argued that for enterprise
to see security as a business opportunity rather than a cost center, [51] "ubiquitous, always-on digital rights
management" enforced in the infrastructure by a handful of organizations must replace the hundreds of
companies that today secure data and networks.[52] Jonathan Zittrain has said users sharing responsibility
for computing safety is far preferable to locking down the Internet. [53]
Standards[edit]
Main article: Web standards
Many formal standards and other technical specifications and software define the operation of different
aspects of the World Wide Web, the Internet, and computer information exchange. Many of the
documents are the work of the World Wide Web Consortium (W3C), headed by Berners-Lee, but some
are produced by the Internet Engineering Task Force (IETF) and other organizations.
Usually, when web standards are discussed, the following publications are seen as foundational:
Additional publications provide definitions of other essential technologies for the World Wide Web,
including, but not limited to, the following:
Accessibility[edit]
Main article: Web accessibility
There are methods available for accessing the web in alternative mediums and formats, so as to enable
use by individuals with disabilities. These disabilities may be visual, auditory, physical, speech related,
cognitive, neurological, or some combination therin. Accessibility features also help others with temporary
disabilities like a broken arm or the aging population as their abilities change. [54] The Web is used for
receiving information as well as providing information and interacting with society. The World Wide Web
Consortium claims it essential that the Web be accessible in order to provide equal access and equal
opportunity to people with disabilities.[55] Tim Berners-Lee once noted, "The power of the Web is in its
universality. Access by everyone regardless of disability is an essential aspect." [54] Many countries
regulate web accessibility as a requirement for websites.[56] International cooperation in the W3C Web
Accessibility Initiative led to simple guidelines that web content authors as well as software developers
can use to make the Web accessible to persons who may or may not be using assistive technology.[54][57]
Internationalization[edit]
The W3C Internationalization Activity assures that web technology will work in all languages, scripts, and
cultures.[58] Beginning in 2004 or 2005, Unicode gained ground and eventually in December 2007
surpassed both ASCII and Western European as the Web's most frequently used character encoding.
[59]
Originally RFC 3986 allowed resources to be identified by URI in a subset of US-ASCII. RFC
3987 allows more characters—any character in the Universal Character Set—and now a resource can be
identified by IRI in any language.[60]
Statistics[edit]
Between 2005 and 2010, the number of web users doubled, and was expected to surpass two billion in
2010.[61] Early studies in 1998 and 1999 estimating the size of the web using capture/recapture methods
showed that much of the web was not indexed by search engines and the web was much larger than
expected.[62][63] According to a 2001 study, there were a massive number, over 550 billion, of documents
on the Web, mostly in the invisible Web, or Deep Web.[64] A 2002 survey of 2,024 million web
pages[65] determined that by far the most web content was in the English language: 56.4%; next were
pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more recent study, which used web
searches in 75 different languages to sample the web, determined that there were over 11.5 billion web
pages in the publicly indexable web as of the end of January 2005.[66] As of March 2009, the indexable
web contains at least 25.21 billion pages.[67] On 25 July 2008, Google software engineers Jesse Alpert
and Nissan Hajaj announced that Google Search had discovered one trillion unique URLs.[68] As of May
2009, over 109.5 million domains operated.[69][not in citation given] Of these 74% were commercial or other
domains operating in the .com generic top-level domain.[69]
Statistics measuring a website's popularity are usually based either on the number of page views or on
associated server 'hits' (file requests) that it receives.
Speed issues[edit]
Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow
browsing has led to a pejorative name for the World Wide Web: the World Wide Wait.[70]Speeding up the
Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to
reduce the congestion can be found at W3C.[71] Guidelines for web response times are:[72]
0.1 second (one tenth of a second). Ideal response time. The user does
not sense any interruption.
1 second. Highest acceptable response time. Download times above 1
second interrupt the user experience.
Caching[edit]
Main article: Web cache
If a user revisits a web page after only a short interval, the page data may not need to be re-obtained from
the source web server. Almost all web browsers cache recently obtained data, usually on the local hard
drive. HTTP requests sent by a browser will usually ask only for data that has changed since the last
download. If the locally cached data are still current, they will be reused. Caching helps reduce the
amount of web traffic on the Internet. The decision about expiration is made independently for each
downloaded file, whether image, stylesheet, JavaScript, HTML, or other web resource. Thus even on
sites with highly dynamic content, many of the basic resources need to be refreshed only occasionally.
Web site designers find it worthwhile to collate resources such as CSS data and JavaScript into a few
site-wide files so that they can be cached efficiently. This helps reduce page download times and lowers
demands on the Web server.
There are other components of the Internet that can cache web content. Corporate and
academic firewalls often cache Web resources requested by one user for the benefit of all. (See
alsocaching proxy server.) Some search engines also store cached content from websites. Apart from the
facilities built into web servers that can determine when files have been updated and so need to be re-
sent, designers of dynamically generated web pages can control the HTTP headers sent back to
requesting users, so that transient or sensitive pages are not cached. Internet bankingand news sites
frequently use this facility. Data requested with an HTTP 'GET' is likely to be cached if other conditions
are met; data obtained in response to a 'POST' is assumed to depend on the data that was POSTed and
so is not cached.
"Search engine" redirects here. For a tutorial on using search engines for research, see WP:Search engine test. For other uses,
A web search engine is a software system that is designed to search for information on the World Wide Web. The search results
are generally presented in a line of results often referred to assearch engine results pages (SERPs). The information may be a
specialist in web pages, images, information and other types of files. Some search engines also mine data available
in databasesor open directories. Unlike web directories, which are maintained only by human editors, search engines also
Contents
[hide]
1 History
3 Market share
6 See also
7 References
8 Further reading
9 External links
History[edit]
Aliweb Inactive
JumpStation Inactive
Lycos Active
Infoseek Inactive
Daum Active
Magellan Inactive
Excite Active
SAPO Active
Yandex Active
Naver Active
Teoma Active
Vivisimo Inactive
Exalead Active
Gigablast Active
Scroogle Inactive
A9.com Inactive
Sogou Active
2005 AOL Search Active
Ask.com Active
GoodSearch Active
SearchMe Inactive
Quaero Active
Ask.com Active
ChaCha Active
Sproose Inactive
Picollator Inactive
Viewzi Inactive
Boogami Inactive
LeapFish Inactive
DuckDuckGo Active
Yebol Inactive
Goby Active
NATE Active
Cuil Inactive
(English) search
During early development of the web, there was a list of webservers edited by Tim Berners-Lee and hosted on
the CERN webserver. One historical snapshot of the list in 1992 remains,[1] but as more and more webservers went online the
central list could no longer keep up. On theNCSA site, new servers were announced under the title "What's New!"[2]
The very first tool used for searching on the Internet was Archie.[3] The name stands for "archive" without the "v". It was created in
1990 by Alan Emtage, Bill Heelan and J. Peter Deutsch, computer science students at McGill University in Montreal. The program
downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a
searchable database of file names; however, Archie did not index the contents of these sites since the amount of data was so limited
The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search
programs, Veronica and Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica
(Very Easy Rodent-Oriented Net-wide Index toComputerized Archives) provided a keyword search of most Gopher menu titles in
the entire Gopher listings. Jughead (Jonzy's Universal GopherHierarchy Excavation And Display) was a tool for obtaining menu
information from specific Gopher servers. While the name of the search engine "Archie" was not a reference to the Archie comic
book series, "Veronica" and "Jughead" are characters in the series, thus referencing their predecessor.
In the summer of 1993, no search engine existed for the web, though numerous specialized catalogues were maintained by
hand. Oscar Nierstrasz at the University of Geneva wrote a series of Perl scripts that periodically mirrored these pages and rewrote
them into a standard format. This formed the basis for W3Catalog, the web's first primitive search engine, released on September 2,
1993.[4]
In June 1993, Matthew Gray, then at MIT, produced what was probably the first web robot, the Perl-based World Wide Web
Wanderer, and used it to generate an index called 'Wandex'. The purpose of the Wanderer was to measure the size of the World
Wide Web, which it did until late 1995. The web's second search engine Aliweb appeared in November 1993. Aliweb did not use
a web robot, but instead depended on being notified by website administrators of the existence at each site of an index file in a
particular format.
JumpStation (created in December 1993[5] by Jonathon Fletcher) used a web robot to find web pages and to build its index, and
used a web formas the interface to its query program. It was thus the first WWW resource-discovery tool to combine the three
essential features of a web search engine(crawling, indexing, and searching) as described below. Because of the limited resources
available on the platform it ran on, its indexing and hence searching were limited to the titles and headings found in the web pages
One of the first "all text" crawler-based search engines was WebCrawler, which came out in 1994. Unlike its predecessors, it
allowed users to search for any word in any webpage, which has become the standard for all major search engines since. It was
also the first one widely known by the public. Also in 1994, Lycos (which started at Carnegie Mellon University) was launched and
Soon after, many search engines appeared and vied for popularity. These included Magellan, Excite, Infoseek, Inktomi, Northern
Light, andAltaVista. Yahoo! was among the most popular ways for people to find web pages of interest, but its search function
operated on its web directory, rather than its full-text copies of web pages. Information seekers could also browse the directory
significant effect on the SE business, which went from struggling to one of the most profitable businesses in the internet. [6]
In 1996, Netscape was looking to give a single search engine an exclusive deal as the featured search engine on Netscape's web
browser. There was so much interest that instead Netscape struck deals with five of the major search engines: for $5 million a year,
each search engine would be in rotation on the Netscape search engine page. The five engines were Yahoo!, Magellan, Lycos,
Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s.
[9]
Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken
down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine
companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001.
Around 2000, Google's search engine rose to prominence.[10] The company achieved better results for many searches with an
innovation calledPageRank. This iterative algorithm ranks web pages based on the number and PageRank of other web sites and
pages that link there, on the premise that good or desirable pages are linked to more than others. Google also maintained a
minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal. In fact,
Google search engine became so popular that spoof engines emerged such as Mystery Seeker.
By 2000, Yahoo! was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002,
and Overture (which ownedAlltheWeb and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it
launched its own search engine based on the combined technologies of its acquisitions.
Microsoft first launched MSN Search in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display
listings fromLooksmart, blended with results from Inktomi. For a short time in 1999, MSN Search used results from AltaVista were
instead. In 2004,Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot).
Microsoft's rebranded search engine, Bing, was launched on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft finalized a deal
In 2012, following the April 24 release of Google Drive, Google released the Beta version of Open Drive (available as a Chrome
app) to enable the search of files in the cloud . Open Drive has now been rebranded as Cloud Kite. Cloud Kite is advertised as a
"collective encyclopedia project based on Google Drive public files and on the crowd sharing, crowd sourcing and crowd-solving
principles". Cloud Kite will also return search results from other cloud storage content services including Dropbox, SkyDrive,
1. Web crawling
2. Indexing
3. Searching[12]
Web search engines work by storing information about many web pages, which they retrieve from the HTML markup of the pages.
These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web crawler which follows every
link on the site. The site owner can exclude specific pages by usingrobots.txt.
The search engine then analyzes the contents of each page to determine how it should be indexed (for example, words can be
extracted from the titles, page content, headings, or special fields called meta tags). Data about web pages are stored in an index
database for use in later queries. A query from a user can be a single word. The index helps find information relating to the query as
quickly as possible.[12] Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as
information about the web pages, whereas others, such as AltaVista, store every word of every page they find.[citation needed] This
cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the
content of the current page has been updated and the search terms are no longer in it. [12] This problem might be considered a mild
form of linkrot, and Google's handling of it increasesusability by satisfying user expectations that the search terms will be on the
returned webpage. This satisfies the principle of least astonishment, since the user normally expects that the search terms will be on
the returned pages. Increased search relevance makes these cached pages very useful as they may contain data that may no
When a user enters a query into a search engine (typically by using keywords), the engine examines its index and provides a listing
of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes
parts of the text. The index is built from the information stored with the data and the method by which the information is indexed.
[12]
From 2007 the Google.com search engine has allowed one to search by date by clicking 'Show search tools' in the leftmost
column of the initial search results page, and then selecting the desired date range.[citation needed] Most search engines support the use
of the boolean operators AND, OR and NOT to further specify the search query. Boolean operators are for literal searches that allow
the user to refine and extend the terms of the search. The engine looks for the words or phrases exactly as entered. Some search
engines provide an advanced feature called proximity search, which allows users to define the distance between keywords.[12]There
is also concept-based searching where the research involves using statistical analysis on pages containing the words or phrases
you search for. As well, natural language queries allow the user to type a question in the same form one would ask it to a human. A
The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of web
pages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most
search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are
the best matches, and what order the results should be shown in, varies widely from one engine to another. [12] The methods also
change over time as Internet usage changes and new techniques evolve. There are two main types of search engine that have
evolved: one is a system of predefined and hierarchically ordered keywords that humans have programmed extensively. The other
is a system that generates an "inverted index" by analyzing texts it locates. This first form relies much more heavily on the computer
Most Web search engines are commercial ventures supported by advertising revenue and thus some of them allow advertisers
to have their listings ranked higher in search results for a fee. Search engines that do not accept money for their search results
make money by running search related ads alongside the regular search engine results. The search engines make money every
Market share[edit]
Search engine Market share in May 2011 Market share in December 2010[14]
Google's worldwide market share peaked at 86.3% in April 2010.[15] Yahoo!, Bing and other search engines are more popular in the
US than in Europe.
According to Hitwise, market share in the USA for October 2011 was Google 65.38%, Bing-powered (Bing and Yahoo!) 28.62%, and
the remaining 66 search engines 6%. However, an Experian Hit wise report released in August 2011 gave the "success rate" of
searches sampled in July. Over 80 percent of Yahoo! and Bing searches resulted in the users visiting a web site, while Google's rate
In the People's Republic of China, Baidu held a 61.6% market share for web search in July 2009.[18] In Russian
Federation, Yandex holds around 60% of the market share as of April 2012.[19] In July 2013 Google controls 84% Global & 88% US
market share for web search.[20] In South Korea, Naver (Hangul: 네이버) is a popular search portal, which holds a market share of
Although search engines are programmed to rank websites based on some combination of their popularity and relevancy, empirical
[23] [24]
studies indicate various political, economic, and social biases in the information they provide. These biases can be a direct
result of economic and commercial processes (e.g., companies that advertise with a search engine can become also more popular
[25]
in its organic search results), and political processes (e.g., the removal of search results to comply with local laws).
Biases can also be a result of social processes, as search engine algorithms are frequently designed to exclude non-normative
viewpoints in favor of more "popular" results.[26] Indexing algorithms of major search engines skew towards coverage of U.S.-based
sites, rather than websites from non-U.S. countries.[24] Major search engines' search algorithms also privilege misinformation and
pornographic portrayals of women, people of color, and members of the LGBT community.[27][28]
Google Bombing is one example of an attempt to manipulate search results for political, social or commercial reasons.
Many search engines such as Google and Bing provide customized results based on the user's activity history. This leads to an
effect that has been called a filter bubble. The term describes a phenomenon in which websites use algorithms to selectively guess
what information a user would like to see, based on information about the user (such as location, past click behaviour and search
history). As a result, websites tend to show only information that agrees with the user's past viewpoint, effectively isolating the user
in a bubble that tends to exclude contrary information. Prime examples are Google's personalized search results and Facebook's
personalized news stream. According to Eli Pariser, who coined the term, users get less exposure to conflicting viewpoints and are
isolated intellectually in their own informational bubble. Pariser related an example in which one user searched Google for "BP" and
got investment news about British Petroleum while another searcher got information about the Deepwater Horizon oil spill and that
the two search results pages were "strikingly different."[29][30][31] The bubble effect may have negative implications for civic discourse,
according to Pariser.[32]
Since this problem has been identified, competing search engines have emerged that seek to avoid this problem by not
See also[edit]
This Euler diagram shows that a uniform resource identifier (URI) is either a uniform resource locator (URL), or a uniform
resource name (URN), or both.
A uniform resource locator, abbreviated URL, also known as web address, is a specific character
string that constitutes a reference to a resource. In most web browsers, the URL of a web page is
displayed on top inside an address bar. An example of a typical URL would
be"http://en.example.org/wiki/Main_Page". A URL is technically a type of uniform resource identifier (URI),
but in many technical documents and verbal discussions, URL is often used as a synonym for URI, and
this is not considered a problem.[1]
Contents
[hide]
1 History
2 Syntax
4 Relationship to URI
5 Internet hostnames
6 Modern usage
7 See also
8 Notes
9 References
10 External links
History[edit]
The Uniform Resource Locator was standardized in 1994 [2] by Tim Berners-Lee and the URI working
group of the Internet Engineering Task Force (IETF) as an outcome of collaboration started at the IETF
Living Documents "Birds of a Feather" session in 1992.[3][4] The format combines the pre-existing system
of domain names (created in 1985) with file path syntax, where slashes are used to
separate directory and file names. Conventions already existed where server names could be prepended
to complete file paths, preceded by a double-slash (//). [5]
Berners-Lee later regretted the use of dots to separate the parts of the domain name within URIs, wishing
he had used slashes throughout.[5] For example,http://www.example.com/path/to/name would
have been written http:com/example/www/path/to/name. Berners-Lee has also said that, given the
colon following the URI scheme, the two slashes before the domain name were also unnecessary. [6]
Syntax[edit]
Main article: URI scheme#Generic syntax
The scheme says how to connect, the host specifies where to connect, and the remainder
specifies what to ask for.
For programs such as Common Gateway Interface (CGI) scripts, this is followed by a query string,[7][8] and
an optional fragment identifier.[9]
The scheme name defines the namespace, purpose, and the syntax of
the remaining part of the URL. Software will try to process a URL
according to its scheme and context. For example, aweb browser will
usually dereference the URL http://example.org:80 by
performing an HTTP request to the host at example.org, using port
number 80. The URLmailto:[email protected] may start an e-
mail composer with the address [email protected] in the To field.
Other examples of scheme names include https, gopher, wais, ftp. URLs with https as a scheme (such
as https://example.com/) require that requests and responses will be made over a secure
connection to the website. Some schemes that require authentication allow a username, and perhaps a
password too, to be embedded in the URL, for exampleftp://[email protected]. Passwords
embedded in this way are not conducive to security, but the full possible syntax is
scheme://username:password@domain:port/path?query_string#fragment_id
The port number, given in decimal, is optional; if omitted, the default for
the scheme is used.
The path is used to specify and perhaps find the resource requested. It
is case-sensitive,[10] though it may be treated as case-insensitive by
some servers, especially those based on Microsoft Windows.
?first_name=John&last_name=Doe.
Reserved
! * ' ( ) ; : @ & = + $ , / ? % # [ ]
Further details can for example be found in RFC 3986 and http://www.w3.org/Addressing/URL/uri-
spec.html.
Relationship to URI[edit]
See also: URIs, Relationship to URL and URN
A URL is a URI that, in addition to identifying a web resource, provides a means of locating the resource
by describing its "primary access mechanism (e.g., its network location)". [11]
Internet hostnames[edit]
Main article: Hostname
On the Internet, a hostname is a domain name assigned to a host computer. This is usually a combination
of the host's local name with its parent domain's name. For example, en.example.org consists of a local
hostname (en) and the domain name example.org. The hostname is translated into an IP address via the
local hosts file, or the domain name system (DNS) resolver. It is possible for a single host computer to
have several hostnames; but generally the operating system of the host prefers to have one hostname
that the host uses for itself.
Any domain name can also be a hostname, as long as the restrictions mentioned below are followed. For
example, both "en.example.org" and "example.org" can be hostnames if they both haveIP
addresses assigned to them. The domain name "xyz.example.org" may not be a hostname if it does not
have an IP address, but "aa.xyz.example.org" may still be a hostname. All hostnames are domain names,
but not all domain names are hostnames.
Modern usage[edit]
Major computer manufacturers such as Apple have begun to deprecate APIs that take local paths as
parameters, in favour of using URLs.[12] This is because remote and local resources (via thefile:// scheme)
may both be represented using a URL, but may additionally provide a protocol (particularly useful for
remote items) and credentials.
Dynamic HTML
From Wikipedia, the free encyclopedia
This article is written like a manual or guidebook. Please help rewrite this article from a
descriptive, neutral point of view, and remove advice or instruction. (December 2008)
HTML
Dynamic HTML
XHTML
HTML element
HTML attribute
Language code
Quirks mode
HTML Frames
HTML5 Canvas, WebGL, and WebCL
Web storage
Comparison of
web browsers
XHTML (1.1)
V
T
E
Dynamic HTML, or DHTML, is an umbrella term for a collection of technologies used together to create
interactive and animated web sites[1] by using a combination of a static markup language (such as HTML),
a client-side scripting language (such as JavaScript), a presentation definition language (such as CSS),
and the Document Object Model.[2]
DHTML allows scripting languages to change variables in a web page's definition language, which in turn
affects the look and function of otherwise "static" HTML page content, after the page has been fully
loaded and during the viewing process. Thus the dynamic characteristic of DHTML is the way it functions
while a page is viewed, not in its ability to generate a unique page with each page load.
By contrast, a dynamic web page is a broader concept, covering any web page generated differently for
each user, load occurrence, or specific variable values. This includes pages created by client-side
scripting, and ones created by server-side scripting (such as PHP, Perl, JSP or ASP.NET) where the web
server generates content before sending it to the client.
DHTML is differentiated from Ajax by the fact that a DHTML page is still request/reload-based. With
DHTML, there may not be any interaction between the client and server after the page is loaded; all
processing happens in JavaScript on the client side. By contrast, an Ajax page uses features of DHTML
to initiate a request (or 'subrequest') to the server to perform actions such as loading more content.
Contents
[hide]
1 Uses
5 Dynamic styles
6 Data binding
7 References
8 External links
Uses[edit]
DHTML allows authors to add effects to their pages that are otherwise difficult to achieve. In short words:
scripting language is changing the DOM and style. For example, DHTML allows the page author to:
Use a form to capture user input, and then process, verify and respond
to that data without having to send data back to the server.
A less common use is to create browser-based action games. Although a number of games were created
using DHTML during the late 1990s and early 2000s,[citation needed], differences between browsers made this
difficult: many techniques had to be implemented in code to enable the games to work on multiple
platforms. Recently browsers have been converging towards the web standards, which has made the
design of DHTML games more viable. Those games can be played on all major browsers and they can
also be ported to Plasma for KDE, Widgets for Mac OS Xand Gadgets for Windows Vista, which are
based on DHTML code.
The term "DHTML" has fallen out of use in recent years as it was associated with practices and
conventions that tended to not work well between various web browsers. DHTML may now be referred to
as unobtrusive JavaScript coding (DOM Scripting), in an effort to place an emphasis on agreed-upon best
practices while allowing similar effects in an accessible, standards-compliant way.
DHTML support with extensive DOM access was introduced with Internet Explorer 4.0. Although there
was a basic dynamic system with Netscape Navigator 4.0, not all HTML elements were represented in the
DOM. When DHTML-style techniques became widespread, varying degrees of support among web
browsers for the technologies involved made them difficult to develop anddebug. Development became
easier when Internet Explorer 5.0+, Mozilla Firefox 2.0+, and Opera 7.0+ adopted a
shared DOM inherited from ECMAscript.
More recently, JavaScript libraries such as jQuery have abstracted away much of the day-to-day
difficulties in cross-browser DOM manipulation.
<script>
var init = function () {
myObj = document.getElementById("navigation");
// ... manipulate myObj
};
window.onload = init;
</script>
<!--
Often the code is stored in an external file; this is done
by linking the file that contains the JavaScript.
This is helpful when several pages use the same script:
-->
<script src="myjavascript.js"></script>
</body>
</html>
Example: Displaying an additional block of text[edit]
The following code illustrates an often-used function. An additional part of a web page will only be
displayed if the user requests it..
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Using a DOM function</title>
<style>
a {background-color:#eee;}
a:hover {background:#ff0;}
#toggleMe {background:#cfc; display:none; margin:30px 0;
padding:1em;}
</style>
</head>
<body>
<h1>Using a DOM function</h1>
<script>
changeDisplayState = function (id) {
var d = document.getElementById('showhide'),
e = document.getElementById(id);
if (e.style.display === 'none' || e.style.display === '') {
e.style.display = 'block';
d.innerHTML = 'Hide paragraph';
} else {
e.style.display = 'none';
d.innerHTML = 'Show paragraph';
}
};
document.getElementById('showhide').onclick = function () {
changeDisplayState('toggleMe');
return false;
};
</script>
</body>
</html>
Document Object Model[edit]
DHTML is not a technology in and of itself; rather, it is the product of three related and complementary
technologies: HTML, Cascading Style Sheets (CSS), and JavaScript. To allow scripts and components to
access features of HTML and CSS, the contents of the document are represented as objects in a
programming model known as the Document Object Model (DOM).
The DOM API is the foundation of DHTML, providing a structured interface that allows access and
manipulation of virtually anything in the document. The HTML elements in the document are available as
a hierarchical tree of individual objects, meaning you can examine and modify an element and its
attributes by reading and setting properties and by calling methods. The text between elements is also
available through DOM properties and methods.
The DOM also provides access to user actions such as pressing a key and clicking the mouse. You can
intercept and process these and other events by creating event handler functions and routines. The event
handler receives control each time a given event occurs and can carry out any appropriate action,
including using the DOM to change the document.
Dynamic styles[edit]
Dynamic styles are a key feature of DHTML. By using CSS, you can quickly change the appearance and
formatting of elements in a document without adding or removing elements. This helps keep your
documents small and the scripts that manipulate the document fast.
The object model provides programmatic access to styles. This means you can change inline styles on
individual elements and change style rules using simple JavaScript programming.
Inline styles are CSS style assignments that have been applied to an element using the style attribute.
You can examine and set these styles by retrieving the style object for an individual element. For
example, to highlight the text in a heading when the user moves the mouse pointer over it, you can use
the style object to enlarge the font and change its color, as shown in the following simple example.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Dynamic Styles</title>
<style>
ul {display:none;}
</style>
</head>
<body>
<h1>Welcome to Dynamic HTML</h1>
<ul>
<li>Change the color, size, and typeface of text</li>
<li>Show and hide text</li>
<li>And much, much more</li>
</ul>
<script>
showMe = function () {
document.getElementsByTagName("h1")[0].style.color =
"#990000";
document.getElementsByTagName("ul")[0].style.display =
"block";
};
Another practical use of data binding is to bind one or more elements in the document to specific fields of
a given record. When the page is viewed, the elements are filled with text and data from the fields in that
record, sometimes called the "current" record. An example is a form letter in which the name, e-mail
address, and other details about an individual are filled from a database. To adapt the letter for a given
individual, you specify which record should be the current record. No other changes to the letter are
needed.
Yet another practical use is to bind the fields in a form to fields in a record. Not only can the user view the
content of the record, but the user can also change that content by changing the settings and values of
the form. The user can then submit these changes so that the new data is uploaded to the source—for
example, to the HTTP server or database.
To provide data binding in your documents, you must add a data source object (DSO) to your document.
This invisible object is an ActiveX control or Java applet that knows how to communicate with the data
source. The following example shows how easy it is to bind a table to a DSO. When viewed, this example
displays the first three fields from all the comma-delimited records of the file "sampdata.csv" in a clear,
easy-to-read table.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Data Binding Example</title>
<style>
td, th {border:1px solid;}
</style>
</head>
<body>
<h1>Data Binding Example</h1>
<object classid="clsid:333C7BC4-460F-11D0-BC04-0080C7055A83"
id="sampdata">
<param name="DataURL" value="sampdata.csv">
<param name="UseHeader" value="True">
</object>
<table datasrc="#sampdata">
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>C</th>
</tr>
</thead>
<!-- Fields will not display without the accompanying CSV file.
-->
<tbody>
<tr>
<td><span datafld="a"></span></td>
<td><span datafld="b"></span></td>
<td><span datafld="c"></span></td>
</tr>
</tbody>
</table>
</body>
</html>
History of Wikipedia
From Wikipedia, the free encyclopedia
The English edition of Wikipedia has grown to 4,395,990 articles, equivalent toover 1,900 print volumes of theEncyclopaedia
Britannica. Including all language editions, Wikipedia has over 30.2 million articles, [1] equivalent to over 13,000 print
volumes.
Wikipedia was formally launched on 15 January 2001 by Jimmy Wales and Larry Sanger, but its technological and conceptual
underpinnings predate this. The earliest known proposal for an online encyclopedia was made by Rick Gates in 1993,[2] but the
concept of a free-as-in-freedom online encyclopedia (as distinct from mere open source or freemium)[3] was proposed by Richard
Crucially, Stallman's concept specifically included the idea that no central organization should control editing. This latter "massively
multiplayer" characteristic was in stark contrast to contemporary digital encyclopedias such as Microsoft Encarta, Encyclopedia
Britannica and even Bomis's Nupedia, which was Wikipedia's direct predecessor. In 2001, the license for Nupedia was changed
to GFDL, and Wales and Sanger launched Wikipedia using the concept and technology of a wiki pioneered in 1995 by Ward
Cunningham.[5] Initially, Wikipedia was intended to complement Nupedia, an online encyclopedia project edited solely by experts, by
providing additional draft articles and ideas for it. In practice, Wikipedia quickly overtook Nupedia, becoming a global project in
multiple languages and inspiring a wide range of other online reference projects.
As of December 2013, Wikipedia includes over 30.3 million freely usable articles in 287 languages[1] that have been written by over
43 million registered users and numerous anonymous contributors worldwide.[6][7][8] According to Alexa Internet, Wikipedia is now the
fith-most-popular website as of December 2012 . According to AllThingsD [1], Wikipedia receives over 85 million monthly unique
Contents
[hide]
1 Historical overview
o 1.1 Background
o 1.6 Organization
2 Timeline
o 2.1 2000
o 2.2 2001
o 2.3 2002
o 2.4 2003
o 2.5 2004
o 2.6 2005
o 2.7 2006
o 2.8 2007
o 2.9 2008
o 2.10 2009
o 2.11 2010
o 2.12 2011
o 2.13 2012
o 2.14 2013
o 3.6 Fundraising
o 3.9 Controversies
o 3.12 Lawsuits
4 See also
5 References
6 External links
Historical overview[edit]
Background[edit]
The concept of the world's knowledge in a single location dates to the ancient Libraries of Alexandria and Pergamum, but the
modern concept of a general-purpose, widely distributed, printedencyclopedia originated with Denis Diderot and the 18th-century
French encyclopedists. The idea of using automated machinery beyond the printing press to build a more useful encyclopedia can
be traced to Paul Otlet's book Traité de documentation (1934; Otlet also founded the Mundaneum institution in 1910), H. G. Wells'
book of essays World Brain (1938) and Vannevar Bush's future vision of the microfilm based Memex in As We May Think (1945).
[10]
Another milestone was Ted Nelson's hypertext design Project Xanadu, begun in 1960.[10]
While previous encyclopedias, notably the Encyclopædia Britannica, were book-based, Microsoft's Encarta, published in 1993, was
available on CD-ROM and hyperlinked. With the development of the web, many people attempted to develop Internet encyclopedia
projects. An early proposal was Interpedia in 1993 by Rick Gates;[2] but this project died before generating any encyclopedic
content. Free software proponent Richard Stallman described the usefulness of a "Free Universal Encyclopedia and Learning
Resource" in 1999.[4] His published document "aims to lay out what the free encyclopedia needs to do, what sort of freedoms it
needs to give the public, and how we can get started on developing it." On 17 January 2001, two days after the start of Wikipedia,
theFree Software Foundation's (FSF) GNUPedia project went online, competing with Nupedia,[11] but today the FSF encourages
volunteered by Bomis, a web-advertising firm owned by Jimmy Wales,Tim Shell and Michael E. Davis.[13][14][15] Nupedia was founded
upon the use of highly qualified volunteer contributors and an elaborate multi-step peer review process. Despite its mailing list of
interested editors, and the presence of a full-time editor-in-chief, Larry Sanger, a graduate philosophy student hired by Wales,[16] the
writing of content for Nupedia was extremely slow, with only 12 articles written during the first year. [15]
Wales and Sanger discussed various ways to create content more rapidly.[14] The idea of a wiki-based complement originated from a
conversation between Larry Sanger and Ben Kovitz.[17][18][19]Ben Kovitz was a computer programmer and regular on Ward
Cunningham's revolutionary wiki "the WikiWikiWeb". He explained to Sanger what wikis were, at that time a difficult concept to
understand, over a dinner on 2 January 2001.[17][18][19][20] Wales first stated, in October 2001, that "Larry had the idea to use Wiki
software",[21] though he later stated in December 2005 that Jeremy Rosenfeld, a Bomis employee, introduced him to the concept. [22]
[23][24][25]
Sanger thought a wiki would be a good platform to use, and proposed on the Nupedia mailing list that a wiki based
upon UseModWiki (then v. 0.90) be set up as a "feeder" project for Nupedia. Under the subject "Let's make a wiki", he wrote:
No, this is not an indecent proposal. It's an idea to add a little feature to Nupedia. Jimmy Wales thinks that many people might find
the idea objectionable, but I think not. (…) As to Nupedia's use of a wiki, this is the ULTIMATE "open" and simple format for
developing content. We have occasionally bandied about ideas for simpler, more open projects to either replace or supplement
Nupedia. It seems to me wikis can be implemented practically instantly, need very little maintenance, and in general are very low-
risk. They're also a potentially great source for content. So there's little downside, as far as I can determine.
Founding of Wikipedia[edit]
There was considerable resistance on the part of Nupedia's editors and reviewers to the idea of associating Nupedia with a wiki-
style website. Sanger suggested giving the new project its own name, Wikipedia, and Wikipedia was soon launched on its own
domain, wikipedia.com, on 15 January 2001. The bandwidth and server (located in San Diego) used for these initial projects
were donated by Bomis. Many former Bomis employees later contributed content to the encyclopedia: notably Tim Shell, co-founder
In December 2008, Wales stated that he made Wikipedia's first edit, a test edit with the text "Hello, World!". [27] The oldest article still
preserved is the article UuU, created on 16 January 2001, at 21:08 UTC.[28][29] The existence of the project was formally announced
and an appeal for volunteers to engage in content creation was made to the Nupedia mailing list on 17 January. [30]
The UuU edit, the first edit that is still preserved on Wikipedia to this day, as it appears using the Nostalgia skin.
The project received many new participants after being mentioned on the Slashdot website in July 2001,[31] with two minor mentions
in March 2001.[32][33] It then received a prominent pointer to a story on the community-edited technologies and culture
website Kuro5hin on 25 July.[34] Between these relatively rapid influxes of traffic, there had been a steady stream of traffic from other
sources, especially Google, which alone sent hundreds of new visitors to the site every day. Its first major mainstream
The project gained its 1,000th article around 12 February 2001, and reached 10,000 articles around 7 September. In the first year of
its existence, over 20,000 encyclopedia entries were created – a rate of over 1,500 articles per month. On 30 August 2002, the
Wikipedia's earliest edits were long believed lost, since the original UseModWiki software deleted old data after about a month. On
the eve of Wikipedia's 10th anniversary, 14 December 2010, developer Tim Starling found backups on SourceForge containing
every change made to Wikipedia from its creation in January 2001 to 17 August 2001.[36]
of usernames. The first subdomain created for a non-English Wikipedia was deutsche.wikipedia.com (created on 16 March 2001,
01:38 UTC),[37] followed after a few hours by Catalan.wikipedia.com (at 13:07 UTC).[38] The Japanese Wikipedia, started
as nihongo.wikipedia.com, was created around that period,[39][40] and initially used only Romanized Japanese. For about two months
Catalan was the one with the most articles in a non-English language,[41][42] although statistics of that early period are imprecise.
[43]
The French Wikipedia was created on or around 11 May 2001,[44] in a wave of new language versions that also
included Chinese, Dutch, Esperanto, Hebrew, Italian, Portuguese, Russian, Spanish, and Swedish.[45] These languages were soon
joined by Arabic[46] andHungarian.[47][48] In September 2001, an announcement pledged commitment to the multilingual provision of
Wikipedia,[49] notifying users of an upcoming roll-out of Wikipedias for all major languages, the establishment of core standards, and
a push for the translation of core pages for the new wikis. At the end of that year, when international statistics first began to be
In January 2002, 90% of all Wikipedia articles were in English. By January 2004, fewer than 50% were English, and this
internationalization has continued to increase as the encyclopedia grows. As of 2013, around 85% of all Wikipedia articles are
Development of Wikipedia[edit]
In March 2002, following the withdrawal of funding by Bomis during the dot-com bust, Larry Sanger left both Nupedia and Wikipedia.
[51]
By 2002, Sanger and Wales differed in their views on how best to manage open encyclopedias. Both still supported the open-
collaboration concept, but the two disagreed on how to handle disruptive editors, specific roles for experts, and the best way to
Wales, a believer in "hands off" executive management,[citation needed] went on to establish self-governance and bottom-up self-direction
by editors on Wikipedia. He made it clear that he would not be involved in the community's day-to-day management, but would
encourage it to learn to self-manage and find its own best approaches. As of 2007, Wales mostly restricts his own role to occasional
input on serious matters, executive activity, advocacy of knowledge, and encouragement of similar reference projects.
Sanger says he is an "inclusionist" and is open to almost anything.[52] He proposed that experts still have a place in the Web
2.0 world. He returned briefly to academia, then joined the Digital Universe Foundation. In 2006, Sanger founded Citizendium, an
open encyclopedia that used real names for contributors in an effort to reduce disruptive editing, and hoped to facilitate "gentle
expert guidance" to increase the accuracy of its content. Decisions about article content were to be up to the community, but the site
was to include a statement about "family-friendly content".[53] He stated early on that he intended to leave Citizendium in a few years,
by which time the project and its management would presumably be established.[54]
Organization[edit]
The Wikipedia project has grown rapidly in the course of its life, at several levels. Content has grown organically through the
addition of new articles, new wikis have been added in English and non-English languages, and entire new projects replicating these
growth methods in other related areas (news, quotations, reference books and so on) have been founded as well. Wikipedia itself
has grown, with the creation of the Wikimedia Foundation to act as an umbrella body and the growth of software and policies to
address the needs of the editorial community. These are documented below:
Evolution of logo[edit]
6 December 2001 – 12 October 2003
Timeline[edit]
Articles summarizing each year are held within the Wikipedia project namespace
and are linked to below. Additional resources for research are available within the
Wikipedia records and archives, and are listed at the end of this article.
2000[edit]
In March 2000, the Nupedia project was started. Its intention was to publish articles
written by experts which would be licensed as free content. Nupedia was founded by
Jimmy Wales, with Larry Sanger as editor-in-chief, and funded by the web-advertising
company Bomis.[55]
2001[edit]
In January 2001, Wikipedia began as a side-project of Nupedia, to allow collaboration on
2001[57] and 13 January 2001,[58] respectively, with wikipedia.org being brought online on
the same day.[59] The project formally opened on 15 January ("Wikipedia Day"), with the
first international Wikipedias – the French, German, Catalan, Swedish, and Italian
editions – being created between March and May. The "neutral point of view" (NPOV)
policy was officially formulated at this time, and Wikipedia's first slashdotter wave arrived
on 26 July.[31] The first media report about Wikipedia appeared in August 2001 in the
breaking news stories on the homepage, as well as information boxes linking related
articles.[61]
2002[edit]
2002 saw the end of funding for Wikipedia from Bomis and the departure of Larry
Sanger. The forking of the Spanish Wikipedia also took place with the establishment of
the Enciclopedia Libre. The first portable MediaWiki software went live on 25 January.
[dubious – discuss]
Bots were introduced, Jimmy Wales confirmed that Wikipedia would never
run commercial advertising, and the first sister project (Wiktionary) and first
formal Manual of Style were launched. A separate board of directors to supervise the
2003[edit]
The English Wikipedia passed 100,000 articles in 2003, while the next largest edition, the
German Wikipedia, passed 10,000. The Wikimedia Foundation was established, and
Wikipedia adopted its jigsaw world logo. Mathematical formulae using TeX were
reintroduced to the website. The first Wikipedian social meeting took place in Munich,
2004[edit]
The worldwide Wikipedia article pool continued to grow rapidly in 2004, doubling in size
in 12 months, from under 500,000 articles in late 2003 to over 1 million in over 100
languages by the end of 2004. The English Wikipedia accounted for just under half of
from California to Florida, Categories and CSS style configuration sheets were
introduced, and the first attempt to block Wikipedia occurred, with the website being
blocked in China for two weeks in June. The formal election of a board and Arbitration
Committee began. The first formal projects were proposed to deliberately balance
content and seek out systemic bias arising from Wikipedia's community structure.
Bourgeois v. Peters,[62] (11th Cir. 2004), a court case decided by the United States Court
of Appeals for the Eleventh Circuit was one of the earliest court opinions to cite and quote
Wikipedia.[citation needed] It stated: "We also reject the notion that the Department of
Homeland Security's threat advisory level somehow justifies these searches. Although
the threat level was "elevated" at the time of the protest, "to date, the threat level has
stood at yellow (elevated) for the majority of its time in existence. It has been raised to
2005[edit]
In 2005, Wikipedia became the most popular reference website on the Internet, according
to Hitwise, with the English Wikipedia alone exceeding 750,000 articles. Wikipedia's first
multilingual and subject portals were established in 2005. A formal fundraiser held in the
first quarter of the year raised almost US$100,000 for system upgrades to handle
The first major Wikipedia scandal occurred in 2005, when a well-known figure was found
to have a vandalized biography which had gone unnoticed for months. In the wake of this
and other concerns,[63] the first policy and system changes specifically designed to
counter this form of abuse were established. These included a new Checkuser privilege
policy update to assist in sock puppetry investigations, a new feature called semi-
protection, a more strict policy on biographies of living people and the tagging of such
articles for stricter review. A restriction of new article creation to registered users only was
2006[edit]
The English Wikipedia gained its one-millionth article, Jordanhill railway station, on 1
March 2006. The first approved Wikipedia article selection was made freely available to
congressional staffers and a campaign manager were caught trying to covertly alter
campaign manager. Nonetheless, Wikipedia was rated as one of the top 2006 global
brands.[65]
Jimmy Wales indicated at Wikimania 2006 that Wikipedia had achieved sufficient volume
and calls for an emphasis on quality, perhaps best expressed in the call for 100,000
than expected, with over 1,000 pages being semi-protected at any given time in 2006.
2007[edit]
Wikipedia continued to grow rapidly in 2007, possessing over 5 million registered editor
total of 7.5 million articles, totalling 1.74 billion words in approximately 250 languages, by
13 August.[67] The English Wikipedia gained articles at a steady rate of 1,700 a day, [68] with
the wikipedia.org domain name ranked the 10th-busiest in the world. Wikipedia continued
to garner visibility in the press – the Essjay controversy broke when a prominent member
of Wikipedia was found to have lied about his credentials. Citizendium, a competing
online encyclopedia, launched publicly. A new trend developed in Wikipedia, with the
news story by adding a redirect from their name to the larger story, rather than creating a
distinct biographical article.[69] On 9 September 2007, the English Wikipedia gained its
two-millionth article, El Hormiguero.[70] There was some controversy in late 2007 when
the Volapük Wikipedia jumped from 797 to over 112,000 articles, briefly becoming the
15th-largest Wikipedia edition, due to automated stub generation by an enthusiast for the
According to the MIT Technology Review, the number of regularly active editors on the
English-language Wikipedia peaked in 2007 at more than 51,000, and has since been
declining.[73]
2008[edit]
Various WikiProjects in many areas continued to expand and refine article contents within
their scope. In April 2008, the 10-millionth Wikipedia article was created, and by the end
million.[6] The three-millionth article on the English Wikipedia, Beate Eriksen, was created
Wikipedia exceeded one million articles, becoming the second edition after the English
Wikipedia to do so. A TIME article listed Wikipedia among 2009's best websites.[75]
The Arbitration Committee of the English Wikipedia decided in May 2009 to restrict
access to its site from Church of Scientology IP addresses, to prevent self-serving edits
"battlefield tactics", with articles on living persons being the "worst casualties".
[77]
Wikipedia content became licensed under Creative Commons in 2009.
2010[edit]
On 24 March, the European Wikipedia servers went offline due to an overheating
problem. Failover to servers in Florida turned out to be broken, causing DNS resolution
for Wikipedia to fail across the world. The problem was resolved quickly, but due to DNS
caching effects, some areas were slower to regain access to Wikipedia than others.[79][80]
On 13 May, the site released a new interface. New features included an updated logo,
new navigation tools, and a link wizard.[81] However, the classic interface remained
available for those who wished to use it. On 12 December, the English Wikipedia passed
the 3.5-million-article mark, while the French Wikipedia's millionth article was created on
2011[edit]
One of many cakes made to celebrate Wikipedia's 10th anniversary [83] in 2011.
Wikipedia and its users held hundreds of celebrations worldwide to commemorate the
site's 10th anniversary on 15 January.[84] The site began efforts to expand its growth in
India, holding its first Indian conference in Mumbai in November 2011.[85][86] The English
Wikipedia passed the 3.6-million-article mark on 2 April, and reached 3.8 million articles
page edits, becoming the second language edition to do so after the English edition,
which attained 500 million page edits on 24 November 2011. The Dutch
Between 4 and 6 October 2011, the Italian Wikipedia became intentionally inaccessible in
approved, would allow any person to force websites to remove information that is
Also in October 2011, Wikimedia announced the launch of Wikipedia Zero, an initiative to
enable free mobile access to Wikipedia in developing countries through partnerships with
mobile operators.[88][89]
2012[edit]
On 16 January, Wikipedia co-founder Jimmy Wales announced that the English
Wikipedia would shut down for 24 hours on 18 January as part of a protest meant to call
public attention to the proposed Stop Online Piracy Act and PROTECT IP Act, two anti-
piracy laws under debate in the United States Congress. Calling the blackout a
"community decision", Wales and other opponents of the laws believed that they would
endanger free speech and online innovation.[90] A similar blackout was staged on 10 July
by the Russian Wikipedia, in protest against a proposed Russian internet regulation law.
[91]
In late March 2012, the Wikimedia Foundation announced Wikidata, a universal platform
for sharing data between all Wikipedia language editions.[92] The US$1.7-million Wikidata
project was partly funded by Google, the Gordon and Betty Moore Foundation, and the
for the first phase of Wikidata, and initially planned to make the platform available to
editors by December 2012. Wikidata's first phase became fully operational in March
2013.[94][95]
In April 2012, Justin Knapp from Indianapolis, Indiana, became the first single contributor
to make over one million edits to Wikipedia.[96][97] The founder of Wikipedia, Jimmy Wales,
congratulated Knapp for his work and presented him with the site's Special
Barnstar medal and the Golden Wiki award for his achievement.[98] Wales also declared
On 13 July 2012, the English Wikipedia gained its 4-millionth article, Izbat al-Burj.[100] In
October 2012, historian and Wikipedia editor Richard Jensen opined that the English
Wikipedia was "nearing completion", noting that the number of regularly active editors
had fallen significantly since 2007, despite Wikipedia's rapid growth in article count and
readership.[101]
2013[edit]
As of December 2013, Wikipedia is the world's sixth-most-popular website according
combined total of over 30.3 million mainspace articles across all 287 language editions.
[1]
It is estimated that Wikipedia receives more than 10 billion global pageviews every
month,[103] and attracts over 85 million unique monthly visitors from the United States
the English Wikipedia alone receives approximately 8 million global pageviews every day.
[104]
On 22 January 2013, the Italian Wikipedia became the fifth language edition of Wikipedia
to exceed 1 million articles, while the Russian and Spanish Wikipedias gained their
millionth articles in May. The Swedish and the Polish Wikipedias gained their millionth
articles a few months later, becoming the eighth and ninth Wikipedia editions to do so. On
the Wikidata database, automatically providing interlanguage links and other data,
became available for all language editions in March 2013.[95] In April 2013, the French
volunteer with arrest unless "classified information" about a military radio station was
deleted.[106] In July, the VisualEditor editing system was launched, forming the first stage
using wikimarkup.[107] An editor specifically designed for mobile was also launched.
In January 2001, Wikipedia ran on UseModWiki, written in Perl by Clifford
Adams. The server has run on Linux to this day, although the original text was
introduced, replacing the older UseModWiki. Written specifically for the project
In July 2002, a major rewrite of the software powering Wikipedia went live;
dubbed "Phase III", it replaced the older "Phase II" version, and
In October 2002, Derek Ramsey started to use a "bot", or program, to add a
large number of articles about United States towns; these articles were
been used before for other topics. These articles were generally well received,
but some users criticized them for their initial uniformity and writing style (for
In January 2003, support for mathematical formulas in TeX was added. The
9 June 2003 – ISBNs in articles now link to Special:Booksources, which
Before this, ISBN link targets were coded into the software and new ones
were suggested on the Wikipedia:ISBN page. See the edit that changed this.
After 6 December 2003, various system messages shown to Wikipedia users
users.
On 12 February 2004, server operations were moved from San
On 29 May 2004, all the various websites were updated to a new version of
On 30 May 2004, the first instances of "categorization" entries appeared.
Category schemes, like Recent Changes and Edit This Page, had existed from
the founding of Wikipedia. However, Larry Sanger had viewed the schemes as
encyclopedia.[109]
After 3 June 2004, administrators could edit the style of the interface by
Also on 30 May 2004, with MediaWiki 1.3, the Template namespace was
On 7 June 2005 at 3:00 a.m. Eastern Standard Time, the bulk of the
Wikimedia servers were moved to a new facility across the street. All
In March 2013, the first phase of the Wikidata interwiki database became
In July 2013, the VisualEditor editing interface was inaugurated, allowing users
instead of wikimarkup.[107]
presented to users.
On 4 April 2002, BrilliantProse, since renamed to Featured Articles,
[111]
was moved to the Wikipedia namespace from the article namespace.
Around 15 October 2003, a new Wikipedia logo was installed. The logo
Stansifer.
On 22 February 2004, Did You Know (DYK) made its first Main Page
appearance.
On 23 February 2004, a coordinated new look for the Main Page
Article, Anniversaries, In the News, and Did You Know rounded out the
new look.
On 10 January 2005, the multilingual portal at www.wikipedia.org was set
On 5 February 2005, Portal:Biology was created, becoming the first
On 16 July 2005, the English Wikipedia began the practice of including
On 19 March 2006, following a vote, the Main Page of the English-
On 13 May 2010, the site released a new interface. New features
included an updated logo, new navigation tools, and a link wizard. [81] The
Internal structures[edit]
Landmarks in the Wikipedia community, and the development of its
April 2001, Wales formally defines the "neutral point of view",
[115]
Wikipedia's core non-negotiable editorial policy,[116] a
in WikiProjects is introduced.[119]
In February 2002, concerns over the risk of future censorship and
combined with a lack of guarantee this would not happen, led most
Also in 2002, policy and style issues were clarified with the creation
guidelines.[122]
November 2002 – new mailing lists for WikiEN and Announce are
In July 2003, the rule against editing one's autobiography is
introduced.[124]
On 28 October 2003, the first "real" meeting of Wikipedians
From 10 July to 30 August 2004
During September to December 2005 following the Seigenthaler
The policy for "Checkuser" (a MediaWiki extension to assist detection of abuse
function had previously existed, but was viewed more as a system tool at the
time, so there had been no need for a policy covering use on a more routine
basis.[127]
Creation of new pages on the English Wikipedia was restricted to editors who
The introduction and rapid adoption of the policy Wikipedia:Biographies of
living people, giving a far tighter quality control and fact-check system to
The "semi-protection" function and policy,[129] allowing pages to be protected so
In May 2006, a new "oversight" feature was introduced on the
On 1 January 2007, the subcommunity
most of them are now inactive. When the group was founded
In April 2007 the results of 4 months policy review by a
A one-day closure of Wikipedia was called by Jimmy Wales on
In August 2002, shortly after Jimmy Wales announced
founded.
Communications committee was formed in January 2006
Angela Beesley and Florence Nibart-Devouard were
On 10 January 2006, Wikipedia became a registered
In July 2006, Angela Beesley resigned from the board of
In June 2006, Brad Patrick was hired to be the first
(June 2007).
In October 2006, Florence Nibart-Devouard became
Sister projects and milestones related to articles, user base, and other statistics.
On 15 January 2001, the first recorded edit of
On 22 January 2003, the English Wikipedia was
On 20 June 2003, the same day that the Wikimedia
year.
In January 2004, Wikipedia reached the 200,000-
On 20 April 2004, the article count of the English
On 7 July 2004, the article count of the English
On 20 September 2004, Wikipedia reached one
On 20 November 2004, the article count of the
On 18 March 2005, Wikipedia passed the 500,000-
In May 2005, Wikipedia became the most popular
On 29 September 2005, the English Wikipedia
On 1 March 2006, the English Wikipedia passed
milestone article.[138]
On 8 June 2006, the English Wikipedia passed
peoples.[139]
On 15 August 2006, the Wikimedia Foundation
launched Wikiversity.[140]
On 24 November 2006, the English Wikipedia
On 4 April 2007, the first Wikipedia CD selection in
On 9 September 2007, the English Wikipedia
2,000,000th article.
On 17 August 2009, the English Wikipedia passed
milestone article.
On 12 December 2010, the English Wikipedia
On 24 November 2011, the English Wikipedia
On 17 December 2011, the Dutch
On 13 July 2012, the English Wikipedia
On 22 January 2013, the Italian
On 11 May 2013, the Russian
On 16 May 2013, the Spanish
On 15 June 2013, the Swedish
On 25 September 2013, the Polish
On 21 October 2013, Wikipedia exceeded 30
Fundraising[edit]
Every year, Wikipedia runs a fundraising campaign to
expected.[142]
On 6 January 2006, the Q4 2005 fundraiser
The 2007 fundraising campaign raised US$1.5
The 2008 fundraising campaign gained Wikipedia
The 2010 campaign was launched on 13
million.[148]
The 2011 campaign raised US$20 million from
The 2012 campaign raised US$25 million from
External impact[edit]
In 2007, Wikipedia was deemed fit to be used as a
event."[153]
On 21 February 2007, Noam Cohen of the New
On 27 February 2007, an article in The Harvard
In July 2013, a large-scale study by four major
Wikipedia biography.
November 2005: The Seigenthaler
1963.
December 2006: German comedian Atze
himself.[157]
16 February 2007: Turkish historian Taner
November 2008: The German Left
December 2008: Wikimedia Nederland, the Dutch
Dutch Wikipedia.[161]
February 2009: When Karl Theodor Maria Nikolaus
May 2009: An article about the German journalist
October 2009: In 1990, the German actor Walter
Wikipedia.[164][165]
leader in its first year, and did most of the early work in
funding for his role, which was not viable part-time, and
[Wales] had had the idea for Nupedia since at least last
fall".[176]
manner.[166][172][177][178]
Controversies[edit]
Main articles: Criticism of Wikipedia, List of litigation
exceptions were:
1. strong prohibition against *any* sort of centralized control ("[must not be] written
under the direction of a single organization, which made all decisions about the
content, and... published in a centralized fashion. ...we dare not allow any
particular, deletionists were not allowed; editing an article would require forking it,
making a change, and then saving the result as a 'new' article on the same topic.
the Tiananmen Square protests of 1989, with some to-be-determined mechanism for
3. given the structure above, where every topic (especially controversial ones) might
have a thousand articles purporting to be *the* GNUpedia article about Sarah Palin,
Stallman explicitly rejected the idea of a centralized website that would specify which
article of those thousand was worth reading. Instead of an official catalogue, the plan
was to rely on search engines at first (the reader would begin by googling "gnupedia
sarah palin"), and then eventually if necessary construct catalogues according to the
(category-lists and lists-of-lists), but as of 2013 search engines still provide about
The goals which led to GNUpedia were published at least as early as 18 December
2000,[180][181] and these exact goals were finalized on the 12th [179] and 13th [182] of
January 2001, albeit with a copyright of 1999, from when Stallman had first started
considering the problem. The only sentence added between 18 December and the
unveiling of GNUpedia the week of 12–16 January was this: "The GNU Free
GNUpedia was 'formally' announced on slashdot [183] the same day that their mailing-
list first went online with a test-message, 16 January. Jimmy Wales posted to the list
on the 17th, the first full day of messages, explaining the discussions with Stallman
source encyclopedia to the free encyclopedia,[187] both Nupedia and Wikipedia had
adopted the GFDL, and the merger [188] of GNUpedia into Wikipedia was effectively
accomplished.
November 2001: it was
announced by Jimmy
Wikipedia, starting in
in 2002 Chief
in keeping Wikipedia
functioning. By
September 2002,
[190]
Wales had publically
advertising on
Wikipedia." By June
Foundation was
formally incorporated;
[191]
the Foundation is
advertising,[192] although
it does 'internally'
advertise Wikimedia
Foundation fund-raising
events on Wikipedia.
[193]
As of 2013, the by-
Foundation do not
adopting a broader
advertising policy, if
necessary.[citation
needed]
Such by-laws are
subject to vote.[citation
needed]
All of 2003: Zero
controversies of any
Notability occurred.
All of 2004: Zero
controversies of any
Notability occurred.
January 2005: The fake
charity QuakeAID, in
earthquake, attempted
Wikipedia page.
October 2005: Alan
a Wikipedia page.
November 2005:
The Seigenthaler
controversy caused
ascertained by Daniel
Brandt of Wikipedia
the scientific
to test articles in
equivalents
in Encyclopædia
Britannica, and
comparable in terms of
accuracy.[194][195] Britanni
ca rejected their
conclusion.[196] Nature re
fused to make any
apologies, asserting
of the criticisms.[197]
Early-to-mid-2006:
The congressional
aides biography
attention, in which
biographies of several
politicians to remove
undesirable information
(including pejorative
statements quoted, or
broken campaign
promises), add
favorable information or
"glowing" tributes, or
authored biographies.
politicians were
implicated: Marty
Meehan, Norm
Coleman, Conrad
Gutknecht.[198] In a
negative information to
political opponents.
[199]
Following media
August 2006.
July 2006: Joshua
as a fake Duke of
Cleveland with a
Wikipedia page.
January 2007: English-
language Wikipedians
following a spate of
vandalism, by an
is routed through a
single IP address.
On 23 January 2007,
a Microsoft employee
Wikipedia articles
regarding an open-
source document
rival to a Microsoft
format.[201]
In February 2007, The
correction that a
prominent English
administrator known as
credentials.[202][203] The
became
a Wikia employee in
Daniel Brandt of
communicated to the
(See: Essjay
controversy)
February 2007: Fuzzy
firm because
defamatory information
network.
16 February 2007:
a Canadian airport
because of false
information on his
biography indicating
In June 2007, an
by coincidence,
Benoit murder-suicide,
were found by
investigators. The
attracted widespread
In October 2007, in their
obituaries of recently
deceased TV theme
composer Ronnie
Hazlehurst, many
British media
organisations reported
that he had co-written
"Reach". In fact, he
edit to Hazlehurst's
Wikipedia article.[204]
In February 2007,
Barbara Bauer, a
Wikimedia for
Literary Agency.
[205]
In Bauer v. Glatzer,
information on
abilities as a literary
Foundation defended
Wikipedia[206] and
the Wikimedia
Foundation was
dismissed on 1 July
2008.[208]
On 14 July 2009, the
breach of copyright,
against a Wikipedia
placed them
on Wikimedia
Commons.[209][210][211][212]
[213]
See National Portrait
Foundation copyright
In April and May 2010,
display of sexual
drawing and
pornographic images
including images of
children on Wikipedia.
[214][215][216]
It led to the
mass removal of
pornographic content
from Wikimedia
Foundation sites.[217][218]
In November 2012, Lord
journalists Andreas
for The
Independent newspaper
replaced Matthew
Symonds (a genuine
character).[219] The
Economist said of
"Parts of it are a
scissors-and-paste job
culled from
Wikipedia."[220]
Notable forks
and
derivatives[edit]
See this page for a partial list
by Wikipedia.
Specialized foreign language
erman), WikiZnanie
Baike(Chinese). Some of
by Wikipedia, leading to
Wikipedias.
In 2006, Larry
things'.[54][222] (see
also Nupedia).
Publication on
other media[edit]
The German Wikipedia was
on CD in November
publisher Zenodot
Verlagsgesellschaft mbH, a
sister company of
accompanied by a 7.5 GB
In September
Wikipedia articles.
to Wikimedia Deutschland.[226]
containing a selection of
Wikipedia CD Selection.[227] In
was published by
the Wikimedia
project's website.
iPods.[230]
Lawsuits[edit]
In limited ways, the
Wikimedia Foundation is
protected by Section 230 of
the Communications
caused a lawsuit to be
being illegal.[233]
See also[edit]
TWiki
From Wikipedia, the free encyclopedia
For the robot character, see Twiki.
TWiki
Written in Perl
Type Wiki
License GPL
Website http://twiki.org/
The TWiki project was founded by Peter Thoeny in 1998 as an open source wiki-based application
platform. In October 2008, the company TWiki.net, created by Thoeny, assumed full control over the
TWiki project[2] while much of the developer community[3][4] forked off to join the Foswiki project.[5]
Contents
[hide]
1 Major features
2 TWiki deployment
3 Realization
4 TWiki release history
5 Forks of TWiki
6 Gallery
7 See also
8 References
9 External links
Major features[edit]
Revision control - complete audit trail, also for meta data such as
attachments and access control settings
Fine-grained access control - restrict read/write/rename on site level,
web level, page level based on user groups
Built in database - users can create wiki applications using the TWiki
Markup Language
TWiki extensions[edit]
TWiki has a plugin API that has spawned over 300 extensions [6] to link into databases, create charts, tags,
sort tables, write spreadsheets, create image gallery and slideshows, make drawings, write blogs,
plot graphs, interface to many different authentication schemes, track Extreme Programming projects and
so on.
Wiki applications are also called situational applications because they are created ad hoc by the users for
very specific needs. Users have built TWiki applications[9] that include call center status boards, to-do
lists, inventory systems, employee handbooks, bug trackers, blog applications, discussion forums, status
reports with rollups and more.
User interface[edit]
The interface of TWiki is completely skinnable in templates, themes and (per user) CSS. It includes
support for internationalization ('I18N'), with support for multiple character sets, UTF-8 URLs, and the user
interface has been translated into Chinese, Czech, Danish, Dutch, French, German, Italian, Japanese,
Polish, Portuguese, Russian, Spanish and Swedish.[10]
TWiki deployment[edit]
TWiki is primarily used at the workplace as a corporate wiki[11] to coordinate team activities, track projects,
implement workflows[12] and as an Intranet Wiki. The TWiki community estimates 40,000 corporate wiki
sites as of March 2007, and 20,000 public TWiki sites.[13]
TWiki customers include Fortune 500 such as Disney, Google, Motorola, Nokia, Oracle
Corporation and Yahoo!, as well as small and medium enterprises,[14] such as ARM Holdings[dead link]
[15]
and DHL.[16] TWiki has also been used to create collaborative internet sites, such as the City of
Melbourne's FutureMelbourne wiki where citizens can collaborate on the future plan. [17]
Realization[edit]
TWiki is implemented in Perl. Wiki pages are stored in plain text files. Everything, including meta such as
access control settings, are version controlled using RCS. RCS is optional since an all-Perl version
control system is provided.
TWiki scales reasonably well even though it uses plain text files and no relational database to store page
data. Many corporate TWiki installations have several hundred thousand pages and tens of thousands of
users. Load balancing and caching can be used to improve performance on high traffic sites. [18]
TWiki has database features built into the engine. A TWiki Form [7] is attached to a page as meta data. This
represents a database record. A set of pages that share the same type of form build a database table. A
formatted search[19] with a SQL-like query[20] can be embedded into a page to construct dynamic
presentation of data from multiple pages. This allows for building wiki applications and constitutes the
TWiki's notion of a structured wiki.
1998-07-23: Initial version, based on JosWiki, an application created by
Markus Peter and Dave Harris[21][22]
2000-05-01: TWiki Release 1 May 2000
2000-12-01: TWiki Release 1 December 2000
2001-09-01: TWiki Release 1 September 2001
2001-12-01: TWiki Release 1 December 2001 ("Athens")
2003-02-01: TWiki Release 1 February 2003 ("Beijing")
2004-09-01: TWiki Release 1 September 2004 ("Cairo")
2006-02-01: TWiki Release 4.0.0 ("Dakar")
2007-01-16: TWiki Release 4.1.0 ("Edinburgh")
2008-01-22: TWiki Release 4.2.0 ("Freetown")
2009-09-02: TWiki Release 4.3.2 ("Georgetown")
2010-06-10: TWiki Release 5.0 ("Helsinki")
2011-08-20: TWiki Release 5.1 ("Istanbul")
Forks of TWiki[edit]
Forks of TWiki include:
2001: Spinner Wiki (abandoned)
2003: O'Wiki fork (abandoned)
2008: Foswiki, launched in October 2008 when a dispute about the
future guidance of the project could not be settled, [23][24] resulting in the
departure of much of the TWiki community including the core developer
team[4]
RSS
From Wikipedia, the free encyclopedia
For other uses, see RSS (disambiguation).
RSS (Rich Site Summary); originally RDF Site Summary; often dubbed Really Simple Syndication,
uses a family of standard web feedformats[2] to publish frequently updated information: blog entries, news
headlines, audio, video. An RSS document (called "feed", "web feed", [3] or "channel") includes full or
summarized text, and metadata, like publishing date and author's name.
RSS feeds enable publishers to syndicate data automatically. A standard XML file format ensures
compatibility with many different machines/programs. RSS feeds also benefit users who want to receive
timely updates from favourite websites or to aggregate data from many sites.
Once users subscribe to a website RSS removes the need for them to manually check it. Instead, their
browser constantly monitors the site and informs the user of any updates. The browser can also be
commanded to automatically download the new data for the user.
Software termed "RSS reader", "aggregator", or "feed reader", which can be web-based, desktop-based,
or mobile-device-based, present RSS feed data to users. Users subscribe to feeds either by entering a
feed's URI into the reader or by clicking on the browser's feed icon. The RSS reader checks the user's
feeds regularly for new information and can automatically download it, if that function is enabled. The
reader also provides a user interface.
Contents
[hide]
1 History
2 Example
3 Variants
4 Modules
5 Interoperability
8 See also
9 References
10 External links
History[edit]
Main article: History of web syndication technology
The RSS formats were preceded by several attempts at web syndication that did not achieve widespread
popularity. The basic idea of restructuring information about websites goes back to as early as 1995,
when Ramanathan V. Guha and others in Apple Computer's Advanced Technology Group developed
the Meta Content Framework.[4]
RDF Site Summary, the first version of RSS was created by Dan Libby and Ramanathan V.
Guha at Netscape. It was released in March 1999 for use on the My.Netscape.Com portal. This version
became known as RSS 0.9.[5] In July 1999, Dan Libby of Netscape produced a new version, RSS 0.91,
[2]
which simplified the format by removing RDF elements and incorporating elements fromDave Winer's
news syndication format.[6] Libby also renamed format from RDF to RSS Rich Site Summary and
outlined further development of the format in a "futures document". [7]
This would be Netscape's last participation in RSS development for eight years. As RSS was being
embraced by web publishers who wanted their feeds to be used on My.Netscape.Com and other early
RSS portals, Netscape dropped RSS support from My.Netscape.Com in April 2001 during new
owner AOL's restructuring of the company, also removing documentation and tools that supported the
format.[8]
Two entities emerged to fill the void, with neither Netscape's help nor approval: The RSS-DEV Working
Group and Dave Winer, whose UserLand Software had published some of the first publishing tools
outside of Netscape that could read and write RSS.
Winer published a modified version of the RSS 0.91 specification on the UserLand website, covering how
it was being used in his company's products, and claimed copyright to the document. [9] A few months
later, UserLand filed a U.S. trademark registration for RSS, but failed to respond to a USPTO trademark
examiner's request and the request was rejected in December 2001. [10]
The RSS-DEV Working Group, a project whose members included Guha and representatives of O'Reilly
Media and Moreover, produced RSS 1.0 in December 2000.[11] This new version, which reclaimed the
name RDF Site Summary from RSS 0.9, reintroduced support for RDF and added XML
namespaces support, adopting elements from standard metadata vocabularies such asDublin Core.
In December 2000, Winer released RSS 0.92[12] a minor set of changes aside from the introduction of the
enclosure element, which permitted audio files to be carried in RSS feeds and helped spark podcasting.
He also released drafts of RSS 0.93 and RSS 0.94 that were subsequently withdrawn. [13]
In September 2002, Winer released a major new version of the format, RSS 2.0, that redubbed its initials
Really Simple Syndication. RSS 2.0 removed the type attribute added in the RSS 0.94 draft and added
support for namespaces. To preserve backward compatibility with RSS 0.92, namespace support applies
only to other content included within an RSS 2.0 feed, not the RSS 2.0 elements themselves. [14] (Although
other standards such as Atom attempt to correct this limitation, RSS feeds are not aggregated with other
content often enough to shift the popularity from RSS to other formats having full namespace support.)
Because neither Winer nor the RSS-DEV Working Group had Netscape's involvement, they could not
make an official claim on the RSS name or format. This has fueled ongoing controversy in the syndication
development community as to which entity was the proper publisher of RSS.
One product of that contentious debate was the creation of an alternative syndication format, Atom, that
began in June 2003.[15] The Atom syndication format, whose creation was in part motivated by a desire to
get a clean start free of the issues surrounding RSS, has been adopted as IETF Proposed Standard RFC
4287.
In July 2003, Winer and UserLand Software assigned the copyright of the RSS 2.0 specification to
Harvard's Berkman Center for Internet & Society, where he had just begun a term as a visiting fellow. [16] At
the same time, Winer launched the RSS Advisory Board with Brent Simmons and Jon Udell, a group
whose purpose was to maintain and publish the specification and answer questions about the format. [17]
In September 2004, Stephen Horlander created the now ubiquitous RSS icon ( ) for use in the Mozilla
Firefox browser.[18]
In December 2005, the Microsoft Internet Explorer team[19] and Microsoft Outlook team[20] announced on
their blogs that they were adopting Firefox's RSS icon. In February 2006, Opera Softwarefollowed suit.
[21]
This effectively made the orange square with white radio waves the industry standard for RSS and
Atom feeds, replacing the large variety of icons and text that had been used previously to identify
syndication data.
In January 2006, Rogers Cadenhead relaunched the RSS Advisory Board without Dave Winer's
participation, with a stated desire to continue the development of the RSS format and resolve
ambiguities. In June 2007, the board revised their version of the specification to confirm that
namespaces may extend core elements with namespace attributes, as Microsoft has done in Internet
Explorer 7. According to their view, a difference of interpretation left publishers unsure of whether this
was permitted or forbidden.
This article is outdated. Please update this article to reflect recent events or newly available
information. (October 2013)
Example[edit]
RSS files are essentially XML formatted plain text. The RSS file itself is relatively easy to read both by
automated processes and by humans alike. An example file could have contents such as the following.
This could be placed on any appropriate communication protocol for file retrieval, such as http or ftp, and
reading software would use the information to present a neat display to the end users.
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>RSS Title</title>
<description>This is an example of an RSS feed</description>
<link>http://www.someexamplerssdomain.com/main.html</link>
<lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate>
<pubDate>Mon, 06 Sep 2009 16:20:00 +0000 </pubDate>
<ttl>1800</ttl>
<item>
<title>Example entry</title>
<description>Here is some text containing an interesting
description.</description>
<link>http://www.wikipedia.org/</link>
<guid>unique string per item</guid>
<pubDate>Mon, 06 Sep 2009 16:20:00 +0000 </pubDate>
</item>
</channel>
</rss>
Variants[edit]
There are several different versions of RSS, falling into two major branches (RDF and 2.*).
The RDF (or RSS 1.*) branch includes the following versions:
RSS 0.90 was the original Netscape RSS version. This RSS was
called RDF Site Summary, but was based on an early working draft of
the RDF standard, and was not compatible with the final RDF
Recommendation.
RSS 1.0 is an open format by the RSS-DEV Working Group, again
standing for RDF Site Summary. RSS 1.0 is an RDF format like RSS
0.90, but not fully compatible with it, since 1.0 is based on the final RDF
1.0 Recommendation.
RSS 1.1 is also an open format and is intended to update and replace
RSS 1.0. The specification is an independent draft not supported or
endorsed in any way by the RSS-Dev Working Group or any other
organization.
The RSS 2.* branch (initially UserLand, now Harvard) includes the following versions:
RSS 0.91 is the simplified RSS version released by Netscape, and also
the version number of the simplified version originally championed
by Dave Winer from Userland Software. The Netscape version was
now called Rich Site Summary; this was no longer an RDF format, but
was relatively easy to use.
RSS 0.92 through 0.94 are expansions of the RSS 0.91 format, which
are mostly compatible with each other and with Winer's version of RSS
0.91, but are not compatible with RSS 0.90.
RSS 2.0.1 has the internal version number 2.0. RSS 2.0.1 was
proclaimed to be "frozen", but still updated shortly after release without
changing the version number. RSS now stood for Really Simple
Syndication. The major change in this version is an explicit extension
mechanism using XML namespaces.[22]
Later versions in each branch are backward-compatible with earlier versions (aside from non-conformant
RDF syntax in 0.90), and both versions include properly documented extension mechanisms using XML
Namespaces, either directly (in the 2.* branch) or through RDF (in the 1.* branch). Most syndication
software supports both branches. "The Myth of RSS Compatibility", an article written in 2004 by RSS critic
and Atom advocate Mark Pilgrim, discusses RSS version compatibility issues in more detail.
The extension mechanisms make it possible for each branch to track innovations in the other. For
example, the RSS 2.* branch was the first to support enclosures, making it the current leading choice for
podcasting, and as of 2005 is the format supported for that use by iTunes and other podcasting software;
however, an enclosure extension is now available for the RSS 1.* branch,mod_enclosure. Likewise, the
RSS 2.* core specification does not support providing full-text in addition to a synopsis, but the RSS 1.*
markup can be (and often is) used as an extension. There are also several common outside extension
packages available, including a new proposal from Microsoft for use in Internet Explorer 7.
The most serious compatibility problem is with HTML markup. Userland's RSS reader—generally
considered as the reference implementation—did not originally filter out HTML markup from feeds. As a
result, publishers began placing HTML markup into the titles and descriptions of items in their RSS feeds.
This behavior has become expected of readers, to the point of becoming a de factostandard,[citation
needed]
though there is still some inconsistency in how software handles this markup, particularly in titles.
The RSS 2.0 specification was later updated to include examples of entity-encoded HTML; however, all
prior plain text usages remain valid.
As of January 2007, tracking data from www.syndic8.com indicates that the three main versions of RSS in
current use are 0.91, 1.0, and 2.0, constituting 13%, 17%, and 67% of worldwide RSS usage,
respectively.[23] These figures, however, do not include usage of the rival web feed format Atom. As of
August 2008, the syndic8.com website is indexing 546,069 total feeds, of which 86,496 were some dialect
of Atom and 438,102 were some dialect of RSS.[24]
Modules[edit]
The primary objective of all RSS modules is to extend the basic XML schema established for more robust
syndication of content. This inherently allows for more diverse, yet standardized, transactions without
modifying the core RSS specification.
To accomplish this extension, a tightly controlled vocabulary (in the RSS world, "module"; in the XML
world, "schema") is declared through an XML namespace to give names to concepts and relationships
between those concepts.
Interoperability[edit]
Although the number of items in an RSS channel are theoretically not limited, some news aggregators do
not support RSS files larger than 150KB (if all elements are provided on a new line, this size corresponds
to approx. 2,800 lines).[25] For example, applications that rely on the Common Feed List of Windows might
handle such files as if they were corrupt, and not open them.Interoperability can be maximized by keeping
the file size under this limit.
author author*
category category
channel feed
copyright rights
description subtitle
generator generator
guid id*
image logo
item entry
link* link*
title* title*
ttl -
See also[edit]