Internet in TRB EXAM

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 120

Internet

From Wikipedia, the free encyclopedia

This article is about the public worldwide computer network system. For other uses, see Internet (disambiguation).

"Computer culture" redirects here. For other uses, see Cyberculture.

Internet

A visualization of routing paths through a portion of the Internet.

General[show]

Governance[show]

Information infrastructure[show]

Services[show]

Guides[show]

Internet portal

 V

 T

 E

Computer network types by spatial scope

 Near field Communication(NFC)

 Body (BAN)
 Personal (PAN)

 Car/Electronics (CAN)

 Near-me (NAN)

 Local (LAN)

 Home (HAN)

 Storage (SAN)

 Campus (CAN)

 Backbone

 Metropolitan (MAN)

 Wide (WAN)

 Cloud (IAN)

 Internet

 Interplanetary Internet

 Intergalactic Computer Network

 V

 T

 E

The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite (TCP/IP) to serve

several billion users worldwide. It is a network of networks that consists of millions of private, public, academic, business, and

government networks, of local to global scope, that are linked by a broad array of electronic, wireless and optical networking

technologies. The Internet carries an extensive range of information resources and services, such as the inter-

linked hypertext documents of the World Wide Web (WWW), the infrastructure to support email, and peer-to-peer networks.

Most traditional communications media including telephone, music, film, and television are being reshaped or redefined by the

Internet, giving birth to new services such as voice over Internet Protocol (VoIP) and Internet Protocol television (IPTV). Newspaper,

book and other print publishing are adapting to website technology, or are reshaped into blogging and web feeds. The Internet has

enabled and accelerated new forms of human interactions through instant messaging, Internet forums, and social
networking. Online shopping has boomed both for major retail outlets and smallartisans and traders. Business-to-

business and financial services on the Internet affect supply chains across entire industries.

The origins of the Internet reach back to research commissioned by the United States government in the 1960s to build robust, fault-

tolerant communication via computer networks. While this work, together with work in the United Kingdom and France, led to

important precursor networks, they were not the Internet. There is no consensus on the exact date when the modern Internet came

into being, but sometime in the early to mid-1980s is considered reasonable.

The funding of a new U.S. backbone by the National Science Foundation in the 1980s, as well as private funding for other

commercial backbones, led to worldwide participation in the development of new networking technologies, and the merger of many

networks. Though the Internet has been widely used by academia since the 1980s, the commercialization of what was by the 1990s

an international network resulted in its popularization and incorporation into virtually every aspect of modern human life. As of June

2012, more than 2.4 billion people—over a third of the world's human population—have used the services of the Internet;

approximately 100 times more people than were using it in 1995.[1][2]

The Internet has no centralized governance in either technological implementation or policies for access and usage; each

constituent network sets its own policies. Only the overreaching definitions of the two principal name spaces in the Internet,

the Internet Protocol address space and the Domain Name System, are directed by a maintainer organization, the Internet

Corporation for Assigned Names and Numbers (ICANN). The technical underpinning and standardization of the core protocols

(IPv4 and IPv6) is an activity of the Internet Engineering Task Force (IETF), a non-profit organization of loosely affiliated

international participants that anyone may associate with by contributing technical expertise.

Contents

[hide]

 1 Terminology

 2 History

 3 Technology

o 3.1 Protocols

o 3.2 Routing

o 3.3 General structure

 4 Governance

 5 Modern uses

 6 Services
o 6.1 World Wide Web

o 6.2 Communication

o 6.3 Data transfer

 7 Access

 8 Users

 9 Social impact

o 9.1 Social networking and entertainment

o 9.2 Electronic business

o 9.3 Telecommuting

o 9.4 Crowdsourcing

o 9.5 Politics and political revolutions

o 9.6 Philanthropy

o 9.7 Censorship

 10 See also

 11 References

 12 External links

o 12.1 Organizations

o 12.2 Articles, books, and journals

Terminology
The Internet Messenger by Buky Schwartz inHolon

See also: Internet capitalization conventions

The Internet, referring to the specific global system of interconnected IP networks, is a proper noun and written with an initial capital

letter. In the media and common use it is often not capitalized, viz. the internet. Some guides specify that the word should be

capitalized when used as a noun, but not capitalized when used as a verb or an adjective.[3] The Internet is also often referred to

as the Net.

Historically the word internet was used, uncapitalized, as early as 1883 as a verb and adjective to refer to interconnected motions.

Starting in the early 1970s the term internet was used as a shorthand form of the technical term internetwork, the result of

interconnecting computer networks with special gateways or routers. It was also used as a verb meaning to connect together,

especially for networks.[4][5]

The terms Internet and World Wide Web are often used interchangeably in everyday speech; it is common to speak of "going on the

Internet" when invoking a web browser to view web pages. However, the Internet is a particular global computer network connecting

millions of computing devices; the World Wide Web is just one of many services running on the Internet. The Web is a collection of

interconnected documents (web pages) and other web resources, linked by hyperlinks and URLs.[6] In addition to the Web, a

multitude of other services are implemented over the Internet, including e-mail, file transfer, remote computer control, newsgroups,

and online games. All of these services can be implemented on anyintranet, accessible to network users.

The term Interweb is a portmanteau of Internet and World Wide Web typically used sarcastically to parody a technically unsavvy

user.[7]

History
Professor Leonard Kleinrockwith the first ARPANET Interface Message Processors at UCLA

Main articles: History of the Internet and History of the World Wide Web

Research into packet switching started in the early 1960s and packet switched networks such as Mark I at NPL in the UK,
[8]
ARPANET, CYCLADES,[9][10]Merit Network,[11] Tymnet, and Telenet, were developed in the late 1960s and early 1970s using a

variety of protocols. The ARPANET in particular led to the development of protocols for internetworking, where multiple separate

networks could be joined together into a network of networks.[citation needed]

The first two nodes of what would become the ARPANET were interconnected between Leonard Kleinrock's Network Measurement

Center at the UCLA's School of Engineering and Applied Science and Douglas Engelbart's NLS system at SRI International (SRI)

in Menlo Park, California, on 29 October 1969.[12] The third site on the ARPANET was the Culler-Fried Interactive Mathematics

center at the University of California at Santa Barbara, and the fourth was the University of Utah Graphics Department. In an early

sign of future growth, there were already fifteen sites connected to the young ARPANET by the end of 1971.[13][14] These early years

were documented in the 1972 film Computer Networks: The Heralds of Resource Sharing.

Early international collaborations on ARPANET were sparse. For various political reasons, European developers were concerned

with developing the X.25networks.[15] Notable exceptions were the Norwegian Seismic Array (NORSAR) in June 1973,[16] followed in

1973 by Sweden with satellite links to the TanumEarth Station and Peter T. Kirstein's research group in the UK, initially at

the Institute of Computer Science, University of London and later at University College London.[citation needed]

In December 1974, RFC 675 – Specification of Internet Transmission Control Program, by Vinton Cerf, Yogen Dalal, and Carl

Sunshine, used the term internetas a shorthand for internetworking and later RFCs repeat this use.[17] Access to the ARPANET was

expanded in 1981 when the National Science Foundation(NSF) developed the Computer Science Network (CSNET). In 1982,

the Internet Protocol Suite (TCP/IP) was standardized and the concept of a world-wide network of fully interconnected TCP/IP

networks called the Internet was introduced.


T3 NSFNET Backbone, c. 1992

TCP/IP network access expanded again in 1986 when the National Science Foundation Network (NSFNET) provided access

tosupercomputer sites in the United States from research and education organizations, first at 56 kbit/s and later at 1.5 Mbit/s and

45 Mbit/s.[18] Commercial Internet service providers (ISPs) began to emerge in the late 1980s and early 1990s. The ARPANET was

decommissioned in 1990. The Internet was commercialized in 1995 when NSFNET was decommissioned, removing the last

restrictions on the use of the Internet to carry commercial traffic. [19] The Internet started a rapid expansion to Europe and Australia in

the mid to late 1980s[20][21] and to Asia in the late 1980s and early 1990s.[22]

Since the mid-1990s the Internet has had a tremendous impact on culture and commerce, including the rise of near instant

communication by email, instant messaging, Voice over Internet Protocol (VoIP) "phone calls", two-way interactive video calls, and

theWorld Wide Web[23] with its discussion forums, blogs, social networking, and online shopping sites. Increasing amounts of
data are transmitted at higher and higher speeds over fiber optic networks operating at 1-Gbit/s, 10-
Gbit/s, or more.

Worldwide Internet users

2005 2010 2013a

World population[24] 6.5 billion 6.9 billion 7.1 billion

Not using the Internet 84% 70% 61%

Using the Internet 16% 30% 39%

Users in the developing world 8% 21% 31%

Users in the developed world 51% 67% 77%


a
Estimate.

Source: International Telecommunications Union.[25]

The Internet continues to grow, driven by ever greater amounts of online information and knowledge, commerce, entertainment

and social networking.[26] During the late 1990s, it was estimated that traffic on the public Internet grew by 100 percent per year,

while the mean annual growth in the number of Internet users was thought to be between 20% and 50%.[27] This growth is often

attributed to the lack of central administration, which allows organic growth of the network, as well as the non-proprietary open

nature of the Internet protocols, which encourages vendor interoperability and prevents any one company from exerting too much

control over the network.[28] As of 31 March 2011, the estimated total number of Internet users was 2.095 billion (30.2% of world

population).[29] It is estimated that in 1993 the Internet carried only 1% of the information flowing through two-way

telecommunication, by 2000 this figure had grown to 51%, and by 2007 more than 97% of all telecommunicated information was

carried over the Internet.[30]

Technology
Protocols
Main article: Internet protocol suite

As the user data is processed down through the protocol stack, each layer adds an encapsulation at the
sending host. Data is transmitted "over the wire" at the link level, left to right. The encapsulation stack
procedure is reversed by the receiving host. Intermediate relays remove and add a new link
encapsulation for retransmission, and inspect the IP layer for routing purposes.

Internet protocol suite

Application layer

 DHCP

 DHCPv6
 DNS

 FTP

 HTTP

 IMAP

 IRC

 LDAP

 MGCP

 NNTP

 BGP

 NTP

 POP

 RPC

 RTP

 RTSP

 RIP

 SIP

 SMTP

 SNMP

 SOCKS

 SSH

 Telnet

 TLS/SSL

 XMPP

 more...

Transport layer

 TCP

 UDP

 DCCP

 SCTP

 RSVP

 more...

Internet layer

 IP

 IPv4

 IPv6
 ICMP

 ICMPv6

 ECN

 IGMP

 IPsec

 more...

Link layer

 ARP/InARP

 NDP

 OSPF

 Tunnels

 L2TP

 PPP

 Media access control

 Ethernet

 DSL

 ISDN

 FDDI

 DOCSIS

 more...

 V

 T

 E

The communications infrastructure of the Internet consists of its hardware components and a system of software layers that control

various aspects of the architecture. While the hardware can often be used to support other software systems, it is the design and the

rigorous standardization process of the software architecture that characterizes the Internet and provides the foundation for its

scalability and success. The responsibility for the architectural design of the Internet software systems has been delegated to

the Internet Engineering Task Force (IETF).[31] The IETF conducts standard-setting work groups, open to any individual, about the

various aspects of Internet architecture. Resulting discussions and final standards are published in a series of publications, each

called a Request for Comments (RFC), freely available on the IETF web site.

The principal methods of networking that enable the Internet are contained in specially designated RFCs that constitute the Internet

Standards. Other less rigorous documents are simply informative, experimental, or historical, or document the best current practices

(BCP) when implementing Internet technologies.


The Internet standards describe a framework known as the Internet protocol suite. This is a model architecture that divides methods

into a layered system of protocols (RFC 1122, RFC 1123). The layers correspond to the environment or scope in which their

services operate. At the top is the application layer, the space for the application-specific networking methods used in software

applications, e.g., a web browser program uses the client-server application model and many file-sharing systems use a peer-to-

peer paradigm. Below this top layer, the transport layer connects applications on different hosts via the network with appropriate

data exchange methods. Underlying these layers are the core networking technologies, consisting of two layers.

The internet layer enables computers to identify and locate each other via Internet Protocol (IP) addresses, and allows them to

connect to one another via intermediate (transit) networks. Last, at the bottom of the architecture, is a software layer, the link layer,

that provides connectivity between hosts on the same local network link, such as a local area network (LAN) or a dial-up connection.

The model, also known as TCP/IP, is designed to be independent of the underlying hardware, which the model therefore does not

concern itself with in any detail. Other models have been developed, such as the Open Systems Interconnection (OSI) model, but

they are not compatible in the details of description or implementation; many similarities exist and the TCP/IP protocols are usually

included in the discussion of OSI networking.

The most prominent component of the Internet model is the Internet Protocol (IP), which provides addressing systems (IP

addresses) for computers on the Internet. IP enables internetworking and in essence establishes the Internet itself. IP Version 4

(IPv4) is the initial version used on the first generation of today's Internet and is still in dominant use. It was designed to address up

to ~4.3 billion (109) Internet hosts. However, the explosive growth of the Internet has led to IPv4 address exhaustion, which entered

its final stage in 2011,[32] when the global address allocation pool was exhausted. A new protocol version, IPv6, was developed in the

mid-1990s, which provides vastly larger addressing capabilities and more efficient routing of Internet traffic. IPv6 is currently in

growing deployment around the world, since Internet address registries (RIRs) began to urge all resource managers to plan rapid

adoption and conversion.[33]

IPv6 is not interoperable with IPv4. In essence, it establishes a parallel version of the Internet not directly accessible with IPv4

software. This means software upgrades or translator facilities are necessary for networking devices that need to communicate on

both networks. Most modern computer operating systems already support both versions of the Internet Protocol. Network

infrastructures, however, are still lagging in this development. Aside from the complex array of physical connections that make up its

infrastructure, the Internet is facilitated by bi- or multi-lateral commercial contracts (e.g., peering agreements), and by technical

specifications or protocols that describe how to exchange data over the network. Indeed, the Internet is defined by its

interconnections and routing policies.

Routing
Internet packet routing is accomplished among various tiers of Internet service providers.

Internet service providers connect customers, which represent the bottom of the routing hierarchy, to customers of other ISPs via

other higher or same-tier networks. At the top of the routing hierarchy are the Tier 1 networks, large telecommunication companies

which exchange traffic directly with all other Tier 1 networks via peering agreements. Tier 2 networks buy Internet transit from other

providers to reach at least some parties on the global Internet, though they may also engage in peering. An ISP may use a single

upstream provider for connectivity, or implement multihoming to achieve redundancy. Internet exchange points are major traffic

exchanges with physical connections to multiple ISPs.

Computers and routers use routing tables to direct IP packets to the next-hop router or destination. Routing tables are maintained by

manual configuration or by routing protocols. End-nodes typically use a default route that points toward an ISP providing transit,

while ISP routers use the Border Gateway Protocol to establish the most efficient routing across the complex connections of the

global Internet.

Large organizations, such as academic institutions, large enterprises, and governments, may perform the same function as ISPs,

engaging in peering and purchasing transit on behalf of their internal networks. Research networks tend to interconnect into large

subnetworks such as GEANT, GLORIAD, Internet2, and the UK's national research and education network, JANET.

General structure
The Internet structure and its usage characteristics have been studied extensively. It has been determined that both the Internet IP

routing structure and hypertext links of the World Wide Web are examples of scale-free networks.[34]

Many computer scientists describe the Internet as a "prime example of a large-scale, highly engineered, yet highly complex system".
[35]
The Internet is heterogeneous; for instance, data transfer rates and physical characteristics of connections vary widely. The

Internet exhibits "emergent phenomena" that depend on its large-scale organization. For example, data transfer rates exhibit

temporal self-similarity. The principles of the routing and addressing methods for traffic in the Internet reach back to their origins in

the 1960s when the eventual scale and popularity of the network could not be anticipated.[36] Thus, the possibility of developing
alternative structures is investigated.[37] The Internet structure was found to be highly robust[38] to random failures and very

vulnerable to high degree attacks.[39]

Governance

Main article: Internet governance

ICANN headquarters in Marina Del Rey, California, United States

The Internet is a globally distributed network comprising many voluntarily interconnected autonomous networks. It operates without

a central governing body.

The technical underpinning and standardization of the Internet's core protocols (IPv4 and IPv6) is an activity of the Internet

Engineering Task Force(IETF), a non-profit organization of loosely affiliated international participants that anyone may associate

with by contributing technical expertise.

To maintain interoperability, the principal name spaces of the Internet are administered by the Internet Corporation for Assigned

Names and Numbers(ICANN), headquartered in Marina del Rey, California. ICANN is the authority that coordinates the assignment

of unique identifiers for use on the Internet, including domain names, Internet Protocol (IP) addresses, application port numbers in

the transport protocols, and many other parameters. Globally unified name spaces, in which names and numbers are uniquely

assigned, are essential for maintaining the global reach of the Internet. ICANN is governed by an international board of directors

drawn from across the Internet technical, business, academic, and other non-commercial communities. ICANN's role in coordinating

the assignment of unique identifiers distinguishes it as perhaps the only central coordinating body for the global Internet. [40]

Allocation of IP addresses is delegated to Regional Internet Registries (RIRs):

 African Network Information Center (AfriNIC) for Africa

 American Registry for Internet Numbers (ARIN) for North America

 Asia-Pacific Network Information Centre (APNIC) for Asia and the Pacific region

 Latin American and Caribbean Internet Addresses Registry (LACNIC) for Latin America

and the Caribbean region


 Réseaux IP Européens - Network Coordination Centre (RIPE NCC) for Europe, the

Middle East, and Central Asia

The National Telecommunications and Information Administration, an agency of the United States Department of Commerce,

continues to have final approval over changes to the DNS root zone.[41][42][43]

The Internet Society (ISOC) was founded in 1992, with a mission to "assure the open development, evolution and use of the

Internet for the benefit of all people throughout the world".[44] Its members include individuals (anyone may join) as well as

corporations, organizations, governments, and universities. Among other activities ISOC provides an administrative home for a

number of less formally organized groups that are involved in developing and managing the Internet, including: the Internet

Engineering Task Force (IETF), Internet Architecture Board (IAB), Internet Engineering Steering Group (IESG), Internet Research

Task Force (IRTF), and Internet Research Steering Group (IRSG).

On 16 November 2005, the United Nations-sponsored World Summit on the Information Society, held in Tunis, established

the Internet Governance Forum (IGF) to discuss Internet-related issues.

Modern uses

The Internet allows greater flexibility in working hours and location, especially with the spread of unmetered high-speed connections.

The Internet can be accessed almost anywhere by numerous means, including through mobile Internet devices. Mobile

phones, datacards, handheld game consoles and cellular routers allow users to connect to the Internet wirelessly. Within the

limitations imposed by small screens and other limited facilities of such pocket-sized devices, the services of the Internet, including

email and the web, may be available. Service providers may restrict the services offered and mobile data charges may be

significantly higher than other access methods.

Educational material at all levels from pre-school to post-doctoral is available from websites. Examples range from CBeebies,

through school and high-school revision guides and virtual universities, to access to top-end scholarly literature through the likes

of Google Scholar. For distance education, help with homework and other assignments, self-guided learning, whiling away spare

time, or just looking up more detail on an interesting fact, it has never been easier for people to access educational information at

any level from anywhere. The Internet in general and theWorld Wide Web in particular are important enablers of

both formal and informal education.

The low cost and nearly instantaneous sharing of ideas, knowledge, and skills has made collaborative work dramatically easier, with

the help of collaborative software. Not only can a group cheaply communicate and share ideas but the wide reach of the Internet

allows such groups more easily to form. An example of this is the free software movement, which has produced, among other

things, Linux, Mozilla Firefox, and OpenOffice.org. Internet chat, whether using an IRC chat room, an instant messaging system, or

a social networking website, allows colleagues to stay in touch in a very convenient way while working at their computers during the

day. Messages can be exchanged even more quickly and conveniently than via email. These systems may allow files to be

exchanged, drawings and images to be shared, or voice and video contact between team members.
Content management systems allow collaborating teams to work on shared sets of documents simultaneously without accidentally

destroying each other's work. Business and project teams can share calendars as well as documents and other information. Such

collaboration occurs in a wide variety of areas including scientific research, software development, conference planning, political

activism and creative writing. Social and political collaboration is also becoming more widespread as both Internet access

and computer literacy spread.

The Internet allows computer users to remotely access other computers and information stores easily, wherever they may be. They

may do this with or without computer security, i.e. authentication and encryption technologies, depending on the requirements. This

is encouraging new ways of working from home, collaboration and information sharing in many industries. An accountant sitting at

home can audit the books of a company based in another country, on a server situated in a third country that is remotely maintained

by IT specialists in a fourth. These accounts could have been created by home-working bookkeepers, in other remote locations,

based on information emailed to them from offices all over the world. Some of these things were possible before the widespread use

of the Internet, but the cost of private leased lines would have made many of them infeasible in practice. An office worker away from

their desk, perhaps on the other side of the world on a business trip or a holiday, can access their emails, access their data

using cloud computing, or open a remote desktop session into their office PC using a secureVirtual Private Network (VPN)

connection on the Internet. This can give the worker complete access to all of their normal files and data, including email and other

applications, while away from the office. It has been referred to among system administrators as the Virtual Private Nightmare,
[45]
because it extends the secure perimeter of a corporate network into remote locations and its employees' homes.

Services
World Wide Web

This NeXT Computer was used by Tim Berners-Lee at CERN and became the world's first Web server.

Many people use the terms Internet and World Wide Web, or just the Web, interchangeably, but the two terms are not synonymous.

The World Wide Web is only one of hundreds of services used on the Internet. The Web is a global set of documents, images and

other resources, logically interrelated by hyperlinks and referenced with Uniform Resource Identifiers (URIs). URIs symbolically

identify services, servers, and other databases, and the documents and resources that they can provide. Hypertext Transfer

Protocol (HTTP) is the main access protocol of the World Wide Web. Web servicesalso use HTTP to allow software systems to

communicate in order to share and exchange business logic and data.


World Wide Web browser software, such as Microsoft's Internet Explorer, Mozilla Firefox, Opera, Apple's Safari, and Google

Chrome, lets users navigate from one web page to another via hyperlinks embedded in the documents. These documents may also

contain any combination of computer data, including graphics, sounds, text, video, multimedia and interactive content that runs

while the user is interacting with the page. Client-side software can include animations, games, office applications and scientific

demonstrations. Through keyword-driven Internet research using search engines like Yahoo! and Google, users worldwide have

easy, instant access to a vast and diverse amount of online information. Compared to printed media, books, encyclopedias and

traditional libraries, the World Wide Web has enabled the decentralization of information on a large scale.

The Web has also enabled individuals and organizations to publish ideas and information to a potentially large audience online at

greatly reduced expense and time delay. Publishing a web page, a blog, or building a website involves little initial cost and many

cost-free services are available. Publishing and maintaining large, professional web sites with attractive, diverse and up-to-date

information is still a difficult and expensive proposition, however. Many individuals and some companies and groups use web logs or

blogs, which are largely used as easily updatable online diaries. Some commercial organizations encourage staff to communicate

advice in their areas of specialization in the hope that visitors will be impressed by the expert knowledge and free information, and

be attracted to the corporation as a result.

One example of this practice is Microsoft, whose product developers publish their personal blogs in order to pique the public's

interest in their work. Collections of personal web pages published by large service providers remain popular, and have become

increasingly sophisticated. Whereas operations such as Angelfire and GeoCities have existed since the early days of the Web,

newer offerings from, for example, Facebook and Twitter currently have large followings. These operations often brand themselves

as social network services rather than simply as web page hosts.

Advertising on popular web pages can be lucrative, and e-commerce or the sale of products and services directly via the Web

continues to grow.

When the Web began in the 1990s, a typical web page was stored in completed form on a web server, formatted in HTML, ready to

be sent to a user's browser in response to a request. Over time, the process of creating and serving web pages has become more

automated and more dynamic. Websites are often created using content management or wiki software with, initially, very little

content. Contributors to these systems, who may be paid staff, members of a club or other organization or members of the public, fill

underlying databases with content using editing pages designed for that purpose, while casual visitors view and read this content in

its final HTML form. There may or may not be editorial, approval and security systems built into the process of taking newly entered

content and making it available to the target visitors.

Communication
Email is an important communications service available on the Internet. The concept of sending electronic text messages between

parties in a way analogous to mailing letters or memos predates the creation of the Internet. Pictures, documents and other files are

sent as email attachments. Emails can be cc-ed to multiple email addresses.


Internet telephony is another common communications service made possible by the creation of the Internet. VoIP stands for Voice-

over-Internet Protocol, referring to the protocol that underlies all Internet communication. The idea began in the early 1990s

with walkie-talkie-like voice applications for personal computers. In recent years many VoIP systems have become as easy to use

and as convenient as a normal telephone. The benefit is that, as the Internet carries the voice traffic, VoIP can be free or cost much

less than a traditional telephone call, especially over long distances and especially for those with always-on Internet connections

such as cable or ADSL. VoIP is maturing into a competitive alternative to traditional telephone service. Interoperability between

different providers has improved and the ability to call or receive a call from a traditional telephone is available. Simple, inexpensive

VoIP network adapters are available that eliminate the need for a personal computer.

Voice quality can still vary from call to call, but is often equal to and can even exceed that of traditional calls. Remaining problems

for VoIP include emergency telephone number dialing and reliability. Currently, a few VoIP providers provide an emergency service,

but it is not universally available. Older traditional phones with no "extra features" may be line-powered only and operate during a

power failure; VoIP can never do so without a backup power source for the phone equipment and the Internet access devices. VoIP

has also become increasingly popular for gaming applications, as a form of communication between players. Popular VoIP clients

for gaming include Ventrilo and Teamspeak. Modern video game consoles also offer VoIP chat features.

Data transfer
File sharing is an example of transferring large amounts of data across the Internet. A computer file can be emailed to customers,

colleagues and friends as an attachment. It can be uploaded to a website or FTP server for easy download by others. It can be put

into a "shared location" or onto a file server for instant use by colleagues. The load of bulk downloads to many users can be eased

by the use of "mirror" servers or peer-to-peer networks. In any of these cases, access to the file may be controlled by

user authentication, the transit of the file over the Internet may be obscured by encryption, and money may change hands for

access to the file. The price can be paid by the remote charging of funds from, for example, a credit card whose details are also

passed – usually fully encrypted – across the Internet. The origin and authenticity of the file received may be checked by digital

signatures or by MD5 or other message digests. These simple features of the Internet, over a worldwide basis, are changing the

production, sale, and distribution of anything that can be reduced to a computer file for transmission. This includes all manner of

print publications, software products, news, music, film, video, photography, graphics and the other arts. This in turn has caused

seismic shifts in each of the existing industries that previously controlled the production and distribution of these products.

Streaming media is the real-time delivery of digital media for the immediate consumption or enjoyment by end users. Many radio

and television broadcasters provide Internet feeds of their live audio and video productions. They may also allow time-shift viewing

or listening such as Preview, Classic Clips and Listen Again features. These providers have been joined by a range of pure Internet

"broadcasters" who never had on-air licenses. This means that an Internet-connected device, such as a computer or something

more specific, can be used to access on-line media in much the same way as was previously possible only with a television or radio

receiver. The range of available types of content is much wider, from specialized technical webcasts to on-demand popular

multimedia services. Podcasting is a variation on this theme, where – usually audio – material is downloaded and played back on a
computer or shifted to a portable media player to be listened to on the move. These techniques using simple equipment allow

anybody, with little censorship or licensing control, to broadcast audio-visual material worldwide.

Digital media streaming increases the demand for network bandwidth. For example, standard image quality needs 1 Mbit/s link

speed for SD 480p, HD 720p quality requires 2.5 Mbit/s, and the top-of-the-line HDX quality needs 4.5 Mbit/s for 1080p. [46]

Webcams are a low-cost extension of this phenomenon. While some webcams can give full-frame-rate video, the picture either is

usually small or updates slowly. Internet users can watch animals around an African waterhole, ships in the Panama Canal, traffic at

a local roundabout or monitor their own premises, live and in real time. Video chat rooms and video conferencing are also popular

with many uses being found for personal webcams, with and without two-way sound. YouTube was founded on 15 February 2005

and is now the leading website for free streaming video with a vast number of users. It uses a flash-based web player to stream and

show video files. Registered users may upload an unlimited amount of video and build their own personal profile. YouTube claims

that its users watch hundreds of millions, and upload hundreds of thousands of videos daily.[47]

Access

Main article: Internet access

Common methods of Internet access in homes include dial-up, landline broadband (over coaxial cable, fiber optic or copper

wires), Wi-Fi, satellite and 3G/4G technology cell phones. Public places to use the Internet include libraries and Internet cafes,

where computers with Internet connections are available. There are also Internet access points in many public places such as

airport halls and coffee shops, in some cases just for brief use while standing. Various terms are used, such as "public Internet

kiosk", "public access terminal", and "Web payphone". Many hotels now also have public terminals, though these are usually fee-

based. These terminals are widely accessed for various usage like ticket booking, bank deposit, online payment etc. Wi-Fi provides

wireless access to computer networks, and therefore can do so to the Internet itself. Hotspots providing such access include Wi-Fi

cafes, where would-be users need to bring their own wireless-enabled devices such as a laptop or PDA. These services may be

free to all, free to customers only, or fee-based. A hotspot need not be limited to a confined location. A whole campus or park, or

even an entire city can be enabled.

Grassroots efforts have led to wireless community networks. Commercial Wi-Fi services covering large city areas are in place in

London, Vienna, Toronto, San Francisco, Philadelphia, Chicago and Pittsburgh. The Internet can then be accessed from such

places as a park bench.[48] Apart from Wi-Fi, there have been experiments with proprietary mobile wireless networks like Ricochet,

various high-speed data services over cellular phone networks, and fixed wireless services. High-end mobile phones such

as smartphones in general come with Internet access through the phone network. Web browsers such as Opera are available on

these advanced handsets, which can also run a wide variety of other Internet software. More mobile phones have Internet access

than PCs, though this is not as widely used.[49] An Internet access provider and protocol matrix differentiates the methods used to

get online.

An Internet blackout or outage can be caused by local signaling interruptions. Disruptions of submarine communications cables may

cause blackouts or slowdowns to large areas, such as in the2008 submarine cable disruption. Less-developed countries are more
vulnerable due to a small number of high-capacity links. Land cables are also vulnerable, as in 2011 when a woman digging for

scrap metal severed most connectivity for the nation of Armenia.[50] Internet blackouts affecting almost entire countries can be

achieved by governments as a form of Internet censorship, as in the blockage of the Internet in Egypt, whereby approximately

93%[51] of networks were without access in 2011 in an attempt to stop mobilization for anti-government protests.[52]

Users

See also: Global Internet usage, English on the Internet, and Unicode

Internet users per 100 inhabitants


[53][54]
Source: International Telecommunications Union.

Internet users by language[55]


Website content languages[56]

Overall Internet usage has seen tremendous growth. From 2000 to 2009, the number of Internet users globally rose from 394 million

to 1.858 billion.[57] By 2010, 22 percent of the world's population had access to computers with 1 billion Googlesearches every day,

300 million Internet users reading blogs, and 2 billion videos viewed daily on YouTube.[58]

The prevalent language for communication on the Internet has been English. This may be a result of the origin of the Internet, as

well as the language's role as a lingua franca. Early computer systems were limited to the characters in the American Standard

Code for Information Interchange (ASCII), a subset of the Latin alphabet.

After English (27%), the most requested languages on the World Wide Web are Chinese (23%), Spanish (8%), Japanese (5%),

Portuguese and German (4% each), Arabic, French and Russian (3% each), and Korean (2%). [59] By region, 42% of the

world'sInternet users are based in Asia, 24% in Europe, 14% in North America, 10% in Latin America and the Caribbean taken

together, 6% in Africa, 3% in the Middle East and 1% in Australia/Oceania. [60] The Internet's technologies have developed enough in

recent years, especially in the use of Unicode, that good facilities are available for development and communication in the world's

widely used languages. However, some glitches such as mojibake (incorrect display of some languages' characters) still remain.

In an American study in 2005, the percentage of men using the Internet was very slightly ahead of the percentage of women,

although this difference reversed in those under 30. Men logged on more often, spent more time online, and were more likely to be

broadband users, whereas women tended to make more use of opportunities to communicate (such as email). Men were more likely

to use the Internet to pay bills, participate in auctions, and for recreation such as downloading music and videos. Men and women

were equally likely to use the Internet for shopping and banking.[61] More recent studies indicate that in 2008, women significantly

outnumbered men on most social networking sites, such as Facebook and Myspace, although the ratios varied with age. [62] In

addition, women watched more streaming content, whereas men downloaded more. [63] In terms of blogs, men were more likely to
blog in the first place; among those who blog, men were more likely to have a professional blog, whereas women were more likely to

have a personal blog.[64]

According to Euromonitor, by 2020 43.7% of the world's population will be users of the Internet. Splitting by country, in 2011 Iceland,

Norway and the Netherlands had the highest Internet penetration by the number of users, with more than 90% of the population with

access.

Social impact

Main article: Sociology of the Internet

The Internet has enabled entirely new forms of social interaction, activities, and organizing, thanks to its basic features such as

widespread usability and access. In the first decade of the 21st century, the first generation is raised with widespread availability of

Internet connectivity, bringing consequences and concerns in areas such as personal privacy and identity, and distribution of

copyrighted materials. These "digital natives" face a variety of challenges that were not present for prior generations.

Social networking and entertainment


See also: Social networking service#Social impact

Many people use the World Wide Web to access news, weather and sports reports, to plan and book vacations and to find out more

about their interests. People use chat, messaging and email to make and stay in touch with friends worldwide, sometimes in the

same way as some previously had pen pals. The Internet has seen a growing number of Web desktops, where users can access

their files and settings via the Internet.

Social networking websites such as Facebook, Twitter, and MySpace have created new ways to socialize and interact. Users of

these sites are able to add a wide variety of information to pages, to pursue common interests, and to connect with others. It is also

possible to find existing acquaintances, to allow communication among existing groups of people. Sites like LinkedInfoster

commercial and business connections. YouTube and Flickr specialize in users' videos and photographs.

The Internet has been a major outlet for leisure activity since its inception, with entertaining social experiments such

as MUDsand MOOs being conducted on university servers, and humor-related Usenet groups receiving much traffic. Today,

manyInternet forums have sections devoted to games and funny videos; short cartoons in the form of Flash movies are also

popular. Over 6 million people use blogs or message boards as a means of communication and for the sharing of ideas.

The Internet pornography and online gambling industries have taken advantage of the World Wide Web, and often provide a

significant source of advertising revenue for other websites.[65] Although many governments have attempted to restrict both

industries' use of the Internet, in general this has failed to stop their widespread popularity. [66]

Another area of leisure activity on the Internet is multiplayer gaming.[67] This form of recreation creates communities, where people

of all ages and origins enjoy the fast-paced world of multiplayer games. These range from MMORPG to first-person shooters,

from role-playing video games to online gambling. While online gaming has been around since the 1970s, modern modes of online

gaming began with subscription services such as GameSpy and MPlayer.[68] Non-subscribers were limited to certain types of game
play or certain games. Many people use the Internet to access and download music, movies and other works for their enjoyment

and relaxation. Free and fee-based services exist for all of these activities, using centralized servers and distributed peer-to-peer

technologies. Some of these sources exercise more care with respect to the original artists' copyrights than others.

Internet usage has been correlated to users' loneliness.[69] Lonely people tend to use the Internet as an outlet for their feelings and to

share their stories with others, such as in the "I am lonely will anyone speak to me" thread.

Cybersectarianism is a new organizational form which involves: "highly dispersed small groups of practitioners that may remain

largely anonymous within the larger social context and operate in relative secrecy, while still linked remotely to a larger network of

believers who share a set of practices and texts, and often a common devotion to a particular leader. Overseas supporters provide

funding and support; domestic practitioners distribute tracts, participate in acts of resistance, and share information on the internal

situation with outsiders. Collectively, members and practitioners of such sects construct viable virtual communities of faith,

exchanging personal testimonies and engaging in collective study via email, on-line chat rooms and web-based message boards." [70]

Cyberslacking can become a drain on corporate resources; the average UK employee spent 57 minutes a day surfing the Web while

at work, according to a 2003 study by Peninsula Business Services.[71] Internet addiction disorder is excessive computer use that

interferes with daily life. Psychologist Nicolas Carr believe that Internet use has other effects on individuals, for instance improving

skills of scan-reading and interfering with the deep thinking that leads to true creativity.[72]

Electronic business
Main article: Electronic business

Electronic business (E-business) involves business processes spanning the entire value chain: electronic purchasing and supply

chain management, processing orders electronically, handling customer service, and cooperating with business partners. E-

commerce seeks to add revenue streams using the Internet to build and enhance relationships with clients and partners.

According to research firm IDC, the size of total worldwide e-commerce, when global business-to-business and -consumer

transactions are added together, will equate to $16 trillion in 2013.IDate, another research firm, estimates the global market for

digital products and services at $4.4 trillion in 2013. A report by Oxford Economics adds those two together to estimate the total size

of the digital economy at $20.4 trillion, equivalent to roughly 13.8% of global sales.[73]

While much has been written of the economic advantages of Internet-enabled commerce, there is also evidence that some aspects

of the Internet such as maps and location-aware services may serve to reinforce economic inequality and the digital divide.
[74]
Electronic commerce may be responsible for consolidation and the decline of mom-and-pop, brick and mortar businesses

resulting in increases in income inequality.[75][76][77]

Telecommuting
Main article: Telecommuting

Remote work is facilitated by tools such as groupware, virtual private networks, conference calling, videoconferencing, and Voice

over IP (VOIP). It can be efficient and useful for companies as it allows workers to communicate over long distances, saving
significant amounts of travel time and cost. As broadband Internet connections become more commonplace, more and more

workers have adequate bandwidth at home to use these tools to link their home to their corporate intranet and internal phone

networks.

Crowdsourcing
Main article: Crowdsourcing

Internet provides a particularly good venue for crowdsourcing (outsourcing tasks to a distributed group of people) since individuals

tend to be more open in web-based projects where they are not being physically judged or scrutinized and thus can feel more

comfortable sharing.

Crowdsourcing systems are used to accomplish a variety of tasks. For example, the crowd may be invited to develop a new

technology, carry out a design task, refine or carry out the steps of an algorithm (see human-based computation), or help capture,

systematize, or analyze large amounts of data (see also citizen science).

Wikis have also been used in the academic community for sharing and dissemination of information across institutional and

international boundaries.[78] In those settings, they have been found useful for collaboration on grant writing, strategic planning,

departmental documentation, and committee work.[79] The United States Patent and Trademark Office uses a wiki to allow the public

to collaborate on finding prior art relevant to examination of pending patent applications. Queens, New York has used a wiki to allow

citizens to collaborate on the design and planning of a local park.[80]

The English Wikipedia has the largest user base among wikis on the World Wide Web[81] and ranks in the top 10 among all Web

sites in terms of traffic.[82]

Politics and political revolutions


The Internet has achieved new relevance as a political tool. The presidential campaign of Howard Dean in 2004 in the United States

was notable for its success in soliciting donation via the Internet. Many political groups use the Internet to achieve a new method of

organizing in order to carry out their mission, having given rise to Internet activism, most notably practiced by rebels in the Arab

Spring.[83][84]

The New York Times suggested that social media websites, such as Facebook and Twitter, helped people organize the political

revolutions in Egypt where it helped certain classes of protesters organize protests, communicate grievances, and disseminate

information.[85]

The potential of the Internet as a civic tool of communicative power was thoroughly explored by Simon R. B. Berdal in his thesis of

2004:

As the globally evolving Internet provides ever new access points to virtual discourse forums, it also promotes new civic relations

and associations within which communicative power may flow and accumulate. Thus, traditionally ... national-embedded peripheries

get entangled into greater, international peripheries, with stronger combined powers... The Internet, as a consequence, changes the
topology of the "centre-periphery" model, by stimulating conventional peripheries to interlink into "super-periphery" structures, which

enclose and "besiege" several centres at once.[86]

Berdal, therefore, extends the Habermasian notion of the Public sphere to the Internet, and underlines the inherent global and civic

nature that intervowen Internet technologies provide. To limit the growing civic potential of the Internet, Berdal also notes how "self-

protective measures" are put in place by those threatened by it:

If we consider China’s attempts to filter "unsuitable material" from the Internet, most of us would agree that this resembles a self-

protective measure by the system against the growing civic potentials of the Internet. Nevertheless, both types represent limitations

to "peripheral capacities". Thus, the Chinese government tries to prevent communicative power to build up and unleash (as

the 1989 Tiananmen Square uprising suggests, the government may find it wise to install "upstream measures"). Even though

limited, the Internet is proving to be an empowering tool also to the Chinese periphery: Analysts believe that Internet petitions have

influenced policy implementation in favour of the public’s online-articulated will ... [86]

Philanthropy
The spread of low-cost Internet access in developing countries has opened up new possibilities for peer-to-peer charities, which

allow individuals to contribute small amounts to charitable projects for other individuals. Websites, such

as DonorsChoose and GlobalGiving, allow small-scale donors to direct funds to individual projects of their choice.

A popular twist on Internet-based philanthropy is the use of peer-to-peer lending for charitable purposes. Kiva pioneered this

concept in 2005, offering the first web-based service to publish individual loan profiles for funding. Kiva raises funds for local

intermediary microfinance organizations which post stories and updates on behalf of the borrowers. Lenders can contribute as little

as $25 to loans of their choice, and receive their money back as borrowers repay. Kiva falls short of being a pure peer-to-peer

charity, in that loans are disbursed before being funded by lenders and borrowers do not communicate with lenders themselves. [87][88]

However, the recent spread of low cost Internet access in developing countries has made genuine international person-to-person

philanthropy increasingly feasible. In 2009 the US-based nonprofitZidisha tapped into this trend to offer the first person-to-person

microfinance platform to link lenders and borrowers across international borders without intermediaries. Members can fund loans for

as little as a dollar, which the borrowers then use to develop business activities that improve their families' incomes while repaying

loans to the members with interest. Borrowers access the Internet via public cybercafes, donated laptops in village schools, and

even smart phones, then create their own profile pages through which they share photos and information about themselves and

their businesses. As they repay their loans, borrowers continue to share updates and dialogue with lenders via their profile pages.

This direct web-based connection allows members themselves to take on many of the communication and recording tasks

traditionally performed by local organizations, bypassing geographic barriers and dramatically reducing the cost of microfinance

services to the entrepreneurs.[89]

Censorship
Internet censorship by country[90][91][92]

Pervasive censorship Changing situation


Substantial censorship Little or no censorship
Selective censorship Not classified / no data

Main articles: Internet censorship and Internet freedom

Some governments, such as those of Burma, Iran, North Korea, the Mainland China, Saudi Arabia, and the United Arab

Emirates restrict what people in their countries can access on the Internet, especially political and religious content. This is

accomplished through software that filters domains and content so that they may not be easily accessed or obtained without

elaborate circumvention.[93]

In Norway, Denmark, Finland, and Sweden, major Internet service providers have voluntarily, possibly to avoid such an arrangement

being turned into law, agreed to restrict access to sites listed by authorities. While this list of forbidden URLs is supposed to contain

addresses of only known child pornography sites, the content of the list is secret.[94] Many countries, including the United States,

have enacted laws against the possession or distribution of certain material, such as child pornography, via the Internet, but do not

mandate filtering software. There are many free and commercially available software programs, called content-control software, with

which a user can choose to block offensive websites on individual computers or networks, in order to limit a child's access to

pornographic materials or depiction of violence.

World Wide Web


From Wikipedia, the free encyclopedia
"WWW" and "The web" redirect here. For other uses of WWW, see WWW (disambiguation). For other
uses of web, see Web (disambiguation).

Not to be confused with the Internet.

Internet
A visualization of routing paths through a portion of the Internet.

General[show]

Governance[show]

Information infrastructure[show]

Services[show]

Guides[show]

Internet portal

 V

 T

 E

World Wide Web

The web's logo designed by Robert Cailliau

Invented by Tim Berners-Lee[1][2]


Company CERN

Availability Worldwide

The World Wide Web (abbreviated as WWW or W3,[3] commonly known as the web) is a system of
interlinked hypertext documents accessed via the Internet. With a web browser, one can view web
pages that may contain text, images, videos, and other multimedia and navigate between them
viahyperlinks.

Tim Berners-Lee, a British computer scientist and at that time employee of CERN, a European research
organisation near Geneva,[4] wrote a proposal in March 1989 for what would eventually become the World
Wide Web.[1] The 1989 proposal was meant for a more effective CERN communication system but
Berners-Lee eventually realised the concept could be implemented throughout the world. [5] Berners-Lee
and Flemish computer scientistRobert Cailliau proposed in 1990 to use hypertext "to link and access
information of various kinds as a web of nodes in which the user can browse at will", [6] and Berners-Lee
finished the first website in December that year. [7] Berners-Lee posted the project on the alt.hypertext
newsgroup on 7 August 1991.[8]

Contents

[hide]

 1 History

 2 Function

o 2.1 Linking

o 2.2 Dynamic updates of web pages

o 2.3 WWW prefix

o 2.4 Scheme specifiers: http and https

 3 Web servers

 4 Privacy

 5 Intellectual property
 6 Security

 7 Standards

 8 Accessibility

 9 Internationalization

 10 Statistics

 11 Speed issues

 12 Caching

 13 See also

 14 References

 15 Further reading

 16 External links

History[edit]
Main article: History of the World Wide Web

The NeXT Computer used by Berners-Lee. The handwritten label declares, "This machine is a server. DO NOT POWER IT
DOWN!!"

In the May 1970 issue of Popular Science magazine, Arthur C. Clarke predicted that satellites would
someday "bring the accumulated knowledge of the world to your fingertips" using a console that would
combine the functionality of the photocopier, telephone, television and a small computer, allowing data
transfer and video conferencing around the globe. [9]
In March 1989, Tim Berners-Lee wrote a proposal that referenced ENQUIRE, a database and software
project he had built in 1980, and described a more elaborate information management system. [10]

With help from Robert Cailliau, he published a more formal proposal (on 12 November 1990) to build a
"Hypertext project" called "WorldWideWeb" (one word, also "W3") as a "web" of "hypertext documents" to
be viewed by "browsers" using a client–server architecture.[6] This proposal estimated that a read-only
web would be developed within three months and that it would take six months to achieve "the creation of
new links and new material by readers, [so that] authorship becomes universal" as well as "the automatic
notification of a reader when new material of interest to him/her has become available." While the read-
only goal was met, accessible authorship of web content took longer to mature, with the wiki concept,
blogs, Web 2.0 andRSS/Atom.[11]

The proposal was modeled after the SGML reader Dynatext by Electronic Book Technology, a spin-off
from the Institute for Research in Information and Scholarship at Brown University. The Dynatext system,
licensed by CERN, was a key player in the extension of SGML ISO 8879:1986 to Hypermedia
within HyTime, but it was considered too expensive and had an inappropriate licensing policy for use in
the general high energy physics community, namely a fee for each document and each document
alteration.

The CERN datacenter in 2010 housing some WWW servers

A NeXT Computer was used by Berners-Lee as the world's first web server and also to write the first web
browser, WorldWideWeb, in 1990. By Christmas 1990, Berners-Lee had built all the tools necessary for a
working Web:[12] the first web browser (which was a web editor as well); the first web server; and the first
web pages,[13] which described the project itself.

The first web page may be lost, but Paul Jones (computer technologist) of UNC-Chapel Hill in North
Carolina revealed in May 2013 that he has a copy of a page given to him by Berners-Lee during a visit to
UNC in 1991 which is the oldest known web page. Jones stored it on a magneto-optical drive and on his
NeXT computer.[14]
On 6 August 1991, Berners-Lee posted a short summary of the World Wide Web project on
the alt.hypertext newsgroup.[15] This date also marked the debut of the Web as a publicly available
service on the Internet, although new users only access it after August 23. For this reason this is
considered the internaut's day. Many newsmedia have reported that the first photo on the web was
uploaded by Berners-Lee in 1992, an image of the CERN house band Les Horribles Cernettes taken by
Silvano de Gennaro; Gennaro has disclaimed this story, writing that media were "totally distorting our
words for the sake of cheap sensationalism." [16]

The first server outside Europe was set up at the Stanford Linear Accelerator Center (SLAC) in Palo Alto,
California, to host the SPIRES-HEP database. Accounts differ substantially as to the date of this event.
The World Wide Web Consortium says December 1992,[17] whereas SLAC itself claims 1991.[18][19] This is
supported by a W3C document titled A Little History of the World Wide Web.[20]

The crucial underlying concept of hypertext originated with older projects from the 1960s, such as the
Hypertext Editing System (HES) at Brown University, Ted Nelson's Project Xanadu, andDouglas
Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar
Bush's microfilm-based "memex", which was described in the 1945 essay "As We May Think".[21]

Berners-Lee's breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he
explains that he had repeatedly suggested that a marriage between the two technologies was possible to
members of both technical communities, but when no one took up his invitation, he finally assumed the
project himself. In the process, he developed three essential technologies:

1. a system of globally unique identifiers for resources on the Web


and elsewhere, the universal document identifier (UDI), later
known as uniform resource locator (URL) and uniform resource
identifier (URI);
2. the publishing language HyperText Markup Language (HTML);

3. the Hypertext Transfer Protocol (HTTP).[22]

The World Wide Web had a number of differences from other hypertext systems available at the time. The
web required only unidirectional links rather than bidirectional ones, making it possible for someone to link
to another resource without action by the owner of that resource. It also significantly reduced the difficulty
of implementing web servers and browsers (in comparison to earlier systems), but in turn presented the
chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide Web was non-
proprietary, making it possible to develop servers and clients independently and to add extensions without
licensing restrictions. On 30 April 1993, CERN announced that the World Wide Web would be free to
anyone, with no fees due.[23] Coming two months after the announcement that the server implementation
of the Gopher protocol was no longer free to use, this produced a rapid shift away from Gopher and
towards the Web. An early popular web browser was ViolaWWW for Unix and the X Windowing System.

Robert Cailliau, Jean-François Abramatic of IBM, and Tim Berners-Lee at the 10th anniversary of the World Wide Web
Consortium.

Scholars generally agree that a turning point for the World Wide Web began with the introduction [24] of
the Mosaic web browser[25] in 1993, a graphical browser developed by a team at the National Center for
Supercomputing Applications at the University of Illinois at Urbana-Champaign (NCSA-UIUC), led
by Marc Andreessen. Funding for Mosaic came from the U.S. High-Performance Computing and
Communications Initiative and the High Performance Computing and Communication Act of 1991, one
of several computing developments initiated by U.S. Senator Al Gore.[26] Prior to the release of Mosaic,
graphics were not commonly mixed with text in web pages and the web's popularity was less than older
protocols in use over the Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic's
graphical user interface allowed the Web to become, by far, the most popular Internet protocol.

The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the European
Organization for Nuclear Research (CERN) in October 1994. It was founded at the Massachusetts
Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense
Advanced Research Projects Agency (DARPA), which had pioneered the Internet; a year later, a second
site was founded at INRIA (a French national computer research lab) with support from the European
Commission DG InfSo; and in 1996, a third continental site was created in Japan at Keio University. By
the end of 1994, while the total number of websites was still minute compared to present standards, quite
a number of notable websiteswere already active, many of which are the precursors or inspiration for
today's most popular services.

Connected by the existing Internet, other websites were created around the world, adding international
standards for domain names and HTML. Since then, Berners-Lee has played an active role in guiding the
development of web standards (such as the markup languages in which web pages are composed), and
has advocated his vision of aSemantic Web. The World Wide Web enabled the spread of information over
the Internet through an easy-to-use and flexible format. It thus played an important role in popularizing
use of the Internet.[27] Although the two terms are sometimes conflated in popular use, World Wide Web is
not synonymous with Internet.[28] The web is a collection of documents and both client and server software
using Internet protocols such as TCP/IP and HTTP.

Tim Berners-Lee was knighted in 2004 by Queen Elizabeth II for his contribution to the World Wide Web.

Function[edit]
The terms Internet and World Wide Web are often used in everyday speech without much distinction.
However, the Internet and the World Wide Web are not the same. The Internet is a global system of
interconnected computer networks. In contrast, the web is one of the services that runs on the Internet. It
is a collection of text documents and other resources, linked by hyperlinks and URLs, usually accessed
by web browsers from web servers. In short, the web can be thought of as an application "running" on the
Internet.[29]

Viewing a web page on the World Wide Web normally begins either by typing the URL of the page into
a web browser or by following a hyperlink to that page or resource. The web browser then initiates a
series of communication messages, behind the scenes, in order to fetch and display it. In the 1990s,
using a browser to view web pages—and to move from one web page to another through hyperlinks—
came to be known as 'browsing,' 'web surfing,' or 'navigating the web'. Early studies of this new behavior
investigated user patterns in using web browsers. One study, for example, found five user patterns:
exploratory surfing, window surfing, evolved surfing, bounded navigation and targeted navigation. [30]

The following example demonstrates how a web browser works. Consider accessing a page with the
URL http://example.org/wiki/World_Wide_Web.

First, the browser resolves the server-name portion of the URL (example.org) into an Internet Protocol
address using the globally distributed database known as the Domain Name System (DNS); this lookup
returns an IP address such as 208.80.152.2. The browser then requests the resource by sending
an HTTP request across the Internet to the computer at that particular address. It makes the request to a
particular application port in the underlying Internet Protocol Suite so that the computer receiving the
request can distinguish an HTTP request from other network protocols it may be servicing such as e-mail
delivery; the HTTP protocol normally uses port 80. The content of the HTTP request can be as simple as
the two lines of text GET /wiki/World_Wide_WebHTTP/1.1 Host: example.org

The computer receiving the HTTP request delivers it to web server software listening for requests on port
80. If the web server can fulfill the request it sends an HTTP response back to the browser indicating
success, which can be as simple as HTTP/1.0 200 OK Content-Type: text/html; charset=UTF-8 followed
by the content of the requested page. The Hypertext Markup Language for a basic web page looks like
<html> <head> <title>Example.org – The World Wide Web</title> </head> <body> <p>The World Wide
Web, abbreviated as WWW and commonly known ...</p> </body> </html>

The web browser parses the HTML, interpreting the markup (<title>, <p> for paragraph, and such) that
surrounds the words in order to draw the text on the screen.

Many web pages use HTML to reference the URLs of other resources such as images, other embedded
media, scripts that affect page behavior, and Cascading Style Sheets that affect page layout. The browser
will make additional HTTP requests to the web server for these other Internet media types. As it receives
their content from the web server, the browser progressively rendersthe page onto the screen as specified
by its HTML and these additional resources.

Linking[edit]
Most web pages contain hyperlinks to other related pages and perhaps to downloadable files, source
documents, definitions and other web resources. In the underlying HTML, a hyperlink looks like <a
href="http://example.org/wiki/Main_Page">Example.org, a free encyclopedia</a>

Graphic representation of a minute fraction of the WWW, demonstratinghyperlinks

Such a collection of useful, related resources, interconnected via hypertext links is dubbed a web of
information. Publication on the Internet created whatTim Berners-Lee first called the WorldWideWeb (in
its original CamelCase, which was subsequently discarded) in November 1990.[6]

The hyperlink structure of the WWW is described by the webgraph: the nodes of
the webgraph correspond to the web pages (or URLs) the directed edges between them to the hyperlinks.

Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced with
different content. This makes hyperlinks obsolete, a phenomenon referred to in some circles as link
rot and the hyperlinks affected by it are often called dead links. The ephemeral nature of the Web has
prompted many efforts to archive web sites. The Internet Archive, active since 1996, is the best known of
such efforts.
Dynamic updates of web pages[edit]
Main article: Ajax (programming)

JavaScript is a scripting language that was initially developed in 1995 by Brendan Eich, then of Netscape,
for use within web pages.[31] The standardised version is ECMAScript.[31] To make web pages more
interactive, some web applications also use JavaScript techniques such as Ajax(asynchronous JavaScript
and XML). Client-side script is delivered with the page that can make additional HTTP requests to the
server, either in response to user actions such as mouse movements or clicks, or based on lapsed time.
The server's responses are used to modify the current page rather than creating a new page with each
response, so the server needs only to provide limited, incremental information. Multiple Ajax requests can
be handled at the same time, and users can interact with the page while data is being retrieved. Web
pages may also regularlypoll the server to check whether new information is available. [32]

WWW prefix[edit]
Many hostnames used for the World Wide Web begin with www because of the long-standing practice of
naming Internet hosts (servers) according to the services they provide. The hostname for aweb server is
often www, in the same way that it may be ftp for an FTP server, and news or nntp for a USENET news
server. These host names appear as Domain Name System or (DNS)subdomain names, as
in www.example.com. The use of 'www' as a subdomain name is not required by any technical or policy
standard and many web sites do not use it; indeed, the first ever web server was
called nxoc01.cern.ch.[33] According to Paolo Palazzi,[34] who worked at CERN along with Tim Berners-
Lee, the popular use of 'www' subdomain was accidental; the World Wide Web project page was intended
to be published at www.cern.ch while info.cern.ch was intended to be the CERN home page, however the
dns records were never switched, and the practice of prepending 'www' to an institution's website domain
name was subsequently copied. Many established websites still use 'www', or they invent other
subdomain names such as 'www2', 'secure', etc.[citation needed]. Many such web servers are set up so that
both the domain root (e.g., example.com) and the www subdomain (e.g., www.example.com) refer to the
same site; others require one form or the other, or they may map to different web sites.

The use of a subdomain name is useful for load balancing incoming web traffic by creating a CNAME
record that points to a cluster of web servers. Since, currently, only a subdomain can be used in a
CNAME, the same result cannot be achieved by using the bare domain root. [citation needed]

When a user submits an incomplete domain name to a web browser in its address bar input field, some
web browsers automatically try adding the prefix "www" to the beginning of it and possibly ".com", ".org"
and ".net" at the end, depending on what might be missing. For example, entering 'microsoft' may be
transformed to http://www.microsoft.com/ and 'openoffice' tohttp://www.openoffice.org. This feature started
appearing in early versions of Mozilla Firefox, when it still had the working title 'Firebird' in early 2003,
from an earlier practice in browsers such asLynx.[35] It is reported that Microsoft was granted a US patent
for the same idea in 2008, but only for mobile devices. [36]

In English, www is usually read as double-u double-u double-u.[citation needed] Some users pronounce it dub-
dub-dub, particularly in New Zealand. Stephen Fry, in his "Podgrammes" series of podcasts, pronounces
it wuh wuh wuh.[citation needed] The English writer Douglas Adams once quipped in The Independent on
Sunday (1999): "The World Wide Web is the only thing I know of whose shortened form takes three times
longer to say than what it's short for".[citation needed] In Mandarin Chinese, World Wide Web is commonly
translated via a phono-semantic matching towàn wéi wǎng (万维网), which satisfies www and literally
means "myriad dimensional net",[37] a translation that very appropriately reflects the design concept and
proliferation of the World Wide Web. Tim Berners-Lee's web-space states that World Wide Web is
officially spelled as three separate words, each capitalised, with no intervening hyphens. [38]

Use of the www prefix is declining as Web 2.0 web applications seek to brand their domain names and
make them easily pronounceable.[39] As the mobile web grows in popularity, services
likeGmail.com, MySpace.com, Facebook.com and Twitter.com are most often discussed without adding
www to the domain (or, indeed, the .com).

Scheme specifiers: http and https[edit]


The scheme specifier http:// or https:// at the start of a web URI refers to Hypertext Transfer
Protocol or HTTP Secure respectively. Unlike www, which has no specific purpose, these specify the
communication protocol to be used for the request and response. The HTTP protocol is fundamental to
the operation of the World Wide Web and the added encryption layer in HTTPS is essential when
confidential information such as passwords or banking information are to be exchanged over the public
Internet. Web browsers usually prepend http:// to addresses too, if omitted.

Web servers[edit]
Main article: Web server

The primary function of a web server is to deliver web pages on the request to clients. This means
delivery of HTML documents and any additional content that may be included by a document, such as
images, style sheets and scripts.

Privacy[edit]
Main article: Internet privacy

Every time a web page is requested from a web server the server can identify, and usually it logs, the IP
address from which the request arrived. Equally, unless set not to do so, most web browsers record the
web pages that have been requested and viewed in a history feature, and usually cache much of the
content locally. Unless HTTPS encryption is used, web requests and responses travel in plain text across
the internet and they can be viewed, recorded and cached by intermediate systems.

When a web page asks for, and the user supplies, personally identifiable information such as their real
name, address, e-mail address, etc., then a connection can be made between the current web traffic and
that individual. If the website uses HTTP cookies, username and password authentication, or other
tracking techniques, then it will be able to relate other web visits, before and after, to the identifiable
information provided. In this way it is possible for a web-based organisation to develop and build a profile
of the individual people who use its site or sites. It may be able to build a record for an individual that
includes information about their leisure activities, their shopping interests, their profession, and other
aspects of their demographic profile. These profiles are obviously of potential interest to marketeers,
advertisers and others. Depending on the website's terms and conditions and the local laws that apply
information from these profiles may be sold, shared, or passed to other organisations without the user
being informed. For many ordinary people, this means little more than some unexpected e-mails in their
in-box, or some uncannily relevant advertising on a future web page. For others, it can mean that time
spent indulging an unusual interest can result in a deluge of further targeted marketing that may be
unwelcome. Law enforcement, counter terrorism and espionage agencies can also identify, target and
track individuals based on what appear to be their interests or proclivities on the web.

Social networking sites make a point of trying to get the user to truthfully expose their real names,
interests and locations. This makes the social networking experience more realistic and therefore
engaging for all their users. On the other hand, photographs uploaded and unguarded statements made
will be identified to the individual, who may regret some decisions to publish these data. Employers,
schools, parents and other relatives may be influenced by aspects of social networking profiles that the
posting individual did not intend for these audiences. On-line bullies may make use of personal
information to harass or stalk users. Modern social networking websites allow fine grained control of the
privacy settings for each individual posting, but these can be complex and not easy to find or use,
especially for beginners.[40]

Photographs and videos posted onto websites have caused particular problems, as they can add a
person's face to an on-line profile. With modern and potential facial recognition technology, it may then be
possible to relate that face with other, previously anonymous, images, events and scenarios that have
been imaged elsewhere. Because of image caching, mirroring and copying, it is difficult to remove an
image from the World Wide Web.

Intellectual property[edit]
This section does not cite any references or sources. Please help improve this section by adding
citations to reliable sources. Unsourced material may be challenged and removed. (January 2013)
Main article: Intellectual property

The intellectual property rights for any creative work initially rests with its creator. Web users who want to
publish their work onto the World Wide Web, however, need to be aware of the details of the way they do
it. If artwork, photographs, writings, poems, or technical innovations are published by their creator onto a
privately owned web server, then they may choose the copyright and other conditions freely themselves.
This is unusual though; more commonly work is uploaded to websites and servers that are owned by
other organizations. It depends upon the terms and conditions of the site or service provider to what
extent the original owner automatically signs over rights to their work by the choice of destination and by
the act of uploading.[citation needed]

Some users of the web erroneously assume that everything they may find online is freely available to
them as if it was in the public domain, which is not always the case. Content owners that are aware of this
widespread belief, may expect that their published content will probably be used in some capacity
somewhere without their permission. Some content publishers therefore embeddigital watermarks in their
media files, sometimes charging users to receive unmarked copies for legitimate use. Digital rights
management includes forms of access control technology that further limit the use of digital content even
after it has been bought or downloaded.[citation needed]

Security[edit]
The web has become criminals' preferred pathway for spreading malware. Cybercrime carried out on the
web can include identity theft, fraud, espionage and intelligence gathering.[41] Web-
basedvulnerabilities now outnumber traditional computer security concerns, [42][43] and as measured
by Google, about one in ten web pages may contain malicious code. [44] Most web-based attackstake
place on legitimate websites, and most, as measured by Sophos, are hosted in the United States, China
and Russia.[45] The most common of all malware threats is SQL injection attacks against websites.
[46]
Through HTML and URIs the web was vulnerable to attacks like cross-site scripting (XSS) that came
with the introduction of JavaScript[47] and were exacerbated to some degree by Web 2.0 and Ajax web
design that favors the use of scripts.[48] Today by one estimate, 70% of all websites are open to XSS
attacks on their users.[49]

Proposed solutions vary to extremes. Large security vendors like McAfee already design governance and
compliance suites to meet post-9/11 regulations,[50] and some, like Finjan have recommended active real-
time inspection of code and all content regardless of its source. [41] Some have argued that for enterprise
to see security as a business opportunity rather than a cost center, [51] "ubiquitous, always-on digital rights
management" enforced in the infrastructure by a handful of organizations must replace the hundreds of
companies that today secure data and networks.[52] Jonathan Zittrain has said users sharing responsibility
for computing safety is far preferable to locking down the Internet. [53]
Standards[edit]
Main article: Web standards

Many formal standards and other technical specifications and software define the operation of different
aspects of the World Wide Web, the Internet, and computer information exchange. Many of the
documents are the work of the World Wide Web Consortium (W3C), headed by Berners-Lee, but some
are produced by the Internet Engineering Task Force (IETF) and other organizations.

Usually, when web standards are discussed, the following publications are seen as foundational:

 Recommendations for markup languages,


especially HTML and XHTML, from the W3C. These define the
structure and interpretation of hypertext documents.
 Recommendations for stylesheets, especially CSS, from the W3C.

 Standards for ECMAScript (usually in the form of JavaScript),


from Ecma International.

 Recommendations for the Document Object Model, from W3C.

Additional publications provide definitions of other essential technologies for the World Wide Web,
including, but not limited to, the following:

 Uniform Resource Identifier (URI), which is a universal system for


referencing resources on the Internet, such as hypertext documents
and images. URIs, often called URLs, are defined by the IETF's RFC
3986 / STD 66: Uniform Resource Identifier (URI): Generic Syntax, as
well as its predecessors and numerous URI scheme-defining RFCs;
 HyperText Transfer Protocol (HTTP), especially as defined by RFC
2616: HTTP/1.1 and RFC 2617: HTTP Authentication, which specify
how the browser and server authenticate each other.

Accessibility[edit]
Main article: Web accessibility

There are methods available for accessing the web in alternative mediums and formats, so as to enable
use by individuals with disabilities. These disabilities may be visual, auditory, physical, speech related,
cognitive, neurological, or some combination therin. Accessibility features also help others with temporary
disabilities like a broken arm or the aging population as their abilities change. [54] The Web is used for
receiving information as well as providing information and interacting with society. The World Wide Web
Consortium claims it essential that the Web be accessible in order to provide equal access and equal
opportunity to people with disabilities.[55] Tim Berners-Lee once noted, "The power of the Web is in its
universality. Access by everyone regardless of disability is an essential aspect." [54] Many countries
regulate web accessibility as a requirement for websites.[56] International cooperation in the W3C Web
Accessibility Initiative led to simple guidelines that web content authors as well as software developers
can use to make the Web accessible to persons who may or may not be using assistive technology.[54][57]

Internationalization[edit]
The W3C Internationalization Activity assures that web technology will work in all languages, scripts, and
cultures.[58] Beginning in 2004 or 2005, Unicode gained ground and eventually in December 2007
surpassed both ASCII and Western European as the Web's most frequently used character encoding.
[59]
Originally RFC 3986 allowed resources to be identified by URI in a subset of US-ASCII. RFC
3987 allows more characters—any character in the Universal Character Set—and now a resource can be
identified by IRI in any language.[60]

Statistics[edit]
Between 2005 and 2010, the number of web users doubled, and was expected to surpass two billion in
2010.[61] Early studies in 1998 and 1999 estimating the size of the web using capture/recapture methods
showed that much of the web was not indexed by search engines and the web was much larger than
expected.[62][63] According to a 2001 study, there were a massive number, over 550 billion, of documents
on the Web, mostly in the invisible Web, or Deep Web.[64] A 2002 survey of 2,024 million web
pages[65] determined that by far the most web content was in the English language: 56.4%; next were
pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more recent study, which used web
searches in 75 different languages to sample the web, determined that there were over 11.5 billion web
pages in the publicly indexable web as of the end of January 2005.[66] As of March 2009, the indexable
web contains at least 25.21 billion pages.[67] On 25 July 2008, Google software engineers Jesse Alpert
and Nissan Hajaj announced that Google Search had discovered one trillion unique URLs.[68] As of May
2009, over 109.5 million domains operated.[69][not in citation given] Of these 74% were commercial or other
domains operating in the .com generic top-level domain.[69]

Statistics measuring a website's popularity are usually based either on the number of page views or on
associated server 'hits' (file requests) that it receives.

Speed issues[edit]
Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow
browsing has led to a pejorative name for the World Wide Web: the World Wide Wait.[70]Speeding up the
Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to
reduce the congestion can be found at W3C.[71] Guidelines for web response times are:[72]

 0.1 second (one tenth of a second). Ideal response time. The user does
not sense any interruption.
 1 second. Highest acceptable response time. Download times above 1
second interrupt the user experience.

 10 seconds. Unacceptable response time. The user experience is


interrupted and the user is likely to leave the site or system.

Caching[edit]
Main article: Web cache

If a user revisits a web page after only a short interval, the page data may not need to be re-obtained from
the source web server. Almost all web browsers cache recently obtained data, usually on the local hard
drive. HTTP requests sent by a browser will usually ask only for data that has changed since the last
download. If the locally cached data are still current, they will be reused. Caching helps reduce the
amount of web traffic on the Internet. The decision about expiration is made independently for each
downloaded file, whether image, stylesheet, JavaScript, HTML, or other web resource. Thus even on
sites with highly dynamic content, many of the basic resources need to be refreshed only occasionally.
Web site designers find it worthwhile to collate resources such as CSS data and JavaScript into a few
site-wide files so that they can be cached efficiently. This helps reduce page download times and lowers
demands on the Web server.

There are other components of the Internet that can cache web content. Corporate and
academic firewalls often cache Web resources requested by one user for the benefit of all. (See
alsocaching proxy server.) Some search engines also store cached content from websites. Apart from the
facilities built into web servers that can determine when files have been updated and so need to be re-
sent, designers of dynamically generated web pages can control the HTTP headers sent back to
requesting users, so that transient or sensitive pages are not cached. Internet bankingand news sites
frequently use this facility. Data requested with an HTTP 'GET' is likely to be cached if other conditions
are met; data obtained in response to a 'POST' is assumed to depend on the data that was POSTed and
so is not cached.

Web search engine


From Wikipedia, the free encyclopedia

"Search engine" redirects here. For a tutorial on using search engines for research, see WP:Search engine test. For other uses,

see Search engine (disambiguation).

A web search engine is a software system that is designed to search for information on the World Wide Web. The search results

are generally presented in a line of results often referred to assearch engine results pages (SERPs). The information may be a

specialist in web pages, images, information and other types of files. Some search engines also mine data available

in databasesor open directories. Unlike web directories, which are maintained only by human editors, search engines also

maintain real-time information by running an algorithm on a web crawler.

Contents

[hide]

 1 History

 2 How web search engines work

 3 Market share

 4 Search engine bias

 5 Customized results and filter bubbles

 6 See also

 7 References

 8 Further reading

 9 External links

History[edit]

Timeline (full list)

Year Engine Current status

1993 W3Catalog Inactive

Aliweb Inactive
JumpStation Inactive

1994 WebCrawler Active, Aggregator

Go.com Active, Yahoo Search

Lycos Active

Infoseek Inactive

1995 AltaVista Inactive, redirected to Yahoo!

Daum Active

Magellan Inactive

Excite Active

SAPO Active

Yahoo! 2008 Active, Launched as a directory

1996 Dogpile Active, Aggregator

Inktomi Inactive, acquired by Yahoo!

HotBot Active (lycos.com)

Ask Jeeves Active (rebranded ask.com)

1997 Northern Light Inactive

Yandex Active

1998 Goto Inactive


Google Active

MSN Search Active as Bing

empas Inactive (merged with NATE)

1999 AlltheWeb Inactive (URL redirected to Yahoo!)

GenieKnows Active, rebranded Yellowee.com

Naver Active

Teoma Active

Vivisimo Inactive

2000 Baidu Active

Exalead Active

Gigablast Active

2002 Inktomi Acquired by Yahoo!

2003 Info.com Active

Scroogle Inactive

2004 Yahoo! Search Active, Launched own web search

(see Yahoo! Directory, 1995)

A9.com Inactive

Sogou Active
2005 AOL Search Active

Ask.com Active

GoodSearch Active

SearchMe Inactive

2006 wikiseek Inactive

Quaero Active

Ask.com Active

Live Search Active as Bing, Launched as

rebranded MSN Search

ChaCha Active

Guruji.com Active as BeeMP3.com

2007 wikiseek Inactive

Sproose Inactive

Wikia Search Inactive

Blackle.com Active, Google Search

2008 Powerset Inactive (redirects to Bing)

Picollator Inactive

Viewzi Inactive
Boogami Inactive

LeapFish Inactive

Forestle Inactive (redirects to Ecosia)

DuckDuckGo Active

2009 Bing Active, Launched as

rebranded Live Search

Yebol Inactive

Mugurdy Inactive due to a lack of funding

Goby Active

NATE Active

2010 Blekko Active

Cuil Inactive

Yandex Active, Launched global

(English) search

2011 YaCy Active, P2P web search engine

2012 Volunia Active

Cloud Kite Active,

formerly Open Drive cloud search

During early development of the web, there was a list of webservers edited by Tim Berners-Lee and hosted on

the CERN webserver. One historical snapshot of the list in 1992 remains,[1] but as more and more webservers went online the

central list could no longer keep up. On theNCSA site, new servers were announced under the title "What's New!"[2]
The very first tool used for searching on the Internet was Archie.[3] The name stands for "archive" without the "v". It was created in

1990 by Alan Emtage, Bill Heelan and J. Peter Deutsch, computer science students at McGill University in Montreal. The program

downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol) sites, creating a

searchable database of file names; however, Archie did not index the contents of these sites since the amount of data was so limited

it could be readily searched manually.

The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to two new search

programs, Veronica and Jughead. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica

(Very Easy Rodent-Oriented Net-wide Index toComputerized Archives) provided a keyword search of most Gopher menu titles in

the entire Gopher listings. Jughead (Jonzy's Universal GopherHierarchy Excavation And Display) was a tool for obtaining menu

information from specific Gopher servers. While the name of the search engine "Archie" was not a reference to the Archie comic

book series, "Veronica" and "Jughead" are characters in the series, thus referencing their predecessor.

In the summer of 1993, no search engine existed for the web, though numerous specialized catalogues were maintained by

hand. Oscar Nierstrasz at the University of Geneva wrote a series of Perl scripts that periodically mirrored these pages and rewrote

them into a standard format. This formed the basis for W3Catalog, the web's first primitive search engine, released on September 2,

1993.[4]

In June 1993, Matthew Gray, then at MIT, produced what was probably the first web robot, the Perl-based World Wide Web

Wanderer, and used it to generate an index called 'Wandex'. The purpose of the Wanderer was to measure the size of the World

Wide Web, which it did until late 1995. The web's second search engine Aliweb appeared in November 1993. Aliweb did not use

a web robot, but instead depended on being notified by website administrators of the existence at each site of an index file in a

particular format.

JumpStation (created in December 1993[5] by Jonathon Fletcher) used a web robot to find web pages and to build its index, and

used a web formas the interface to its query program. It was thus the first WWW resource-discovery tool to combine the three

essential features of a web search engine(crawling, indexing, and searching) as described below. Because of the limited resources

available on the platform it ran on, its indexing and hence searching were limited to the titles and headings found in the web pages

the crawler encountered.

One of the first "all text" crawler-based search engines was WebCrawler, which came out in 1994. Unlike its predecessors, it

allowed users to search for any word in any webpage, which has become the standard for all major search engines since. It was

also the first one widely known by the public. Also in 1994, Lycos (which started at Carnegie Mellon University) was launched and

became a major commercial endeavor.

Soon after, many search engines appeared and vied for popularity. These included Magellan, Excite, Infoseek, Inktomi, Northern

Light, andAltaVista. Yahoo! was among the most popular ways for people to find web pages of interest, but its search function

operated on its web directory, rather than its full-text copies of web pages. Information seekers could also browse the directory

instead of doing a keyword-based search.


Google adopted the idea of selling search terms in 1998, from a small search engine company named goto.com. This move had a

significant effect on the SE business, which went from struggling to one of the most profitable businesses in the internet. [6]

In 1996, Netscape was looking to give a single search engine an exclusive deal as the featured search engine on Netscape's web

browser. There was so much interest that instead Netscape struck deals with five of the major search engines: for $5 million a year,

each search engine would be in rotation on the Netscape search engine page. The five engines were Yahoo!, Magellan, Lycos,

Infoseek, and Excite.[7][8]

Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s.
[9]
Several companies entered the market spectacularly, receiving record gains during their initial public offerings. Some have taken

down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine

companies were caught up in the dot-com bubble, a speculation-driven market boom that peaked in 1999 and ended in 2001.

Around 2000, Google's search engine rose to prominence.[10] The company achieved better results for many searches with an

innovation calledPageRank. This iterative algorithm ranks web pages based on the number and PageRank of other web sites and

pages that link there, on the premise that good or desirable pages are linked to more than others. Google also maintained a

minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal. In fact,

Google search engine became so popular that spoof engines emerged such as Mystery Seeker.

By 2000, Yahoo! was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002,

and Overture (which ownedAlltheWeb and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it

launched its own search engine based on the combined technologies of its acquisitions.

Microsoft first launched MSN Search in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display

listings fromLooksmart, blended with results from Inktomi. For a short time in 1999, MSN Search used results from AltaVista were

instead. In 2004,Microsoft began a transition to its own search technology, powered by its own web crawler (called msnbot).

Microsoft's rebranded search engine, Bing, was launched on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft finalized a deal

in whichYahoo! Search would be powered by Microsoft Bing technology.

In 2012, following the April 24 release of Google Drive, Google released the Beta version of Open Drive (available as a Chrome

app) to enable the search of files in the cloud . Open Drive has now been rebranded as Cloud Kite. Cloud Kite is advertised as a

"collective encyclopedia project based on Google Drive public files and on the crowd sharing, crowd sourcing and crowd-solving

principles". Cloud Kite will also return search results from other cloud storage content services including Dropbox, SkyDrive,

Evernote and Box.[11]

How web search engines work[edit]


This section possibly contains original research. Please improve it by verifying the claims made and
adding inline citations. Statements consisting only of original research may be removed. (October 2012)
This article needs additional citations for verification. Please help improve this article by adding
citations to reliable sources. Unsourced material may be challenged and removed. (July 2013)

A search engine operates in the following order:

1. Web crawling

2. Indexing

3. Searching[12]

Web search engines work by storing information about many web pages, which they retrieve from the HTML markup of the pages.

These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web crawler which follows every

link on the site. The site owner can exclude specific pages by usingrobots.txt.

The search engine then analyzes the contents of each page to determine how it should be indexed (for example, words can be

extracted from the titles, page content, headings, or special fields called meta tags). Data about web pages are stored in an index

database for use in later queries. A query from a user can be a single word. The index helps find information relating to the query as

quickly as possible.[12] Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as

information about the web pages, whereas others, such as AltaVista, store every word of every page they find.[citation needed] This

cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the

content of the current page has been updated and the search terms are no longer in it. [12] This problem might be considered a mild

form of linkrot, and Google's handling of it increasesusability by satisfying user expectations that the search terms will be on the

returned webpage. This satisfies the principle of least astonishment, since the user normally expects that the search terms will be on

the returned pages. Increased search relevance makes these cached pages very useful as they may contain data that may no

longer be available elsewhere.[citation needed]

High-level architecture of a standard Web crawler

When a user enters a query into a search engine (typically by using keywords), the engine examines its index and provides a listing

of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes

parts of the text. The index is built from the information stored with the data and the method by which the information is indexed.
[12]
From 2007 the Google.com search engine has allowed one to search by date by clicking 'Show search tools' in the leftmost

column of the initial search results page, and then selecting the desired date range.[citation needed] Most search engines support the use

of the boolean operators AND, OR and NOT to further specify the search query. Boolean operators are for literal searches that allow

the user to refine and extend the terms of the search. The engine looks for the words or phrases exactly as entered. Some search

engines provide an advanced feature called proximity search, which allows users to define the distance between keywords.[12]There

is also concept-based searching where the research involves using statistical analysis on pages containing the words or phrases

you search for. As well, natural language queries allow the user to type a question in the same form one would ask it to a human. A

site like this would be ask.com.[citation needed]

The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of web

pages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most

search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are

the best matches, and what order the results should be shown in, varies widely from one engine to another. [12] The methods also

change over time as Internet usage changes and new techniques evolve. There are two main types of search engine that have

evolved: one is a system of predefined and hierarchically ordered keywords that humans have programmed extensively. The other

is a system that generates an "inverted index" by analyzing texts it locates. This first form relies much more heavily on the computer

itself to do the bulk of the work.

Most Web search engines are commercial ventures supported by advertising revenue and thus some of them allow advertisers

to have their listings ranked higher in search results for a fee. Search engines that do not accept money for their search results

make money by running search related ads alongside the regular search engine results. The search engines make money every

time someone clicks on one of these ads.[13]

Market share[edit]

This section requires expansion with:


Information about national search
engines
like StatCounter, Yandex,Naver and
their market share in respective
countries.. (October 2011)

Search engine Market share in May 2011 Market share in December 2010[14]

Google 82.80% 84.65%

Yahoo! 6.42% 6.69%


Search engine Market share in May 2011 Market share in December 2010[14]

Baidu 4.89% 3.39%

Bing 3.91% 3.29%

Yandex 1.7% 1.3%

Ask 0.52% 0.56%

AOL 0.3% 0.42%

Google's worldwide market share peaked at 86.3% in April 2010.[15] Yahoo!, Bing and other search engines are more popular in the

US than in Europe.

According to Hitwise, market share in the USA for October 2011 was Google 65.38%, Bing-powered (Bing and Yahoo!) 28.62%, and

the remaining 66 search engines 6%. However, an Experian Hit wise report released in August 2011 gave the "success rate" of

searches sampled in July. Over 80 percent of Yahoo! and Bing searches resulted in the users visiting a web site, while Google's rate

was just under 68 percent.[16][17]

In the People's Republic of China, Baidu held a 61.6% market share for web search in July 2009.[18] In Russian

Federation, Yandex holds around 60% of the market share as of April 2012.[19] In July 2013 Google controls 84% Global & 88% US

market share for web search.[20] In South Korea, Naver (Hangul: 네이버) is a popular search portal, which holds a market share of

over 70% at least since 2011,[21] continuing to 2013.[22]

Search engine bias[edit]

Although search engines are programmed to rank websites based on some combination of their popularity and relevancy, empirical
[23] [24]
studies indicate various political, economic, and social biases in the information they provide. These biases can be a direct

result of economic and commercial processes (e.g., companies that advertise with a search engine can become also more popular
[25]
in its organic search results), and political processes (e.g., the removal of search results to comply with local laws).

Biases can also be a result of social processes, as search engine algorithms are frequently designed to exclude non-normative

viewpoints in favor of more "popular" results.[26] Indexing algorithms of major search engines skew towards coverage of U.S.-based

sites, rather than websites from non-U.S. countries.[24] Major search engines' search algorithms also privilege misinformation and

pornographic portrayals of women, people of color, and members of the LGBT community.[27][28]
Google Bombing is one example of an attempt to manipulate search results for political, social or commercial reasons.

Customized results and filter bubbles[edit]

Many search engines such as Google and Bing provide customized results based on the user's activity history. This leads to an

effect that has been called a filter bubble. The term describes a phenomenon in which websites use algorithms to selectively guess

what information a user would like to see, based on information about the user (such as location, past click behaviour and search

history). As a result, websites tend to show only information that agrees with the user's past viewpoint, effectively isolating the user

in a bubble that tends to exclude contrary information. Prime examples are Google's personalized search results and Facebook's

personalized news stream. According to Eli Pariser, who coined the term, users get less exposure to conflicting viewpoints and are

isolated intellectually in their own informational bubble. Pariser related an example in which one user searched Google for "BP" and

got investment news about British Petroleum while another searcher got information about the Deepwater Horizon oil spill and that

the two search results pages were "strikingly different."[29][30][31] The bubble effect may have negative implications for civic discourse,

according to Pariser.[32]

Since this problem has been identified, competing search engines have emerged that seek to avoid this problem by not

tracking[33] or "bubbling"[34] users.

See also[edit]

Uniform resource locator


From Wikipedia, the free encyclopedia
"URL" redirects here. For other uses, see URL (disambiguation).

This Euler diagram shows that a uniform resource identifier (URI) is either a uniform resource locator (URL), or a uniform
resource name (URN), or both.

A uniform resource locator, abbreviated URL, also known as web address, is a specific character
string that constitutes a reference to a resource. In most web browsers, the URL of a web page is
displayed on top inside an address bar. An example of a typical URL would
be"http://en.example.org/wiki/Main_Page". A URL is technically a type of uniform resource identifier (URI),
but in many technical documents and verbal discussions, URL is often used as a synonym for URI, and
this is not considered a problem.[1]
Contents

[hide]

 1 History

 2 Syntax

 3 List of allowed URL characters

 4 Relationship to URI

 5 Internet hostnames

 6 Modern usage

 7 See also

 8 Notes

 9 References

 10 External links

History[edit]
The Uniform Resource Locator was standardized in 1994 [2] by Tim Berners-Lee and the URI working
group of the Internet Engineering Task Force (IETF) as an outcome of collaboration started at the IETF
Living Documents "Birds of a Feather" session in 1992.[3][4] The format combines the pre-existing system
of domain names (created in 1985) with file path syntax, where slashes are used to
separate directory and file names. Conventions already existed where server names could be prepended
to complete file paths, preceded by a double-slash (//). [5]

Berners-Lee later regretted the use of dots to separate the parts of the domain name within URIs, wishing
he had used slashes throughout.[5] For example,http://www.example.com/path/to/name would
have been written http:com/example/www/path/to/name. Berners-Lee has also said that, given the
colon following the URI scheme, the two slashes before the domain name were also unnecessary. [6]

Syntax[edit]
Main article: URI scheme#Generic syntax

Every URL consists of the following:


 the scheme name (commonly called protocol), then
 a colon, two slashes,[note 1], then

 a host, normally given as a domain name[note 2] but sometimes as a


literal IP address, then

 optionally a port number, then

 the full path of the resource

The scheme says how to connect, the host specifies where to connect, and the remainder
specifies what to ask for.

For programs such as Common Gateway Interface (CGI) scripts, this is followed by a query string,[7][8] and
an optional fragment identifier.[9]

The syntax is:


scheme://domain:port/path?query_string#fragment_id

 The scheme name defines the namespace, purpose, and the syntax of
the remaining part of the URL. Software will try to process a URL
according to its scheme and context. For example, aweb browser will
usually dereference the URL http://example.org:80 by
performing an HTTP request to the host at example.org, using port
number 80. The URLmailto:[email protected] may start an e-
mail composer with the address [email protected] in the To field.

Other examples of scheme names include https, gopher, wais, ftp. URLs with https as a scheme (such
as https://example.com/) require that requests and responses will be made over a secure
connection to the website. Some schemes that require authentication allow a username, and perhaps a
password too, to be embedded in the URL, for exampleftp://[email protected]. Passwords
embedded in this way are not conducive to security, but the full possible syntax is
scheme://username:password@domain:port/path?query_string#fragment_id

 The domain name or literal numeric IP address gives the destination


location for the URL. A literal numeric IPv6 address may be given, but
must be enclosed in [ ] e.g.[db8:0cec::99:123a].

The domain google.com, or its numeric IP


address 72.14.207.99, is the address of Google's website.
 The domain name portion of a URL is not case sensitive
since DNS ignores case:

http://en.example.org/ and HTTP://EN.EXAMPLE.ORG/ b


oth open the same page.

 The port number, given in decimal, is optional; if omitted, the default for
the scheme is used.

For example, http://vnc.example.com:5800 connects to port


5800 of vnc.example.com, which may be appropriate for
a VNC remote control session. If the port number is omitted for an
http: URL, the browser will connect on port 80, the default HTTP
port. The default port for an https: request is 443.

 The path is used to specify and perhaps find the resource requested. It
is case-sensitive,[10] though it may be treated as case-insensitive by
some servers, especially those based on Microsoft Windows.

If the server is case sensitive


and http://en.example.org/wiki/URL is correct,
then http://en.example.org/WIKI/URL or http://en.exa
mple.org/wiki/url will display an HTTP 404 error page, unless
these URLs point to valid resources themselves.

 The query string contains data to be passed to software running on the


server. It may contain name/value pairs separated by ampersands, for
example

?first_name=John&last_name=Doe.

 The fragment identifier, if present, specifies a part or a position within


the overall resource or document.

When used with HTML, it usually specifies a section or location


within the page, and used in combination with Anchor Tags the
browser is scrolled to display that part of the page.
List of allowed URL characters[edit]
Unreserved

May be encoded but it is not necessary


A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 - _ . ~

Reserved

Have to be encoded sometimes

! * ' ( ) ; : @ & = + $ , / ? % # [ ]

Further details can for example be found in RFC 3986 and http://www.w3.org/Addressing/URL/uri-
spec.html.

Relationship to URI[edit]
See also: URIs, Relationship to URL and URN

A URL is a URI that, in addition to identifying a web resource, provides a means of locating the resource
by describing its "primary access mechanism (e.g., its network location)". [11]

Internet hostnames[edit]
Main article: Hostname

On the Internet, a hostname is a domain name assigned to a host computer. This is usually a combination
of the host's local name with its parent domain's name. For example, en.example.org consists of a local
hostname (en) and the domain name example.org. The hostname is translated into an IP address via the
local hosts file, or the domain name system (DNS) resolver. It is possible for a single host computer to
have several hostnames; but generally the operating system of the host prefers to have one hostname
that the host uses for itself.

Any domain name can also be a hostname, as long as the restrictions mentioned below are followed. For
example, both "en.example.org" and "example.org" can be hostnames if they both haveIP
addresses assigned to them. The domain name "xyz.example.org" may not be a hostname if it does not
have an IP address, but "aa.xyz.example.org" may still be a hostname. All hostnames are domain names,
but not all domain names are hostnames.

Modern usage[edit]
Major computer manufacturers such as Apple have begun to deprecate APIs that take local paths as
parameters, in favour of using URLs.[12] This is because remote and local resources (via thefile:// scheme)
may both be represented using a URL, but may additionally provide a protocol (particularly useful for
remote items) and credentials.
Dynamic HTML
From Wikipedia, the free encyclopedia
This article is written like a manual or guidebook. Please help rewrite this article from a
descriptive, neutral point of view, and remove advice or instruction. (December 2008)

HTML

 HTML and HTML5; HTML editor

 Dynamic HTML

 XHTML

 XHTML Basic (Mobile)

 XHTML Mobile Profile and C-HTML

 HTML element

 Span and div

 HTML attribute

 Character encodings; Unicode

 Language code

 Document Object Model

 Browser Object Model

 Style sheets and CSS

 Font family and Web colors

 HTML scripting and JavaScript

 W3C, WHATWG, and validator

 Quirks mode

 HTML Frames
 HTML5 Canvas, WebGL, and WebCL

 HTML5 Audio and HTML5 video

 Web storage

 Web browser (layout) engine

 Comparison of

 document markup languages

 web browsers

 layout engine support for

 HTML; Non-standard HTML

 XHTML (1.1)

 HTML5; HTML5 canvas,

 HTML5 media (Audio, Video)

 V

 T

 E

Dynamic HTML, or DHTML, is an umbrella term for a collection of technologies used together to create
interactive and animated web sites[1] by using a combination of a static markup language (such as HTML),
a client-side scripting language (such as JavaScript), a presentation definition language (such as CSS),
and the Document Object Model.[2]

DHTML allows scripting languages to change variables in a web page's definition language, which in turn
affects the look and function of otherwise "static" HTML page content, after the page has been fully
loaded and during the viewing process. Thus the dynamic characteristic of DHTML is the way it functions
while a page is viewed, not in its ability to generate a unique page with each page load.

By contrast, a dynamic web page is a broader concept, covering any web page generated differently for
each user, load occurrence, or specific variable values. This includes pages created by client-side
scripting, and ones created by server-side scripting (such as PHP, Perl, JSP or ASP.NET) where the web
server generates content before sending it to the client.
DHTML is differentiated from Ajax by the fact that a DHTML page is still request/reload-based. With
DHTML, there may not be any interaction between the client and server after the page is loaded; all
processing happens in JavaScript on the client side. By contrast, an Ajax page uses features of DHTML
to initiate a request (or 'subrequest') to the server to perform actions such as loading more content.

Contents

[hide]

 1 Uses

 2 Structure of a web page

 3 Example: Displaying an additional block of text

 4 Document Object Model

 5 Dynamic styles

 6 Data binding

 7 References

 8 External links

Uses[edit]
DHTML allows authors to add effects to their pages that are otherwise difficult to achieve. In short words:
scripting language is changing the DOM and style. For example, DHTML allows the page author to:

 Animate text and images in their document, independently moving each


element from any starting point to any ending point, following a
predetermined path or one chosen by the user.
 Embed a ticker that automatically refreshes its content with the latest
news, stock quotes, or other data.

 Use a form to capture user input, and then process, verify and respond
to that data without having to send data back to the server.

 Include rollover buttons or drop-down menus.

A less common use is to create browser-based action games. Although a number of games were created
using DHTML during the late 1990s and early 2000s,[citation needed], differences between browsers made this
difficult: many techniques had to be implemented in code to enable the games to work on multiple
platforms. Recently browsers have been converging towards the web standards, which has made the
design of DHTML games more viable. Those games can be played on all major browsers and they can
also be ported to Plasma for KDE, Widgets for Mac OS Xand Gadgets for Windows Vista, which are
based on DHTML code.

The term "DHTML" has fallen out of use in recent years as it was associated with practices and
conventions that tended to not work well between various web browsers. DHTML may now be referred to
as unobtrusive JavaScript coding (DOM Scripting), in an effort to place an emphasis on agreed-upon best
practices while allowing similar effects in an accessible, standards-compliant way.

DHTML support with extensive DOM access was introduced with Internet Explorer 4.0. Although there
was a basic dynamic system with Netscape Navigator 4.0, not all HTML elements were represented in the
DOM. When DHTML-style techniques became widespread, varying degrees of support among web
browsers for the technologies involved made them difficult to develop anddebug. Development became
easier when Internet Explorer 5.0+, Mozilla Firefox 2.0+, and Opera 7.0+ adopted a
shared DOM inherited from ECMAscript.

More recently, JavaScript libraries such as jQuery have abstracted away much of the day-to-day
difficulties in cross-browser DOM manipulation.

Structure of a web page[edit]


See also: DOM events

Typically a web page using DHTML is set up in the following way:


<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>DHTML example</title>
</head>
<body>
<div id="navigation"></div>

<script>
var init = function () {
myObj = document.getElementById("navigation");
// ... manipulate myObj
};
window.onload = init;
</script>
<!--
Often the code is stored in an external file; this is done
by linking the file that contains the JavaScript.
This is helpful when several pages use the same script:
-->
<script src="myjavascript.js"></script>
</body>
</html>
Example: Displaying an additional block of text[edit]
The following code illustrates an often-used function. An additional part of a web page will only be
displayed if the user requests it..
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Using a DOM function</title>
<style>
a {background-color:#eee;}
a:hover {background:#ff0;}
#toggleMe {background:#cfc; display:none; margin:30px 0;
padding:1em;}
</style>
</head>
<body>
<h1>Using a DOM function</h1>

<h2><a id="showhide" href="#">Show paragraph</a></h2>

<p id="toggleMe">This is the paragraph that is only displayed on


request.</p>

<p>The general flow of the document continues.</p>

<script>
changeDisplayState = function (id) {
var d = document.getElementById('showhide'),
e = document.getElementById(id);
if (e.style.display === 'none' || e.style.display === '') {
e.style.display = 'block';
d.innerHTML = 'Hide paragraph';
} else {
e.style.display = 'none';
d.innerHTML = 'Show paragraph';
}
};
document.getElementById('showhide').onclick = function () {
changeDisplayState('toggleMe');
return false;
};
</script>
</body>
</html>
Document Object Model[edit]
DHTML is not a technology in and of itself; rather, it is the product of three related and complementary
technologies: HTML, Cascading Style Sheets (CSS), and JavaScript. To allow scripts and components to
access features of HTML and CSS, the contents of the document are represented as objects in a
programming model known as the Document Object Model (DOM).

The DOM API is the foundation of DHTML, providing a structured interface that allows access and
manipulation of virtually anything in the document. The HTML elements in the document are available as
a hierarchical tree of individual objects, meaning you can examine and modify an element and its
attributes by reading and setting properties and by calling methods. The text between elements is also
available through DOM properties and methods.

The DOM also provides access to user actions such as pressing a key and clicking the mouse. You can
intercept and process these and other events by creating event handler functions and routines. The event
handler receives control each time a given event occurs and can carry out any appropriate action,
including using the DOM to change the document.

Dynamic styles[edit]
Dynamic styles are a key feature of DHTML. By using CSS, you can quickly change the appearance and
formatting of elements in a document without adding or removing elements. This helps keep your
documents small and the scripts that manipulate the document fast.

The object model provides programmatic access to styles. This means you can change inline styles on
individual elements and change style rules using simple JavaScript programming.

Inline styles are CSS style assignments that have been applied to an element using the style attribute.
You can examine and set these styles by retrieving the style object for an individual element. For
example, to highlight the text in a heading when the user moves the mouse pointer over it, you can use
the style object to enlarge the font and change its color, as shown in the following simple example.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Dynamic Styles</title>
<style>
ul {display:none;}
</style>
</head>

<body>
<h1>Welcome to Dynamic HTML</h1>

<p><a href="#">Dynamic styles are a key feature of DHTML.</a></p>

<ul>
<li>Change the color, size, and typeface of text</li>
<li>Show and hide text</li>
<li>And much, much more</li>
</ul>

<p>We've only just begun!</p>

<script>
showMe = function () {
document.getElementsByTagName("h1")[0].style.color =
"#990000";
document.getElementsByTagName("ul")[0].style.display =
"block";
};

document.getElementsByTagName("a")[0].onclick = function (e) {


e.preventDefault();
showMe();
};
</script>
</body>
</html>
Data binding[edit]
Data binding is a DHTML feature that lets you easily bind individual elements in your document to data
from another source, such as a database or comma-delimited text file. When the document is loaded, the
data is automatically retrieved from the source and formatted and displayed within the element.
One practical way to use data binding is to automatically and dynamically generate tables in your
document. You can do this by binding a table element to a data source. When the document is viewed, a
new row is created in the table for each record retrieved from the source, and the cells of each row are
filled with text and data from the fields of the record. Because this generation is dynamic, the user can
view the page while new rows are created in the table. Additionally, once all the table data is present, you
can manipulate (sort or filter) the data without requiring the server to send additional data. The table is
regenerated, using the previously retrieved data to fill the new rows and cells of the table.

Another practical use of data binding is to bind one or more elements in the document to specific fields of
a given record. When the page is viewed, the elements are filled with text and data from the fields in that
record, sometimes called the "current" record. An example is a form letter in which the name, e-mail
address, and other details about an individual are filled from a database. To adapt the letter for a given
individual, you specify which record should be the current record. No other changes to the letter are
needed.

Yet another practical use is to bind the fields in a form to fields in a record. Not only can the user view the
content of the record, but the user can also change that content by changing the settings and values of
the form. The user can then submit these changes so that the new data is uploaded to the source—for
example, to the HTTP server or database.

To provide data binding in your documents, you must add a data source object (DSO) to your document.
This invisible object is an ActiveX control or Java applet that knows how to communicate with the data
source. The following example shows how easy it is to bind a table to a DSO. When viewed, this example
displays the first three fields from all the comma-delimited records of the file "sampdata.csv" in a clear,
easy-to-read table.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Data Binding Example</title>
<style>
td, th {border:1px solid;}
</style>
</head>

<body>
<h1>Data Binding Example</h1>

<object classid="clsid:333C7BC4-460F-11D0-BC04-0080C7055A83"
id="sampdata">
<param name="DataURL" value="sampdata.csv">
<param name="UseHeader" value="True">
</object>

<table datasrc="#sampdata">
<thead>
<tr>
<th>A</th>
<th>B</th>
<th>C</th>
</tr>
</thead>

<!-- Fields will not display without the accompanying CSV file.
-->
<tbody>
<tr>
<td><span datafld="a"></span></td>
<td><span datafld="b"></span></td>
<td><span datafld="c"></span></td>
</tr>
</tbody>
</table>
</body>
</html>
History of Wikipedia
From Wikipedia, the free encyclopedia

The English edition of Wikipedia has grown to 4,395,990 articles, equivalent toover 1,900 print volumes of theEncyclopaedia
Britannica. Including all language editions, Wikipedia has over 30.2 million articles, [1] equivalent to over 13,000 print
volumes.

Wikipedia was formally launched on 15 January 2001 by Jimmy Wales and Larry Sanger, but its technological and conceptual

underpinnings predate this. The earliest known proposal for an online encyclopedia was made by Rick Gates in 1993,[2] but the
concept of a free-as-in-freedom online encyclopedia (as distinct from mere open source or freemium)[3] was proposed by Richard

Stallman in December 2000.[4]

Crucially, Stallman's concept specifically included the idea that no central organization should control editing. This latter "massively

multiplayer" characteristic was in stark contrast to contemporary digital encyclopedias such as Microsoft Encarta, Encyclopedia

Britannica and even Bomis's Nupedia, which was Wikipedia's direct predecessor. In 2001, the license for Nupedia was changed

to GFDL, and Wales and Sanger launched Wikipedia using the concept and technology of a wiki pioneered in 1995 by Ward

Cunningham.[5] Initially, Wikipedia was intended to complement Nupedia, an online encyclopedia project edited solely by experts, by

providing additional draft articles and ideas for it. In practice, Wikipedia quickly overtook Nupedia, becoming a global project in

multiple languages and inspiring a wide range of other online reference projects.

As of December 2013, Wikipedia includes over 30.3 million freely usable articles in 287 languages[1] that have been written by over

43 million registered users and numerous anonymous contributors worldwide.[6][7][8] According to Alexa Internet, Wikipedia is now the

fith-most-popular website as of December 2012 . According to AllThingsD [1], Wikipedia receives over 85 million monthly unique

visitors from the United States alone.[9]

Contents

[hide]

 1 Historical overview

o 1.1 Background

o 1.2 Formulation of the concept

o 1.3 Founding of Wikipedia

o 1.4 Namespaces, subdomains, and internationalization

o 1.5 Development of Wikipedia

o 1.6 Organization

o 1.7 Evolution of logo

 2 Timeline

o 2.1 2000

o 2.2 2001

o 2.3 2002
o 2.4 2003

o 2.5 2004

o 2.6 2005

o 2.7 2006

o 2.8 2007

o 2.9 2008

o 2.10 2009

o 2.11 2010

o 2.12 2011

o 2.13 2012

o 2.14 2013

 3 History by subject area

o 3.1 Hardware and software

o 3.2 Look and feel

o 3.3 Internal structures

o 3.4 The Wikimedia Foundation and legal structures

o 3.5 Projects and milestones

o 3.6 Fundraising

o 3.7 External impact

 3.7.1 Effect of biographical articles

o 3.8 Early roles of Wales and Sanger

o 3.9 Controversies

o 3.10 Notable forks and derivatives


o 3.11 Publication on other media

o 3.12 Lawsuits

 4 See also

 5 References

 6 External links

o 6.1 Wikipedia records and archives

o 6.2 Third party

Historical overview[edit]
Background[edit]
The concept of the world's knowledge in a single location dates to the ancient Libraries of Alexandria and Pergamum, but the

modern concept of a general-purpose, widely distributed, printedencyclopedia originated with Denis Diderot and the 18th-century

French encyclopedists. The idea of using automated machinery beyond the printing press to build a more useful encyclopedia can

be traced to Paul Otlet's book Traité de documentation (1934; Otlet also founded the Mundaneum institution in 1910), H. G. Wells'

book of essays World Brain (1938) and Vannevar Bush's future vision of the microfilm based Memex in As We May Think (1945).
[10]
Another milestone was Ted Nelson's hypertext design Project Xanadu, begun in 1960.[10]

While previous encyclopedias, notably the Encyclopædia Britannica, were book-based, Microsoft's Encarta, published in 1993, was

available on CD-ROM and hyperlinked. With the development of the web, many people attempted to develop Internet encyclopedia

projects. An early proposal was Interpedia in 1993 by Rick Gates;[2] but this project died before generating any encyclopedic

content. Free software proponent Richard Stallman described the usefulness of a "Free Universal Encyclopedia and Learning

Resource" in 1999.[4] His published document "aims to lay out what the free encyclopedia needs to do, what sort of freedoms it

needs to give the public, and how we can get started on developing it." On 17 January 2001, two days after the start of Wikipedia,

theFree Software Foundation's (FSF) GNUPedia project went online, competing with Nupedia,[11] but today the FSF encourages

people "to visit and contribute to [Wikipedia]".[12]

Formulation of the concept[edit]


Wikipedia was initially conceived as a feeder project for Nupedia, an earlier project to produce a free online encyclopedia,

volunteered by Bomis, a web-advertising firm owned by Jimmy Wales,Tim Shell and Michael E. Davis.[13][14][15] Nupedia was founded

upon the use of highly qualified volunteer contributors and an elaborate multi-step peer review process. Despite its mailing list of

interested editors, and the presence of a full-time editor-in-chief, Larry Sanger, a graduate philosophy student hired by Wales,[16] the

writing of content for Nupedia was extremely slow, with only 12 articles written during the first year. [15]
Wales and Sanger discussed various ways to create content more rapidly.[14] The idea of a wiki-based complement originated from a

conversation between Larry Sanger and Ben Kovitz.[17][18][19]Ben Kovitz was a computer programmer and regular on Ward

Cunningham's revolutionary wiki "the WikiWikiWeb". He explained to Sanger what wikis were, at that time a difficult concept to

understand, over a dinner on 2 January 2001.[17][18][19][20] Wales first stated, in October 2001, that "Larry had the idea to use Wiki

software",[21] though he later stated in December 2005 that Jeremy Rosenfeld, a Bomis employee, introduced him to the concept. [22]
[23][24][25]
Sanger thought a wiki would be a good platform to use, and proposed on the Nupedia mailing list that a wiki based

upon UseModWiki (then v. 0.90) be set up as a "feeder" project for Nupedia. Under the subject "Let's make a wiki", he wrote:

No, this is not an indecent proposal. It's an idea to add a little feature to Nupedia. Jimmy Wales thinks that many people might find

the idea objectionable, but I think not. (…) As to Nupedia's use of a wiki, this is the ULTIMATE "open" and simple format for

developing content. We have occasionally bandied about ideas for simpler, more open projects to either replace or supplement

Nupedia. It seems to me wikis can be implemented practically instantly, need very little maintenance, and in general are very low-

risk. They're also a potentially great source for content. So there's little downside, as far as I can determine.

Wales set one up and put it online on 10 January 2001.[26]

Founding of Wikipedia[edit]
There was considerable resistance on the part of Nupedia's editors and reviewers to the idea of associating Nupedia with a wiki-

style website. Sanger suggested giving the new project its own name, Wikipedia, and Wikipedia was soon launched on its own

domain, wikipedia.com, on 15 January 2001. The bandwidth and server (located in San Diego) used for these initial projects

were donated by Bomis. Many former Bomis employees later contributed content to the encyclopedia: notably Tim Shell, co-founder

and later CEO of Bomis, and programmer Jason Richey.

In December 2008, Wales stated that he made Wikipedia's first edit, a test edit with the text "Hello, World!". [27] The oldest article still

preserved is the article UuU, created on 16 January 2001, at 21:08 UTC.[28][29] The existence of the project was formally announced

and an appeal for volunteers to engage in content creation was made to the Nupedia mailing list on 17 January. [30]

The UuU edit, the first edit that is still preserved on Wikipedia to this day, as it appears using the Nostalgia skin.

The project received many new participants after being mentioned on the Slashdot website in July 2001,[31] with two minor mentions

in March 2001.[32][33] It then received a prominent pointer to a story on the community-edited technologies and culture

website Kuro5hin on 25 July.[34] Between these relatively rapid influxes of traffic, there had been a steady stream of traffic from other
sources, especially Google, which alone sent hundreds of new visitors to the site every day. Its first major mainstream

media coverage was in the New York Times on 20 September 2001.[35]

The project gained its 1,000th article around 12 February 2001, and reached 10,000 articles around 7 September. In the first year of

its existence, over 20,000 encyclopedia entries were created – a rate of over 1,500 articles per month. On 30 August 2002, the

article count reached 40,000.

Wikipedia's earliest edits were long believed lost, since the original UseModWiki software deleted old data after about a month. On

the eve of Wikipedia's 10th anniversary, 14 December 2010, developer Tim Starling found backups on SourceForge containing

every change made to Wikipedia from its creation in January 2001 to 17 August 2001.[36]

Namespaces, subdomains, and internationalization[edit]


Early in Wikipedia's development, it began to expand internationally, with the creation of new namespaces, each with a distinct set

of usernames. The first subdomain created for a non-English Wikipedia was deutsche.wikipedia.com (created on 16 March 2001,

01:38 UTC),[37] followed after a few hours by Catalan.wikipedia.com (at 13:07 UTC).[38] The Japanese Wikipedia, started

as nihongo.wikipedia.com, was created around that period,[39][40] and initially used only Romanized Japanese. For about two months

Catalan was the one with the most articles in a non-English language,[41][42] although statistics of that early period are imprecise.
[43]
The French Wikipedia was created on or around 11 May 2001,[44] in a wave of new language versions that also

included Chinese, Dutch, Esperanto, Hebrew, Italian, Portuguese, Russian, Spanish, and Swedish.[45] These languages were soon

joined by Arabic[46] andHungarian.[47][48] In September 2001, an announcement pledged commitment to the multilingual provision of

Wikipedia,[49] notifying users of an upcoming roll-out of Wikipedias for all major languages, the establishment of core standards, and

a push for the translation of core pages for the new wikis. At the end of that year, when international statistics first began to be

logged,Afrikaans, Norwegian, and Serbian versions were announced.[50]

In January 2002, 90% of all Wikipedia articles were in English. By January 2004, fewer than 50% were English, and this

internationalization has continued to increase as the encyclopedia grows. As of 2013, around 85% of all Wikipedia articles are

contained within non-English Wikipedia versions.[1]

Development of Wikipedia[edit]

A screenshot of Wikipedia's main page on 28 September 2002.

In March 2002, following the withdrawal of funding by Bomis during the dot-com bust, Larry Sanger left both Nupedia and Wikipedia.
[51]
By 2002, Sanger and Wales differed in their views on how best to manage open encyclopedias. Both still supported the open-
collaboration concept, but the two disagreed on how to handle disruptive editors, specific roles for experts, and the best way to

guide the project to success.

Wales, a believer in "hands off" executive management,[citation needed] went on to establish self-governance and bottom-up self-direction

by editors on Wikipedia. He made it clear that he would not be involved in the community's day-to-day management, but would

encourage it to learn to self-manage and find its own best approaches. As of 2007, Wales mostly restricts his own role to occasional

input on serious matters, executive activity, advocacy of knowledge, and encouragement of similar reference projects.

Sanger says he is an "inclusionist" and is open to almost anything.[52] He proposed that experts still have a place in the Web

2.0 world. He returned briefly to academia, then joined the Digital Universe Foundation. In 2006, Sanger founded Citizendium, an

open encyclopedia that used real names for contributors in an effort to reduce disruptive editing, and hoped to facilitate "gentle

expert guidance" to increase the accuracy of its content. Decisions about article content were to be up to the community, but the site

was to include a statement about "family-friendly content".[53] He stated early on that he intended to leave Citizendium in a few years,

by which time the project and its management would presumably be established.[54]

Organization[edit]
The Wikipedia project has grown rapidly in the course of its life, at several levels. Content has grown organically through the

addition of new articles, new wikis have been added in English and non-English languages, and entire new projects replicating these

growth methods in other related areas (news, quotations, reference books and so on) have been founded as well. Wikipedia itself

has grown, with the creation of the Wikimedia Foundation to act as an umbrella body and the growth of software and policies to

address the needs of the editorial community. These are documented below:

Evolution of logo[edit]

Foundation – 6 December 2001


6 December 2001 – 12 October 2003

13 October 2003 – 13 May 2010

13 May 2010 – present

Timeline[edit]

Articles summarizing each year are held within the Wikipedia project namespace

and are linked to below. Additional resources for research are available within the

Wikipedia records and archives, and are listed at the end of this article.

2000[edit]

The Bomis staff in the summer of 2000.

In March 2000, the Nupedia project was started. Its intention was to publish articles

written by experts which would be licensed as free content. Nupedia was founded by
Jimmy Wales, with Larry Sanger as editor-in-chief, and funded by the web-advertising

company Bomis.[55]

2001[edit]
In January 2001, Wikipedia began as a side-project of Nupedia, to allow collaboration on

articles prior to entering the peer-review process.


[56]
Thewikipedia.com and wikipedia.org domain names were registered on 12 January

2001[57] and 13 January 2001,[58] respectively, with wikipedia.org being brought online on

the same day.[59] The project formally opened on 15 January ("Wikipedia Day"), with the

first international Wikipedias – the French, German, Catalan, Swedish, and Italian

editions – being created between March and May. The "neutral point of view" (NPOV)

policy was officially formulated at this time, and Wikipedia's first slashdotter wave arrived

on 26 July.[31] The first media report about Wikipedia appeared in August 2001 in the

newspaper Wales on Sunday.[60] The September 11 attacks spurred the appearance of

breaking news stories on the homepage, as well as information boxes linking related

articles.[61]

2002[edit]
2002 saw the end of funding for Wikipedia from Bomis and the departure of Larry

Sanger. The forking of the Spanish Wikipedia also took place with the establishment of

the Enciclopedia Libre. The first portable MediaWiki software went live on 25 January.
[dubious – discuss]
Bots were introduced, Jimmy Wales confirmed that Wikipedia would never

run commercial advertising, and the first sister project (Wiktionary) and first

formal Manual of Style were launched. A separate board of directors to supervise the

project was proposed and initially discussed at Meta-Wikipedia.

2003[edit]
The English Wikipedia passed 100,000 articles in 2003, while the next largest edition, the

German Wikipedia, passed 10,000. The Wikimedia Foundation was established, and

Wikipedia adopted its jigsaw world logo. Mathematical formulae using TeX were

reintroduced to the website. The first Wikipedian social meeting took place in Munich,

Germany, in October. The basic principles of Wikipedia's Arbitration system and

committee (known colloquially as "ArbCom") were developed, mostly by Florence

Devouard, Fred Bauder and other early Wikipedians.

2004[edit]
The worldwide Wikipedia article pool continued to grow rapidly in 2004, doubling in size

in 12 months, from under 500,000 articles in late 2003 to over 1 million in over 100

languages by the end of 2004. The English Wikipedia accounted for just under half of

these articles. The website's server farms were moved

from California to Florida, Categories and CSS style configuration sheets were

introduced, and the first attempt to block Wikipedia occurred, with the website being

blocked in China for two weeks in June. The formal election of a board and Arbitration

Committee began. The first formal projects were proposed to deliberately balance

content and seek out systemic bias arising from Wikipedia's community structure.

Bourgeois v. Peters,[62] (11th Cir. 2004), a court case decided by the United States Court

of Appeals for the Eleventh Circuit was one of the earliest court opinions to cite and quote

Wikipedia.[citation needed] It stated: "We also reject the notion that the Department of

Homeland Security's threat advisory level somehow justifies these searches. Although

the threat level was "elevated" at the time of the protest, "to date, the threat level has

stood at yellow (elevated) for the majority of its time in existence. It has been raised to

orange (high) six times."[62]

2005[edit]
In 2005, Wikipedia became the most popular reference website on the Internet, according

to Hitwise, with the English Wikipedia alone exceeding 750,000 articles. Wikipedia's first

multilingual and subject portals were established in 2005. A formal fundraiser held in the

first quarter of the year raised almost US$100,000 for system upgrades to handle

growing demand. China again blocked Wikipedia in October 2005.

The first major Wikipedia scandal occurred in 2005, when a well-known figure was found

to have a vandalized biography which had gone unnoticed for months. In the wake of this

and other concerns,[63] the first policy and system changes specifically designed to

counter this form of abuse were established. These included a new Checkuser privilege

policy update to assist in sock puppetry investigations, a new feature called semi-

protection, a more strict policy on biographies of living people and the tagging of such

articles for stricter review. A restriction of new article creation to registered users only was

put in place in December 2005.[64]

2006[edit]
The English Wikipedia gained its one-millionth article, Jordanhill railway station, on 1

March 2006. The first approved Wikipedia article selection was made freely available to

download, and "Wikipedia" became registered as a trademark of the Wikimedia


Foundation. The congressional aides biography scandals – multiple incidents in which

congressional staffers and a campaign manager were caught trying to covertly alter

Wikipedia biographies – came to public attention, leading to the resignation of the

campaign manager. Nonetheless, Wikipedia was rated as one of the top 2006 global

brands.[65]

Jimmy Wales indicated at Wikimania 2006 that Wikipedia had achieved sufficient volume

and calls for an emphasis on quality, perhaps best expressed in the call for 100,000

feature-quality articles. A new privilege, "oversight", was created, allowing specific

versions of archived pages with unacceptable content to be marked as non-viewable.

Semi-protection against anonymous vandalism, introduced in 2005, proved more popular

than expected, with over 1,000 pages being semi-protected at any given time in 2006.

2007[edit]
Wikipedia continued to grow rapidly in 2007, possessing over 5 million registered editor

accounts by 13 August.[66] The 250 language editions of Wikipedia contained a combined

total of 7.5 million articles, totalling 1.74 billion words in approximately 250 languages, by

13 August.[67] The English Wikipedia gained articles at a steady rate of 1,700 a day, [68] with

the wikipedia.org domain name ranked the 10th-busiest in the world. Wikipedia continued

to garner visibility in the press – the Essjay controversy broke when a prominent member

of Wikipedia was found to have lied about his credentials. Citizendium, a competing

online encyclopedia, launched publicly. A new trend developed in Wikipedia, with the

encyclopedia addressing people whose notability stemmed from being a participant in a

news story by adding a redirect from their name to the larger story, rather than creating a

distinct biographical article.[69] On 9 September 2007, the English Wikipedia gained its

two-millionth article, El Hormiguero.[70] There was some controversy in late 2007 when

the Volapük Wikipedia jumped from 797 to over 112,000 articles, briefly becoming the

15th-largest Wikipedia edition, due to automated stub generation by an enthusiast for the

Volapük constructed language.[71][72]

According to the MIT Technology Review, the number of regularly active editors on the

English-language Wikipedia peaked in 2007 at more than 51,000, and has since been

declining.[73]

2008[edit]
Various WikiProjects in many areas continued to expand and refine article contents within

their scope. In April 2008, the 10-millionth Wikipedia article was created, and by the end

of the year the English Wikipedia exceeded 2.5 million articles.


2009[edit]
By late August 2009, the number of articles in all Wikipedia editions had exceeded 14

million.[6] The three-millionth article on the English Wikipedia, Beate Eriksen, was created

on 17 August 2009 at 04:05 UTC.[74] On 27 December 2009, the German

Wikipedia exceeded one million articles, becoming the second edition after the English

Wikipedia to do so. A TIME article listed Wikipedia among 2009's best websites.[75]

The Arbitration Committee of the English Wikipedia decided in May 2009 to restrict

access to its site from Church of Scientology IP addresses, to prevent self-serving edits

by Scientologists.[76][77][78] A "host of anti-Scientologist editors" were also topic-banned.[77]


[78]
The committee concluded that both sides had "gamed policy" and resorted to

"battlefield tactics", with articles on living persons being the "worst casualties".
[77]
Wikipedia content became licensed under Creative Commons in 2009.

2010[edit]
On 24 March, the European Wikipedia servers went offline due to an overheating

problem. Failover to servers in Florida turned out to be broken, causing DNS resolution

for Wikipedia to fail across the world. The problem was resolved quickly, but due to DNS

caching effects, some areas were slower to regain access to Wikipedia than others.[79][80]

On 13 May, the site released a new interface. New features included an updated logo,

new navigation tools, and a link wizard.[81] However, the classic interface remained

available for those who wished to use it. On 12 December, the English Wikipedia passed

the 3.5-million-article mark, while the French Wikipedia's millionth article was created on

21 September. The 1-billionth Wikimedia project edit was performed on 16 April.[82]

2011[edit]

One of many cakes made to celebrate Wikipedia's 10th anniversary [83] in 2011.
Wikipedia and its users held hundreds of celebrations worldwide to commemorate the

site's 10th anniversary on 15 January.[84] The site began efforts to expand its growth in

India, holding its first Indian conference in Mumbai in November 2011.[85][86] The English

Wikipedia passed the 3.6-million-article mark on 2 April, and reached 3.8 million articles

on 18 November. On 7 November 2011, the German Wikipedia exceeded 100 million

page edits, becoming the second language edition to do so after the English edition,

which attained 500 million page edits on 24 November 2011. The Dutch

Wikipedia exceeded 1 million articles on 17 December 2011, becoming the fourth

Wikipedia edition to do so.

Between 4 and 6 October 2011, the Italian Wikipedia became intentionally inaccessible in

protest against the Italian Parliament's proposed DDL intercettazionilaw, which, if

approved, would allow any person to force websites to remove information that is

perceived as untrue or offensive, without the need to provide evidence. [87]

Also in October 2011, Wikimedia announced the launch of Wikipedia Zero, an initiative to

enable free mobile access to Wikipedia in developing countries through partnerships with

mobile operators.[88][89]

2012[edit]
On 16 January, Wikipedia co-founder Jimmy Wales announced that the English

Wikipedia would shut down for 24 hours on 18 January as part of a protest meant to call

public attention to the proposed Stop Online Piracy Act and PROTECT IP Act, two anti-

piracy laws under debate in the United States Congress. Calling the blackout a

"community decision", Wales and other opponents of the laws believed that they would

endanger free speech and online innovation.[90] A similar blackout was staged on 10 July

by the Russian Wikipedia, in protest against a proposed Russian internet regulation law.
[91]

In late March 2012, the Wikimedia Foundation announced Wikidata, a universal platform

for sharing data between all Wikipedia language editions.[92] The US$1.7-million Wikidata

project was partly funded by Google, the Gordon and Betty Moore Foundation, and the

Allen Institute for Artificial Intelligence.[93] Wikimedia Deutschland assumed responsibility

for the first phase of Wikidata, and initially planned to make the platform available to

editors by December 2012. Wikidata's first phase became fully operational in March

2013.[94][95]

In April 2012, Justin Knapp from Indianapolis, Indiana, became the first single contributor

to make over one million edits to Wikipedia.[96][97] The founder of Wikipedia, Jimmy Wales,
congratulated Knapp for his work and presented him with the site's Special

Barnstar medal and the Golden Wiki award for his achievement.[98] Wales also declared

that 20 April would be "Justin Knapp Day".[99]

On 13 July 2012, the English Wikipedia gained its 4-millionth article, Izbat al-Burj.[100] In

October 2012, historian and Wikipedia editor Richard Jensen opined that the English

Wikipedia was "nearing completion", noting that the number of regularly active editors

had fallen significantly since 2007, despite Wikipedia's rapid growth in article count and

readership.[101]

2013[edit]
As of December 2013, Wikipedia is the world's sixth-most-popular website according

to Alexa Internet,[102] and is the largest general-knowledge encyclopedia online, with a

combined total of over 30.3 million mainspace articles across all 287 language editions.
[1]
It is estimated that Wikipedia receives more than 10 billion global pageviews every

month,[103] and attracts over 85 million unique monthly visitors from the United States

alone,[9] where it is the eighth-most-popular site.[102] On average, the Main Page of

the English Wikipedia alone receives approximately 8 million global pageviews every day.
[104]

On 22 January 2013, the Italian Wikipedia became the fifth language edition of Wikipedia

to exceed 1 million articles, while the Russian and Spanish Wikipedias gained their

millionth articles in May. The Swedish and the Polish Wikipedias gained their millionth

articles a few months later, becoming the eighth and ninth Wikipedia editions to do so. On

27 January, the main belt asteroid274301 was officially renamed "Wikipedia" by

the Committee for Small Body Nomenclature.[105] The first phase of

the Wikidata database, automatically providing interlanguage links and other data,

became available for all language editions in March 2013.[95] In April 2013, the French

secret service was accused of attempting to censor Wikipedia by threatening a Wikipedia

volunteer with arrest unless "classified information" about a military radio station was

deleted.[106] In July, the VisualEditor editing system was launched, forming the first stage

of an effort to allow articles to be edited with a word processor-like interface instead of

using wikimarkup.[107] An editor specifically designed for mobile was also launched.

History by subject area[edit]


Hardware and software[edit]
Main article: MediaWiki
The software that runs Wikipedia, and the computer hardware, server farms and

other systems upon which Wikipedia relies.


In January 2001, Wikipedia ran on UseModWiki, written in Perl by Clifford

Adams. The server has run on Linux to this day, although the original text was

stored in files rather than in a database. Articles were named with

the CamelCase convention.



In January 2002, "Phase II" of the wiki software powering Wikipedia was

introduced, replacing the older UseModWiki. Written specifically for the project

by Magnus Manske, it included a PHPwiki engine.


In July 2002, a major rewrite of the software powering Wikipedia went live;

dubbed "Phase III", it replaced the older "Phase II" version, and

became MediaWiki. It was written by Lee Daniel Crocker in response to the

increasing demands of the growing project.


In October 2002, Derek Ramsey started to use a "bot", or program, to add a

large number of articles about United States towns; these articles were

automatically generated from U.S. censusdata. Occasionally, similar bots had

been used before for other topics. These articles were generally well received,

but some users criticized them for their initial uniformity and writing style (for

example, see this version of an original bot-generated town article, and

compare to current version).


In January 2003, support for mathematical formulas in TeX was added. The

code was contributed by Tomasz Wegrzanowski.


9 June 2003 – ISBNs in articles now link to Special:Booksources, which

fetches its contents from the user-editable page Wikipedia:Book sources.

Before this, ISBN link targets were coded into the software and new ones

were suggested on the Wikipedia:ISBN page. See the edit that changed this.


After 6 December 2003, various system messages shown to Wikipedia users

were no longer hard coded, allowing Wikipedia administrators to modify

certain parts of MediaWiki's interface, such as the message shown to blocked

users.

On 12 February 2004, server operations were moved from San

Diego, California to Tampa, Florida.[108]


On 29 May 2004, all the various websites were updated to a new version of

the MediaWiki software.


On 30 May 2004, the first instances of "categorization" entries appeared.

Category schemes, like Recent Changes and Edit This Page, had existed from

the founding of Wikipedia. However, Larry Sanger had viewed the schemes as

lists, and even hand-entered articles, whereas the categorization effort

centered on individual categorization entries in each article of the

encyclopedia, as part of a larger automatic categorization of the articles of the

encyclopedia.[109]


After 3 June 2004, administrators could edit the style of the interface by

changing the CSS in the monobook stylesheet at MediaWiki:Monobook.css.


Also on 30 May 2004, with MediaWiki 1.3, the Template namespace was

created, allowing transclusion of standard texts.[110]


On 7 June 2005 at 3:00 a.m. Eastern Standard Time, the bulk of the

Wikimedia servers were moved to a new facility across the street. All

Wikimedia projects were down during this time.


In March 2013, the first phase of the Wikidata interwiki database became

available across Wikipedia's language editions.[95]


In July 2013, the VisualEditor editing interface was inaugurated, allowing users

to edit Wikipedia using a WYSIWYG text editor (similar to a word processor)

instead of wikimarkup.[107]

Look and feel[edit]


The external face of Wikipedia, its look and feel, and the Wikipedia branding, as

presented to users.


On 4 April 2002, BrilliantProse, since renamed to Featured Articles,
[111]
was moved to the Wikipedia namespace from the article namespace.

Around 15 October 2003, a new Wikipedia logo was installed. The logo

concept was selected by a voting process,[112] which was followed by a


revision process to select the best variant. The final selection was

created by David Friedland (who edits Wikipedia under the

username "nohat") based on a logo design and concept created by Paul

Stansifer.


On 22 February 2004, Did You Know (DYK) made its first Main Page

appearance.


On 23 February 2004, a coordinated new look for the Main Page

appeared at 19:46 UTC. Hand-chosen entries for the Daily Featured

Article, Anniversaries, In the News, and Did You Know rounded out the

new look.


On 10 January 2005, the multilingual portal at www.wikipedia.org was set

up, replacing a redirect to the English-language Wikipedia.


On 5 February 2005, Portal:Biology was created, becoming the first

thematic "portal" on the English Wikipedia.[113] However, the concept was

pioneered on the German Wikipedia, wherePortal:Recht (law studies)

was set up in October 2003.[114]


On 16 July 2005, the English Wikipedia began the practice of including

the day's "featured pictures" on the Main Page.


On 19 March 2006, following a vote, the Main Page of the English-

language Wikipedia featured its first redesign in nearly two years.


On 13 May 2010, the site released a new interface. New features

included an updated logo, new navigation tools, and a link wizard. [81] The

"classic" Wikipedia interface remained available as an option.

Internal structures[edit]
Landmarks in the Wikipedia community, and the development of its

organization, internal structures, and policies.


April 2001, Wales formally defines the "neutral point of view",
[115]
Wikipedia's core non-negotiable editorial policy,[116] a

reformulation of the "Lack of Bias" policy outlined by Sanger for


Nupedia[117] in spring or summer 2000, which covered many of the

same core principles.[118]



In September 2001, collaboration by subject matter

in WikiProjects is introduced.[119]


In February 2002, concerns over the risk of future censorship and

commercialization by Bomis Inc (Wikipedia's original host)

combined with a lack of guarantee this would not happen, led most

participants of the Spanish Wikipedia to break away and establish it

independently as the Enciclopedia Libre.[120] Following clarification

of Wikipedia's status and non-commercial nature later that year, re-

merger talks between Enciclopedia Libre and the re-founded

Spanish Wikipedia occasionally took place in 2002 and 2003, but

no conclusion was reached. As of October 2009, the two continue

to coexist as substantial Spanish language reference sources, with

around 43,000 articles (EL) and 520,000 articles (Sp.W)


[121]
respectively.


Also in 2002, policy and style issues were clarified with the creation

of the Manual of Style, along with a number of other policies and

guidelines.[122]


November 2002 – new mailing lists for WikiEN and Announce are

set up, as well as other language mailing lists (e.g. Polish), to

reduce the volume of traffic on mailing lists.[123]


In July 2003, the rule against editing one's autobiography is

introduced.[124]


On 28 October 2003, the first "real" meeting of Wikipedians

happened in Munich. Many cities followed suit, and soon a number

of regular Wikipedian get-togethers were established around the

world. Several Internet communities, including one on the

popular blog website LiveJournal, have also sprung up since.


From 10 July to 30 August 2004

the Wikipedia:Browse and Wikipedia:Browse by overview formerly

on the Main Page were replaced by links to overviews. On 27


August 2004 the Community Portal was started,[125] to serve as a

focus for community efforts. These were previously accomplished

on an informal basis, by individual queries of the Recent Changes,

in wiki style, as ad-hoc collaborations between like-minded editors.


During September to December 2005 following the Seigenthaler

controversy and other similar concerns,[63] several anti-abuse

features and policies were added to Wikipedia. These were:


The policy for "Checkuser" (a MediaWiki extension to assist detection of abuse

via internet sock-puppetry) was established in November 2005.[126] Checkuser

function had previously existed, but was viewed more as a system tool at the

time, so there had been no need for a policy covering use on a more routine

basis.[127]


Creation of new pages on the English Wikipedia was restricted to editors who

had created a user account.[128]


The introduction and rapid adoption of the policy Wikipedia:Biographies of

living people, giving a far tighter quality control and fact-check system to

biographical articles related to living people.


The "semi-protection" function and policy,[129] allowing pages to be protected so

that only those with an account could edit.


In May 2006, a new "oversight" feature was introduced on the

English Wikipedia, allowing a handful of highly trusted users

to permanently erase page revisions containing copyright

infringements or libelous or personal information from a

page's history. Previous to this, page version deletion was

laborious, and also deleted versions remained visible to other

administrators and could be un-deleted by them.


On 1 January 2007, the subcommunity

named Esperanza was disbanded by communal consent.

Esperanza had begun as an effort to promote "wikilove" and a

social support network, but had developed its own subculture

and private structures.[130][131] Its disbanding was described as


the painful but necessary remedy for a project that had

allowed editors to "see themselves as Esperanzans first and

foremost".[131] A number of Esperanza's subprojects were

integrated back into Wikipedia as free-standing projects, but

most of them are now inactive. When the group was founded

in September 2005, there had been concerns expressed that

it would eventually be condemned as such.[132]


In April 2007 the results of 4 months policy review by a

working group of several hundred editors seeking to merge

the core Wikipedia policies into one core policy

(See: Wikipedia:Attribution) was polled for community support.

The proposal did not gain consensus; a significant view

became evident that the existing structure of three strong

focused policies covering the respective areas of policy, was

frequently seen as more helpful to quality control than one

more general merged proposal.


A one-day closure of Wikipedia was called by Jimmy Wales on

18 January 2012, in conjunction with Google and over 7,000

other websites, to protest the Stop Online Piracy Act then

under consideration by the United States Congress.

The Wikimedia Foundation and legal


structures[edit]
Legal and organizational structure of the Wikimedia Foundation, its executive, and

its activities as a foundation.


In August 2002, shortly after Jimmy Wales announced

that he would never run commercial advertisements on

Wikipedia, the URL of Wikipedia was changed

from wikipedia.com towikipedia.org (see: .com and .org).



On 20 June 2003, the Wikimedia Foundation was

founded.


Communications committee was formed in January 2006

to handle media inquiries and emails received for the


foundation and Wikipedia via the newly

implemented OTRS (a ticket handling system).


Angela Beesley and Florence Nibart-Devouard were

elected to the Board of Trustees of the Wikimedia

Foundation. During this time, Angela was active in

editing content and setting policy, such as privacy policy,

within the Foundation.[133]


On 10 January 2006, Wikipedia became a registered

trademark of Wikimedia Foundation.[134]


In July 2006, Angela Beesley resigned from the board of

the Wikimedia Foundation.[135]


In June 2006, Brad Patrick was hired to be the first

executive director of the Foundation. He resigned in

January 2007, and was later replaced by Sue Gardner

(June 2007).


In October 2006, Florence Nibart-Devouard became

chair of the board of Wikimedia Foundation.

Projects and milestones[edit]


Main pages: Wikipedia:Statistics and List of Wikipedias

Sister projects and milestones related to articles, user base, and other statistics.


On 15 January 2001, the first recorded edit of

Wikipedia was performed.



In December 2002, the first sister

project, Wiktionary, was created; aiming to produce

a dictionary and thesaurus of the words in all

languages. It uses the same software as Wikipedia.


On 22 January 2003, the English Wikipedia was

again slashdotted after having reached

the 100,000 article milestone with the Hastings,

New Zealand article. Two days later, the German-


language Wikipedia, the largest non-English

language version, passed the 10,000 article mark.


On 20 June 2003, the same day that the Wikimedia

Foundation was founded, "Wikiquote" was created.

A month later, "Wikibooks" was launched.

"Wikisource" was set up towards the end of the

year.


In January 2004, Wikipedia reached the 200,000-

article milestone in English with the article on Neil

Warnock, and reached 450,000 articles for both

English and non-English Wikipedias. The next

month, the combined article count of the English

and non-English reached 500,000.


On 20 April 2004, the article count of the English

Wikipedia reached 250,000.


On 7 July 2004, the article count of the English

Wikipedia reached 300,000.


On 20 September 2004, Wikipedia reached one

million articles in over 105 languages, and received

a flurry of related attention in the press.[136] The one

millionth article was published in theHebrew

Wikipedia, and discusses the flag of Kazakhstan.


On 20 November 2004, the article count of the

English Wikipedia reached 400,000.


On 18 March 2005, Wikipedia passed the 500,000-

article milestone in English, with Involuntary

settlements in the Soviet Union being announced in

a press release as the landmark article.[137]


In May 2005, Wikipedia became the most popular

reference website on the Internet according to


traffic monitoring company Hitwise,

relegating Dictionary.com to second place.


On 29 September 2005, the English Wikipedia

passed the 750,000-article mark.


On 1 March 2006, the English Wikipedia passed

the 1,000,000-article mark, with Jordanhill railway

station being announced on the Main Page as the

milestone article.[138]


On 8 June 2006, the English Wikipedia passed

the 1,000-featured-article mark, with Iranian

peoples.[139]


On 15 August 2006, the Wikimedia Foundation

launched Wikiversity.[140]


On 24 November 2006, the English Wikipedia

passed the 1,500,000-article mark, with Kanab

ambersnail being announced on the Main Page as

the milestone article.[138]


On 4 April 2007, the first Wikipedia CD selection in

English was published as a free download.[141]


On 9 September 2007, the English Wikipedia

passed the 2,000,000-article mark. El

Hormiguero was accepted by consensus as the

2,000,000th article.


On 17 August 2009, the English Wikipedia passed

the 3,000,000-article mark, with Beate

Eriksen being announced on the Main Page as the

milestone article.


On 12 December 2010, the English Wikipedia

passed the 3,500,000-article mark.



On 7 November 2011, the German Wikipedia

exceeded 100 million page edits.


On 24 November 2011, the English Wikipedia

exceeded 500 million page edits.


On 17 December 2011, the Dutch

Wikipedia exceeded 1,000,000 articles, becoming

the fourth Wikipedia language edition to do so.


On 13 July 2012, the English Wikipedia

exceeded 4,000,000 articles, with Izbat al-Burj.[100]


On 22 January 2013, the Italian

Wikipedia exceeded 1,000,000 articles, becoming

the fifth Wikipedia language edition to do so.


On 11 May 2013, the Russian

Wikipedia exceeded 1,000,000 articles, becoming

the sixth Wikipedia language edition to do so.


On 16 May 2013, the Spanish

Wikipedia exceeded 1,000,000 articles, becoming

the seventh Wikipedia language edition to do so.


On 15 June 2013, the Swedish

Wikipedia exceeded 1,000,000 articles, becoming

the eighth Wikipedia language edition to do so.


On 25 September 2013, the Polish

Wikipedia exceeded 1,000,000 articles, becoming

the ninth Wikipedia language edition to do so.


On 21 October 2013, Wikipedia exceeded 30

million articles across all 287 language editions.

Fundraising[edit]
Every year, Wikipedia runs a fundraising campaign to

support its operations.



One of the first fundraisers was held from 18

February 2005 to 1 March 2005, raising

US$94,000, which was US$21,000 more than

expected.[142]

On 6 January 2006, the Q4 2005 fundraiser

concluded, raising a total of just over US$390,000.


[143]


The 2007 fundraising campaign raised US$1.5

million from 44,188 donations.[144]


The 2008 fundraising campaign gained Wikipedia

more than US$6 million.[145][146]


The 2010 campaign was launched on 13

November 2010.[147] The campaign raised US$16

million.[148]


The 2011 campaign raised US$20 million from

more than one million donors.[149]


The 2012 campaign raised US$25 million from

around 1.2 million donors.[150]

External impact[edit]


In 2007, Wikipedia was deemed fit to be used as a

major source by the UK Intellectual Property

Office in a Formula One trademark case ruling.[151]



Over time, Wikipedia gained recognition amongst

more traditional media as a "key source" for major

new events, such as the 2004 Indian Ocean

earthquake and related tsunami, the 2008

American Presidential election,[152] and the

2007 Virginia Tech massacre. The latter article was

accessed 750,000 times in two days, with

newspapers published local to the shootings

adding that "Wikipedia has emerged as the


clearinghouse for detailed information on the

event."[153]


On 21 February 2007, Noam Cohen of the New

York Times reported that some academics were

banning the use of Wikipedia as a research use.[154]


On 27 February 2007, an article in The Harvard

Crimson newspaper reported that some professors

at Harvard University included Wikipedia in

their syllabi, but that there was a split in their

perception of using Wikipedia.[155]


In July 2013, a large-scale study by four major

universities identified the most contested articles

on Wikipedia, finding that Israel, Adolf

Hitler and God were more fiercely debated than

any other subjects.[156]

Effect of biographical articles[edit]

Because Wikipedia biographies are often updated as

soon as new information comes to light, they are often

used as a reference source on the lives of notable

people. This has led to attempts to manipulate and

falsify Wikipedia articles for promotional or defamatory

purposes (see Controversies). It has also led to novel

uses of the biographical material provided. Some

notable people's lives are being affected by their

Wikipedia biography.


November 2005: The Seigenthaler

controversy occurred when a hoaxer asserted on

Wikipedia that journalist John Seigenthaler had

been involved in the Kennedy assassination of

1963.

December 2006: German comedian Atze

Schröder sued Arne Klempert, secretary


of Wikimedia Deutschland, because he did not

want his real name published in on Wikipedia.

Schröder later withdrew his complaint, but wanted

his attorney's costs to be paid by Klempert. A court

decided that the artist had to cover those costs by

himself.[157]


16 February 2007: Turkish historian Taner

Akçam was briefly detained upon arrival

at Montréal-Pierre Elliott Trudeau International

Airport because of false information on his

Wikipedia biography claiming he was a terrorist.[158]


[159]


November 2008: The German Left

Party politician Lutz Heilmann claimed that some

remarks in his Wikipedia article caused damage to

his reputation. He succeeded in getting a court

order to make Wikimedia Deutschland remove a

key search portal. The result was a national

outpouring of support for Wikipedia, more

donations to Wikimedia Deutschland, and a rise in

daily pageviews of Lutz Heilmann's article from a

few dozen to half a million. Shortly after, Heilmann

asked the court to withdraw the court order.[160]


December 2008: Wikimedia Nederland, the Dutch

chapter, won a preliminary injunction after an

entrepreneur was linked in "his" article with the

criminal Willem Holleeder and wanted the article

deleted. The judge in Utrecht believed Wikimedia's

assertion that it has no influence on the content of

Dutch Wikipedia.[161]


February 2009: When Karl Theodor Maria Nikolaus

Johann Jakob Philipp Franz Joseph Sylvester

Freiherr von und zu Guttenberg became federal


minister on 10 February 2009, an unregistered user

added an eleventh given name in the article on

German Wikipedia: Wilhelm. Numerous

newspapers took it over. When wary Wikipedians

wanted to erase Wilhelm, the revert was reverted

with regard to those newspapers. This case about

Wikipedia reliability and journalists copying from

Wikipedia became known as Falscher

Wilhelm ("wrong Wilhelm").[162]


May 2009: An article about the German journalist

Richard Herzinger in the German Wikipedia was

vandalized. The IP user added that Herzinger, who

wrote for Die Welt, was Jewish; the sighter marked

this as "sighted" (meaning that there is no

vandalism in the article). Herzinger complained

about that to Wikipedians who immediately deleted

the assertion. According to Herzinger, who wrote

about the incident in a newspaper article,[163] he is

regularly called a Jew by right-wing extremists due

to his perceived pro-Israel stance.


October 2009: In 1990, the German actor Walter

Sedlmayr was murdered. Years later, when the two

murderers were released from prison, German law

prohibited the media from mentioning their names.

The men's lawyer also sent the Wikimedia

Foundation a cease and desist letter requesting the

men's names be removed from the English

Wikipedia.[164][165]

Early roles of Wales and


Sanger[edit]
Both Wales and Sanger played important roles in the

early stages of Wikipedia. Sanger initially brought the

wiki concept to Wales and suggested it be applied


to Nupedia and then, after some initial skepticism, Wales

agreed to try it.[18] To Wales is ascribed the broader idea

of an encyclopedia to which non-experts could

contribute, i.e. Wikipedia; Sanger wrote, "To be clear, the

idea of an open source, collaborative encyclopedia,

open to contribution by ordinary people,

was entirely Jimmy's, not mine" (emphasis in original

text). He also wrote, "Jimmy, of course, deserves

enormous credit for investing in and guiding

Wikipedia."[15] Wales stated in October 2001 that "Larry

had the idea to use Wiki software."[21] Sanger coined the

portmanteau "Wikipedia" as the project name.[15] In

review, Larry Sanger conceived of a wiki-based

encyclopedia as a strategic solution to Nupedia's

inefficiency problems.[166] In terms of project roles,

Sanger spearheaded and pursued the project as its

leader in its first year, and did most of the early work in

formulating policies (including "Ignore all rules")[167] and

"Neutral point of view"[51] and building up the community.


[166]
Upon departure in March 2002, Sanger emphasized

the main issue was purely the cessation of Bomis'

funding for his role, which was not viable part-time, and

his changing personal priorities;[16] however, by 2004, the

two had drifted apart and Sanger became more critical.

Two weeks after the launch of Citizendium, Sanger

criticized Wikipedia, describing the latter as "broken

beyond repair."[168] In 2002 Sanger parted ways with

Wikipedia; by 2005 Wales began to dispute Sanger's

role in the project, three years after Sanger left.[169][170][171]

In 2005, Wales described himself simply as the founder

of Wikipedia;[169] however, according to Brian

Bergstein of the Associated Press, "Sanger has long

been cited as a co-founder."[166] There is evidence that

Sanger was called co-founder, along with Wales, as

early as 2001, and he is referred to as such in early


Wikipedia press releases and Wikipedia articles and in a

September 2001 New York Times article for which both

were interviewed.[172] In 2006, Wales said, "He used to

work for me [...] I don't agree with calling him a co-

founder, but he likes the title";[173]nonetheless, before

January 2004, Wales did not dispute Sanger's status as

co-founder[174] and, indeed, identified himself as "co-

founder" as late as August 2002.[175] By contrast, Sanger

originally thought of himself as an employee; in his

introductory message to the Nupedia mailing list, he said

that "Jimmy Wales[,] contacted me and asked me to

apply as editor-in-chief of Nupedia. Apparently, Bomis,

Inc. (which owns Nupedia)... who could manage this sort

of long-term project, he thought I would be perfect for the

job. This is indeed my dream job..."[176] Sanger did not

claim to be a founder or co-founder, instead saying "He

[Wales] had had the idea for Nupedia since at least last

fall".[176]

As of March 2007: Wales emphasized this employer–

employee relationship and his ultimate authority, terming

himself Wikipedia's sole founder; and Sanger

emphasized their statuses as co-founders, referencing

earlier versions of Wikipedia pages (2004, 2006), press

releases (2002–2004), and media coverage from the

time of his involvement routinely terming them in this

manner.[166][172][177][178]

Controversies[edit]
Main articles: Criticism of Wikipedia, List of litigation

involving Wikipedia, and Reliability of Wikipedia

Wikinews has related


news:U.K. National Portrait
Gallery threatens U.S.
citizen with legal action over
Wikimedia images
 January 2001: Licensing and structure. After partial

breakdown of discussions with Bomis, Richard

Stallman announced GNUpedia as a competing

project.[179] Besides having a nearly identical name,

it was very similar functionally to

Nupedia/Wikipedia (the former which launched in

March 2000 but had as yet published very few

articles—the latter of which was intended to be a

source of seed-articles for the former). The goals

and methods of GNUpedia were nearly identical to

Wikipedia: anyone can contribute, small

contributions welcome, plan on taking years,

narrow focus on encyclopedic content as the

primary goal, anyone can read articles, anyone can

mirror articles, anyone can translate articles, use

libre-licensed code to run the site, encourage peer

review, and rely primarily on volunteers. GNUpedia

was roughly intended to be an combination

of Wikipedia and also Wikibooks. The main

exceptions were:

1. strong prohibition against *any* sort of centralized control ("[must not be] written

under the direction of a single organization, which made all decisions about the

content, and... published in a centralized fashion. ...we dare not allow any

organization to decide what counts as part of [our encyclopedia]"). In

particular, deletionists were not allowed; editing an article would require forking it,

making a change, and then saving the result as a 'new' article on the same topic.

2. assuming attribution for articles (rather than anonymous by default), requiring

attribution for quotations, and allowing original authors to control straightforward

translations, In particular, the idea was to have a set of N articles covering

the Tiananmen Square protests of 1989, with some to-be-determined mechanism for

readers to endorse/rank/like/plus/star the version of the article they found best.

3. given the structure above, where every topic (especially controversial ones) might

have a thousand articles purporting to be *the* GNUpedia article about Sarah Palin,

Stallman explicitly rejected the idea of a centralized website that would specify which

article of those thousand was worth reading. Instead of an official catalogue, the plan
was to rely on search engines at first (the reader would begin by googling "gnupedia

sarah palin"), and then eventually if necessary construct catalogues according to the

same principles as articles were constructed. In wikipedia, there is an official central

website for each language (en.wikipedia.org), and an official catalogue of sorts

(category-lists and lists-of-lists), but as of 2013 search engines still provide about

60% of the inbound traffic.

The goals which led to GNUpedia were published at least as early as 18 December

2000,[180][181] and these exact goals were finalized on the 12th [179] and 13th [182] of

January 2001, albeit with a copyright of 1999, from when Stallman had first started

considering the problem. The only sentence added between 18 December and the

unveiling of GNUpedia the week of 12–16 January was this: "The GNU Free

Documentation License would be a good license to use for courses."

GNUpedia was 'formally' announced on slashdot [183] the same day that their mailing-

list first went online with a test-message, 16 January. Jimmy Wales posted to the list

on the 17th, the first full day of messages, explaining the discussions with Stallman

concerning the change in Nupedia content-licensing, and suggesting cooperation.[184]


[185]
Stallman himself first posted on the 19th, and in his second post on the 22nd

mentioned discussions about merging Wikipedia and GNUpedia were ongoing.


[186]
Within a couple of months, Wales had changed his email-sig from the open

source encyclopedia to the free encyclopedia,[187] both Nupedia and Wikipedia had

adopted the GFDL, and the merger [188] of GNUpedia into Wikipedia was effectively

accomplished.


November 2001: it was

announced by Jimmy

Wales that advertising

would soon begin on

Wikipedia, starting in

the summer or fall of

2002.[189] Instead, early

in 2002 Chief

Editor Larry Sangerwas

fired, since his salary

was the largest[citation


needed]
expense involved

in keeping Wikipedia
functioning. By

September 2002,
[190]
Wales had publically

stated "There are

currently no plans for

advertising on

Wikipedia." By June

2003, the Wikimedia

Foundation was

formally incorporated;
[191]
the Foundation is

explicitly against paid

advertising,[192] although

it does 'internally'

advertise Wikimedia

Foundation fund-raising

events on Wikipedia.
[193]
As of 2013, the by-

laws of the Wikimedia

Foundation do not

explicitly prohibit it from

adopting a broader

advertising policy, if

necessary.[citation
needed]
Such by-laws are

subject to vote.[citation
needed]


All of 2003: Zero

controversies of any

Notability occurred.


All of 2004: Zero

controversies of any

Notability occurred.

January 2005: The fake

charity QuakeAID, in

the month following

the 2004 Indian Ocean

earthquake, attempted

to promote itself on its

Wikipedia page.


October 2005: Alan

Mcilwraith was exposed

as a fake war hero with

a Wikipedia page.


November 2005:

The Seigenthaler

controversy caused

Brian Chase to resign

from his employment,

after his identity was

ascertained by Daniel

Brandt of Wikipedia

Watch. Following this,

the scientific

journal Nature undertoo

k a peer reviewed study

to test articles in

Wikipedia against their

equivalents

in Encyclopædia

Britannica, and

concluded they are

comparable in terms of

accuracy.[194][195] Britanni

ca rejected their

methodology and their

conclusion.[196] Nature re
fused to make any

apologies, asserting

instead the reliability of

its study and a rejection

of the criticisms.[197]


Early-to-mid-2006:

The congressional

aides biography

scandals came to public

attention, in which

several political aides

were caught trying to

influence the Wikipedia

biographies of several

politicians to remove

undesirable information

(including pejorative

statements quoted, or

broken campaign

promises), add

favorable information or

"glowing" tributes, or

replace the article in

part or whole by staff

authored biographies.

The staff of at least five

politicians were

implicated: Marty

Meehan, Norm

Coleman, Conrad

Burns, Joe Biden, Gil

Gutknecht.[198] In a

separate but similar

incident, the campaign

manager for Cathy Cox,


Morton Brilliant,

resigned after being

found to have added

negative information to

the Wikipedia entries of

political opponents.
[199]
Following media

publicity, the incidents

tapered off around

August 2006.


July 2006: Joshua

Gardner was exposed

as a fake Duke of

Cleveland with a

Wikipedia page.


January 2007: English-

language Wikipedians

in Qatar were briefly

blocked from editing,

following a spate of

vandalism, by an

administrator who did

not realize that the

country's internet traffic

is routed through a

single IP address.

Multiple media sources

promptly declared that

Wikipedia was banning

Qatar from the site.[200]


On 23 January 2007,

a Microsoft employee

offered to pay Rick

Jelliffe to review and


change certain

Wikipedia articles

regarding an open-

source document

standard which was

rival to a Microsoft

format.[201]


In February 2007, The

New Yorker magazine

issued a rare editorial

correction that a

prominent English

Wikipedia editor and

administrator known as

"Essjay", had invented a

persona using fictitious

credentials.[202][203] The

editor, Ryan Jordan,

became

a Wikia employee in

January 2007 and

divulged his real name;

this was noticed by

Daniel Brandt of

Wikipedia Watch, and

communicated to the

original article author.

(See: Essjay

controversy)


February 2007: Fuzzy

Zoeller sued a Miami

firm because

defamatory information

was added to his


Wikipedia biography in

an anonymous edit that

came from their

network.


16 February 2007:

Turkish historian Taner

Akçam was briefly

detained upon arrival at

a Canadian airport

because of false

information on his

biography indicating

that he was a terrorist.


In June 2007, an

anonymous user posted

hoax information that,

by coincidence,

foreshadowed the Chris

Benoit murder-suicide,

hours before the bodies

were found by

investigators. The

discovery of the edit

attracted widespread

media attention and

was first covered in

sister site Wikinews.


In October 2007, in their

obituaries of recently

deceased TV theme

composer Ronnie

Hazlehurst, many

British media

organisations reported
that he had co-written

the S Club 7 song

"Reach". In fact, he

hadn't, and it was

discovered that this

information had been

sourced from a hoax

edit to Hazlehurst's

Wikipedia article.[204]


In February 2007,

Barbara Bauer, a

literary agent, sued

Wikimedia for

defamation and causing

harm to her business,

the Barbara Bauer

Literary Agency.
[205]
In Bauer v. Glatzer,

Bauer claimed that

information on

Wikipedia critical of her

abilities as a literary

agent caused this harm.

The Electronic Frontier

Foundation defended

Wikipedia[206] and

moved to dismiss the

case on 1 May 2008.


[207]
The case against

the Wikimedia

Foundation was

dismissed on 1 July

2008.[208]

On 14 July 2009, the

National Portrait Gallery

issued a cease and

desist letter for alleged

breach of copyright,

against a Wikipedia

editor who downloaded

more than 3,000 high-

resolution images from

the NPG website, and

placed them

on Wikimedia

Commons.[209][210][211][212]
[213]
See National Portrait

Gallery and Wikimedia

Foundation copyright

dispute for more.


In April and May 2010,

there was controversy

over the hosting and

display of sexual

drawing and

pornographic images

including images of

children on Wikipedia.
[214][215][216]
It led to the

mass removal of

pornographic content

from Wikimedia

Foundation sites.[217][218]


In November 2012, Lord

Justice Leveson wrote

in his report on British

press standards, "The


Independent was

founded in 1986 by the

journalists Andreas

Whittam Smith, Stephen

Glover and Brett

Straub..." He had used

the Wikipedia article

for The

Independent newspaper

as his source, but an

act of vandalism had

replaced Matthew

Symonds (a genuine

co-founder) with Brett

Straub (an unknown

character).[219] The

Economist said of

the Leveson report,

"Parts of it are a

scissors-and-paste job

culled from

Wikipedia."[220]

Notable forks
and
derivatives[edit]
See this page for a partial list

of Wikipedia mirrors and

forks. No list of sites utilizing

the software is maintained. A

significant number of sites

use the MediaWiki software

and concept, popularized

by Wikipedia.
Specialized foreign language

forks using the Wikipedia

concept include Enciclopedia

Libre (Spanish), Wikiweise (G

erman), WikiZnanie

(Russian), Susning.nu (Swedi

sh), and Baidu

Baike(Chinese). Some of

these (such as Enciclopedia

Libre) use GFDL or

compatible licenses as used

by Wikipedia, leading to

exchange of material with

their respective language

Wikipedias.

In 2006, Larry

Sanger founded Citizendium,

based upon a modified

version of MediaWiki.[221] The

site cited its aims were 'to

improve on the Wikipedia

model with "gentle expert

oversight", among other

things'.[54][222] (see

also Nupedia).

Publication on
other media[edit]
The German Wikipedia was

the first to be partly published

also using other media

(rather than online on the

internet), including releases

on CD in November

2004[223] and more extended

versions on CDs or DVD in


April 2005 and December

2006. In December 2005, the

publisher Zenodot

Verlagsgesellschaft mbH, a

sister company of

Directmedia, published a 139

page book explaining

Wikipedia, its history and

policies, which was

accompanied by a 7.5 GB

DVD containing 300,000

articles and 100,000 images

from the German Wikipedia.


[224]
Originally, Directmedia

also announced plans to print

the German Wikipedia in its

entirety, in 100 volumes of

800 pages each. Publication

was due to begin in October

2006, and finish in 2010. In

March 2006, however, this

project was called off.[225]

In September

2008, Bertelsmann published

a 1000 pages volume with a

selection of popular German

Wikipedia articles.

Bertelsmann paid voluntarily

1 Euro per sold copy

to Wikimedia Deutschland.[226]

The first CD version

containing a selection of

articles from the English

Wikipedia was published in

April 2006 by SOS


Children as the 2006

Wikipedia CD Selection.[227] In

April 2007, "Wikipedia

Version 0.5", a CD containing

around 2000 articles selected

from the online encyclopedia

was published by

the Wikimedia

Foundation and Linterweb.

The selection of articles

included was based on both

the quality of the online

version and the importance of

the topic to be included. This

CD version was created as a

test-case in preparation for a

DVD version including far

more articles.[228][229] The CD

version can be purchased

online, downloaded as a DVD

image file or Torrent file, or

accessed online at the

project's website.

A free software project has

also been launched to make

a static version of Wikipedia

available for use on iPods.

The "Encyclopodia" project

was started around March

2006 and can currently be

used on 1st to 4th generation

iPods.[230]

Lawsuits[edit]
In limited ways, the

Wikimedia Foundation is
protected by Section 230 of

the Communications

Decency Act. In the

defamation action Bauer et

al. v. Glatzer et al., it was

held that Wikimedia had no

case to answer due to the

provisions of this section.


[231]
A similar law in France

caused a lawsuit to be

dismissed in October 2007.


[232]
In 2013, a German

appeals court (the Higher

Regional Court of Stuttgart)

ruled that Wikipedia is a

"service provider" not a

"content provider", and as

such is immune from liability

as long as it takes down

content that is accused of

being illegal.[233]

See also[edit]

TWiki
From Wikipedia, the free encyclopedia
For the robot character, see Twiki.

TWiki

Developer(s) Peter Thoeny with TWiki contributors

Initial release 23 July 1998


Stable release 5.1.4 (2013-02-16) [±]

Preview release None (None) [±]

Written in Perl

Operating system Cross-platform

Type Wiki

License GPL

Website http://twiki.org/

TWiki is a Perl-based structured wiki application,[1] typically used to run a collaboration


platform, knowledge or document management system, aknowledge base, or team portal. Users can
create wiki applications using the TWiki Markup Language, and developers can extend its functionality
with plugins.

The TWiki project was founded by Peter Thoeny in 1998 as an open source wiki-based application
platform. In October 2008, the company TWiki.net, created by Thoeny, assumed full control over the
TWiki project[2] while much of the developer community[3][4] forked off to join the Foswiki project.[5]

Contents

[hide]

 1 Major features

o 1.1 TWiki extensions

o 1.2 TWiki application platform

o 1.3 User interface

 2 TWiki deployment

 3 Realization
 4 TWiki release history

 5 Forks of TWiki

 6 Gallery

 7 See also

 8 References

 9 External links

Major features[edit]

 Revision control - complete audit trail, also for meta data such as
attachments and access control settings
 Fine-grained access control - restrict read/write/rename on site level,
web level, page level based on user groups

 Extensible TWiki markup language

 TinyMCE based WYSIWYG editor

 Dynamic content generation with TWiki variables

 Forms and reporting - capture structured content, report on it with


searches embedded in pages

 Built in database - users can create wiki applications using the TWiki
Markup Language

 Skinnable user interface

 RSS/Atom feeds and e-mail notification

 Over 400 Extensions and 200 Plugins

TWiki extensions[edit]
TWiki has a plugin API that has spawned over 300 extensions [6] to link into databases, create charts, tags,
sort tables, write spreadsheets, create image gallery and slideshows, make drawings, write blogs,
plot graphs, interface to many different authentication schemes, track Extreme Programming projects and
so on.

TWiki application platform[edit]


TWiki as a structured wiki provides database-like manipulation of fields stored on pages, [7] and offers a
SQL-like query language to embed reports in wiki pages.[8]

Wiki applications are also called situational applications because they are created ad hoc by the users for
very specific needs. Users have built TWiki applications[9] that include call center status boards, to-do
lists, inventory systems, employee handbooks, bug trackers, blog applications, discussion forums, status
reports with rollups and more.

User interface[edit]
The interface of TWiki is completely skinnable in templates, themes and (per user) CSS. It includes
support for internationalization ('I18N'), with support for multiple character sets, UTF-8 URLs, and the user
interface has been translated into Chinese, Czech, Danish, Dutch, French, German, Italian, Japanese,
Polish, Portuguese, Russian, Spanish and Swedish.[10]

TWiki deployment[edit]
TWiki is primarily used at the workplace as a corporate wiki[11] to coordinate team activities, track projects,
implement workflows[12] and as an Intranet Wiki. The TWiki community estimates 40,000 corporate wiki
sites as of March 2007, and 20,000 public TWiki sites.[13]

TWiki customers include Fortune 500 such as Disney, Google, Motorola, Nokia, Oracle
Corporation and Yahoo!, as well as small and medium enterprises,[14] such as ARM Holdings[dead link]
[15]
and DHL.[16] TWiki has also been used to create collaborative internet sites, such as the City of
Melbourne's FutureMelbourne wiki where citizens can collaborate on the future plan. [17]

Realization[edit]
TWiki is implemented in Perl. Wiki pages are stored in plain text files. Everything, including meta such as
access control settings, are version controlled using RCS. RCS is optional since an all-Perl version
control system is provided.

TWiki scales reasonably well even though it uses plain text files and no relational database to store page
data. Many corporate TWiki installations have several hundred thousand pages and tens of thousands of
users. Load balancing and caching can be used to improve performance on high traffic sites. [18]

TWiki has database features built into the engine. A TWiki Form [7] is attached to a page as meta data. This
represents a database record. A set of pages that share the same type of form build a database table. A
formatted search[19] with a SQL-like query[20] can be embedded into a page to construct dynamic
presentation of data from multiple pages. This allows for building wiki applications and constitutes the
TWiki's notion of a structured wiki.

TWiki release history[edit]


1998-07-23: Initial version, based on JosWiki, an application created by
Markus Peter and Dave Harris[21][22]

2000-05-01: TWiki Release 1 May 2000


2000-12-01: TWiki Release 1 December 2000


2001-09-01: TWiki Release 1 September 2001


2001-12-01: TWiki Release 1 December 2001 ("Athens")


2003-02-01: TWiki Release 1 February 2003 ("Beijing")


2004-09-01: TWiki Release 1 September 2004 ("Cairo")


2006-02-01: TWiki Release 4.0.0 ("Dakar")


2007-01-16: TWiki Release 4.1.0 ("Edinburgh")


2008-01-22: TWiki Release 4.2.0 ("Freetown")


2009-09-02: TWiki Release 4.3.2 ("Georgetown")


2010-06-10: TWiki Release 5.0 ("Helsinki")


2011-08-20: TWiki Release 5.1 ("Istanbul")

Forks of TWiki[edit]
Forks of TWiki include:


2001: Spinner Wiki (abandoned)

2003: O'Wiki fork (abandoned)


2008: Foswiki, launched in October 2008 when a dispute about the
future guidance of the project could not be settled, [23][24] resulting in the
departure of much of the TWiki community including the core developer
team[4]

RSS
From Wikipedia, the free encyclopedia
For other uses, see RSS (disambiguation).

For RSS feeds from Wikipedia, see Wikipedia:Syndication.

RSS - Rich Site Summary

Filename .rss, .xml


extension

Internet media application/rss+xml (registration not


type finished)[1]

Type of format Web syndication

Extended from XML

RSS (Rich Site Summary); originally RDF Site Summary; often dubbed Really Simple Syndication,
uses a family of standard web feedformats[2] to publish frequently updated information: blog entries, news
headlines, audio, video. An RSS document (called "feed", "web feed", [3] or "channel") includes full or
summarized text, and metadata, like publishing date and author's name.

RSS feeds enable publishers to syndicate data automatically. A standard XML file format ensures
compatibility with many different machines/programs. RSS feeds also benefit users who want to receive
timely updates from favourite websites or to aggregate data from many sites.
Once users subscribe to a website RSS removes the need for them to manually check it. Instead, their
browser constantly monitors the site and informs the user of any updates. The browser can also be
commanded to automatically download the new data for the user.

Software termed "RSS reader", "aggregator", or "feed reader", which can be web-based, desktop-based,
or mobile-device-based, present RSS feed data to users. Users subscribe to feeds either by entering a
feed's URI into the reader or by clicking on the browser's feed icon. The RSS reader checks the user's
feeds regularly for new information and can automatically download it, if that function is enabled. The
reader also provides a user interface.

Contents

[hide]

 1 History

 2 Example

 3 Variants

 4 Modules

 5 Interoperability

 6 BitTorrent and RSS

 7 RSS Compared to Atom

 8 See also

 9 References

 10 External links

History[edit]
Main article: History of web syndication technology

The RSS formats were preceded by several attempts at web syndication that did not achieve widespread
popularity. The basic idea of restructuring information about websites goes back to as early as 1995,
when Ramanathan V. Guha and others in Apple Computer's Advanced Technology Group developed
the Meta Content Framework.[4]
RDF Site Summary, the first version of RSS was created by Dan Libby and Ramanathan V.
Guha at Netscape. It was released in March 1999 for use on the My.Netscape.Com portal. This version
became known as RSS 0.9.[5] In July 1999, Dan Libby of Netscape produced a new version, RSS 0.91,
[2]
which simplified the format by removing RDF elements and incorporating elements fromDave Winer's
news syndication format.[6] Libby also renamed format from RDF to RSS Rich Site Summary and
outlined further development of the format in a "futures document". [7]

This would be Netscape's last participation in RSS development for eight years. As RSS was being
embraced by web publishers who wanted their feeds to be used on My.Netscape.Com and other early
RSS portals, Netscape dropped RSS support from My.Netscape.Com in April 2001 during new
owner AOL's restructuring of the company, also removing documentation and tools that supported the
format.[8]

Two entities emerged to fill the void, with neither Netscape's help nor approval: The RSS-DEV Working
Group and Dave Winer, whose UserLand Software had published some of the first publishing tools
outside of Netscape that could read and write RSS.

Winer published a modified version of the RSS 0.91 specification on the UserLand website, covering how
it was being used in his company's products, and claimed copyright to the document. [9] A few months
later, UserLand filed a U.S. trademark registration for RSS, but failed to respond to a USPTO trademark
examiner's request and the request was rejected in December 2001. [10]

The RSS-DEV Working Group, a project whose members included Guha and representatives of O'Reilly
Media and Moreover, produced RSS 1.0 in December 2000.[11] This new version, which reclaimed the
name RDF Site Summary from RSS 0.9, reintroduced support for RDF and added XML
namespaces support, adopting elements from standard metadata vocabularies such asDublin Core.

In December 2000, Winer released RSS 0.92[12] a minor set of changes aside from the introduction of the
enclosure element, which permitted audio files to be carried in RSS feeds and helped spark podcasting.
He also released drafts of RSS 0.93 and RSS 0.94 that were subsequently withdrawn. [13]

In September 2002, Winer released a major new version of the format, RSS 2.0, that redubbed its initials
Really Simple Syndication. RSS 2.0 removed the type attribute added in the RSS 0.94 draft and added
support for namespaces. To preserve backward compatibility with RSS 0.92, namespace support applies
only to other content included within an RSS 2.0 feed, not the RSS 2.0 elements themselves. [14] (Although
other standards such as Atom attempt to correct this limitation, RSS feeds are not aggregated with other
content often enough to shift the popularity from RSS to other formats having full namespace support.)
Because neither Winer nor the RSS-DEV Working Group had Netscape's involvement, they could not
make an official claim on the RSS name or format. This has fueled ongoing controversy in the syndication
development community as to which entity was the proper publisher of RSS.

One product of that contentious debate was the creation of an alternative syndication format, Atom, that
began in June 2003.[15] The Atom syndication format, whose creation was in part motivated by a desire to
get a clean start free of the issues surrounding RSS, has been adopted as IETF Proposed Standard RFC
4287.

In July 2003, Winer and UserLand Software assigned the copyright of the RSS 2.0 specification to
Harvard's Berkman Center for Internet & Society, where he had just begun a term as a visiting fellow. [16] At
the same time, Winer launched the RSS Advisory Board with Brent Simmons and Jon Udell, a group
whose purpose was to maintain and publish the specification and answer questions about the format. [17]

In September 2004, Stephen Horlander created the now ubiquitous RSS icon ( ) for use in the Mozilla
Firefox browser.[18]

In December 2005, the Microsoft Internet Explorer team[19] and Microsoft Outlook team[20] announced on
their blogs that they were adopting Firefox's RSS icon. In February 2006, Opera Softwarefollowed suit.
[21]
This effectively made the orange square with white radio waves the industry standard for RSS and
Atom feeds, replacing the large variety of icons and text that had been used previously to identify
syndication data.

In January 2006, Rogers Cadenhead relaunched the RSS Advisory Board without Dave Winer's
participation, with a stated desire to continue the development of the RSS format and resolve
ambiguities. In June 2007, the board revised their version of the specification to confirm that
namespaces may extend core elements with namespace attributes, as Microsoft has done in Internet
Explorer 7. According to their view, a difference of interpretation left publishers unsure of whether this
was permitted or forbidden.

This article is outdated. Please update this article to reflect recent events or newly available
information. (October 2013)

Example[edit]
RSS files are essentially XML formatted plain text. The RSS file itself is relatively easy to read both by
automated processes and by humans alike. An example file could have contents such as the following.
This could be placed on any appropriate communication protocol for file retrieval, such as http or ftp, and
reading software would use the information to present a neat display to the end users.
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>RSS Title</title>
<description>This is an example of an RSS feed</description>
<link>http://www.someexamplerssdomain.com/main.html</link>
<lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate>
<pubDate>Mon, 06 Sep 2009 16:20:00 +0000 </pubDate>
<ttl>1800</ttl>

<item>
<title>Example entry</title>
<description>Here is some text containing an interesting
description.</description>
<link>http://www.wikipedia.org/</link>
<guid>unique string per item</guid>
<pubDate>Mon, 06 Sep 2009 16:20:00 +0000 </pubDate>
</item>

</channel>
</rss>
Variants[edit]
There are several different versions of RSS, falling into two major branches (RDF and 2.*).

The RDF (or RSS 1.*) branch includes the following versions:

 RSS 0.90 was the original Netscape RSS version. This RSS was
called RDF Site Summary, but was based on an early working draft of
the RDF standard, and was not compatible with the final RDF
Recommendation.
 RSS 1.0 is an open format by the RSS-DEV Working Group, again
standing for RDF Site Summary. RSS 1.0 is an RDF format like RSS
0.90, but not fully compatible with it, since 1.0 is based on the final RDF
1.0 Recommendation.

 RSS 1.1 is also an open format and is intended to update and replace
RSS 1.0. The specification is an independent draft not supported or
endorsed in any way by the RSS-Dev Working Group or any other
organization.

The RSS 2.* branch (initially UserLand, now Harvard) includes the following versions:

RSS 0.91 is the simplified RSS version released by Netscape, and also
the version number of the simplified version originally championed
by Dave Winer from Userland Software. The Netscape version was
now called Rich Site Summary; this was no longer an RDF format, but
was relatively easy to use.

RSS 0.92 through 0.94 are expansions of the RSS 0.91 format, which
are mostly compatible with each other and with Winer's version of RSS
0.91, but are not compatible with RSS 0.90.


RSS 2.0.1 has the internal version number 2.0. RSS 2.0.1 was
proclaimed to be "frozen", but still updated shortly after release without
changing the version number. RSS now stood for Really Simple
Syndication. The major change in this version is an explicit extension
mechanism using XML namespaces.[22]

Later versions in each branch are backward-compatible with earlier versions (aside from non-conformant
RDF syntax in 0.90), and both versions include properly documented extension mechanisms using XML
Namespaces, either directly (in the 2.* branch) or through RDF (in the 1.* branch). Most syndication
software supports both branches. "The Myth of RSS Compatibility", an article written in 2004 by RSS critic
and Atom advocate Mark Pilgrim, discusses RSS version compatibility issues in more detail.

The extension mechanisms make it possible for each branch to track innovations in the other. For
example, the RSS 2.* branch was the first to support enclosures, making it the current leading choice for
podcasting, and as of 2005 is the format supported for that use by iTunes and other podcasting software;
however, an enclosure extension is now available for the RSS 1.* branch,mod_enclosure. Likewise, the
RSS 2.* core specification does not support providing full-text in addition to a synopsis, but the RSS 1.*
markup can be (and often is) used as an extension. There are also several common outside extension
packages available, including a new proposal from Microsoft for use in Internet Explorer 7.

The most serious compatibility problem is with HTML markup. Userland's RSS reader—generally
considered as the reference implementation—did not originally filter out HTML markup from feeds. As a
result, publishers began placing HTML markup into the titles and descriptions of items in their RSS feeds.
This behavior has become expected of readers, to the point of becoming a de factostandard,[citation
needed]
though there is still some inconsistency in how software handles this markup, particularly in titles.
The RSS 2.0 specification was later updated to include examples of entity-encoded HTML; however, all
prior plain text usages remain valid.
As of January 2007, tracking data from www.syndic8.com indicates that the three main versions of RSS in
current use are 0.91, 1.0, and 2.0, constituting 13%, 17%, and 67% of worldwide RSS usage,
respectively.[23] These figures, however, do not include usage of the rival web feed format Atom. As of
August 2008, the syndic8.com website is indexing 546,069 total feeds, of which 86,496 were some dialect
of Atom and 438,102 were some dialect of RSS.[24]

Modules[edit]
The primary objective of all RSS modules is to extend the basic XML schema established for more robust
syndication of content. This inherently allows for more diverse, yet standardized, transactions without
modifying the core RSS specification.

To accomplish this extension, a tightly controlled vocabulary (in the RSS world, "module"; in the XML
world, "schema") is declared through an XML namespace to give names to concepts and relationships
between those concepts.

Some RSS 2.0 modules with established namespaces are:

 Media RSS 2.0 Module


 OpenSearch RSS 2.0 Module

Interoperability[edit]
Although the number of items in an RSS channel are theoretically not limited, some news aggregators do
not support RSS files larger than 150KB (if all elements are provided on a new line, this size corresponds
to approx. 2,800 lines).[25] For example, applications that rely on the Common Feed List of Windows might
handle such files as if they were corrupt, and not open them.Interoperability can be maximized by keeping
the file size under this limit.

BitTorrent and RSS[edit]


Some BitTorrent clients support RSS. RSS feeds which provide links to .torrent files allow users
to subscribe and automatically download content as soon as it is published.

RSS Compared to Atom[edit]


Both RSS and Atom are widely supported and are compatible with all major consumer feed readers. RSS
gained wider use because of early feed reader support. But, technically, Atom is more advanced [26] and
has several advantages: less restrictive licensing, IANA registered MIME type, XML
namespace, URI support, Relax NG support.[27]
The following table shows RSS elements alongside Atom elements where they are equivalent. Note: "*"
indicates that an element must be provided except for Atom elements "author" and "link" which are only
required under certain conditions.

RSS 2.0 Atom 1.0

author author*

category category

channel feed

copyright rights

description subtitle

description* summary and/or content

generator generator

guid id*

image logo

item entry

lastBuildDate (in channel) updated*

link* link*

managingEditor author or contributor

pubDate published (subelement of entry)

title* title*

ttl -

See also[edit]

 Comparison of feed aggregators


 DataPortability

 FeedSync previously Simple Sharing Extensions

 Mashup (web application hybrid)

You might also like