Learning Node
Learning Node
Learning Node
com
www.allitebooks.com
Learning Node
Shelley Powers
www.allitebooks.com
Learning Node
by Shelley Powers
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions
are also available for most titles (http://my.safaribooksonline.com). For more information, contact our
corporate/institutional sales department: 800-998-9938 or [email protected].
Editor: Simon St. Laurent Indexer: Aaron Hazelton, BIM Publishing Services
Production Editor: Rachel Steely Cover Designer: Karen Montgomery
Copyeditor: Rachel Monaghan Interior Designer: David Futato
Proofreader: Kiel Van Horn Illustrators: Robert Romano and Rebecca Demarest
Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of
O’Reilly Media, Inc. Learning Node, the image of a hamster rat, and related trade dress are trademarks
of O’Reilly Media, Inc.
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a
trademark claim, the designations have been printed in caps or initial caps.
While every precaution has been taken in the preparation of this book, the publisher and authors assume
no responsibility for errors or omissions, or for damages resulting from the use of the information con-
tained herein.
ISBN: 978-1-449-32307-3
[LSI]
1345837731
www.allitebooks.com
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
iii
www.allitebooks.com
Servers, Streams, and Sockets 41
TCP Sockets and Servers 42
HTTP 44
UDP/Datagram Socket 46
Streams, Pipes, and Readline 48
Child Processes 50
child_process.spawn 50
child_process.exec and child_process.execFile 52
child_process.fork 53
Running a Child Process Application in Windows 53
Domain Resolution and URL Processing 54
The Utilities Module and Object Inheritance 56
Events and EventEmitter 59
iv | Table of Contents
www.allitebooks.com
7. The Express Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Express: Up and Running 128
The app.js File in More Detail 129
Error Handling 132
A Closer Look at the Express/Connect Partnership 133
Routing 134
Routing Path 136
Routing and HTTP Verbs 139
Cue the MVC 145
Testing the Express Application with cURL 150
Table of Contents | v
www.allitebooks.com
Refactoring the Widget Factory 222
Adding the MongoDB Backend 223
vi | Table of Contents
www.allitebooks.com
Acceptance Testing 301
Selenium Testing with Soda 301
Emulating a Browser with Tobi and Zombie 305
Performance Testing: Benchmarks and Load Tests 306
Benchmark Testing with ApacheBench 307
Load Testing with Nodeload 311
Refreshing Code with Nodemon 313
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
www.allitebooks.com
www.allitebooks.com
Preface
Why Node?
If you explore the source code for Node, you’ll find the source code for Google’s V8,
the JavaScript (technically, ECMAScript) engine that’s also at the core of Google’s
Chrome browser. One advantage to Node.js, then, is that you can develop Node ap-
plications for just one implementation of JavaScript—not half a dozen different brows-
ers and browser versions.
Node is designed to be used for applications that are heavy on input/output (I/O), but
light on computation. More importantly, it provides this functionality directly out of
the box. You don’t have to worry about the application blocking any further processing
ix
while waiting for a file to finish loading or a database to finish updating, because most
of the functionality is asynchronous I/O by default. And you don’t have to worry about
working with threads, because Node is implemented on a single thread.
Most importantly, Node is written in a language that many traditional web developers
are familiar with: JavaScript. You may be learning how to use new technologies, such
as working with WebSockets or developing to a framework like Express, but at least
you won’t have to learn a new language along with the concepts. This language famil-
iarity makes it a lot easier to just focus on the new material.
If you’re not sure you’re familiar enough with JavaScript, you might
want to check out my introductory text on JavaScript, Learning Java-
Script, Second Edition (O’Reilly).
x | Preface
How to Best Use This Book
You don’t have to read this book’s chapters in order, but there are paths through the
book that are dependent on what you’re after and how much experience you have with
Node.
If you’ve never worked with Node, then you’re going to want to start with Chapter 1
and read through at least Chapter 5. These chapters cover getting both Node and the
package manager (npm) installed, how to use them, creating your first applications,
and utilizing modules. Chapter 5 also covers some of the style issues associated with
Node, including how to deal with Node’s unique approach to asynchronous develop-
ment.
If you have had some exposure to Node, have worked with both the built-in Node
modules and a few external ones, and have also used REPL (read-eval-print loop—the
interactive console), you could comfortably skip Chapter 1–Chapter 4, but I still
recommend starting no later than Chapter 5.
I incorporate the use of the Express framework, which also utilizes the Connect mid-
dleware, throughout the book. If you’ve not worked with Express, you’re going to want
to go through Chapter 6–Chapter 8, which cover the concepts of routing, proxies, web
servers, and middleware, and introduce Express. In particular, if you’re curious about
using Express in a Model-View-Controller (MVC) framework, definitely read Chap-
ter 7 and Chapter 8.
After these foundation chapters, you can skip around a bit. For instance, if you’re
primarily working with key/value pairs, you’ll want to read the Redis discussion in
Chapter 9; if you’re interested in document-centric data, check out Chapter 10, which
introduces how to use MongoDB with Node. Of course, if you’re going to work only
with a relational database, you can go directly to Chapter 11 and skip the Redis and
MongoDB chapters, though do check them out sometime—they might provide a new
viewpoint to working with data.
After those three data chapters, we get into specialized application use. Chapter 12
focuses purely on graphics and media access, including how to provide media for the
new HTML5 video element, as well as working with PDF documents and Canvas.
Chapter 13 covers the very popular Sockets.io module, especially for working with the
new web socket functionality.
After the split into two different specialized uses of Node in Chapter 12 and Chap-
ter 13, we come back together again at the end of the book. After you’ve had some time
to work with the examples in the other chapters, you’re going to want to spend some
in Chapter 14, learning in-depth practices for Node debugging and testing.
Chapter 15 is probably one of the tougher chapters, and also one of the more important.
It covers issues of security and authority. I don’t recommend that it be one of the first
Preface | xi
chapters you read, but it is essential that you spend time in this chapter before you roll
a Node application out for general use.
Chapter 16 is the final chapter, and you can safely leave it for last, regardless of your
interest and experience. It focuses on how to prepare your application for production
use, including how to deploy your Node application not only on your own system, but
also in one of the cloud servers that are popping up to host Node applications. I’ll also
cover how to deploy a Node application to your server, including how to ensure it plays
well with another web server such as Apache, and how to ensure your application
survives a crash and restarts when the system is rebooted.
Node is heavily connected with the Git source control technique, and most (if not all)
Node modules are hosted on GitHub. The Appendix provides a Git/GitHub survival
guide for those who haven’t worked with either.
I mentioned earlier that you don’t have to follow the chapters in order, but I recommend
that you do. Many of the chapters work off effort in previous chapters, and you may
miss out on important points if you skip around. In addition, though there are numer-
ous standalone examples all throughout the book, I do use one relatively simple Express
application called Widget Factory that begins life in Chapter 7 and is touched on, here
and there, in most of the rest of the chapters. I believe you’ll have a better time with
the book if you start at the beginning and then lightly skim the sections that you know,
rather than skip a chapter altogether.
As the king says in Alice in Wonderland, “Begin at the beginning and go on till you come
to the end: then stop.”
The Technology
The examples in this book were created in various releases of Node 0.6.x. Most were
tested in a Linux environment, but should work, as is, in any Node environment.
Node 0.8.x released just as this book went to production. The examples in the chapters
do work with Node 0.8.x for the most part; I have indicated the instances where you’ll
need to make a code change to ensure that the application works with the newest Node
release.
The Examples
You can find the examples as a compressed file at the O’Reilly web page for this book
(http://oreil.ly/Learning_node). Once you’ve downloaded and uncompressed it, and
you have Node installed, you can install all the dependency libraries for the examples
by changing to the examples directory and typing:
npm install -d
I’ll cover more on using the Node package manager (npm) in Chapter 4.
xii | Preface
Conventions Used in This Book
The following typographical conventions are used in this book:
Plain text
Indicates menu titles, menu options, menu buttons, and keyboard accelerators
(such as Alt and Ctrl).
Italic
Indicates new terms, URLs, email addresses, filenames, file extensions, pathnames,
directories, and Unix utilities.
Constant width
Indicates commands, options, switches, variables, attributes, keys, functions,
types, classes, namespaces, methods, modules, properties, parameters, values, ob-
jects, events, event handlers, XML tags, HTML tags, macros, the contents of files,
or the output from commands.
Constant width bold
Shows commands or other text that should be typed literally by the user.
Constant width italic
Shows text that should be replaced with user-supplied values.
Preface | xiii
If you feel your use of code examples falls outside fair use or the permission given above,
feel free to contact us at [email protected].
How to Contact Us
Please address comments and questions concerning this book to the publisher:
O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-998-9938 (in the United States or Canada)
707-829-0515 (international or local)
707-829-0104 (fax)
We have a web page for this book, where we list errata, examples, and any additional
information. You can access this page at http://oreil.ly/Learning_node.
To comment or ask technical questions about this book, please send email to
[email protected].
For more information about our books, courses, conferences, and news, see our website
at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
xiv | Preface
Acknowledgments
Thanks, as always, to friends and family who help keep me sane when I work on a
book. Special thanks to my editor, Simon St. Laurent, who listened to me vent more
than once.
My thanks also to the production crew who helped take this book from an idea to the
work you’re now holding: Rachel Steely, Rachel Monaghan, Kiel Van Horn, Aaron
Hazelton, and Rebecca Demarest.
When you work with Node, you’re the recipient of a great deal of generosity, starting
with the creator of Node.js, Ryan Dahl, and including the creator of npm, Isaac
Schlueter, who is also now the Node.js gatekeeper.
Others who provided extremely useful code and modules in this book are Bert Belder,
TJ Holowaychuk, Jeremy Ashkenas, Mikeal Rogers, Guillermo Rauch, Jared Hanson,
Felix Geisendörfer, Steve Sanderson, Matt Ranney, Caolan McMahon, Remy Sharp,
Chris O’Hara, Mariano Iglesias, Marco Aurélio, Damián Suárez, Jeremy Ashkenas,
Nathan Rajlich, Christian Amor Kvalheim, and Gianni Chiappetta. My apologies for
any module developers I have inadvertently omitted.
And what would Node be without the good people who provide tutorials, how-tos,
and helpful guides? Thanks to Tim Caswell, Felix Geisendörfer, Mikato Takada, Geo
Paul, Manuel Kiessling, Scott Hanselman, Peter Krumins, Tom Hughes-Croucher, Ben
Nadel, and the entire crew of Nodejitsu and Joyent.
Preface | xv
CHAPTER 1
Node.js: Up and Running
1
Setting Up a Node Development Environment
There is more than one way to install Node in most environments. Which approach
you use is dependent on your existing development environment, your comfort level
working with source code, or how you plan to use Node in your existing applications.
Package installers are provided for both Windows and Mac OS, but you can install
Node by grabbing a copy of the source and compiling the application. You can also use
Git to clone (check out) the Node repo (repository) in all three environments.
In this section I’m going to demonstrate how to get Node working in a Linux system
(an Ubuntu 10.04 VPS, or virtual private server), by retrieving and compiling the source
directly. I’ll also demonstrate how to install Node so that you can use it with Microsoft’s
WebMatrix on a Windows 7 PC.
Download source and basic package installers for Node from http://no
dejs.org/#download. There’s a wiki page providing some basic instruc-
tion for installing Node in various environments at https://github.com/
joyent/node/wiki/Installing-Node-via-package-manager. I also encour-
age you to search for the newest tutorials for installing Node in your
environment, as Node is very dynamic.
This book assumes only that you have previous experience with Java-
Script and traditional web development. Given that, I’m erring on the
side of caution and being verbose in descriptions of what you need to
do to install Node.
For both Ubuntu and Debian, you’ll also need to install other libraries. Using the Ad-
vanced Packaging Tool (APT) available in most Debian GNU/Linux systems, you can
ensure the libraries you need are installed with the following commands:
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install build-essential openssl libssl-dev pkg-config
You now have a directory labeled node-v0.6.18. Change into the directory and issue
the following commands to compile and install Node:
./configure
make
sudo make install
If you’ve not used the make utility in Unix before, these three commands set up the
makefile based on your system environment and installation, run a preliminary make to
check for dependencies, and then perform a final make with installation. After process-
ing these commands, Node should now be installed and accessible globally via the
command line.
The fun challenge of programming is that no two systems are alike. This
sequence of actions should be successful in most Linux environments,
but the operative word here is should.
Notice in the last command that you had to use sudo to install Node. You need root
privileges to install Node this way (see the upcoming note). However, you can install
Node locally by using the following, which installs Node in a given local subdirectory:
mkdir ~/working
./configure --prefix=~/working
make
make install
echo 'export PATH=~/working/bin:${PATH}' >> ~/.bashrc
. ~/.bashrc
So, as you can see here, setting the prefix configuration option to a specified path in
your home directory installs Node locally. You’ll need to remember to update your
PATH environmental variable accordingly.
Although you can install Node locally, if you’re thinking of using this approach to use
Node in your shared hosting environment, think again. Installing Node is just one part
of using Node in an environment. You also need privileges to compile an application,
as well as run applications off of certain ports (such as port 80). Most shared hosting
environments will not allow you to install your own version of Node.
Unless there’s a compelling reason, I recommend installing Node using sudo.
At one time there was a security concern about running the Node pack-
age manager (npm), covered in Chapter 4, with root privilege. However,
those security issues have since been addressed.
You can find the Windows Azure SDK for Node and installation in-
structions at https://www.windowsazure.com/en-us/develop/nodejs/.
The other approach to using Node on Windows—in this case, Windows 7—is by in-
tegrating Node into Microsoft’s WebMatrix, a tool for web developers integrating open
source technologies. Here are the steps we’ll need to take to get Node up and running
with WebMatrix in Windows 7:
Once the WebMatrix installation is finished, install the latest version of Node using
the installer provided at the primary Node site (http://nodejs.org/#download). Installa-
tion is one-click, and once you’re finished you can open a Command window and type
node to check for yourself that the application is operational, as shown in Figure 1-2.
For Node to work with IIS in Windows, install iisnode, a native IIS 7.x module created
and maintained by Tomasz Janczuk. As with Node, installation is a snap using the
prebuilt installation package, available at https://github.com/tjanczuk/iisnode. There are
x86 and x64 installations, but for x64, you’ll need to install both.
During the iisnode installation, a window may pop up telling you that you’re missing
the Microsoft Visual C++ 2010 Redistributable Package, as shown in Figure 1-3. If so,
you’ll need to install this package, making sure you get the one that matches the version
of iisnode you’re installing—either the x86 package (available at http://www.microsoft
.com/download/en/details.aspx?id=5555) or the x64 package (available at http://www
.microsoft.com/download/en/details.aspx?id=14632), or both. Once you’ve installed the
requisite package, run the iisnode installation again.
Figure 1-3. Message warning us that we need to install the C++ redistributable package
If you want to install the iisnode samples, open a Command window with administrator
privileges, go to the directory where iisnode is installed—either Program Files for 64-
bit, or Program Files (x86)—and run the setupsamples.bat file.
Figure 1-5 shows WebMatrix once the site has been generated. Click the Run button,
located in the top left of the page, and a browser page should open with the ubiquitous
“Hello, world!” message displayed.
If you’re running the Windows Firewall, the first time you run a Node application, you
may get a warning like that shown in Figure 1-6. You need to let the Firewall know this
application is acceptable by checking the “Private networks” option and then the “Al-
low access” button. You want to restrict communication to just your private network
on your development machine.
Figure 1-6. Warning that the Windows Firewall blocked Node application, and the option to bypass
If you look at the generated files for your new WebMatrix Node project, you’ll see one
named app.js. This is the Node file, and it contains the following code:
}).listen(process.env.PORT || 8080);
What this all means, I’ll get into in the second part of this chapter. The important item
to take away from this code right now is that we can run this same application in any
operating system where Node is installed and get the exact same functionality: a service
that returns a simple message to the user.
Updating Node
Node stable releases are even numbered, such as the current 0.8.x, while the develop-
ment releases are odd numbered (currently 0.9.x). I recommend sticking with stable
releases only—at least until you have some experience with Node.
Updating your Node installation isn’t complicated. If you used the package installer,
using it for the new version should just override the old installation. If you’re working
directly with the source, you can always uninstall the old source and install the new if
you’re concerned about potential clutter or file corruption. In the Node source direc-
tory, just issue the uninstall make option:
make uninstall
Download the new source, compile it, and install it, and you’re ready to go again.
The challenge with updating Node is determining whether a specific environment,
module, or other application works with the new version. In most cases, you shouldn’t
have version troubles. However, if you do, there is an application you can use to
“switch” Node versions. The application is the Node Version Manager (Nvm).
You can download Nvm from GitHub, at https://github.com/creationix/nvm. Like
Node, Nvm must be compiled and installed on your system.
To install a specific version of Node, install it with Nvm:
nvm install v0.4.1
Node: Jumping In
Now that you have Node installed, it’s time to jump into your first application.
// content header
res.writeHead(200, {'content-type': 'text/plain'});
The code is saved in a file named helloworld.js. As server-side functionality goes, this
Node application is neither too verbose, nor too cryptic; one can intuit what’s hap-
pening, even without knowing Node. Best of all, it’s familiar since it’s written in a
language we know well: JavaScript.
To run the application, from the command line in Linux, the Terminal window in Mac
OS, or the Command window in Windows, type:
node helloworld.js
The following is printed to the command line once the program has successfully started:
Server running at 8124
Now, access the site using any browser. If the application is running on your local
machine, you’ll use localhost:8124. If it’s running remotely, use the URL of the remote
site, with the 8124 port. A web page with the words “Hello, World!” is displayed.
You’ve now created your first complete and working Node application.
Since we didn’t use an ampersand (&) following the node command—telling the ap-
plication to run in the background—the application starts and doesn’t return you to
the command line. You can continue accessing the application, and the same words
get displayed. The application continues until you type Ctrl-C to cancel it, or otherwise
kill the process.
If you want to run the application in the background within a Linux system, use the
following:
node helloworld.js &
However, you’ll then have to find the process identifier using ps -ef, and manually kill
the right process—in this case, the one with the process identifier 3747—using kill:
ps -ef | grep node
kill 3747
You won’t be able to start another Node application listening at the same port: you can
run only one Node application against one port at a time. If you’re running Apache at
port 80, you won’t be able to run the Node application at this port, either. You must
use a different port for each application.
You can also add helloworld.js as a new file to the existing WebMatrix website you
created earlier, if you’re using WebMatrix. Just open the site, choose the “New File...”
option from the menu bar, and add the text shown in Example 1-1 to the file. Then
click the Run button.
WebMatrix overrides the port in the application. When you run the
application, you’ll access the application from the port defined for the
project, not specified in the http.Server.listen method.
Node: Jumping In | 11
var http = require('http');
Most Node functionality is provided through external applications and libraries called
modules. This line of JavaScript loads the HTTP module, assigning it to a local variable.
The HTTP module provides basic HTTP functionality, enabling network access of the
application.
The next line of code is:
http.createServer(function (req, res) { ...
In this line of code, a new server is created with createServer, and an anonymous
function is passed as the parameter to the function call. This anonymous function is
the requestListener function, and has two parameters: a server request (http.Server
Request) and a server response (http.ServerResponse).
Within the anonymous function, we have the following line:
res.writeHead(200, {'content-Type': 'text/plain'});
The http.ServerResponse object has a method, writeHead, that sends a response header
with the response status code (200), as well as provides the content-type of the re-
sponse. You can also include other response header information within the headers
object, such as content-length or connection:
{ 'content-length': '123',
'content-type': 'text/plain',
'connection': 'keep-alive',
'accept': '*/*' }
and then:
www.allitebooks.com
res.end();
The anonymous function and the createServer function are both finished on the next
line in the code:
}).listen(8124);
The http.Server.listen method chained at the end of the createServer method listens
for incoming connections on a given port—in this case, port 8124. Optional parameters
are a hostname and a callback function. If a hostname isn’t provided, the server accepts
connections to web addresses, such as http://oreilly.com or http://examples.burningbird
.net.
The listen method is asynchronous, which means the application doesn’t block pro-
gram execution, waiting for the connection to be established. Whatever code following
the listen call is processed, and the listen callback function is invoked when the
listening event is fired—when the port connection is established.
The last line of code is:
console.log('Server running on 8124/');
The console object is one of the objects from the browser world that is incorporated
into Node. It’s a familiar construct for most JavaScript developers, and provides a way
to output text to the command line (or development environment), rather than to the
client.
A new module, File System (fs), is used in this example. The File System module wraps
standard POSIX file functionality, including opening up and accessing the contents
from a file. The method used is readFile. In Example 1-2, it’s passed the name of the
file to open, the encoding, and an anonymous function.
The two instances of asynchronous behavior I want to point out in Example 1-2 are
the callback function that’s attached to the readFile method, and the callback function
attached to the listen method.
As discussed earlier, the listen method tells the HTTP server object to begin listening
for connections on the given port. Node doesn’t block, waiting for the connection to
be established, so if we need to do something once the connection is established, we
provide a callback function, as shown in Example 1-2.
When the connection is established, a listening event is emitted, which then invokes
the callback function, outputting a message to the console.
The second, more important callback instance is the one attached to readFile. Access-
ing a file is a time-consuming operation, relatively speaking, and a single-threaded
application accessed by multiple clients that blocked on file access would soon bog
down and be unusable.
Instead, the file is opened and the contents are read asynchronously. Only when the
contents have been read into the data buffer—or an error occurs during the process—
is the callback function passed to the readFile method called. It’s passed the error (if
any), and the data if no error occurs.
var counter = 0;
// content header
res.writeHead(200, {'Content-Type': 'text/plain'});
The loop to print out the numbers is used to delay the application, similar to what
could happen if you performed a computationally intensive process and then blocked
until the process was finished. The setTimeout function is another asynchronous func-
tion, which in turn invokes a second asynchronous function: readFile. The application
combines both asynchronous and synchronous processes.
Create a text file named main.txt, containing any text you want. Running the applica-
tion and accessing the page from Chrome with a query string of file=main generates
the following console output:
Server running at 8124/
opening main.txt
opening undefined.txt
The first two lines are expected. The first is the result of running console.log at the end
of the application, and the second is a printout of the file being opened. But what’s
undefined.txt in the third line?
When processing a web request from a browser, be aware that browsers may send more
than one request. For instance, a browser may also send a second request, looking for
a favicon.ico. Because of this, when you’re processing the query string, you must check
to see if the data you need is being provided, and ignore requests without the data.
So far, all we’ve done is test our Node applications from a browser. This isn’t really
putting much stress on the asynchronous nature of the Node application.
Example 1-4. Simple application to call the new Node application 2,000 times
var http = require('http');
Create the second text file, named secondary.txt. Put whatever you wish in it, but make
the contents obviously different from main.txt.
After making sure the Node application is running, start the test application:
node test.js
As the test application is running, access the application using your browser. If you
look at the console messages being output by the application, you’ll see it process both
your manual and the test application’s automated requests. Yet the results are consis-
tent with what we would expect, a web page with:
• The numbers 1 through 100 printed out
• The contents of the text file—in this case, main.txt
Now, let’s mix things up a bit. In Example 1-3, make the counter global rather than
local to the loop function, and start the application again. Then run the test program
and access the page in the browser.
This example demonstrates how absolutely critical the use of var is with
Node.
Benefits of Node
By now you have a working Node installation—possibly even more than one.
You’ve also had a chance to create a couple of Node applications and test out the
differences between synchronous and asynchronous code (and what happens if you
accidentally forget the var keyword).
Node isn’t all asynchronous function calls. Some objects may provide both synchro-
nous and asynchronous versions of the same function. However, Node works best
when you use asynchronous coding as much as possible.
The Node event loop and callback functions have two major benefits.
First, the application can easily scale, since a single thread of execution doesn’t have
an enormous amount of overhead. If we were to create a PHP application similar to the
Node application in Example 1-3, the user would see the same page—but your system
Benefits of Node | 19
would definitely notice the difference. If you ran the PHP application in Apache with
the default prefork MPM, each time the application was requested, it would have to be
handled in a separate child process. Chances are, unless you have a significantly loaded
system, you’ll only be able to run—at most—a couple of hundred child processes in
parallel. More than that number of requests means that a client needs to wait for a
response.
A second benefit to Node is that you minimize resource usage, but without having to
resort to multithreaded development. In other words, you don’t have to create a thread-
safe application. If you’ve ever developed a thread-safe application previously, you’re
probably feeling profoundly glad at this statement.
However, as was demonstrated in the last example application, you aren’t developing
JavaScript applications for single users to run in the browser, either. When you work
with asynchronous applications, you need to make sure that you don’t build in de-
pendencies on one asynchronous function call finishing ahead of another, because there
are no guarantees—not unless you call the second function call within the code of the
first. In addition, global variables are extremely hazardous in Node, as is forgetting the
var keyword.
Still, these are issues we can work with—especially considering the benefits of Node’s
low resource requirements and not having to worry about threads.
A final reason for liking Node? You can code in JavaScript without hav-
ing to worry about IE6.
While you’re exploring the use of Node and figuring out the code for your custom
module or Node application, you don’t have to type JavaScript into a file and run it
with Node to test your code. Node also comes with an interactive component known
as REPL, or read-eval-print loop, which is the subject of this chapter.
REPL (pronounced “repple”) supports a simplified Emacs style of line editing and a
small set of basic commands. Whatever you type into REPL is processed no differently
than if you had typed the JavaScript into a file and run the file using Node. You can
actually use REPL to code your entire application—literally testing the application on
the fly.
In this chapter, I’ll also cover some interesting quirks of REPL, along with some ways
you can work around them. These workarounds include replacing the underlying
mechanism that persists commands, as well as using some command-line editing.
Lastly, if the built-in REPL doesn’t provide exactly what you need for an interactive
environment, there’s also an API to create your own custom REPL, which I’ll demon-
strate in the latter part of the chapter.
REPL then provides a command-line prompt—an angle bracket (>)—by default. Any-
thing you type from this point on is processed by the underlying V8 JavaScript engine.
REPL is very simple to use. Just start typing in your JavaScript, like you’d add it to a file:
21
> a = 2;
2
The tool prints out the result of whatever expression you just typed. In this session
excerpt, the value of the expression is 2. In the following, the expression result is an
array with three elements:
> b = ['a','b','c'];
[ 'a', 'b', 'c' ]
To access the last expression, use the underscore/underline character (_). In the fol-
lowing, a is set to 2, and the resulting expression is incremented by 1, and then 1 again:
> a = 2;
2
> _ ++;
3
> _ ++;
4
You can even access properties or call methods on the underscored expression:
> ['apple','orange','lime']
[ 'apple', 'orange', 'lime' ]
> _.length
3
> 3 + 4
7
> _.toString();
'7'
You can use the var keyword with REPL in order to access an expression or value at a
later time, but you might get an unexpected result. For instance, the following line in
REPL:
var a = 2;
doesn’t return the value 2, it returns a value of undefined. The reason is that the result
of the expression is undefined, since variable assignment doesn’t return a result when
evaluated.
Consider the following instead, which is what’s happening, more or less, under the
hood in REPL:
console.log(eval('a = 2'));
console.log(eval('var a = 2'));
Typing the preceding lines into a file and running that file using Node returns:
2
undefined
There is no result from the second call to eval, and hence the value returned is unde
fined. Remember, REPL is a read-eval-print loop, with emphasis on the eval.
Still, you can use the variable in REPL, just as you would in a Node application:
The latter two command lines do have results, which are printed out by REPL.
To end the REPL session, either press Ctrl-C twice, or Ctrl-D once. We’ll cover other
ways to end the session later, in “REPL Commands” on page 27.
This code snippet is a good example of how REPL can be useful. At first glance, we
might expect the expression we typed to evaluate to true, since 3 is greater than 2,
which is greater than 1. However, in JavaScript, expressions are evaluated left to right,
and each expression’s result is returned for the next evaluation.
A better way of looking at what’s happening with the preceding code snippet is this
REPL session:
> 3 > 2 > 1;
false
> 3 > 2;
true
> true > 1;
false
Now the result makes more sense. What’s happening is that the expression 3 > 2 is
evaluated, returning true. But then the value of true is compared to the numeric 1.
JavaScript provides automatic data type conversion, after which true and 1 are equiv-
alent values. Hence, true is not greater than 1, and the result is false.
REPL’s helpfulness is in enabling us to discover these little interesting quirks in Java-
Script. Hopefully, after testing our code in REPL, we don’t have unexpected side effects
in our applications (such as expecting a result of true but getting a result of false).
Since you didn’t use the var keyword, the expression result is printed out—in this
instance, the interface for the querystring object. How’s that for a bonus? Not only are
you getting access to the object, but you’re also learning more about the object’s in-
terface while you’re at it. However, if you want to forgo the potentially lengthy output
of text, use the var keyword:
> var qs = require('querystring');
You’ll be able to access the querystring object with the qs variable with either approach.
In addition to being able to incorporate external modules, REPL gracefully handles
multiline expressions, providing a textual indicator of code that’s nested following an
opening curly brace ({):
> var test = function (x, y) {
... var val = x * y;
... return val;
... };
undefined
> test(3,4);
12
REPL provides repeating dots to indicate that everything that’s being typed follows an
open curly brace and hence the command isn’t finished yet. It does the same for an
open parenthesis, too:
> test(4,
... 5);
20
Increasing levels of nesting generates more dots; this is necessary in an interactive en-
vironment, where you might lose track of where you are, as you type:
> var test = function (x, y) {
... var test2 = function (x, y) {
..... return x * y;
You can type in, or copy and paste in, an entire Node application and run it from REPL:
> var http = require('http');
undefined
> http.createServer(function (req, res) {
...
... // content header
... res.writeHead(200, {'Content-Type': 'text/plain'});
...
... res.end("Hello person\n");
... }).listen(8124);
{ connections: 0,
allowHalfOpen: true,
_handle:
{ writeQueueSize: 0,
onconnection: [Function: onconnection],
socket: [Circular] },
_events:
{ request: [Function],
connection: [Function: connectionListener] },
httpAllowHalfOpen: false }
>
undefined
> console.log('Server running at http://127.0.0.1:8124/');
Server running at http://127.0.0.1:8124/
Undefined
You can access this application from a browser no differently than if you had typed the
text into a file and run it using Node. And again, the responses back from REPL can
provide an interesting look at the code, as shown in the boldfaced text.
In fact, my favorite use of REPL is to get a quick look at objects. For instance, the Node
core object global is sparsely documented at the Node.js website. To get a better look,
I opened up a REPL session and passed the object to the console.log method like so:
> console.log(global)
I could have done the following, which has the same result:
> gl = global;
I’m not replicating what was displayed in REPL; I’ll leave that for you to try on your
own installation, since the interface for global is so large. The important point to take
away from this exercise is that we can, at any time, quickly and easily get a quick look
at an object’s interface. It’s a handy way of remembering what a method is called, or
what properties are available.
You can use the up and down arrow keys to traverse through the commands you’ve
typed into REPL. This can be a handy way of reviewing what you’ve done, as well as a
way of editing what you’ve done, though in a somewhat limited capacity.
Consider the following session in REPL:
> var myFruit = function(fruitArray,pickOne) {
... return fruitArray[pickOne - 1];
... }
undefined
> fruit = ['apples','oranges','limes','cherries'];
[ 'apples',
'oranges',
'limes',
'cherries' ]
> myFruit(fruit,2);
'oranges'
> myFruit(fruit,0);
undefined
> var myFruit = function(fruitArray,pickOne) {
... if (pickOne <= 0) return 'invalid number';
... return fruitArray[pickOne - 1];
... };
undefined
> myFruit(fruit,0);
'invalid number'
> myFruit(fruit,1);
'apples'
Though it’s not demonstrated in this printout, when I modified the function to check
the input value, I actually arrowed up through the content to the beginning function
declaration, and then hit Enter to restart the function. I added the new line, and then
again used the arrow keys to repeat previously typed entries until the function was
finished. I also used the up arrow key to repeat the function call that resulted in an
undefined result.
It seems like a lot of work just to avoid retyping something so simple, but consider
working with regular expressions, such as the following:
> var ssRe = /^\d{3}-\d{2}-\d{4}$/;
undefined
> ssRe.test('555-55-5555');
true
> var decRe = /^\s*(\+|-)?((\d+(\.\d+)?)|(\.\d+))\s*$/;
undefined
> decRe.test(56.5);
true
If you’re concerned about spending a lot of time coding in REPL with nothing to show
for it when you’re done, no worries: you can save the results of the current context with
the .save command. It and the other REPL commands are covered in the next section.
REPL Commands
REPL has a simple interface with a small set of useful commands. In the preceding
section, I mentioned .save. The .save command saves your inputs in the current object
context into a file. Unless you specifically created a new object context or used
the .clear command, the context should comprise all of the input in the current REPL
session:
> .save ./dir/session/save.js
Only your inputs are saved, as if you had typed them directly into a file using a text
editor.
Here is the complete list of REPL commands and their purposes:
.break
If you get lost during a multiline entry, typing .break will start you over again.
You’ll lose the multiline content, though.
Mac users should use the appropriate installer for these applications. Windows users
have to use a Unix environmental emulator, such as Cygwin.
Here’s a quick and visual demonstration of using REPL with rlwrap to change the REPL
prompt to purple:
env NODE_NO_READLINE=1 rlwrap -ppurple node
If I always want my REPL prompt to be purple, I can add an alias to my bashrc file:
alias node="env NODE_NO_READLINE=1 rlwrap -ppurple node"
To change both the prompt and the color, I’d use the following:
in purple.
The especially useful component of rlwrap is its ability to persist history across REPL
sessions. By default, we have access to command-line history only within a REPL ses-
sion. By using rlwrap, the next time we access REPL, not only will we have access to a
history of commands within the current session, but also a history of commands in past
sessions (and other command-line entries). In the following session output, the com-
mands shown were not typed in, but were instead pulled from history with the up arrow
key:
# env NODE_NO_READLINE=1 rlwrap -ppurple -S "::>" node
::>e = ['a','b'];
[ 'a', 'b' ]
::>3 > 2 > 1;
false
As helpful as rlwrap is, we still end up with undefined every time we type in an expres-
sion that doesn’t return a value. However, we can adjust this, and other functionality,
just by creating our own custom REPL, discussed next.
Custom REPL
Node provides us access to creating our own custom REPL. To do so, first we need to
include the REPL module (repl):
var repl = require("repl");
To create a new REPL, we call the start method on the repl object. The syntax for this
method is:
repl.start([prompt], [stream], [eval], [useGlobal], [ignoreUndefined]);
All of the parameters are optional. If not provided, default values will be used for each
as follows:
prompt
Default is >.
stream
Default is process.stdin.
eval
Default is the async wrapper for eval.
useGlobal
Default is false to start a new context rather than use the global object.
Then I used the custom REPL just like I use the built-in version, except now I have a
different prompt and no longer get the annoying undefined after the first variable
assignment. I do still get the other responses that aren’t undefined:
node via stdin> var ct = 0;
node via stdin> ct++;
0
node via stdin> console.log(ct);
1
node via stdin> ++ct;
2
node via stdin> console.log(ct);
2
In my code, I wanted the defaults for all but prompt and ignoreUndefined. Setting the
other parameters to null triggers Node to use the default values for each.
You can replace the eval function with your custom REPL. The only requirement is
that it has a specific format:
function eval(cmd, callback) {
callback(null, result);
}
The stream option is interesting. You can run multiple versions of REPL, taking input
from both the standard input (the default), as well as sockets. The documentation for
REPL at the Node.js site provides an example of a REPL listening in on a TCP socket,
using code similar to the following:
var repl = require("repl"),
net = require("net");
net.createServer(function (socket) {
repl.start("node via TCP socket> ", socket);
}).listen(8124);
Figure 2-1. PuTTY and REPL via TCP don’t exactly like each other
I also tried with the Windows 7 Telnet client, and the response was even worse. How-
ever, using my Linux Telnet client worked without a hitch.
The problem here, as you might expect, is Telnet client settings. However, I didn’t
pursue it further, because running REPL from an exposed Telnet socket is not some-
thing I plan to implement, and not something I would recommend, either—at least,
not without heavy security. It’s like using eval() in your client-side code, and not
scrubbing the text your users send you to run—but worse.
You could keep a running REPL and communicate via a Unix socket with something
like the GNU Netcat utility:
nc -U /tmp/node-repl-sock
You can type in commands no differently than typing them in using stdin. Be aware,
though, if you’re using either a TCP or Unix socket, that any console.log commands
are printed out to the server console, not to the client:
console.log(someVariable); // actually printed out to server
// preload in modules
context.http = require('http');
context.util = require('util');
context.os = require('os');
Running the application with Node brings up the REPL prompt, where we can then
access the modules:
>>os.hostname();
'einstein'
>>util.log('message');
5 Feb 11:33:15 - message
>>
If you want to run the REPL application like an executable in Linux, add the following
line as the first line in the application:
#!/usr/local/bin/node
www.allitebooks.com
If you’ll be spending a lot of time developing in REPL, even with the use of rlwrap to
persist history, you’re going to want to frequently save your work. Working in REPL
is no different than working in other editing environments, so I’ll repeat: stuff happens
—save often.
REPL has had a major facelift in Node 0.8. For instance, just typing the
built-in module name, such as fs, loads the module now. Other im-
provements are noted in the new REPL documentation at the primary
Node.js website.
Chapter 1 provided a first look at a Node application with the traditional (and always
entertaining) Hello, World application. The examples in the chapter made use of a
couple of modules from what is known as the Node core: the API providing much of
the functionality necessary for building Node applications.
In this chapter, I’m going to provide more detail on the Node core system. It’s not an
exhaustive overview, since the API is quite large and dynamic in nature. Instead, we’ll
focus on key elements of the API, and take a closer look at those that we’ll use in later
chapters and/or are complex enough to need a more in-depth review.
Topics covered in this chapter include:
• Node global objects, such as global, process, and Buffer
• The timer methods, such as setTimeout
• A quick overview of socket and stream modules and functionality
• The Utilities object, especially the part it plays in Node inheritance
• The EventEmitter object and events
35
necessarily anything we’d access or need to know about directly. Some, though, are
important enough for us to take a closer look at, because they help define key aspects
of how Node works.
In particular, we’re going to explore:
• The global object—that is, the global namespace
• The process object, which provides essential functionality, such as wrappers for
the three STDIO (Standard IO) streams, and functionality to transform a synchro-
nous function into an asynchronous callback
• The Buffer class, a global object that provides raw data storage and manipulation
• Child processes
• Modules useful for domain resolution and URL processing
global
global is the global namespace object. In some ways, it’s similar to windows in a browser
environment, in that it provides access to global properties and methods and doesn’t
have to be explicitly referenced by name.
From REPL, you can print out the global object to the console like so:
> console.log(global)
What prints out is the interface for all of the other global objects, as well as a good deal
of information about the system in which you’re running.
I mentioned that global is like the windows object in a browser, but there are key dif-
ferences—and not just the methods and properties available. The windows object in a
browser is truly global in nature. If you define a global variable in client-side JavaScript,
it’s accessible by the web page and by every single library. However, if you create a
variable at the top-level scope in a Node module (a variable outside a function), it only
becomes global to the module, not to all of the modules.
You can actually see what happens to the global object when you define a module/
global variable in REPL. First, define the top-level variable:
> var test = "This really isn't global, as we know global";
You should see your variable, as a new property of global, at the bottom. For another
interesting perspective, assign global to a variable, but don’t use the var keyword:
gl = global;
The global object interface is printed out to the console, and at the bottom you’ll see
the local variable assigned as a circular reference:
Any other global object or method, including require, is part of the global object’s
interface.
When Node developers discuss context, they’re really referring to the global object. In
Example 2-1 in Chapter 2, the code accessed the context object when creating a custom
REPL object. The context object is a global object. When an application creates a
custom REPL, it exists within a new context, which in this case means it has its own
global object. The way to override this and use the existing global object is to create a
custom REPL and set the useGlobal flag to true, rather than the default of false.
Modules exist in their own global namespace, which means that if you define a top-
level variable in one module, it is not available in other modules. More importantly, it
means that only what is explicitly exported from the module becomes part of whatever
application includes the module. In fact, you can’t access a top-level module variable
in an application or other module, even if you deliberately try.
To demonstrate, the following code contains a very simple module that has a top-level
variable named globalValue, and functions to set and return the value. In the function
that returns the value, the global object is printed out using a console.log method call.
var globalValue;
exports.setGlobal = function(val) {
globalValue = val;
};
exports.returnGlobal = function() {
console.log(global);
return globalValue;
};
We might expect that in the printout of the global object we’ll see globalValue, as we
do when we set a variable in our applications. This doesn’t happen, though.
Start a REPL session and issue a require call to include the new module:
> var mod1 = require('./mod1.js');
Set the value and then ask for the value back:
> mod1.setGlobal(34);
> var val = mod1.returnGlobal();
The console.log method prints out the global object before returning its globally de-
fined value. We can see at the bottom the new variable holding a reference to the
imported module, but val is undefined because the variable hasn’t yet been set. In
addition, the output includes no reference to that module’s own top-level globalValue:
If we ran the command again, then the outer application variable would be set, but we
still wouldn’t see globalValue:
mod1: { setGlobal: [Function], returnGlobal: [Function] },
_: undefined,
val: 34 }
The only access we have to the module data is by whatever means the module provides.
For JavaScript developers, this means no more unexpected and harmful data collisions
because of accidental or intentional global variables in libraries.
process
Each Node application is an instance of a Node process object, and as such, comes
with certain built-in functionality.
Many of the process object’s methods and properties provide identification or infor-
mation about the application and its environment. The process.execPath method re-
turns the execution path for the Node application; process.version provides the Node
version; and process.platform identifies the server platform:
console.log(process.execPath);
console.log(process.version);
console.log(process.platform);
This code returns the following in my system (at the time of this writing):
/usr/local/bin/node
v0.6.9
linux
The process object also wraps the STDIO streams stdin, stdout, and stderr. Both
stdin and stdout are asynchronous, and are readable and writable, respectively.
stderr, however, is a synchronous, blocking stream.
To demonstrate how to read and write data from stdin and stdout, in Example 3-1 the
Node application listens for data in stdin, and repeats the data to stdout. The stdin
stream is paused by default, so we have to issue a resume call before sending data.
Example 3-1. Reading and writing data to stdin and stdout, respectively
process.stdin.resume();
Run the application using Node, and then start typing into the terminal. Every time
you type something and press Enter, what you typed is reflected back to you.
The heapTotal and heapUsed properties refer to the V8 engine’s memory usage.
The last process method I’m going to cover is process.nextTick. This method attaches
a callback function that’s fired during the next tick (loop) in the Node event loop.
You would use process.nextTick if you wanted to delay a function for some reason,
but you wanted to delay it asynchronously. A good example would be if you’re creating
a new function that has a callback function as a parameter and you want to ensure that
the callback is truly asynchronous. The following code is a demonstration:
function asynchFunction = function (data, callback) {
process.nextTick(function() {
callback(val);
});
);
If we just called the callback function, then the action would be synchronous. Now,
the callback function won’t be called until the next tick in the event loop, rather than
right away.
You could use setTimeout with a zero (0) millisecond delay instead of process.nextTick:
setTimeout(function() {
callback(val);
}, 0);
Buffer
The Buffer class, also a global object, is a way of handling binary data in Node. In the
section “Servers, Streams, and Sockets” on page 41 later in the chapter, I’ll cover the
fact that streams are often binary data rather than strings. To convert the binary data
to a string, the data encoding for the stream socket is changed using setEncoding.
As a demonstration, you can create a new buffer with the following:
If the buffer holds a string, you can pass in an optional second parameter with the
encoding. Possible encodings are:
ascii
Seven-bit ASCII
utf8
Multibyte encoded Unicode characters
usc2
Two bytes, little-endian-encoded Unicode characters
base64
Base64 encoding
hex
Encodes each byte as two hexadecimal characters
You can also write a string to an existing buffer, providing an optional offset,
length, and encoding:
buf.write(string); // offset defaults to 0, length defaults to
buffer.length - offset, encoding is utf8
Data sent between sockets is transmitted as a buffer (in binary format) by default. To
send a string instead, you either need to call setEncoding directly on the socket, or
specify the encoding in the function that writes to the socket. By default, the TCP
(Transmission Control Protocol) socket.write method sets the second parameter to
utf8, but the socket returned in the connectionListener callback to the TCP create
Server function sends the data as a buffer, not a string.
In the code, the callback function on_OpenAndReadFile opens and reads a file to the
HTTP response when the function is called after approximately 2,000 milliseconds
have passed.
The function clearTimeout clears a preset setTimeout. If you need to have a repeating
timer, you can use setInterval to call a function every n milliseconds—n being the
second parameter passed to the function. Clear the interval with clearInterval.
conn.on('close', function() {
console.log('client closed connection');
});
}).listen(8124);
Node objects that inherit from a special object, the EventEmitter, expose
the on method event handling, as discussed later in this chapter.
// connect to server
client.connect ('8124','localhost', function () {
console.log('connected to server');
client.write('Who needs a browser to communicate?');
});
The data being transmitted between the two sockets is typed in at the terminal, and
transmitted when you press Enter. The client application first sends the string you just
typed, which the TCP server writes out to the console. The server repeats the message
back to the client, which in turn writes the message out to the console. The server also
prints out the IP address and port for the client using the socket’s remoteAddress and
remotePort properties. Following is the console output for the server after several strings
were sent from the client (with the IP address edited out for security):
Hey, hey, hey, hey-now.
from #ipaddress 57251
Don't be mean, we don't have to be mean.
from #ipaddress 57251
Cuz remember, no matter where you go,
from #ipaddress 57251
The connection between the client and server is maintained until you kill one or the
other using Ctrl-C. Whichever socket is still open receives a close event that’s printed
out to the console. The server can also serve more than one connection from more than
one client, since all the relevant functions are asynchronous.
As I mentioned earlier, TCP is the underlying transport mechanism for much of the
functionality we use on the Internet today, including HTTP, which we’ll cover next.
HTTP
You had a chance to work with the HTTP module in Chapter 1. We created servers
using the createServer method, passing in the function that will act as the requestLis
tener. Requests are processed as they come, asynchronously.
In a network, TCP is the transportation layer and HTTP is the application layer. If you
scratch around in the modules included with Node, you’ll see that when you create an
HTTP server, you’re inheriting functionality from the TCP-based net.Server.
For the HTTP server, the requestListener is a socket, while the http.ServerRequest
object is a readable stream and the http.ServerResponse is a writable stream. HTTP
adds another level of complexity because of the chunked transfer encoding it supports.
The chunked transfer encoding allows transfer of data when the exact size of the re-
sponse isn’t known until it’s fully processed. Instead, a zero-sized chunk is sent to
indicate the end of a query. This type of encoding is useful when you’re processing a
request such as a large database query output to an HTML table: writing the data can
begin before the rest of the query data has been received.
The TCP examples earlier in this chapter, and the HTTP examples in Chapter 1, were
both coded to work with network sockets. However, all of the server/socket modules
can also connect to a Unix socket, rather than a specific network port. Unlike a network
socket, a Unix or IPC (interprocess communication) socket enables communication
between processes within the same system.
To demonstrate Unix socket communication, I duplicated Example 1-3’s code, but
instead of binding to a port, the new server binds to a Unix socket, as shown in Ex-
ample 3-4. The application also makes use of readFileSync, the synchronous version
of the function to open a file and read its contents.
// content header
res.writeHead(200, {'Content-Type': 'text/plain'});
The client is based on a code sample provided in the Node core documentation for the
http.request object at the Node.js site. The http.request object, by default, makes use
of http.globalAgent, which supports pooled sockets. The size of this pool is five sockets
by default, but you can adjust it by changing the agent.maxSockets value.
The client accepts the chunked data returned from the server, printing out to the con-
sole. It also triggers a response on the server with a couple of minor writes, as shown
in Example 3-5.
Example 3-5. Connecting to the Unix socket and printing out received data
var http = require('http');
var options = {
method: 'GET',
socketPath: '/tmp/node-server-sock',
path: "/?file=main.txt"
};
I didn’t use the asynchronous file read function with the http.request object because
the connection is already closed when the asynchronous function is called and no file
contents are returned.
Before leaving this section on the HTTP module, be aware that much of the behavior
you’ve come to expect with Apache or other web servers isn’t built into a Node HTTP
server. For instance, if you password-protect your website, Apache will pop up a win-
dow asking for your username and password; a Node HTTP server will not. If you want
this functionality, you’re going to have to code for it.
Chapter 15 covers the SSL version of HTTP, HTTPS, along with Crypto
and TLS/SSL.
UDP/Datagram Socket
TCP requires a dedicated connection between the two endpoints of the communica-
tion. UDP is a connectionless protocol, which means there’s no guarantee of a con-
nection between the two endpoints. For this reason, UDP is less reliable and robust
than TCP. On the other hand, UDP is generally faster than TCP, which makes it more
popular for real-time uses, as well as technologies such as VoIP (Voice over Internet
Protocol), where the TCP connection requirements could adversely impact the quality
of the signal.
Node core supports both types of sockets. In the last couple of sections, I demonstrated
the TCP functionality. Now, it’s UDP’s turn.
The UDP module identifier is dgram:
require ('dgram');
To create a UDP socket, use the createSocket method, passing in the type of socket—
either udp4 or udp6. You can also pass in a callback function to listen for events. Unlike
messages sent with TCP, messages sent using UDP must be sent as buffers, not strings.
Example 3-6 contains the code for a demonstration UDP client. In it, data is accessed
via process.stdin, and then sent, as is, via the UDP socket. Note that we don’t have to
set the encoding for the string, since the UDP socket accepts only a buffer, and the
process.stdin data is a buffer. We do, however, have to convert the buffer to a string,
Example 3-6. A datagram client that sends messages typed into the terminal
var dgram = require('dgram');
The UDP server, shown in Example 3-7, is even simpler than the client. All the server
application does is create the socket, bind it to a specific port (8124), and listen for the
message event. When a message arrives, the application prints it out using
console.log, along with the IP address and port of the sender. Note especially that no
encoding is necessary to print out the message—it’s automatically converted from a
buffer to a string.
We didn’t have to bind the socket to a port. However, without the binding, the socket
would attempt to listen in on every port.
Example 3-7. A UDP socket server, bound to port 8124, listening for messages
var dgram = require('dgram');
server.bind(8124);
I didn’t call the close method on either the client or the server after sending/receiving
the message. However, no connection is being maintained between the client and server
—just the sockets capable of sending a message and receiving communication.
...and then enjoy the fact that everything you type from that point on is echoed back
to you.
If you want to keep the output stream open for continued data, pass an option, { end:
false }, to the output stream:
process.stdin.pipe(process.stdout, { end : false });
There is one additional object that provides a specific functionality to readable streams:
readline. You include the Readline module with code like the following:
var readline = require('readline');
The Readline module allows line-by-line reading of a stream. Be aware, though, that
once you include this module, the Node program doesn’t terminate until you close the
interface and the stdin stream. The Node site documentation contains an example of
how to begin and terminate a Readline interface, which I adapted in Example 3-8. The
application asks a question as soon as you run it, and then outputs the answer. It also
listens for any “command,” which is really any line that terminates with \n. If the com-
mand is .leave, it leaves the application; otherwise, it just repeats the command and
prompts the user for more. A Ctrl-C or Ctrl-D key combination also causes the appli-
cation to terminate.
// ask question
interface.question(">>What is the meaning of life? ", function(answer) {
console.log("About the meaning of life, you said " + answer);
interface.setPrompt(">>");
interface.prompt();
});
interface.on('close', function() {
closeInterface();
});
This should look familiar. Remember from Chapter 2 that we can use rlwrap to override
the command-line functionality for REPL. We use the following to trigger its use:
env NODE_NO_READLINE=1 rlwrap node
Child Processes
Operating systems provide access to a great deal of functionality, but much of it is only
accessible via the command line. It would be nice to be able to access this functionality
from a Node application. That’s where child processes come in.
Node enables us to run a system command within a new child process, and listen in
on its input/output. This includes being able to pass arguments to the command, and
even pipe the results of one command to another. The next several sections explore
this functionality in more detail.
All but the last example demonstrated in this section use Unix com-
mands. They work on a Linux system, and should work in a Mac. They
won’t, however, work in a Windows Command window.
child_process.spawn
There are four different techniques you can use to create a child process. The most
common one is using the spawn method. This launches a command in a new process,
passing in any arguments. In the following, we create a child process to call the Unix
pwd command to print the current directory. The command takes no arguments:
var spawn = require('child_process').spawn,
pwd = spawn('pwd');
Notice the events that are captured on the child process’s stdout and stderr. If no error
occurs, any output from the command is transmitted to the child process’s stdout,
triggering a data event on the process. If an error occurs, such as in the following where
we’re passing an invalid option to the command:
var spawn = require('child_process').spawn,
pwd = spawn('pwd', ['-g']);
The process exited with a code of 1, which signifies that an error occurred. The exit
code varies depending on the operating system and error. When no error occurs, the
child process exits with a code of 0.
The earlier code demonstrated sending output to the child process’s stdout and
stderr, but what about stdin? The Node documentation for child processes includes
an example of directing data to stdin. It’s used to emulate a Unix pipe (|) whereby the
result of one command is immediately directed as input to another command. I adapted
the example in order to demonstrate one of my favorite uses of the Unix pipe—being
able to look through all subdirectories, starting in the local directory, for a file with a
specific word (in this case, test) in its name:
find . -ls | grep test
Example 3-9 implements this functionality as child processes. Note that the first com-
mand, which performs the find, takes two arguments, while the second one takes just
one: a term passed in via user input from stdin. Also note that, unlike the example in
the Node documentation, the grep child process’s stdout encoding is changed via
setEncoding. Otherwise, when the data is printed out, it would be printed out as a
buffer.
Example 3-9. Using child processes to find files in subdirectories with a given search term, “test”
var spawn = require('child_process').spawn,
find = spawn('find',['.','-ls']),
grep = spawn('grep',['test']);
grep.stdout.setEncoding('utf8');
Child Processes | 51
// and exit handling for both
find.on('exit', function (code) {
if (code !== 0) {
console.log('find process exited with code ' + code);
}
When you run the application, you’ll get a listing of all files in the current directory
and any subdirectories that contain test in their filename.
All of the example applications up to this point work the same in Node 0.8 as in Node
0.6. Example 3-9 is an exception because of a change in the underlying API.
In Node 0.6, the exit event would not be emitted until the child process exits and all
STDIO pipes are closed. In Node 0.8, the event is emitted as soon as the child process
finishes. This causes the application to crash, because the grep child process’s STDIO
pipe is closed when it tries to process its data. For the application to work in Node 0.8,
the application needs to listen for the close event on the find child process, rather than
the exit event:
// and exit handling for both
find.on('close', function (code) {
if (code !== 0) {
console.log('find process exited with code ' + code);
}
In Node 0.8, the close event is emitted when the child process exits and all STDIO
pipes are closed.
console.log(global);
child_process.fork
The last child process method is child_process.fork. This variation of spawn is for
spawning Node processes.
What sets the child_process.fork process apart from the others is that there’s an actual
communication channel established to the child process. Note, though, that each pro-
cess requires a whole new instance of V8, which takes both time and memory.
Child Processes | 53
Example 3-10. Running a child process application in Windows
var cmd = require('child_process').spawn('cmd', ['/c', 'dir\n']);
The /c flag passed as the first argument to cmd.exe instructs it to carry out the command
and then terminate. The application doesn’t work without this flag. You especially
don’t want to pass in the /K flag, which tells cmd.exe to execute the application and
then remain because your application won’t terminate.
The dns.reverse method returns an array of domain names for a given IP address:
dns.reverse('173.255.206.103', function(err,domains) {
domains.forEach(function(domain) {
console.log(domain);
});
});
The dns.resolve method returns an array of record types by a given type, such as A, MX,
NS, and so on. In the following code, I’m looking for the name server domains for my
domain name, burningbird.net:
This returns:
ns1.linode.com
ns3.linode.com
ns5.linode.com
ns4.linode.com
We used the URL module in Example 1-3 in Chapter 1. This simple module provides
a way of parsing a URL and returning an object with all of the URL components. Passing
in the following URL:
var url = require('url');
var urlObj = url.parse('http://examples.burningbird.net:8124/?file=main');
The URL module is often used with the Query String module. The latter module is a
simple utility module that provides functionality to parse a received query string, or
prepare a string for use as a query string.
To chunk out the key/value pairs in the query string, use the querystring.parse
method. The following:
var vals = querystring.parse('file=main&file=secondary&type=html");
results in a JavaScript object that allows for easy access of the individual query string
values:
{ file: [ 'main', 'secondary' ], type: 'html' }
Since file is given twice in the query string, both values are grouped into an array, each
of which can be accessed individually:
You can also convert an object into a query string, using querystring.stringify:
var qryString = querystring.stringify(vals)
You can use Utilities to test if an object is an array (util.isArray) or regular expression
(util.isRegExp), and to format a string (util.format). A new experimental addition to
the module provides functionality to pump data from a readable stream to a writable
stream (util.pump):
util.pump(process.stdin, process.stdout);
However, I wouldn’t type this into REPL, as anything you type from that point on is
echoed as soon as you type it—making the session a little awkward.
I make extensive use of util.inspect to get a string representation of an object. I find
it’s a great way to discover more about an object. The first required argument is the
object; the second optional argument is whether to display the nonenumerable prop-
erties; the third optional argument is the number of times the object is recursed (depth);
and the fourth, also optional, is whether to style the output in ANSI colors. If you assign
a value of null to the depth, it recurses indefinitely (the default is two times)—as much
as necessary to exhaustively inspect the object. From experience, I’d caution you to be
careful using null for the depth because you’re going to get a large output.
You can use util.inspect in REPL, but I recommend a simple application, such as the
following:
var util = require('util');
var jsdom = require('jsdom');
Now you can inspect and reinspect the object interface at your leisure. Again, if you
use null for depth, expect a large output file.
The Utilities module provides several other methods, but the one you’re most likely to
use is util.inherits. The util.inherits function takes two parameters, constructor
and superConstructor. The result is that the constructor will inherit the functionality
from the superconstructor.
Example 3-11 demonstrates all the nuances associated with using util.inherits. The
explanation of the code follows.
first.prototype.output = function() {
console.log(this.name);
}
function third(func) {
this.name = 'third';
this.callMethod = func;
}
The application creates three objects named first, second, and third, respectively.
The first object has two methods: test and output. The test method is defined directly
in the object, while the output method is added later via the prototype object. The
reason I used both techniques for defining a method on the object is to demonstrate
an important aspect of inheritance with util.inherits (well, of JavaScript, but enabled
by util.inherits).
The second object contains the following line:
If we eliminate this line from the second object constructor, any call to output on the
second object would succeed, but a call to test would generate an error and force the
Node application to terminate with a message about test being undefined.
The call method chains the constructors between the two objects, ensuring that the
superconstructor is invoked as well as the constructor. The superconstructor is the
constructor for the inherited object.
We need to invoke the superconstructor since the test method doesn’t exist until
first is created. However, we didn’t need the call method for the output method,
because it’s defined directly on the first object’s prototype object. When the second
object inherits properties from the first, it also inherits the newly added method.
If we look under the hood of util.inherits, we see where super_ is defined:
exports.inherits = function(ctor, superCtor) {
ctor.super_ = superCtor;
ctor.prototype = Object.create(superCtor.prototype, {
constructor: {
value: ctor,
enumerable: false,
writable: true,
configurable: true
}
});
};
The third object in the application, third, also has a name property. It doesn’t inherit
from either first or second, but does expect a function passed to it when it’s instanti-
ated. This function is assigned to its own callMethod property. When the code creates
an instance of this object, the two object instance’s test method is passed to the
constructor:
var three = new third(two.test);
When three.callMethod is called, “second” is output, not “third” as you might expect
at first glance. And that’s where the self reference in the first object comes in.
In JavaScript, this is the object context, which can change as a method is passed
around, or passed to an event handler. The only way you can preserve data for an
object’s method is to assign this to an object variable—in this case, self—and then
use the variable within any functions in the object.
Running this application results in the following output:
second
second
second
Use the newly created EventEmitter instance to do two essential tasks: attach an event
handler to an event, and emit the actual event. The on event handler is triggered when
a specific event is emitted. The first parameter to the method is the name of the event,
the second a function to process the event:
em.on('someevent', function(data) { ... });
The event is emitted on the object, based on some criteria, via the emit method:
if (somecriteria) {
en.emit('data');
}
In Example 3-12, we create an EventEmitter instance that emits an event, timed, every
three seconds. In the event handler function for this event, a message with a counter is
output to the console.
Example 3-12. Very basic test of the EventEmitter functionality
var eventEmitter = require('events').EventEmitter;
var counter = 0;
em.on('timed', function(data) {
console.log('timed ' + data);
});
By using util.inherits with the object, you can call the emit method within the object’s
methods, and code event handlers on the object instances:
someobj.prototype.somemethod = function() { this.emit('event'); };
...
someobjinstance.on('event', function() { });
Rather than attempt to decipher how EventEmitter works in the abstract sense, let’s
move on to Example 3-13, which shows a working example of an object inheriting
EventEmitter’s functionality. In the application, a new object, inputChecker, is created.
The constructor takes two values, a person’s name and a filename. It assigns the per-
son’s name to an object variable, and also creates a reference to a writable stream using
the File System module’s createWriteStream method (for more on the File System
module, see the sidebar “Readable and Writable Stream” on page 60).
The object also has a method, check, that checks incoming data for specific commands.
One command (wr:) triggers a write event, another (en:) an end event. If no command
is present, then an echo event is triggered. The object instance provides event handlers
for all three events. It writes to the output file for the write event, it echoes the input
for the commandless input, and it terminates the application with an end event, using
the process.exit method.
All input comes from standard input (process.stdin).
util.inherits(inputChecker,eventEmitter);
ic.on('write', function(data) {
this.writeStream.write(data, 'utf8');
});
ic.on('end', function() {
process.exit();
});
process.stdin.resume();
process.stdin.setEncoding('utf8');
process.stdin.on('data', function(input) {
ic.check(input);
});
The EventEmitter functionality is bolded in the example. Note that the functionality
also includes the process.stdin.on event handler method, since process.stdin is one
of the many Node objects that inherit from EventEmitter.
We don’t have to chain the constructors from the new object to EventEmitter, as
demonstrated in the earlier example covering util.inherits, because the functionality
we need—on and emit—consists of prototype methods, not object instance properties.
When you exceed 10 listeners for an event, you’ll get a warning by default. Use setMax
Listeners, passing in a number, to change the number of listeners. Use a value of zero
(0) for an unlimited amount of listeners.
Many of the core Node objects, as well as third-party modules, make use of EventEmit
ter. In Chapter 4, I’ll demonstrate how to convert the code in Example 3-13 into a
module.
You can also just include a specific object, rather than all objects, from a module:
var spawn = require('child_process').spawn;
63
You can load core modules—i.e., those native to Node—or modules from the
node_modules folder just by providing the module identifier, such as http for the HTTP
module. Modules not part of core, or not included in the node_modules folder, should
include forward slashes to indicate the path. As an example, Node expects to find the
module named mymodule.js in the same directory as the Node application in the fol-
lowing require statement:
require ('./mymodule');
Module files can have either a .js, .node, or .json file extension. The .node extension
assumes that the file is a compiled binary, not a text file containing JavaScript.
Node core modules have higher priority than external modules. If you’re trying to load
a custom module named http, Node loads the core version of the HTTP module. You’ll
have to provide either a different module identifier, or you’ll need to provide the full
path.
Earlier I mentioned the node_modules folder. If you specify the node identifier without
providing a path, and the module isn’t a core module, Node first looks for a node_mod-
ules folder local to the application, and searches for the module in this folder. If it
doesn’t find the module, Node then looks in the parent subdirectory for a node_mod-
ules folder and the node, and so on.
If the module is named mymodule, and the application is located in a subdirectory with
the following path:
/home/myname/myprojects/myapp
then Node looks for the module using the following searches, in turn:
• /home/myname/myprojects/myapp/node_modules/mymodule.js
• /home/myname/myprojects/node_modules/mymodule.js
• /home/myname/node_modules/mymodule.js
• /node_modules/mymodule.js
Node can optimize the search depending on where the file issuing the require statement
resides. For instance, if the file making the require statement is itself a module in a
subdirectory of the node_modules folder, Node begins the search for the required mod-
ule in the topmost node_modules folder.
There are two additional variations of require: require.resolve and require.cache.
The require.resolve method performs the lookup for the given module but, rather
than load the module, just returns the resolved filename. The resolve.cache object
contains a cached version of all loaded modules. When you try to load the module
again in the same context, it’s loaded from the cache. If you want to force a new load,
delete the item from the cache.
delete it with:
delete require.cache('./circle.js');
This code forces a reload of the module the next time a require is called on it.
Modern installations include npm, but you can double-check for its existence by typing
npm at the command line in the same environment that you use to access Node.
Modules can be installed globally or locally. The local installation is the best approach
if you’re working on a project and not everyone sharing the system needs access to this
module. A local installation, which is the default, installs the module in the current
location in the node_modules directory.
$ npm install modulename
npm not only installs Connect, it also discovers its module dependencies and installs
them, too, as shown in Figure 4-1.
Once it’s installed, you can find the module in your local directory’s node_modules
directory. Any dependencies are installed in that module’s node_modules directory.
If you want to install the package globally, use the -g or --global option:
$ npm -g install connect
These examples install packages that are registered at the npm site. You can also install
a module that’s in a folder on the filesystem, or a tarball that’s either local or fetched
via a URL:
npm install http://somecompany.com/somemodule.tgz
Now you can make use of a familiar syntax in your Node application development.
If you’re no longer using a module, you can uninstall it:
npm uninstall modulename
The following command tells npm to check for new modules, and perform an update
if any are found:
If you just want to check to see if any packages are outdated, use the following:
npm outdated
The la and ll options provide extended descriptions. The following is the text I get
running npm ll in my Windows 7 machine:
C:\Users\Shelley>npm ls ll
npm WARN jsdom >= 0.2.0 Unmet dependency in C:\Users\Shelley\node_modules\html5
C:\Users\Shelley
├── [email protected]
├── [email protected]
├── [email protected]
├─┬ [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ └── [email protected]
├─┬ [email protected]
│ ├── UNMET DEPENDENCY jsdom >= 0.2.0
│ ├── [email protected]
│ └─┬ [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├── [email protected]
│ ├─┬ [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ ├── [email protected]
│ │ └── [email protected]
│ ├── [email protected]
│ └── [email protected]
└─┬ [email protected]
└── [email protected]
Note the warning about an unmet dependency for the HTML5 module. The HTML5
module requires an older version of the JSDOM library. To correct this, I installed the
necessary version of the module:
npm install [email protected]
You can also directly install all dependencies with the -d flag. For instance, in the di-
rectory for the module, type the following:
npm install -d
If you want to install a version of the module that hasn’t yet been uploaded to the npm
registry, you can install directly from the Git repository:
npm install https://github.com/visionmedia/express/tarball/master
Use caution, though, as I’ve found that when you install a not-yet-released version of
a module, and you do an npm update, the npm registry version can overwrite the version
you’re using.
To see which modules are installed globally, use:
npm ls -g
You can learn more about your npm installation using the config command. The fol-
lowing lists the npm configuration settings:
npm config list
You can get a more in-depth look at all configuration settings with:
npm config ls -l
You can modify or remove configuration settings either by using a command line:
You can also search for a module using whatever terms you think might return the best
selection:
npm search html5 parser
The first time you do a search, npm builds an index, which can take a few minutes.
When it’s finished, though, you’ll get a list of possible modules that match the term or
terms you provided. The search terms html5 and parser returned just two modules:
HTML5, an HTML parser that includes support for SVG and MathML; and Fabric, an
object model with support for HTML5 Canvas and an SVG-to-Canvas parser.
The npm website provides a registry of modules you can browse through, and an up-
to-date listing of modules most depended on—that is, modules most used by other
modules or by Node applications. In the next section, I’ll cover a sampling of these
modules.
I’ll cover other npm commands later in this chapter, in the section
“Creating Your Own Custom Module” on page 74.
Finding Modules
Though Node.js has been active only for a few years, it’s already attracted a large body
of work. When you look at the Node.js modules wiki page, you’ll find a significant
number of modules. The good thing is, you can find a lot of useful modules that im-
plement the functionality you need. The bad news is, it’s difficult to determine which
module to use—in other words, which modules are “best of breed.”
Using a search tool like Google can give you a fair idea of which modules are popular.
For example, it quickly became apparent when I was exploring middleware and frame-
work modules that Connect and Express were very popular.
In addition, when you look at the GitHub registry for the item, you can see if it’s actively
supported and up to date with the current Node installation. As another example, I
was checking out a tool named Apricot, which does HTML parsing and is recom-
mended in the Node documentation, but then I noticed it hadn’t been updated for
Finding Modules | 69
some time, and when I tried to use the module, I found it didn’t work with my instal-
lation of Node (at least, not when this book was written).
Because the Colors module is included in the current location’s node_modules subdir-
ectory, Node is able to find it rather quickly.
Now try something out, such as the following:
console.log('This Node kicks it!'.rainbow.underline);
The result is a colorful, underlined rendering of your message. The style applies only
for the one message—you’ll need to apply another style for another message.
If you’ve worked with jQuery, you recognize the chaining used to combine effects. The
example makes use of two: a font effect, underlined, and a font color, rainbow.
Try another, this time zebra and bold:
console.log('We be Nodin'.zebra.bold);
You can change the style for sections of the console message:
console.log('rainbow'.rainbow, 'zebra'.zebra);
Finding Modules | 71
Why would something like Colors be useful? One reason is that it enables us to specify
formatting for various events, such as displaying one color for errors in one module,
another color or effect for warnings in a second module, and so on. To do this, you can
use the Colors presets or create a custom theme:
> colors.setTheme({
....... mod1_warn: 'cyan',
....... mod1_error: 'red',
....... mod2_note: 'yellow'
....... });
> console.log("This is a helpful message".mod2_note);
This is a helpful message
> console.log("This is a bad message".mod1_error);
This is a bad message
You can run the application with short options. The following prints the values of 1
and 2 out to the console:
./app.js -o 1 -t 2
You can also use the Optimist module to process Boolean and unhyphenated options.
However, you can also run the Node application as a standalone application with a
couple of modifications.
First, add the following as the first line in the application:
#!/usr/local/bin/node
Underscore
Install the Underscore module with:
npm install underscore
Of course, the problem with the underscore is that this character has a specific meaning
in REPL. No worries, though—we can just use another variable, us:
var us = require('underscore');
us.each(['apple','cherry'], function(fruit) { console.log(fruit); });
Finding Modules | 73
... betterWithNode: function(str) {
..... return str + ' is better with Node';
..... }
... });
> console.log(us.betterWithNode('chocolate'));
chocolate is better with Node
You’ll see the term mixin used in several Node modules. It’s based on a
pattern where properties of one object are added (“mixed in”) to an-
other.
Of course, it makes more sense to extend Underscore from a module that we can reuse
in our applications, which leads us to our next topic—creating our own custom
modules.
You want to use this function, as well as others, in your Node applications.
To convert your JavaScript library for use in Node, you’ll need to export all of your
exposed functions using the exports object, as shown in the following code:
exports.concatArray = function(str, array) {
return array.map(function(element) {
return str + ' ' + element;
});
};
To use concatArray in a Node application, import the library using require, assigning
the library to a variable name. Once the library is assigned, you can call any of the
exposed functions in your code:
var newArray = require ('./arrayfunctions.js');
console.log(newArray.concatArray('hello', ['test1','test2']));
The first property, name, is the name of the module. The second, main, indicates the
entry point for the module.
The second way is to include either an index.js or index.node file in the directory to
serve as the main module entry point.
Why would you provide a directory rather than just a single module? The most likely
reason is that you’re making use of existing JavaScript libraries, and just providing a
“wrapper” file that wraps the exposed functions with exports statements. Another rea-
son is that your library is so large that you want to break it down to make it easier to
modify.
Regardless of the reason, be aware that all of the exported objects must be in the one
main file that Node loads.
the tool will run through the required fields, prompting you for each. When it’s done,
it generates a package.json file.
In Chapter 3, Example 3-13, I started an object called inputChecker that checks in-
coming data for commands and then processes the command. The example demon-
strated how to incorporate EventEmitter. Now we’re going to modify this simple object
to make it usable by other applications and modules.
First, we’ll create a subdirectory in node_modules and name it inputcheck, and then
move the existing inputChecker code file to it. We need to rename the file to index.js.
Next, we need to modify the code to pull out the part that implements the new object.
We’ll save it for a future test file. The last modification we’ll do is add the exports
object, resulting in the code shown in Example 4-1.
Example 4-1. Application from Example 3-13 modified to be a module object
var util = require('util');
var eventEmitter = require('events').EventEmitter;
var fs = require('fs');
exports.inputChecker = inputChecker;
util.inherits(inputChecker,eventEmitter);
inputChecker.prototype.check = function check(input) {
var self = this;
var command = input.toString().trim().substr(0,3);
if (command == 'wr:') {
self.emit('write',input.substr(3,input.length));
} else if (command == 'en:') {
self.emit('end');
} else {
self.emit('echo',input);
}
};
We can’t export the object function directly, because util.inherits expects an object
to exist in the file named inputChecker. We’re also modifying the inputChecker object’s
prototype later in the file. We could have changed these code references to use
exports.inputChecker, but that’s kludgy. It’s just as easy to assign the object in a sep-
arate statement.
To create the package.json file, I ran npm init and answered each of the prompts. The
resulting file is shown in Example 4-2.
Example 4-2. Generated package.json for inputChecker module
{
"author": "Shelley Powers <[email protected]> (http://burningbird.net)",
"name": "inputcheck",
"description": "Looks for commands within the string and implements the commands",
"version": "0.0.1",
"homepage": "http://inputcheck.burningbird.net",
"repository": {
"url": "
},
"main": "inputcheck.js",
"engines": {
"node": "~0.6.10"
},
"dependencies": {},
"devDependencies": {},
"optionalDependencies": {}
}
The npm init command doesn’t prompt for dependencies, so we need to add them
directly to the file. However, the inputChecker module isn’t dependent on any external
modules, so we can leave these fields blank in this case.
At this point, we can test the new module to make sure it actually works as a module.
Example 4-3 is the portion of the previously existing inputChecker application that
tested the new object, now pulled out into a separate test application.
Example 4-3. InputChecker test application
var inputChecker = require('inputcheck').inputChecker;
ic.on('write', function(data) {
this.writeStream.write(data, 'utf8');
});
ic.on('end', function() {
process.exit();
});
process.stdin.resume();
process.stdin.setEncoding('utf8');
process.stdin.on('data', function(input) {
ic.check(input);
});
We can now move the test application into a new examples subdirectory within the
module directory to be packaged with the module as an example. Good practice de-
mands that we also provide a test directory with one or more testing applications, as
well as a doc directory with documentation. For a module this small, a README file
should be sufficient. Lastly, we create a gzipped tarball of the module.
Once we’ve provided all we need to provide, we can publish the module.
Before publishing, the Guide recommends we test that the module can cleanly install.
To test for this, type the following in the root directory for the module:
npm install . -g
At this point, we’ve tested the inputChecker module, modified the package.json package
to add directories, and confirmed that the package successfully installs.
Next, we need to add ourselves as npm users if we haven’t done so already. We do this
by typing:
npm adduser
and following the prompts to add a username, a password, and an email address.
There’s one last thing to do:
npm publish
We can provide the path to the tarball or the directory. As the Guide warns us, every-
thing in the directory is exposed unless we use a .npmignore list in the package.json file
to ignore material. It’s better, though, just to remove anything that’s not needed before
publishing the module.
Once published—and once the source is also uploaded to GitHub (if that’s the repos-
itory you’re using)—the module is now officially ready for others to use. Promote the
module on Twitter, Google+, Facebook, your website, and wherever else you think
people would want to know about the module. This type of promotion isn’t
bragging—it’s sharing.
Node might seem intimidating at times, with discussions about asynchronous events
and callbacks and new objects such as EventEmitter—not to mention all that new
server-side functionality we have to play with. If you’ve worked with any of the modern
JavaScript libraries, though, you’ve experienced much of the functionality that goes
into Node, at least when it comes to asynchronous development.
For instance, if you’ve used a timer in JavaScript, you’ve used an asynchronous func-
tion. If you’ve ever developed in Ajax, you’ve used an asynchronous function. Even the
plain old onclick event handler is an asynchronous function, since we never know when
the user is going to click that mouse or tap that keyboard.
Any method that doesn’t block the control thread while waiting for some event or result
is an asynchronous function. When it comes to the onclick handling, the application
doesn’t block all other application processing, waiting for that user’s mouse click—
just as it doesn’t block all functionality while the timer is in effect, or while waiting for
the server to return from an Ajax call.
In this chapter, we’re going to look more closely at exactly what we mean by the term
asynchronous control. In particular, we’re going to look at some asynchronous design
patterns, as well as explore some of the Node modules that provide finer control over
program flow when we’re working in this new environment. And since asynchronous
control can add some new and interesting twists when it comes to error handling, we’re
also going to take a closer look at exception handling within an asynchronous Node
environment.
81
the result of an asynchronous action. It’s also known as a future, a delay, or simply
deferred. The CommonJS design model embraced the concept of the promise.
In the earlier Node implementation, a promise was an object that emitted exactly two
events: success and error. Its use was simple: if an asynchronous operation succeeded,
the success event was emitted; otherwise, the error event was emitted. No other events
were emitted, and the object would emit one or the other, but not both, and no more
than once. Example 5-1 incorporates a previously implemented promise into a function
that opens and reads in a file.
Example 5-1. Using a previously implemented Node promise
function test_and_load(filename) {
var promise = new process.Promise();
fs.stat(filename).addCallback(function (stat) {
}).addErrback(function (error) {
promise.emitError(error);
});
return promise;
}
Each object would return the promise object. The code to process a successful result
would be passed as a function to the promise object’s addCallback method, which had
one parameter, the data. The code to process the error would be passed as a function
to the promise object’s addErrback method, which received the error as its one and only
parameter:
var File = require('file');
var promise = File.read('mydata.txt');
promise.addCallback(function (data) {
// process data
});
promise.addErrback(function (err) {
// deal with error
})
The promise object ensured that the proper functionality was performed whenever the
event finished—either the results could be manipulated, or the error processed.
www.allitebooks.com
The code for Example 5-1 is one of a number of examples of possible
asynchronous function techniques documented at http://groups.google
.com/group/nodejs/browse_thread/thread/8dab9f0a5ad753d5 as part of
the discussions about how Node would handle this concept in the fu-
ture.
The promise object was pulled from Node in version 0.1.30. As Ryan Dahl noted at
the time, the reasoning was:
Because many people (myself included) only want a low-level interface to file system
operations that does not necessitate creating an object, while many other people want
something like promises but different in one way or another. So instead of promises we’ll
use last argument callbacks and consign the task of building better abstraction layers to
user libraries.
Rather than the promise object, Node incorporated the last argument callbacks we’ve
used in previous chapters. All asynchronous methods feature a callback function as the
last argument. The first argument in this callback function is always an error object.
To demonstrate the fundamental structure of the callback functionality, Example 5-2
is a complete Node application that creates an object with one method, someMethod.
This method takes three arguments, the second of which must be a string, and the third
being the callback. In the method, if the second argument is missing or is not a string,
the object creates a new Error object, which is passed to the callback function. Other-
wise, whatever the result of the method is gets passed to the callback function.
Example 5-2. The fundamental structure of the last callback functionality
var obj =function() { };
if (!arg2)
return callback(new Error('second argument missing or not a string'));
callback(arg1);
}
var test = new obj();
try {
test.doSomething('test', 3.55, function(err,value) {
if (err) throw err;
console.log(value);
});
} catch(err) {
The key elements of the callback functionality are in boldface in the code.
The first key functionality is to ensure the last argument is a callback function. Well,
we can’t determine the user’s intent, but we can make sure the last argument is a func-
tion, and that will have to do. The second key functionality is to create the new Node
Error object if an error occurs, and return it as the result to the callback function. The
last critical functionality is to invoke the callback function, passing in the method’s
result if no error occurs. In short, everything else is changeable, as long as these three
key functionalities are present:
• Ensure the last argument is a function.
• Create a Node Error and return it if an error occurs.
• If no error occurs, invoke the callback function, passing the method’s result.
With the existing code in Example 5-1, the application output is the following error
message printed out to the console:
[Error: second argument missing or not a string]
results in test being printed out to the console. Changing it then to the following:
test.doSomething('test',function(err,value) {
again results in an error, this time because the second argument is missing.
If you look through the code in the lib directory of the Node installation, you’ll see the
last callback pattern repeated throughout. Though the functionality may change, this
pattern remains the same.
This approach is quite simple and ensures consistent results from asynchronous meth-
ods. However, it also creates its own unique challenges, as we’ll cover in the next
section.
try {
var data = fs.readFileSync('./apples.txt','utf8');
console.log(data);
var adjData = data.replace(/[A|a]pple/g,'orange');
fs.writeFileSync('./oranges.txt', adjData);
} catch(err) {
console.error(err);
}
Since problems can occur and we can’t be sure errors are handled internally in any
module function, we wrap all of the function calls in a try block to allow for graceful
—or at least, more informative—exception handling. The following is an example of
what the error looks like when the application can’t find the file to read:
{ [Error: ENOENT, no such file or directory './apples.txt']
errno: 34,
code: 'ENOENT',
path: './apples.txt',
syscall: 'open' }
While perhaps not very user-friendly, at least it’s a lot better than the alternative:
node.js:201
throw e; // process.nextTick error, or 'error' event on first tick
^
Error: ENOENT, no such file or directory './apples.txt'
at Object.openSync (fs.js:230:18)
at Object.readFileSync (fs.js:120:15)
at Object.<anonymous> (/home/examples/public_html/node/read.js:3:18)
at Module._compile (module.js:441:26)
at Object..js (module.js:459:10)
at Module.load (module.js:348:31)
at Function._load (module.js:308:12)
at Array.0 (module.js:479:10)
at EventEmitter._tickCallback (node.js:192:40)
In the example, we’re going to have expected results because each function call is
performed in sequence.
try {
fs.readFile('./apples2.txt','utf8', function(err,data) {
In Example 5-4, the input file is opened and read, and only when both actions are
finished does the callback function passed as the last parameter get called. In this func-
tion, the error is checked to make sure it’s null. If not, the error is thrown for catching
in the outer exception-handling block.
If no error occurs, the data is processed and the asynchronous writeFile method is
called. Its callback function has only one parameter, the error object. If it’s not null,
it’s thrown for handling in the outer exception block.
If an error occurred, it would look similar to the following:
/home/examples/public_html/node/read2.js:11
if (err) throw err;
^
Error: ENOENT, no such file or directory './boogabooga/oranges.txt'
Including another sequential function call adds another level of callback nesting. In
Example 5-5, we access a listing of files for a directory. In each of the files, we replace
a generic domain name with a specific domain name using the string replace method,
and the result is written back to the original file. A log is maintained of each changed
file, using an open write stream.
Example 5-5. Retrieving directory listing for files to modify
var fs = require('fs');
try {
// get list of files
fs.readdir('./data/', function(err, files) {
// modify contents
fs.readFile('./data/' + name,'utf8', function(err,data) {
// write to file
fs.writeFile('./data/' + name, adjData, function(err) {
// log write
writeStream.write('changed ' + name + '\n', 'utf8', function(err) {
changed data3.txt
changed data1.txt
changed data5.txt
changed data2.txt
changed data4.txt
changed data1.txt
changed data3.txt
changed data5.txt
changed data4.txt
changed data2.txt
Another issue arises if you want to check when all of the files have been modified in
order to do something. The forEach method invokes the iterator callback functions
asynchronously, so it doesn’t block. Adding a statement following the use of forEach,
like the following:
console.log('all done');
doesn’t really mean the application is all finished, just that the forEach method didn’t
block. If you add a console.log statement at the same time you log the changed file:
writeStream.write('changed ' + name + '\n', 'utf8', function(err) {
});
You’d then get the expected result: an “all done” message displays after all the files
have been updated.
The application works quite well—except if the directory we’re accessing has subdir-
ectories as well as files. If the application encounters a subdirectory, it spits out the
following error:
/home/examples/public_html/node/example5.js:20
if (err) throw err;
^
Error: EISDIR, illegal operation on a directory
Example 5-6 prevents this type of error by using the fs.stats method to return an object
representing the data from a Unix stat command. This object contains information
about the object, including whether it’s a file or not. The fs.stats method is, of course,
another asynchronous method, requiring yet more callback nesting.
Example 5-6. Adding in a stats check of each directory object to make sure it’s a file
var fs = require('fs');
try {
// get list of files
fs.readdir('./data/', function(err, files) {
if (stats.isFile())
// write to file
fs.writeFile('./data/' + name, adjData, function(err) {
// log write
writeStream.write('changed ' + name + '\n', 'utf8',
function(err) {
if(err) throw err;
});
});
});
});
});
});
} catch(err) {
console.error(err);
}
Again, the application performs its purpose, and performs it well—but how difficult it
is to read and maintain! I’ve heard this type of nested callback called callback
spaghetti and the even more colorful pyramid of doom, both of which are apt terms.
The nested callbacks continue to push against the right side of the document, making
it more difficult to ensure we have the right code in the right callback. However, we
can’t break the callback nesting apart because it’s essential that the methods be called
in turn:
1. Start the directory lookup.
2. Filter out subdirectories.
3. Read each file’s contents.
4. Modify the contents.
5. Write back to the original file.
What we’d like to do is find a way of implementing this series of method calls but
without having to depend on nested callbacks. For this, we need to look at third-party
modules that provide asynchronous control flow.
Step
Step is a focused utility module that enables simplified control flow for serial and par-
allel execution. It can be installed using npm as follows:
npm install step
The Step module exports exactly one object. To use the object for serial execution,
wrap your asynchronous function calls within functions that are then passed as pa-
rameters to the object. For instance, in Example 5-7, Step is used to read the contents
of a file, modify the contents, and write them back to the file.
Example 5-7. Using Step to perform serial asynchronous tasks
var fs = require('fs'),
Step = require('step');
try {
Step (
function readData() {
fs.readFile('./data/data1.txt', 'utf8', this);
},
function modify(err, text) {
if (err) throw err;
return text.replace(/somecompany\.com/g,'burningbird.net');
},
function writeData(err, text) {
if (err) throw err;
fs.writeFile('./data/data1.txt', text, this);
}
);
} catch(err) {
console.error(err);
}
The first function in the Step sequence, readData, reads a file’s contents into a string,
which is then passed to a second function. The second function modifies the string
In more detail, the first function wraps the asynchronous fs.readFile. However, rather
than pass a callback function as the last parameter, the code passes the this context.
When the function is finished, its data and any possible error are sent to the next func-
tion, modify. The modify function isn’t an asynchronous function, as all it’s doing is
replacing one substring for another in the string. It doesn’t require the this context,
and just returns the result at the end of the function.
The last function gets the newly modified string and writes it back to the original file.
Again, since it’s an asynchronous function, it gets this in place of the callback function.
If we didn’t include this as the last parameter to the final function, any errors that occur
wouldn’t be thrown and caught in the outer loop. If the boogabooga subdirectory didn’t
exist with the following modified code:
function writeFile(err, text) {
if (err) throw err;
fs.writeFile('./boogabooga/data/data1.txt');
}
try {
Step (
function readDir() {
fs.readdir(_dir, this);
},
function readFile(err, results) {
if (err) throw err;
files = results;
var group = this.group();
results.forEach(function(name) {
fs.readFile(_dir + name, 'utf8', group());
});
},
function writeAll(err, data) {
if (err) throw err;
for (var i = 0; i < files.length; i++) {
var adjdata = data[i].replace(/somecompany\.com/g,'burningbird.net');
fs.writeFile(_dir + files[i], adjdata, 'utf8',this);
}
}
);
} catch(err) {
console.log(err);
}
To preserve the filenames, the readdir result is assigned to a global variable, files. In
the last Step function, a regular for loop cycles through the data to modify it, and then
cycles through the files variable to get the filename. Both the filename and modified
data are used in the last asynchronous call to writeFile.
One other approach we could have used if we wanted to hardcode the change to each
file is to use the Step parallel feature. Example 5-9 performs a readFile on a couple
of different files, passing in this.parallel() as the last parameter. This results in a
parameter being passed to the next function for each readFile in the first function. The
parallel function call also has to be used in the writeFile function in the second func-
tion, to ensure that each callback is processed in turn.
Example 5-9. Reading and writing to a group of files using Step’s group functionality
var fs = require('fs'),
Step = require('step'),
files;
try {
Step (
function readFiles() {
fs.readFile('./data/data1.txt', 'utf8',this.parallel());
fs.readFile('./data/data2.txt', 'utf8',this.parallel());
It works, but it’s clumsy. It would be better to reserve the use of the parallel functionality
for a sequence of different asynchronous functions that can be implemented in parallel,
and the data processed post-callback.
As for our earlier application, rather than trying to force Step into contortions to fit our
use case, we can use another library that provides the additional flexibility we need:
Async.
Async
The Async module provides functionality for managing collections, such as its own
variation of forEach, map, and filter. It also provides some utility functions, including
ones for memoization. However, what we’re interested in here are its facilities for han-
dling control flow.
As mentioned earlier, Async provides control flow capability for a variety of asynchro-
nous patterns, including serial, parallel, and waterfall. Like Step, it gives us a tool
to tame the wild nested callback pyramid, but its approach is quite different. For one,
we don’t insert ourselves between each function and its callback. Instead, we incorpo-
rate the callback as part of the process.
As an example, we’ve already identified that the pattern of the earlier application
matches with Async’s waterfall, so we’ll be using the async.waterfall method. In
Example 5-10, I used async.waterfall to open and read a data file using fs.readFile,
perform the synchronous string substitution, and then write the string back to the file
try {
async.waterfall([
function readData(callback) {
fs.readFile('./data/data1.txt', 'utf8', function(err, data){
callback(err,data);
});
},
function modify(text, callback) {
var adjdata=text.replace(/somecompany\.com/g,'burningbird.net');
callback(null, adjdata);
},
function writeData(text, callback) {
fs.writeFile('./data/data1.txt', text, function(err) {
callback(err,text);
});
}
], function (err, result) {
if (err) throw err;
console.log(result);
});
} catch(err) {
console.log(err);
}
The async.waterfall method takes two parameters: an array of tasks and an optional
final callback function. Each asynchronous task function is an element of the
async.waterfall method array, and each function requires a callback as the last of its
parameters. It is this callback function that allows us to chain the asynchronous call-
back results without having to physically nest the functions. However, as you can see
in the code, each function’s callback is handled as we would normally handle it if we
were using nested callbacks—other than the fact that we don’t have to test the errors
in each function. The callbacks look for an error object as first parameter. If we pass
an error object in the callback function, the process is ended at this point, and the final
callback routine is called. The final callback is when we can test for an error, and throw
the error to the outer exception handling block (or otherwise handle).
The readData function wraps our fs.readFile call, which checks for an error, first. If
an error is found, it throws the error, ending the process. If not, it issues a call to the
callback as its last operation. This is the trigger to tell Async to invoke the next function,
passing any relevant data. The next function isn’t asynchronous, so it does its process-
ing, passing null as the error object, and the modified data. The last function, write
Data, calls the asynchronous writeFile, using the passed-in data from the previous
callback and then testing for an error in its own callback routine.
The processing is very similar to what we had in Example 5-4, but without the nesting
(and having to test for an error in each function). It may seem more complicated than
what we had in Example 5-4, and I wouldn’t necessarily recommend its use for such
simple nesting, but look what it can do with a more complex nested callback. Exam-
ple 5-11 duplicates the exact functionality from Example 5-6, but without the callback
nesting and excessive indenting.
Example 5-11. Get objects from directory, test to look for files, read file test, modify, and write back
out log results
var fs = require('fs'),
async = require('async'),
_dir = './data/';
Every last bit of functionality is present from Example 5-6. The fs.readdir method is
used to get an array of directory objects. The Node forEach method (not the Async
forEach) is used to access each specific object. The fs.stats method is used to get the
stats for each object. stats is used to check for files, and when a file is found, it’s
opened and its data accessed. The data is then modified, and passed on to be written
back to the file via fs.writeFile. The operation is logged in the logfile and also echoed
to the console.
Note that there is more data passed in some of the callbacks. Most of the functions
need the filename as well as the text, so this is passed in the last several methods. Any
amount of data can be passed in the methods, as long as the first parameter is the error
object (or null, if no error object) and the last parameter in each function is the callback
function.
We don’t have to check for an error in each asynchronous task function either, because
Async tests the error object in each callback, and stops processing and calls the final
callback function if an error is found. And we don’t have to worry about using special
processing when handling an array of items, as we did when we used Step earlier in the
chapter.
The other Async control flow methods, such as async.parallel and async.serial, per-
form in a like manner, with an array of tasks as the first method parameter, and a final
optional callback as the second. How they process the asynchronous tasks differs,
though, as you would expect.
The async.parallel method calls all of the asynchronous functions at once, and when
they are each finished, calls the optional final callback. Example 5-12 uses
async.parallel to read in the contents of three files in parallel. However, rather than
try {
async.parallel({
data1 : function (callback) {
fs.readFile('./data/data1.txt', 'utf8', function(err, data){
callback(err,data);
});
},
data2 : function (callback) {
fs.readFile('./data/data2.txt', 'utf8', function(err, data){
callback(err,data);
});
},
data3 : function readData3(callback) {
fs.readFile('./data/data3.txt', 'utf8', function(err, data){
callback(err,data);
});
},
The results are returned as an array of objects, with each result tied to each of the
properties. If the three data files in the example had the following content:
• data1.txt: apples
• data2.txt: oranges
• data3.txt: peaches
the result of running Example 5-12 is:
{ data1: 'apples\n', data2: 'oranges\n', data3: 'peaches\n' }
I’ll leave the testing of the other Async control flow methods as a reader exercise. Just
remember that when you’re working with the Async control flow methods, all you need
is to pass a callback to each asynchronous task and to call this callback when you’re
finished, passing in an error object (or null) and whatever data you need.
One helpful Node.js style guide is Felix’s Node.js Style Guide, at http:
//nodeguide.com/style.html.
Click a link in a web page, and you expect something to happen. That something is
typically a page being loaded. However, there’s actually a lot that goes on before that
web resource loads—some of which is mostly out of our control (such as packet rout-
ing), and some of which is dependent on us having software installed that understands
how to respond based on the link’s contents.
Of course, when we use web servers such as Apache, and software such as Drupal,
much of the mechanics of serving a file or a resource are handled behind the scenes.
However, when we’re creating our own server-side applications in Node and bypassing
our usual technology, we have to get more involved in ensuring that the right resource
gets delivered at the right time.
This chapter focuses on the technology available to Node developers for providing the
very basic routing and middleware functionality we need to ensure that resource A gets
delivered to user B correctly and quickly.
103
5. If the file does exist, open the file for reading.
6. Prepare a response header.
7. Write the file to the response.
8. Wait for the next request.
Creating an HTTP server and reading files requires the HTTP and File System modules.
The Path module will also come in handy, because it has a way of checking to make
sure a file exists before trying to open it for reading. In addition, we’ll want to define a
global variable for the base directory, or use the predefined __dirname (more on this in
the upcoming sidebar “Why Not Use __dirname?” on page 110).
The top of the application has the following code at this point:
var http = require('http'),
path = require('path'),
fs = require('fs'),
base = '/home/examples/public_html';
Creating a server using the HTTP module isn’t anything new. And the application can
get the document requested by directly accessing the HTTP request object’s url prop-
erty. To double-check the response compared to requests, we’ll also throw in a con
sole.log of the requested file’s pathname. This is in addition to the console.log mes-
sage that’s written when the server is first started:
http.createServer(function (req, res) {
}).listen(8124);
Before attempting to open the file for reading and writing to the HTTP response, the
application needs to check that it exists. The path.exists function is a good choice at
this point. If the file doesn’t exist, write a brief message to this effect and set the status
code to 404: document not found.
path.exists(pathname, function(exists) {
if (exists) {
// insert code to process request
} else {
res.writeHead(404);
res.write('Bad request 404\n');
res.end();
}
Now we’re getting into the meat of the new application. In examples in previous chap-
ters, we used fs.readFile to read in a file. The problem with fs.readFile, though, is
that it wants to read the file completely into memory before making it available.
The path.exists method has been deprecated in Node 0.8. Instead, use
fs.exists. The examples file referenced in the preface include applica-
tions that support both environments.
Instead of using fs.readFile, the application creates a read stream via the fs.createR
eadStream method, using the default settings. Then it’s a simple matter to just pipe the
file contents directly to the HTTP response object. Since the stream sends an end signal
when it’s finished, we don’t need to use the end method call with the read stream:
res.setHeader('Content-Type', 'test/html');
The read stream has two events of interest: open and error. The open event is sent when
the stream is ready, and the error if a problem occurs. The application calls the pipe
method in the callback function for the open event.
At this point, the static file server looks like the application in Example 6-1.
Example 6-1. A simple static file web server
var http = require('http'),
path = require('path'),
fs = require('fs'),
base = '/home/examples/public_html';
path.exists(pathname, function(exists) {
if (!exists) {
res.writeHead(404);
res.write('Bad request 404\n');
res.end();
} else {
res.setHeader('Content-Type', 'text/html');
I tested it with a simple HTML file, which has nothing more than an img element, and
the file loaded and displayed properly:
<!DOCTYPE html>
<head>
<title>Test</title>
<meta charset="utf-8" />
</head>
<body>
<img src="./phoenix5a.png" />
</body>
I then tried it with another example file I had, which contained an HTML5 video
element:
<!DOCTYPE html>
<head>
<title>Video</title>
<meta charset="utf-8" />
</head>
<body>
<video id="meadow" controls>
<source src="videofile.mp4" />
<source src="videofile.ogv" />
<source src="videofile.webm" />
</video>
</body>
Though the file would open and the video displayed when I tried the page with Chrome,
the video element did not work when I tested the page with Internet Explorer 10.
Looking at the console output provided the reason why:
Server running at 8124/
/home/examples/public_html/html5media/chapter1/example2.html
/home/examples/public_html/html5media/chapter1/videofile.mp4
/home/examples/public_html/html5media/chapter1/videofile.ogv
/home/examples/public_html/html5media/chapter1/videofile.webm
The application has to be modified to test for the file extension for each file and then
return the appropriate MIME type in the response header. We could code this func-
tionality ourselves, but I’d rather make use of an existing module: node-mime.
You can install node-mime using npm: npm install mime. The GitHub
site is at https://github.com/broofa/node-mime.
The node-mime module can return the proper MIME type given a filename (with or
without path), and can also return file extensions given a content type. The node-mime
module is added to the requirements list like so:
mime = require('mime');
The returned content type is used in the response header, and also output to the con-
sole, so we can check the value as we test the application:
// content type
var type = mime.lookup(pathname);
console.log(type);
res.setHeader('Content-Type', type);
Now when we access the file with the video element in IE10, the video file works.
What doesn’t work, though, is when we access a directory instead of a file. When this
happens, an error is output to the console, and the web page remains blank for the user:
{ [Error: EISDIR, illegal operation on a directory] errno: 28, code: 'EISDIR' }
We not only need to check if the resource being accessed exists, but we also need to
check whether it’s a file or a directory. If it’s a directory being accessed, we can either
display its contents, or we can output an error—it’s the developer’s choice.
file.pipe(res);
});
file.on("error", function(err) {
console.log(err);
});
} else {
res.writeHead(403);
res.write('Directory access is forbidden');
res.end();
}
});
}).listen(8124);
console.log('Server running at 8124/');
The following is the console output for accessing one web page that contains both
image and video file links:
Note the proper handling of the content types. Figure 6-1 shows one web page that
contains a video element loaded into Chrome, and the network access displayed in the
browser’s console.
Figure 6-1. Displaying the browser console while loading a web page served by the simple static file
server from Example 6-2
You get a better feel for how the read stream works when you load a page that has a
video element and begin to play it. The browser grabs the read stream output at a speed
it can manage, filling its own internal buffer, and then pauses the output. If you close
the server while the video content is playing, the video continues to play...up to the
point where it exhausts its current video buffer. The video element then goes blank
because the read stream is no longer available. It’s actually a little fascinating to see
how well everything works with so little effort on our part.
Though the application works when tested with several different documents, it’s not
perfect. It doesn’t handle many other types of web requests, it doesn’t handle security
or caching, and it doesn’t properly handle the video requests. One web page application
I tested that uses HTML video also makes use of the HTML5 video element API to
output the state of the video load process. This application didn’t get the information
it needs to work as designed.
There are many little gotchas that can trip us when it comes to creating a static file
server. Another approach is to use an existing static file server. In the next section, we’ll
look at one included in the Connect middleware.
Middleware
What is middleware? That’s a good question, and one that, unfortunately, doesn’t have
a definitive answer.
Generally, middleware is software that exists between you, as the developer, and the
underlying system. By system, we can mean either the operating system, or the under-
lying technology, such as we get from Node. More specifically, middleware inserts itself
into the communication chain between your application and the underlying system—
hence its rather descriptive name.
I’m covering only Connect in this book, for three reasons. One, it’s simpler to use. JSGI
would require us to spend too much time trying to understand how it works in general
(independent of its use with Node), whereas with Connect, we can jump right in. Two,
Connect provides middleware support for Express, a very popular framework (covered
in Chapter 7). Three, and perhaps most importantly, over time Connect has seemingly
floated to the top as best in breed. It’s the most used middleware if the npm registry is
any indication.
Connect Basics
You can install Connect using npm:
npm install connect
Connect is, in actuality, a framework in which you can use one or more middleware
applications. The documentation for Connect is sparse. However, it is relatively simple
to use once you’ve seen a couple of working examples.
Middleware | 111
Working with Alpha Modules
At the time I wrote the first draft of this chapter, the npm registry had the stable version
(1.8.5) of Connect, but I wanted to cover the development version, 2.x, since it will
most likely be the version you’ll be using.
I downloaded the source code for Connect 2.x directly from GitHub, and moved into
my development environment’s node_modules directory. I then changed to the Connect
directory and installed it using npm, but without specifying the module’s name, and
using the -d flag to install the dependencies:
npm install -d
You can use npm to install directly from the Git repository. You can also use Git directly
to clone the version and then use the technique I just described to install it.
Be aware that if you install a module directly from source, and you perform an npm
update, npm will overwrite the module with what it considers to be the “latest” module
—even if you are using a newer version of the module.
In Example 6-3, I created a simple server application using Connect, and using two of
the middleware1 bundled with Connect: connect.logger and connect.favicon. The
logger middleware logs all requests to a stream—in this case, the default STDIO.out
put stream—and the favicon middleware serves up the favicon.ico file. The application
includes the middleware using the use method on the Connect request listener, which
is then passed as a parameter to the HTTP object’s createServer method.
Example 6-3. Incorporating the logger and favicon middleware into a Connect-based application
var connect = require('connect');
var http = require('http');
http.createServer(app).listen(8124);
You can use any number of middleware—either built in with Connect or provided by
a third party—by just including additional use states.
Rather than create the Connect request listener first, we can also incorporate the Con-
nect middleware directly into the createServer method, as shown in Example 6-4.
1. Connect refers to the individual middleware options as just “middleware.” I follow its convention in this
chapter.
http.createServer(connect()
.use(connect.favicon())
.use(connect.logger())
.use(function(req,res) {
res.end('Hello World\n');
})).listen(8124);
Connect Middleware
Connect comes bundled with at least 20 middleware. I’m not going to cover them all
in this section, but I am going to demonstrate enough of them so that you have a good
understanding of how they work together.
connect.static
Earlier, we created a simplified static file server from scratch. Connect provides mid-
dleware that implements the functionality of that server, and more. It is extremely easy
to use—you just specify the connect.static middleware option, passing in the root
directory for all requests. The following implements most of what we created in Ex-
ample 6-2, but with far less code:
var connect = require('connect'),
http = require('http'),
__dirname = '/home/examples';
http.createServer(connect()
.use(connect.logger())
.use(connect.static(_dirname + '/public_html'), {redirect: true})
).listen(8124);
The connect.static middleware takes the root path as the first parameter, and an op-
tional object as the second. Among the options supported in the second object are:
maxAge
Browser cache in milliseconds: defaults to 0
hidden
Set to true to allow transfer of hidden files; default is false
redirect
Set to true to redirect to trailing / when the pathname is a directory
Middleware | 113
This short Connect middleware application represents a big difference in behavior from
the earlier scratch application. The Connect solution handles the browser cache, pro-
tects against malformed URLs, and more properly handles HTTP HTML5 video, which
the server built from scratch could not. Its only shortcoming when compared to the
scratch server is that we have more control over error handling with the scratch server.
However, the connect.static middleware does provide the appropriate response and
status code to the browser.
The code just shown, and the earlier examples in the section, also demonstrate another
Connect middleware: connect.logger. We’ll discuss it next.
connect.logger
The logger middleware module logs incoming requests to a stream, set to stdout by
default. You can change the stream, as well as other options including buffer duration,
format, and an immediate flag that signals whether to write the log immediately or on
response.
There are several tokens with which you can build the format string, in addition to four
predefined formats you can use:
default
':remote-addr - - [:date] ":method :url HTTP/:http-version" :status :res[con
tent-length] ":referrer" ":user-agent"'
short
':remote-addr - :method :url HTTP/:http-version :status :res[content-
length] - :response-time ms'
tiny
':method :url :status :res[content-length] - :response-time ms'
dev
Concise output colored by response status for development use
The default format generates log entries like the following:
99.28.217.189 - - [Sat, 25 Feb 2012 02:18:22 GMT] "GET /example1.html HTTP/1.1" 304
- "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko)
Chrome/17.0.963.56 Safari/535.11"
99.28.217.189 - - [Sat, 25 Feb 2012 02:18:22 GMT] "GET /phoenix5a.png HTTP/1.1" 304
- "http://examples.burningbird.net:8124/example1.html"
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko)
Chrome/17.0.963.56 Safari/535.11"
99.28.217.189 - - [Sat, 25 Feb 2012 02:18:22 GMT] "GET /favicon.ico HTTP/1.1"
304 - "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko)
Chrome/17.0.963.56 Safari/535.11"
99.28.217.189 - - [Sat, 25 Feb 2012 02:18:28 GMT]
"GET /html5media/chapter2/example16.html HTTP/1.1" 304 - "-"
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.11 (KHTML, like Gecko)
Chrome/17.0.963.56 Safari/535.11"
http.createServer(connect()
.use(connect.logger({format : 'dev', stream : writeStream }))
.use(connect.static(__dirname + '/public_html'))
).listen(8124);
While not as informative, this is a handy way of checking request state and load times.
Middleware | 115
var app = connect()
.use(connect.logger('dev'))
.use(connect.cookieParser())
.use(function(req, res, next) {
console.log('tracking ' + req.cookies.username);
next();
})
.use(connect.static('/home/examples/public_html'));
http.createServer(app).listen(8124);
console.log('Server listening on port 8124');
I’ll get into the use of the anonymous function, and especially the purpose of next, in
the section “Custom Connect Middleware” on page 118. Focusing for now on con
nect.cookieParser, we see that this middleware intercepts the incoming request, pulls
the cookie data out of the header, and stores the data in the request object. The anony-
mous function then accesses the username data from the cookies object, outputting it
to the console.
To create an HTTP response cookie, we pair connect.parseCookie with connect.cook
ieSession, which provides secure session persistence. Text is passed as a string to the
connect.cookieParser function, providing a secret key for session data. The data is
added directly to the session object. To clear the session data, set the session object to
null.
Example 6-7 creates two functions—one to clear the session data, and one to output a
tracking message—that are used as middleware for incoming requests. They’re added
as middleware in addition to logger, parseCookie, cookieSession, and static. The user
is prompted for his or her username in the client page, which is then used to set a request
cookie. On the server, the username and the number of resources the person has ac-
cessed in the current session are persisted via an encrypted response cookie.
Example 6-7. Using a session cookie to track resource accesses
var connect = require('connect')
, http = require('http');
// track user
function trackUser(req, res, next) {
// static server
app.use(connect.static('/home/examples/public_html'));
// start server and listen
http.createServer(app).listen(8124);
console.log('Server listening on port 8124');
Figure 6-2 shows a web page accessed through the server application in Example 6-8.
The JavaScript console is open to display both cookies. Note that the response cookie,
unlike the request, is encrypted.
Figure 6-2. JavaScript console open in Chrome, displaying request and response cookies
Middleware | 117
The number of documents the user accesses is tracked, either until the user accesses
the /clear URL (in which case the session object is set to null) or closes the browser,
ending the session.
Example 6-7 also made use of a couple of custom middleware functions. In the next
(and final) section on Connect, we’ll discuss how these work with Connect, and how
to create a third-party middleware.
The reason I cover connect.favicon, other than its usefulness, is that it’s one of the
simplest middleware, and therefore easy to reverse engineer.
The source code for connect.favicon, especially when compared with other source
codes, shows that all exported middleware return a function with the following mini-
mum signature or profile:
return function(req, res, next)
The next callback, passed as the last parameter to the function, is called if the middle-
ware does not process the current request, or doesn’t process it completely. The next
callback is also called if the middleware has an error, and an error object is returned
as the parameter, as shown in Example 6-8.
Example 6-8. The favicon Connect middleware
module.exports = function favicon(path, options){
var options = options || {}
, path = path || __dirname + '/../public/favicon.ico'
, maxAge = options.maxAge || 86400000;
The next callback is, of course, how the chained functions are called, in sequence. In
an incoming request, if the middleware can completely handle the request, such as the
request favicon.ico request, no further middleware are invoked. This is why you would
include the connect.favicon middleware before connect.logger in your applications—
to prevent requests for favicon.ico from cluttering up the logs:
http.createServer(connect()
.use(connect.favicon('/public_html/favicon.ico'))
.use(connect.logger())
.use(connect.static(_dirname + '/public_html'))
).listen(8124);
You’ve seen how you can create a custom Connect middleware directly in the appli-
cation, and how a bundled Connect middleware looks, but how would you create a
third-party middleware that’s not going to be embedded directly in the application?
To create an external Connect middleware, create the module as you would any other
module, but make sure it has all the pieces that Connect requires—specifying the three
parameters (req, res, and next), and that it calls next if it doesn’t completely handle
the request.
Example 6-9 creates a Connect middleware that checks to see if the requested file exists
and that it is a file (not a directory). If the request is a directory, it returns a 403 status
code and a custom message. If the file doesn’t exist, it returns a 404 status code and,
again, a custom message. If neither happens, then it calls next to trigger the Connect
middleware into invoking the next function (in this case, connect.static).
Example 6-9. Creating a custom error handler middleware module
var fs = require('fs');
Middleware | 119
return function customHandler(req, res, next) {
var pathname = path + req.url;
console.log(pathname);
fs.stat(pathname, function(err, stats) {
if (err) {
res.writeHead(404);
res.write(missingmsg);
res.end();
} else if (!stats.isFile()) {
res.writeHead(403);
res.write(directorymsg);
res.end();
} else {
next();
}
});
}
}
The custom Connect middleware throws an error when one occurs, but if an error
occurs within the returned function, next is called with an error object:
next(err);
The following code shows how we can use this custom middleware in an application:
var connect = require('connect'),
http = require('http'),
fs = require('fs'),
custom = require('./custom'),
base = '/home/examples/public_html';
http.createServer(connect()
.use(connect.favicon(base + '/favicon.ico'))
.use(connect.logger())
.use(custom(base + '/public_html', '404 File Not Found',
'403 Directory Access Forbidden'))
.use(connect.static(base))
).listen(8124);
Connect does have an errorHandler function, but it doesn’t serve the purpose we’re
trying to achieve. Rather, its purpose is to provide a formatted output of an exception.
You’ll see it in use with an Express application in Chapter 7.
There are several other bundled middleware, as well as a significant number of third-
party middleware you can use with Connect. In addition, Connect forms the middle-
ware layer for the Express application framework, discussed in Chapter 7. First, though,
let’s take a quick look at two other types of services necessary for many Node
applications: routers and proxies.
The module provides an extensive and well-documented API, but I’m going to focus
only on three different methods:
addRoute
Defines a new route pattern listener
parse
Parses a string and dispatches a match to the appropriate route
matched.add
Maps a route handler to a route match
Routers | 121
We define a route using a regular expression that can contain curly brackets ({}) de-
limiting named variables that will be passed to the route handler function. For instance,
both of the following route patterns:
{type}/{id}
node/{id}
will match:
http://something.org/node/174
The difference is that a type parameter is passed to the route handler for the first pattern,
but not the second.
You can also use a colon (:) to denote optional segments. The following:
category/:type:/:id:
will match:
category/
category/tech/
category/history/143
To trigger the route handler, you parse the request:
parse(request);
If the request matches any of the existing route handler functions, that function is called.
In Example 6-10, I created a simple application that looks for any given category, and
an optional publication and publication item. It prints out to the console, the action
specified in the request.
Example 6-10. Using Crossroads to route URL request into specific actions
var crossroads = require('crossroads'),
http = require('http');
crossroads.addRoute('/category/{type}/:pub:/:id:', function(type,pub,id) {
if (!id && !pub) {
console.log('Accessing all entries of category ' + type);
return;
} else if (!id) {
console.log('Accessing all entries of category ' + type +
' and pub ' + pub);
return;
} else {
console.log('Accessing item ' + id + ' of pub ' + pub +
' of category ' + type);
}
});
http.createServer(function(req,res) {
crossroads.parse(req.url);
To match how something like Drupal works, with its combination of type of object
and identifier, Example 6-11 uses another Crossroads method, matched.add, to map a
route handler to a specific route.
Example 6-11. Mapping a route handler to a given route
var crossroads = require('crossroads'),
http = require('http');
function onTypeAccess(type,id) {
console.log('access ' + type + ' ' + id);
};
typeRoute.matched.add(onTypeAccess);
http.createServer(function(req,res) {
crossroads.parse(req.url);
res.end('processing');
}).listen(8124);
Proxies
A proxy is a way of routing requests from several different sources through one server
for whatever reason: caching, security, even obscuring the originator of the request. As
an example, publicly accessible proxies have been used to restrict some people’s access
Proxies | 123
to certain web content by making it seem that a request originates from someplace
other than its actual origin. This type of proxy is also called a forward proxy.
A reverse proxy is a way of controlling how requests are sent to a server. As an example,
you may have five servers, but you don’t want people directly accessing four of them.
Instead, you direct all traffic through the fifth server, which proxies the requests to the
other servers. Reverse proxies can also be used for load balancing, and to improve the
overall performance of a system by caching requests as they are made.
In Node, the most popular proxy module is http-proxy. This module provides all of
the proxy uses I could think of, and some I couldn’t. It provides forward and reverse
proxying, can be used with WebSockets, supports HTTPS, and can incorporate latency.
It’s used at the popular nodejitsu.com website, so, as the creators claim, it’s battle
hardened.
The simplest use of http-proxy is to create a standalone proxy server that listens for
incoming requests on one port, and proxies them to a web server listening on another:
var http = require('http'),
httpProxy = require('http-proxy');
httpProxy.createServer(8124, 'localhost').listen(8000);
All this simple application does is listen for requests on port 8000 and proxy them to
the HTTP server listening on port 8124.
The output to the browser from running this application on my system was:
The bits related to the use of the proxy are in bold text in the output. Notice the request
cookie still hanging around from an earlier example?
You can also use http-proxy from the command line. In the bin directory, there is a
command-line application, which takes port, target, a configuration file, a flag to in-
dicate silencing the proxy log output, or -h (for help). To listen for requests in port
8000 and proxy to port 8124 on the localhost, use:
./node-http-proxy --port 8000 --target localhost:8124
It can’t get much simpler than this. If you want to run the proxy in the background,
attach the ampersand (&) to the end.
I’ll demonstrate some of the http-proxy capabilities with WebSockets and HTTPS later
in the book, but for now, we’ll pull together the technologies demonstrated in this
chapter—a static file server, the Connect middleware, the Crossroads router, and the
http-proxy proxy—to create one last example, so you can try a working application
that combines all these pieces.
In Example 6-12, I’m using the http-proxy to test for a dynamic incoming request (the
request URL starts with /node/). If a match is found, the router proxies the request to
one server, which uses the Crossroads router to parse out the relevant data. If the re-
quest isn’t for a dynamic resource, the proxy then routes the request to a static file
server that’s utilizing several Connect middleware, including logger, favicon, and
static.
Example 6-12. Combining Connect, Crossroads, and http-proxy to handle dynamic and static content
requests
var connect = require('connect'),
http = require('http'),
fs = require('fs'),
crossroads = require('crossroads'),
httpProxy = require('http-proxy'),
base = '/home/examples/public_html';
Proxies | 125
httpProxy.createServer(function(req,res,proxy) {
if (req.url.match(/^\/node\//))
proxy.proxyRequest(req, res, {
host: 'localhost',
port: 8000
});
else
proxy.proxyRequest(req,res, {
host: 'localhost',
port: 8124
});
}).listen(9000);
results in the following console entries, as well as the proper response being returned
to the browser:
accessed node 345
GET /example1.html 304 3ms
GET /phoenix5a.png 304 1ms
accessed node 800
GET /html5media/chapter2/example14.html 304 1ms
GET /html5media/chapter2/bigbuckposter.jpg 304 1ms
I wouldn’t say we’re halfway to our own CMS (content management system), but we’re
getting the tools we need if we wanted to build one. But then, why build our own when
we can use Node-enabled frameworks (covered in the next chapter)?
127
The Geddy.js site is at http://geddyjs.org/. Flatiron can be found at http:
//flatironjs.org/, the Ember.js Github page is at https://github.com/em
berjs/ember.js, and the primary CoreJS site is at http://echo.nextapp.com/
site/corejs. The Express GitHub page is at https://github.com/visionme
dia/express. You can find the Express documentation at http://expressjs
.com/.
To get a feel for Express, the best first step is to use the command-line version of the
tool to generate an application. Since you’re never sure what an application will do,
you’ll want to run this application in a clean directory—not a directory where you have
all sorts of valuable stuff.
I named my new application site, which is simple enough:
express site
It also provides a helpful message to change to the site directory and run npm install:
npm install -d
Once the new application is installed, run the generated app.js file with node:
node app.js
It starts a server at port 3000. Accessing the application shows a web page with the
words:
Express
Welcome to Express
You’ve created your first Express application. Now let’s see what we need to do to make
it do something more interesting.
app.configure(function(){
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
});
app.configure('development', function(){
app.use(express.errorHandler());
});
app.get('/', routes.index);
http.createServer(app).listen(3000);
From the top, the application includes three modules: Express, Node’s HTTP, and a
module just generated, routes. In the routes subdirectory, an index.js file has the fol-
lowing code:
/*
* GET home page.
*/
A little digging in the code shows us that the Express response object’s render method
renders a given view with a set of options—in this case, a title of “Express.” I’ll cover
this more later in this chapter, in the section “Routing” on page 134.
while the next configure call ensures that the settings apply only in a development
environment:
app.config('development', function() { ... }
This mode can be any that you designate, and is controlled by an environmental vari-
able, NODE_ENV:
$ export NODE_ENV=production
or:
$ export NODE_ENV=ourproduction
You can use any term you want. By default, the environment is development.
To ensure that your application always runs in a specific mode, add a NODE_ENV export
to the user profile file.
We can also call the middleware as methods when we create the server:
var app = express.createServer(
express.logger(),
express.bodyParts()
);
The bodyParser middleware, like the other middleware, comes directly from Connect.
All Express does is re-export it.
I covered logger, favicon, and static in the previous chapter, but not bodyParse. This
middleware parses the incoming request body, converting it into request object prop-
erties. The methodOverride option also comes to Express via Connect, and allows Ex-
press applications to emulate full REST capability via a hidden form field named
_method.
The last configuration item is app.router. This optional middleware contains all the
defined routes and performs the lookup for any given route. If omitted, the first call to
app.get—app.post, etc.—mounts the routes instead.
Just as with Connect, the order of middleware is important. The favicon middleware
is called before logger, because we don’t want favicon.ico accesses cluttering the log.
The static middleware is included before bodyParser and methodOverride, because
neither of these is useful with the static pages—form processing occurs dynamically in
the Express application, not via a static page.
Error Handling
Express provides its own error handling, as well as access to the Connect errorHandler.
The Connect errorHandler provides a way of handling exceptions. It’s a development
tool that gives us a better idea of what’s happening when an exception occurs. You can
include it like you’d include other middleware:
app.use(express.errorHandler());
You can also generate HTML for an exception using the showStack flag:
app.use(express.errorHandler({showStack : true; dumpExceptions : true}));
To reiterate: this type of error handling is for development only—we definitely don’t
want our users to see exceptions. We do, however, want to provide more effective
handling for when pages aren’t found, or when a user tries to access a restricted
subdirectory.
One approach we can use is to add a custom anonymous function as the last middleware
in the middleware list. If none of the other middleware can process the request, it should
fall gracefully to this last function:
app.configure(function(){
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
app.use(function(req, res, next){
res.send('Sorry ' + req.url + ' does not exist');
});
});
In the next chapter, we’ll fine-tune the response by using a template to generate a nice
404 page.
We can use another form of error handling to capture thrown errors and process them
accordingly. In the Express documentation, this type of error handler is named
app.error, but it didn’t seem to exist at the time this book was written. However, the
function signature does work—a function with four parameters: error, request,
response, and next.
I added a second error handler middleware function and adjusted the 404 middleware
function to throw an error rather than process the error directly:
Now I can process the 404 error, as well as other errors, within the same function. And
again, I can use templates to generate a more attractive page.
You can also enable static caching with the staticCache middleware:
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.staticCache());
app.use(express.static(__dirname + '/public'));
If you’re using express.directory with routing, though, make sure that the directory
middleware follows the app.router middleware, or it could conflict with the routing.
You can also use third-party Connect middleware with Express. Use
caution, though, when combining it with routing.
Routing
The core of all the Node frameworks—in fact, many modern frameworks—is the con-
cept of routing. I covered a standalone routing module in Chapter 6, and demonstrated
how you can use it to extract a service request from a URL.
Express routing is managed using the HTTP verbs GET, PUT, DELETE, and POST. The
methods are named accordingly, such as app.get for GET and app.post for POST. In the
generated application, shown in Example 7-1, app.get is used to access the application
root ('/'), and is passed a request listener—in this instance, the routes index function
—to process the data request.
The routes.index function is simple:
exports.index = function(req, res){
res.render('index', { title: 'Express' });
};
It makes a call to the render method on the resource object. The render method takes
the name of file that provides the template. Since the application has already identified
the view engine:
app.set('view engine', 'jade');
it’s not necessary to provide an extension for the file. However, you could also use:
res.render('index.jade', { title: 'Express' });
You can find the template file in another generated directory named views. It has two
files: index.jade and layout.jade. index.jade is the file the template file referenced in the
render method, and has the following contents:
extends layout
block content
h1= title
p Welcome to #{title}
The index.jade file is what provides the content for the body defined in layout.jade.
I cover the use of Jade templates and CSS with Express applications in
Chapter 8.
Routing | 135
res.end();
}
});
};
Just as we hoped, the application now writes the content to the console as well as the
browser. This just demonstrates that, though we’re using an unfamiliar framework, it’s
all based on Node and functionality we’ve used previously. Of course, since this is a
framework, we know there has to be a better method than using res.write and
res.end. There is, and it’s discussed in the next section, which looks a little more closely
at routing paths.
Routing Path
The route, or route path, given in Example 7-1 is just a simple / (forward slash) signi-
fying the root address. Express compiles all routes to a regular expression object inter-
nally, so you can use strings with special characters, or just use regular expressions
directly in the path strings.
To demonstrate, I created a bare-bones routing path application in Example 7-2 that
listens for three different routes. If a request is made to the server for one of these routes,
the parameters from the request are returned to the sender using the Express response
object’s send method.
Example 7-2. Simple application to test different routing path patterns
var express = require('express')
, http = require('http');
app.configure(function(){
});
app.get('/content/*',function(req,res) {
res.send(req.params);
});
app.get("/products/:id/:operation?", function(req,res) {
console.log(req);
res.send(req.params.operation + ' ' + req.params.id);
});
http.createServer(app).listen(3000);
The regular expression is looking for a single identifier or a range of identifiers, given
as two values with a range indicator (..) between them. Anything after the identifier
or range is ignored. If no identifier or range is provided, the parameters are null.
The code to process the request doesn’t use the underlying HTTP response object’s
write and end methods to send the parameters back to the requester; instead, it uses
the Express send method. The send method determines the proper headers for the re-
sponse (given the data type of what’s being sent) and then sends the content using the
underlying HTTP end method.
The next app.get is using a string to define the routing path pattern. In this case, we’re
looking for any content item. This pattern will match anything that begins with /con-
tent/. The following requests:
/content/156
/content/this_is_a_story
/content/apples/oranges
The asterisk (*) is liberal in what it accepts, and everything after content/ is returned.
The last app.get method is looking for a product request. If a product identifier is given,
it can be accessed directly via params.id. If an operation is given, it can be accessed
Routing | 137
directly via params.operation. Any combination of the two values is allowed, except not
providing at least one identifier or one operation.
The following URLs:
/products/laptopJK3444445/edit
/products/fordfocus/add
/products/add
/products/tablet89/delete
/products/
The application outputs the request object to the console. When running the applica-
tion, I directed the output to an output.txt file so I could examine the request object
more closely:
node app.js > output.txt
The request object is a socket, of course, and we’ll recognize much of the object from
our previous work exploring the Node HTTP request object. What we’re mainly in-
terested in is the route object added via Express. Following is the output for the
route object for one of the requests:
route:
{ path: '/products/:id/:operation?',
method: 'get',
callbacks: [ [Function] ],
keys: [ [Object], [Object] ],
regexp: /^\/products\/(?:([^\/]+?))(?:\/([^\/]+?))?\/?$/i,
params: [ id: 'laptopJK3444445', operation: 'edit' ] },
Note the generated regular expression object, which converts my use of the optional
indicator (:) in the path string into something meaningful for the underlying JavaScript
engine (thankfully, too, since I’m lousy at regular expressions).
Now that we have a better idea of how the routing paths work, let’s look more closely
at the use of the HTTP verbs.
Any request that doesn’t match one of the three given path patterns just
generates a generic 404 response: Cannot GET /whatever.
Let’s say our application is managing that most infamous of products, the widget. To
create a new widget, we’ll need to create a web page providing a form that gets the
information about the new widget. We can generate this form with the application,
and I’ll demonstrate this approach in Chapter 8, but for now we’ll use a static web
page, shown in Example 7-3.
Example 7-3. Sample HTML form to post widget data to the Express application
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Widgets</title>
</head>
<body>
<form method="POST" action="/widgets/add"
enctype="application/x-www-form-urlencoded">
Routing | 139
The page takes advantage of the new HTML5 attributes required and pattern to pro-
vide validation of data. Of course, this works only with browsers that support HTML5,
but for now, I’ll assume you’re using a modern HTML5-capable browser.
The widget form requires a widget name, price (with an associated regular expression
to validate the data structure in the pattern attribute), and description. Browser-based
validation should ensure we get the three values, and that the price is properly formatted
as US currency.
In the Express application, we’re just going to persist new widgets in memory, as we
want to focus purely on the Express technology at this time. As each new widget is
posted to the application, it’s added to an array of widgets via the app.post method.
Each widget can be accessed by its application-generated identifier via the app.get
method. Example 7-4 shows the entire application.
Example 7-4. Express application to add and display widgets
var express = require('express')
, http = require('http')
, app = express();
app.configure(function(){
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(app.router);
});
app.configure('development', function(){
app.use(express.errorHandler());
});
// add widget
app.post('/widgets/add', function(req, res) {
var indx = widgets.length + 1;
widgets[widgets.length] =
{ id : indx,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
descr : req.body.widgetdesc };
console.log('added ' + widgets[indx-1]);
res.send('Widget ' + req.body.widgetname + ' added with id ' + indx);
});
http.createServer(app).listen(3000);
The first widget is seeded into the widget array, so we have existing data if we want to
immediately query for a widget without adding one first. Note the conditional test in
app.get to respond to a request for a nonexistent or removed widget.
Running the application (example4.js in the examples), and accessing the application
using / or /index.html (or /example3.html, in the examples) serves up the static HTML
page with the form. Submitting the form generates a page displaying a message about
the widget being added, as well as its identifier. We can then use the identifier to display
the widget—in effect, a dump of the widget object instance:
http://whateverdomain.com:3000/widgets/2
Routing | 141
<p>Widget Price: <input type="text"
pattern="^\$?([0-9]{1,3},([0-9]{3},)*[0-9]{3}|[0-9]+)(.[0-9][0-9])?$"
name="widgetprice" id="widgetprice" size="25" value="100.00" required/></p>
Since PUT and DELETE are not supported in the form method attribute, we have to add
them using a hidden field with a specific name, _method, and give them a value of either
put, for PUT, or delete for DELETE.
The form to delete the widget is simple: it contains the hidden _method field, and a
button to confirm the deletion of widget 1:
<p>Are you sure you want to delete Widget 1?</p>
<form method="POST" action="/widgets/1/delete"
enctype="application/x-www-form-urlencoded">
<p>
<input type="submit" name="submit" id="submit" value="Delete Widget 1"/>
</p>
</form>
To ensure that the HTTP verbs are handled properly, we need to add another middle-
ware, express.methodOverride, following express.bodyParser in the app.configure
method call. The express.methodOverride middleware alters the HTTP method to
whatever is given as value in this hidden field:
app.configure(function(){
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
});
Next, we’ll need to add functionality to process these two new verbs. The update re-
quest replaces the widget object’s contents with the new contents, while the delete
request deletes the widget array entry in place, deliberately leaving a null value since
we do not want to reorder the array because of the widget removal.
app.configure(function(){
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
});
app.configure('development', function(){
app.use(express.errorHandler());
});
// in memory data store
var widgets = [
{ id : 1,
name : 'My Special Widget',
price : 100.00,
descr : 'A widget beyond price'
}
]
// add a widget
app.post('/widgets/add', function(req, res) {
var indx = widgets.length + 1;
widgets[widgets.length] =
{ id : indx,
name : req.body.widgetname,
Routing | 143
price : parseFloat(req.body.widgetprice),
descr : req.body.widgetdesc };
console.log(widgets[indx-1]);
res.send('Widget ' + req.body.widgetname + ' added with id ' + indx);
});
// delete a widget
app.del('/widgets/:id/delete', function(req,res) {
var indx = req.params.id - 1;
delete widgets[indx];
// update/edit a widget
app.put('/widgets/:id/update', function(req,res) {
var indx = parseInt(req.params.id) - 1;
widgets[indx] =
{ id : indx,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
descr : req.body.widgetdesc };
console.log(widgets[indx]);
res.send ('Updated ' + req.params.id);
});
http.createServer(app).listen(3000);
After running the application, I add a new widget, list the widgets out, update widget
1’s price, delete the widget, and then list the widgets out again. The console.log mes-
sages for this activity are:
Express server listening on port 3000
{ id: 2,
name: 'This is my Baby',
price: 4.55,
descr: 'baby widget' }
POST /widgets/add 200 4ms
GET /widgets 200 2ms
GET /edit.html 304 2ms
{ id: 0,
name: 'My Special Widget',
price: 200,
descr: 'A widget beyond price' }
PUT /widgets/1/update 200 2ms
GET /del.html 304 2ms
deleted 1
DELETE /widgets/1/delete 200 3ms
GET /widgets 200 2ms
Notice the HTTP PUT and DELETE verbs in bold text in the output. When I list the widgets
out the second time, the values returned are:
We now have a RESTful Express application. But we also have another problem.
If our application managed only one object, it might be OK to cram all the functionality
into one file. Most applications, however, manage more than one object, and the func-
tionality for all of those applications isn’t as simple as our little example. What we need
is to convert this RESTful Express application into a RESTful MVC Express application.
We’re already there for most of the functionality—we just need to clean it up a bit.
Just a reminder: you also might have issues with existing middleware
when implementing the MVC change. For instance, the use of the direc
tory middleware, which provides a pretty directory printout, conflicts
with the create action, since they work on the same route. Solution?
Place the express.directory middleware after the app.router in the
configure method call.
First, we’re going to create a controllers subdirectory and create a new file in it named
widgets.js. Then we’re going to copy all of our apt.get and apt.put method calls into
this new file.
Next, we need to convert the method calls into the appropriate MVC format. This
means converting the routing method call into a function for each, which is then ex-
ported. For instance, the function to create a new widget:
// add a widget
app.post('/widgets/add', function(req, res) {
var indx = widgets.length + 1;
widgets[widgets.length] =
{ id : indx,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice)};
console.log(widgets[indx-1]);
res.send('Widget ' + req.body.widgetname + ' added with id ' + indx);
});
Each function still receives the request and resource object. The only difference is that
there isn’t a direct route-to-function mapping.
Example 7-6 shows the new widgets.js file in the controllers subdirectory. Two of the
methods, new and edit, are placeholders for now, to be addressed in Chapter 8. We’re
// add a widget
exports.create = function(req, res) {
var indx = widgets.length + 1;
widgets[widgets.length] =
{ id : indx,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice) };
console.log(widgets[indx-1]);
res.send('Widget ' + req.body.widgetname + ' added with id ' + indx);
};
// show a widget
exports.show = function(req, res) {
var indx = parseInt(req.params.id) - 1;
if (!widgets[indx])
res.send('There is no widget with id of ' + req.params.id);
else
res.send(widgets[indx]);
};
// delete a widget
exports.destroy = function(req, res) {
var indx = req.params.id - 1;
delete widgets[indx];
Notice that edit and new are both GET methods, as their only purpose is to serve a form.
It’s the associated create and update methods that actually change the data: the former
is served as POST, the latter as PUT.
To map the routes to the new functions, I created a second module, maproutecontrol
ler, with one exported function, mapRoute. It has two parameters—the Express app
object and a prefix representing the mapped controller object (in this case, widgets).
It uses the prefix to access the widgets controller object, and then maps the methods it
knows are in this object (because the object is a controller and has a fixed set of required
methods) to the appropriate route. Example 7-7 has the code for this new module.
Example 7-7. Function to map routes to controller object methods
exports.mapRoute = function(app, prefix) {
// index
app.get(prefix, prefixObj.index);
// add
app.get(prefix + '/new', prefixObj.new);
// show
app.get(prefix + '/:id', prefixObj.show);
// create
app.post(prefix + '/create', prefixObj.create);
// edit
app.get(prefix + '/:id/edit', prefixObj.edit);
// update
app.put(prefix + '/:id', prefixObj.update);
// destroy
app.del(prefix + '/:id', prefixObj.destroy);
};
Example 7-8 shows the finished application. I added back in the original
routes.index view, except I changed the title value in the routes/index.js file from
“Express” to “Widget Factory.”
Example 7-8. Application that makes use of the new MVC infrastructure to maintain widgets
var express = require('express')
, routes = require('./routes')
, map = require('./maproutecontroller')
, http = require('http')
, app = express();
app.configure(function(){
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.staticCache({maxObjects: 100, maxLength: 512}));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
app.use(express.directory(__dirname + '/public'));
app.use(function(req, res, next){
throw new Error(req.url + ' not found');
});
app.use(function(err, req, res, next) {
console.log(err);
res.send(err.message);
});
});
app.configure('development', function(){
app.use(express.errorHandler());
});
app.get('/', routes.index);
var prefixes = ['widgets'];
http.createServer(app).listen(3000);
Cleaner, simpler, extensible. We still don’t have the view part of the MVC, but I’ll cover
that in the next chapter.
Following the request option, specify the method (in this case, GET), and then the re-
quest URL. You should get back a dump of all widgets currently in the data store.
To test creating a new widget, first issue a request for the new object:
curl --request GET http://examples.burningbird.net:3000/widgets/new
A message is returned about retrieving the new widget form. Next, test adding a new
widget, passing the data for the widget in the cURL request, and changing the method
to POST:
curl --request POST http://examples.burningbird.net:3000/widgets/create
--data 'widgetname=Smallwidget&widgetprice=10.00'
Run the index test again to make sure the new widget is displayed:
curl --request GET http://examples.burningbird.net:3000/widgets
Once you’ve verified the data was changed, go ahead and delete the new record, chang-
ing the HTTP method to DELETE:
curl --request DELETE http://examples.burningbird.net:3000/widgets/2
Now that we have the controller component of the MVC, we need to add the view
components, which I cover in Chapter 8. Before moving on, though, read the sidebar
“Beyond Basic Express” on page 151 for some final tips.
Frameworks such as Express provide a great deal of useful functionality, but one thing
they don’t provide is a way of separating the data from the presentation. You can use
JavaScript to generate HTML to process the result of a query or update, but the effort
can quickly become tedious—especially if you have to generate every part of the page,
including sidebars, headers, and footers. Sure, you can use functions, but the work can
still verge on overwhelming.
Luckily for us, as framework systems have developed, so have template systems, and
the same holds true for Node and Express. In Chapter 7, we briefly used Jade, the
template system installed by default with Express, to generate an index page. Express
also supports other compatible template systems, including another popular choice,
EJS (embedded JavaScript). Jade and EJS take a completely different approach, but
both deliver the expected results.
In addition, though you can manually create CSS files for your website or application,
you can also use a CSS engine that can simplify this aspect of your web design and
development. Rather than having to remember to add in all of the curly braces and
semicolons, you use a simplified structure that can be cleaner to maintain. One such
CSS engine that works quite nicely with Express and other Node applications is Stylus.
In this chapter I’ll primarily focus on Jade, since it is installed by default with Express.
However, I’m going to briefly cover EJS, so you can see two different types of template
systems and how they work. I’ll also introduce the use of Stylus to manage the CSS to
ensure that the pages display nicely.
153
The EJS GitHub page can be found at https://github.com/visionmedia/ejs.
In the code, the EJS is embedded directly into HTML, in this example providing the
data for the individual list items for an unordered list. The angle brackets and percent-
age sign pairs (<%, %>) are used to delimit EJS instructions: a conditional test ensures
that an array has been provided, and then the JavaScript processes the array, outputting
the individual array values.
EJS is based on the Ruby ERB templating system, which is why you’ll
frequently see “erb-like” used to describe its format.
The values themselves are output with the equals sign (=), which is a shortcut for “print
this value here”:
<%= name %>
The value is escaped when it’s printed out. To print out an unescaped value, use a dash
(-), like so:
<%- name %>
If for some reason you don’t want to use the standard open and closing EJS tags (<%,
%>), you can define custom ones using the EJS object’s open and close methods:
ejs.open('<<');
ejs.close('>>');
You can then use these custom tags instead of the default ones:
<h1><<=title >></h1>
Unless you have a solid reason for doing so, though, I’d stick with the default.
Once EJS is installed, you can use it directly in a simple Node application—you don’t
have to use it with a framework like Express. As a demonstration, render HTML from
a given template file as follows:
<html>
<head>
<title><%= title %></title>
</head>
<body>
<% if (names.length) { %>
<ul>
<% names.forEach(function(name){ %>
<li><%= name %></li>
<% }) %>
</ul>
<% } %>
</body>
Call the EJS object’s renderFile method directly. Doing so opens the template and uses
the data provided as an option to generate the HTML.
Example 8-1 uses the standard HTTP server that comes with Node to listen for a request
on port 8124. When a request is received, the application calls the EJS renderFile
method, passing in the path for the template file, as well as a names array and a document
title. The last parameter is a callback function that either provides an error (and a
fairly readable error, at that) or the resulting generated HTML. In the example, the
result is sent back via the response object if there’s no error. If there is an error, an error
message is sent in the result, and the error object is output to the console.
// data to render
var names = ['Joe', 'Mary', 'Sue', 'Mark'];
var title = 'Testing EJS';
// render or error
ejs.renderFile(__dirname + '/views/test.ejs',
{title : 'testing', names : names},
function(err, result) {
if (!err) {
res.end(result);
} else {
res.end('An error occurred accessing page');
console.log(err);
}
});
}).listen(8124);
One variation of the rendering method is render, which takes the EJS template as a
string and then returns the formatted HTML:
var str = fs.readFileSync(__dirname + '/views/test.ejs', 'utf8');
res.end(html);
A third rendering method, which I won’t demonstrate, is compile, which takes an EJS
template string and returns a JavaScript function that can be invoked to render HTML
each time it’s called. You can also use this method to enable EJS for Node in client-side
applications.
The filters can be chained together, with the result of one being piped to the next. The
use of the filter is triggered by the colon (:) following the equals sign (=), which is then
followed by the data object. The following example of the use of filters takes a set of
people objects, maps a new object consisting solely of their names, sorts the names,
and then prints out a concatenated string of the names:
var people = [
{name : 'Joe Brown', age : 32},
{name : 'Mary Smith', age : 54},
{name : 'Tom Thumb', age : 21},
{name : 'Cinder Ella', age : 16}];
The filters aren’t documented in the EJS for Node documentation, and you have to be
careful using them interchangeably because some of the filters want a string, not an
array of objects. Table 8-1 contains a list of the filters, and a brief description of what
type of data they work with and what they do.
Table 8-1. EJS for Node filters
Filter Type of data Purpose
first Accepts and returns array Returns first element of array
last Accepts and returns array Returns last element of array
capitalize Accepts and returns string Capitalizes first character in string
downcase Accepts and returns string Lowercases all characters in string
upcase Accepts and returns string Capitalizes all characters in string
sort Accepts and returns array Applies Array.sort to array
app.configure(function(){
app.set('views', __dirname + '/views');
app.set('view engine', 'ejs');
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
});
app.configure('development', function(){
app.use(express.errorHandler());
});
app.get('/', routes.index);
http.createServer(app).listen(3000);
The index.js route doesn’t require any change at all, because it’s not using anything
that’s specific to any template system; it’s using the Express resource object’s render
method, which works regardless of template system (as long as the system is compatible
with Express):
exports.index = function(req, res){
res.render('index', { title: 'Express' }, function(err, stuff) {
if (!err) {
console.log(stuff);
res.write(stuff);
res.end();
}
});
};
In the views directory, the index.ejs file (note the extension) uses EJS for Node anno-
tation rather than the Jade we saw in Chapter 7:
<html>
<head>
This demonstrates the beauty of working with an application that separates the model
from the controller from the view: you can swap technology in and out, such as using
a different template system, without impacting the application logic or data access.
To recap what’s happening with this application:
1. The main Express application uses app.get to associate a request listener function
(routes.index) with an HTTP GET request.
2. The routes.index function calls res.render to render the response to the GET
request.
3. The res.render function invokes the application object’s render function.
4. The application render function renders the specified view, with whatever options
—in this case, the title.
5. The rendered content is then written to the response object, and back to the user’s
browser.
In Chapter 7, we focused on the routing aspects of the application, and now we’ll focus
on the view. We’ll take the application we created at the end of Chapter 7, in Exam-
ple 7-6 through Example 7-8, and add in the views capability. First, though, we need
to do a little restructuring of the environment to ensure that the application can grow
as needed.
The routes and the controllers directories can stay as they are, but the views and the
public directory need to be modified to allow for different objects. Instead of placing
all widget views directly in views, we add them to a new subdirectory of views named,
appropriately enough, widgets:
Instead of placing all widget static files directly in the public directory, we also place
them in a widgets subdirectory:
/application directory
/public
/widgets
Now, we can add new objects by adding new directories, and we’ll be able to use
filenames of new.html and edit.ejs for each, without worrying about overwriting existing
files.
Note that this structure assumes we may have static files for our application. The next
step is to figure out how to integrate the static files into the newly dynamic environment.
We used the relative indicator .. since the public directory is located off the control-
lers directory’s parent. However, we can’t use this path in sendfile as is, because it
The HTML page with the form is nothing to get excited about—just a simple form, as
shown in Example 8-3. However, we did add the description field back in to make the
data a little more interesting.
Example 8-3. HTML new widget form
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Widgets</title>
</head>
<body>
<h1>Add Widget:</h1>
Now we’re ready to convert the widget controller so it uses templates, starting with the
code to add a new widget.
The actual processing of the data in the widget controller for the new widget doesn’t
change. We still pull the data from the request body, and add it to the in-memory widget
store. However, now that we have access to a template system, we’re going to change
how we respond to the successful addition of a new widget.
I created a new EJS template, named added.ejs, shown in Example 8-4. All it does is
provide a listing of the widget’s properties, and a message consisting of the title sent
with the widget object.
Example 8-4. “Widget added” confirmation view template
<head>
<title><%= title %></title>
</head>
<body>
<h1><%= title %> | <%= widget.name %></h1>
<ul>
<li>ID: <%= widget.id %></li>
<li>Name: <%= widget.name %></li>
<li>Price: <%= widget.price.toFixed(2) %></li>
<li>Desc: <%= widget.desc %></li>
</ul>
</body>
// generate widget id
var indx = widgets.length + 1;
// add widget
widgets[widgets.length] =
{ id : indx,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
desc : req.body.widgetdesc };
The two options sent to the view are the page title and the widget object. Figure 8-1
shows the informative, though plain, result.
The code to process a new widget doesn’t do any validation of the data
or check for authority or SQL injection hacks. Data validation, security,
and authorization are covered in Chapter 15.
The next two processes to convert to templates, update and deletion, require a way of
specifying which widget to perform the action on. In addition, we also need to convert
The controller code to trigger this new view is extremely simple: a call to render the
view, sending the entire array of widgets through as data:
In Example 8-5, if the object has a length property (is an array), its element objects are
traversed and their properties are printed out as table data, in addition to the links to
edit and delete the object. Figure 8-2 shows the table after several widgets have been
added to our in-memory data store.
Figure 8-2. Widget display table after the addition of several widgets
The link (route) to delete the object is actually the same as the link (route) to show
it: /widgets/:id. We’ll add a mostly hidden form to the Show Widget page that includes
a button to delete the widget if it’s no longer needed. This allows us to incorporate the
necessary trigger for the deletion without having to add a new route. It also provides
another level of protection to ensure that users know exactly which widget they’re
deleting.
Rather than incorporate the delete request into the Show Widget page,
it’s also perfectly acceptable to create another route, such as /widgets/:id/
delete, and generate an “Are you sure?” page from the index page link,
which then triggers the deletion.
</body>
Very little modification is required in the controller code for either the show or the
destroy methods. I’ve left the destroy method as is for now. All it does is delete the
object from the in-memory store and send a message back to this effect:
exports.destroy = function(req, res) {
var indx = req.params.id - 1;
delete widgets[indx];
The show method required little change—simply replacing the send message with a call
to render the new view:
// show a widget
exports.show = function(req, res) {
var indx = parseInt(req.params.id) - 1;
if (!widgets[indx])
res.send('There is no widget with id of ' + req.params.id);
else
res.render('widgets/show', {title : 'Show Widget', widget : widgets[indx]});
};
Figure 8-3 demonstrates what the Show Widget page looks like, complete with the
Delete Widget button at the bottom.
By now, you’ve seen how simple it is to incorporate views into the application. The
best thing of all about this system is that you can incorporate changes into the view
templates without having to stop the application: it uses the changed template the next
time the view is accessed.
One last view for the update widget, and we’re done converting the widget application
to use the EJS template system.
Figure 8-4 shows the page with a widget loaded. All you need to do is edit the field
values, and then click Submit to submit the changes.
The modification to the controller code is as simple as the other modifications have
been. The Edit view is accessed using res.render, and the widget object is passed as
data:
The code to process the update is very close to what we had in Chapter 7, except that
instead of sending a message that the object is updated, we’re using a view. We’re not
creating a new view, though. Instead, we’re using the widgets/added.ejs view we used
earlier. Since both just display the object’s properties and can take a title passed in as
data, we can easily repurpose the view just by changing the title:
// update a widget
exports.update = function(req, res) {
var indx = parseInt(req.params.id) - 1;
widgets[indx] =
{ id : indx + 1,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
desc : req.body.widgetdesc}
console.log(widgets[indx]);
res.render('widgets/added', {title: 'Widget Edited', widget : widgets[indx]})
};
Again, the view used doesn’t impact what route (URL) is shown, so it doesn’t matter
if we reuse a view. Being able to reuse a view can save us a lot of work as the application
increases in difficulty.
You’ve had a chance to see pieces of the controller code throughout these sections as
we convert it to use templates. Example 8-8 is an entire copy of the changed file, which
you can compare to Example 7-6 in Chapter 7 to see how easily views incorporate into
the code, and how much work they can save us.
Example 8-8. The widget controller implemented with views
var widgets = [
{ id : 1,
name : "The Great Widget",
price : 1000.00,
desc: "A widget of great value"
}
]
// generate widget id
var indx = widgets.length + 1;
// add widget
widgets[widgets.length] =
{ id : indx,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
desc : req.body.widgetdesc };
// show a widget
exports.show = function(req, res) {
var indx = parseInt(req.params.id) - 1;
if (!widgets[indx])
res.send('There is no widget with id of ' + req.params.id);
else
res.render('widgets/show', {title : 'Show Widget', widget : widgets[indx]});
};
// delete a widget
exports.destroy = function(req, res) {
var indx = req.params.id - 1;
delete widgets[indx];
// update a widget
exports.update = function(req, res) {
var indx = parseInt(req.params.id) - 1;
widgets[indx] =
{ id : indx + 1,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
desc : req.body.widgetdesc}
console.log(widgets[indx]);
res.render('widgets/added', {title: 'Widget Edited', widget : widgets[indx]})
};
You have:
html
head
title This is it
body
p Say Hi to the world
The contents of both the title and the paragraph elements are just included after the
element name. There are no ending tags—they’re assumed—and again, indentation
triggers nesting. Another example is the following, which also makes use of both class
name and identifier, as well as additional nesting:
html
head
title This is it
body
div.content
div#title
p nested data
This generates:
If you have large bodies of content, such as text for a paragraph, you can use the vertical
bar, or pipe (|), to concatenate the text:
p
| some text
| more text
| and even more
This becomes:
<p>some text more text and even more</p>
Another approach is to end the paragraph element with a period (.) indicating that the
block contains only text and allowing us to omit the vertical bar:
p.
some text
more text
and even more
If we want to include HTML as the text, we can; it ends up being treated as HTML in
the generated source:
body.
<h1>A header</h1>
<p>A paragraph</p>
Form elements generally have attributes, and they’re incorporated in Jade in paren-
theses, including setting their values (if any). The attributes need only be separated by
whitespace, but I list them each on a separate line to make the template more readable.
The following Jade template:
html
head
title This is it
body
form(method="POST"
action="/widgets"
enctype="application/x-www-form-urlencoded")
input(type="text"
name="widgetname"
id="widgetname"
size="25")
Notice the use of the pound sign and curly braces (#{}) for the title. This is how we
embed data passed to the template in Jade. The use of the identifier doesn’t change
from EJS, just the syntax.
To use the new layout template, we start off each of the content templates with:
extends layout
The use of extends lets the template engine know where to find the layout template for
the page view, while the use of block instructs the template engine about where to place
the generated content.
You don’t have to use content for the block name, and you can use more than one
block. In addition, you can also include other template files if you want to break up the
layout template even further. I modified layout.jade to include a header rather than the
markup directly in the layout file:
doctype 5
html(lang="en")
include header
body
block content
I then defined the header content in a file named header.jade, with the following:
head
title #{title}
meta(charset="utf-8")
There are two things to note in the new layout.jade and header.jade code.
First, the include directive is relative. If you split the views into the following subdir-
ectory structure:
/views
/widgets
layout.jade
/standard
header.jade
The file doesn’t have to be Jade, either—it could be HTML, in which case you’ll need
to use the file extension:
include standard/header.html
Second, do not use indentation in the header.jade file. The indentation comes in from
the parent file and doesn’t need to be duplicated in the included template file. In fact,
if you do so, your template will generate an error.
Now that we’ve defined the layout template, it’s time to convert the EJS views into Jade.
Now is also the time you might consider swapping the static Add Widget
form file for a dynamic one so that it can also take advantage of the new
layout template.
block content
h1 #{title} | #{widget.name}
ul
li id: #{widget.id}
li Name: #{widget.name}
li Price: $#{widget.price.toFixed()}
li Desc: #{widget.desc}
Notice how we can still use the toFixed method to format the price output.
The block is named content, so it integrates with the expectations of the block name
set in the layout.jade file. The simplified HTML for an h1 header and an unordered list
is integrated with the data passed from the controller—in this case, the widget object.
Running the widget application and adding a new widget generates the same HTML
as generated with the EJS: a header and a list of widget properties for the newly added
widget—all without our changing any of the controller code.
Each link must be included on a separate line; otherwise, we lose the nesting indication
with the indentation.
The main index.jade file that references the newly created row template is shown in
Example 8-11. This template introduces two new Jade constructs: a conditional test
and an iteration. The conditional is used to test for the length property on the
widgets object, assuring us we’re dealing with an array. The iteration construct uses an
abbreviated form of the Array.forEach method, where the array is traversed and each
instance is assigned to the new variable, widget.
Example 8-11. The index template for creating a table of widgets
extends layout
block content
table
caption Widgets
if widgets.length
tr
th ID
th Name
th Price
th Description
th
th
each widget in widgets
include row
This is a whole lot less work than having to manually enter all those angle brackets,
especially with the table headers (th). The results of the Jade template are identical to
those from the EJS template: an HTML table with widgets in each row, and the ability
to edit or delete each widget.
block content
h1 Edit #{widget.name}
form(method="POST"
action="/widgets/#{widget.id}"
enctype="application/x-www-form-urlencoded")
p Widget Name:
input(type="text"
name="widgetname"
id="widgetname"
size="25"
value="#{widget.name}"
required)
p Widget Price:
input(type="text"
name="widgetprice"
id="widgetprice"
size="25"
value="#{widget.price}"
pattern="="^\$?([0-9]{1,3},([0-9]{3},)*[0-9]{3}|[0-9]+)(.[0-9][0-9])?$"
required)
p Widget Description:
br
textarea(name="widgetdesc"
id="widgetdesc"
cols="20"
rows="5") #{widget.desc}
p
input(type="hidden"
name="_method"
id="_method"
value="put")
input(type="submit"
name="submit"
id="submit"
value="Submit")
input(type="reset"
name="reset"
id="reset"
value="reset")
I then modified the added.jade file from Example 8-10 to use this new template:
extends layout
block content
h1 #{title} | #{widget.name}
include widget
The new Show Widget template also makes use of the new widget.jade template, as
demonstrated in Example 8-13.
Example 8-13. The new Show Widget template in Jade
extends layout
block content
h1 #{widget.name}
include widget
form(method="POST"
action="/widgets/#{widget.id}"
enctype="application/x-www-form-urlencoded")
input(type="hidden"
name="_method"
id="_method"
value="delete")
input(type="submit"
name="submit"
id="submit"
value="Delete Widget")
You can see how modularizing the templates makes each template that much cleaner,
and thus easier to maintain.
With the newly modularized template, we can now show and delete a specific
widget...and that leads to a quirk where the Jade template differs from the EJS template.
In the widget application, when widgets are deleted, they are deleted in place. This
means the array element is basically set to null, so that the widget location in the array
is maintained relative to its identifier. This in-place maintenance doesn’t cause a prob-
lem when we add and delete widgets and display them in the index page in EJS, but it
block content
table
caption Widgets
if widgets.length
tr
th ID
th Name
th Price
th Description
th
th
each widget in widgets
if widget
include row
And now, all of the template views have been converted to Jade, and the application is
complete. (Well, until we add in the data portion in Chapter 10.)
But while the application is complete, it’s not very attractive. Of course, it’s easy enough
to add a stylesheet into the header to modify the presentation of all the elements, but
we’ll also briefly take a look at another approach: using Stylus.
The stylesheet is now immediately available to all of our application views because they
all use the layout template, which uses this header.
Now you can definitely see the value of converting the static new.html
file into a template view: making the change to the header doesn’t im-
pact it, and it has to be manually edited.
Stylus is not like the Jade template system. It doesn’t create dynamic CSS views. What
it does is generate static stylesheets from a Stylus template the first time the template
is accessed, or each time the template is modified.
To incorporate Stylus into the widget application, we have to include the module within
the main application file’s (app.js) require section. Then we have to include the Stylus
middleware along with the others in the configure method call, passing in an option
with the source for the Stylus templates, and the destination where the compiled style-
sheets are to be placed. Example 8-14 shows the newly modified app.js file with these
changes in bold text.
Example 8-14. Adding Stylus CSS template support to the widget application
var express = require('express')
, routes = require('./routes')
, map = require('./maproutecontroller')
, http = require('http')
, stylus = require('stylus')
, app = express();
app.configure(function(){
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.staticCache({maxObjects: 100, maxLength: 512}));
app.use(stylus.middleware({
src: __dirname + '/views'
, dest: __dirname + '/public'
}));
app.use(express.static(__dirname + '/public'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(app.router);
app.use(express.directory(__dirname + '/public'));
app.use(function(req, res, next){
throw new Error(req.url + ' not found');
});
app.use(function(err, req, res, next) {
console.log(err);
app.configure('development', function(){
app.use(express.errorHandler());
});
app.get('/', routes.index);
http.createServer(app).listen(3000);
The first time you access the widget application after making this change, you may
notice a very brief hesitation. The reason is that the Stylus module is generating the
stylesheet—an event that happens when a new or modified stylesheet template is added
and the application is restarted. After the stylesheet has been generated, though, it’s
the generated copy that’s served up—it isn’t recompiled with every page access.
You will need to restart your Express application if you make changes
to the stylesheet template.
The Stylus stylesheet templates all have the same extension: .styl. The source directory
is set to views, but it expects the stylesheet templates to be in a stylesheets directory
under views. When it generates the static stylesheets, it places them in a stylesheets
directory under the destination directory (in this case, /public).
After working with Jade, you should find the Stylus syntax very familiar. Again, each
element that is being styled is listed, followed by the indented stylesheet setting. The
syntax strips away the need for curly braces, colons, and semicolons.
For example, to change the background color for the web page to yellow, and the text
color to red, use the following for the Stylus template:
body
background-color yellow
color red
If you want elements to share settings, list them on the same line with a comma between
them, just like you would with CSS:
If you want to use a pseudoclass, such as :hover or :visited, use the following syntax:
textarea
input
background-color #fff
&:hover
background-color cyan
The ampersand (&) represents the parent selector. All combined, the following Stylus
template:
Combined, the following:
p, tr
background-color yellow
color red
textarea
input
background-color #fff
&:hover
background-color cyan
There’s more to working with Stylus, but I’ll leave that to you as an off-book exercise.
The Stylus website provides a good set of documentation for the syntax. Before leaving
this chapter, though, we’ll create a Stylus stylesheet that enhances the presentation of
the widget application.
Specifically, we’ll add a border and spacing to the HTML table element in the index
widget listing page. We’re also going to change the font for the headers and remove the
Figure 8-5 shows the index page after several widgets have been added. Again, it’s
nothing fancy, but the data content is a lot easier to read with the new stylesheet.
When it comes to data, there’s relational databases and Everything Else, otherwise
known as NoSQL. In the NoSQL category, a type of structured data is based on key/
value pairs, typically stored in memory for extremely fast access. The three most pop-
ular in-memory key/value stores are Memcached, Cassandra, and Redis. Happily for
Node developers, there is Node support for all three stores.
Memcached is primarily used as a way of caching data queries for quick access in
memory. It’s also quite good with distributed computing, but has limited support for
more complex data. It’s useful for applications that do a lot of queries, but less so for
applications doing a lot of data writing and reading. Redis is the superior data store for
the latter type of application. In addition, Redis can be persisted, and it provides more
flexibility than Memcached—especially in its support for different types of data. How-
ever, unlike Memcached, Redis works only on a single machine.
The same factors also come into play when comparing Redis and Cassandra. Like
Memcached, Cassandra has support for clusters. However, also like Memcached, it has
limited data structure support. It’s good for ad hoc queries—a use that does not favor
Redis. However, Redis is simple to use, uncomplicated, and typically faster than Cas-
sandra. For these reasons, and others, Redis has gained a greater following among Node
developers, which is why I picked it over Memcached and Cassandra to cover in this
chapter on key/value in-memory data stores.
I’m going to take a break from the more tutorial style of technology coverage in the
previous chapters and demonstrate Node and Redis by implementing three use cases
that typify the functionality these two technologies can provide:
• Building a game leaderboard
• Creating a message queue
• Tracking web page statistics
These applications also make use of modules and technologies covered in earlier chap-
ters, such as the Jade template system (covered in Chapter 8), the Async module (cov-
ered in Chapter 5), and Express (covered in Chapter 7 and Chapter 8).
187
The Redis site is at http://redis.io/. Read more on Memcached at http://
memcached.org/, and on Apache Cassandra at http://cassandra.apache
.org/.
I also recommend using the hiredis library, as it’s nonblocking and improves perfor-
mance. Install it using the following:
npm install hiredis redis
To use redis in your Node applications, you first include the module:
var redis = require('redis');
Then you’ll need to create a Redis client. The method used is createClient:
var client = redis.createClient();
The createClient method can take three optional parameters: port, host, and options
(outlined shortly). By default, the host is set to 127.0.0.1, and the port is set to 6379.
The port is the one used by default for a Redis server, so these default settings should
be fine if the Redis server is hosted on the same machine as the Node application.
The third parameter is an object that supports the following options:
parser
The Redis protocol reply parser; set by default to hiredis. You can also use java
script.
return_buffers
Defaults to false. If true, all replies are sent as Node buffer objects rather than
strings.
The hset command sets a value, so there’s no return data, only the Redis acknowledg-
ment. If you call a method that gets multiple values, such as client.hvals, the second
parameter in the callback function will be an array—either an array of single strings,
or an array of objects:
client.hvals(obj.member, function (err, replies) {
if (err) {
return console.error("error response - " + err);
}
Because the Node callback is so ubiquitous, and because so many of the Redis com-
mands are operations that just reply with a confirmation of success, the redis module
provides a redis.print method you can pass as the last parameter:
client.set("somekey", "somevalue", redis.print);
client.on('error', function(err) {
console.log('Error ' + err);
});
}).listen(8124);
The Redis connection is established when the server is created, and closed when the
server is closed. Another approach is to create a static client connection that persists
across requests, but this has disadvantages. For more on when to create the Redis client,
see the upcoming sidebar “When to Create the Redis Client” on page 200. The object
To render the template, the application reads in the template file (using a synchronous
file read, since this occurs only once, when the application is first started) and then uses
it to compile a template function:
var layout = require('fs').readFileSync(__dirname + '/score.jade', 'utf8');
var fn = jade.compile(layout, {filename: __dirname + '/score.jade'});
The compiled Jade function can then be used anytime you want to render the HTML
from the template, passing in whatever data the template is expecting:
var str = fn({scores : result});
res.end(str);
This will all make more sense when we see the complete server application, but for
now, let’s return to the Redis part of the application.
The top scores application is using two Redis calls: zrevrange to get a range of scores,
and hgetall to get all the hash fields for each member listed in the top scores. And this
is where things get a little tricky.
You can easily combine results from multiple tables when you’re using a relational
database, but the same doesn’t hold true when you’re accessing data from a key/value
data store such as Redis. It’s doable, but since this is a Node application, we have the
extra complexity of each Redis call being asynchronous.
This is where a library such as Async comes in handy. I covered Async in Chapter 5,
and demonstrated a couple of the Async methods (waterfall and parallel). One
method I didn’t demonstrate was series, which is the ideal function for our use here.
The Redis functions need to be called in order, so the data is returned in order, but
each interim step doesn’t need the data from previous steps. The Async parallel func-
tionality would run all the calls at once, which is fine, but then the results from each
are returned in a random order—not guaranteed to return highest score first. The
waterfall functionality isn’t necessary, because again, each step doesn’t need data from
the previous step. The Async series functionality ensures that each Redis hgetall call
is made in sequence and the data is returned in sequence, but takes into account that
each functional step doesn’t care about the others.
So we now have a way for the Redis commands to get called in order and ensure the
data is returned in proper sequence, but the code to do so is clumsy: we have to add a
separate step in the Async series for each hgetall Redis call and return the result once
// helper function
function makeCallbackFunc(member) {
return function(callback) {
client.hgetall(member, function(err, obj) {
callback(err,obj);
});
};
}
http.createServer(function(req,res) {
Before the HTTP server is created, we set up the Jade template function and also es-
tablish a running client to the Redis data store. When a new request is made of the
server, we filter out all requests for the favicon.ico file (no need to call Redis for a
favicon.ico request), and then access the top five scores using zrevrange. Once the ap-
plication has the scores, it uses the Async series method to process the Redis hash
requests one at a time and in sequence so it can get an ordered result back. This resulting
array is passed to the Jade template engine.
Figure 9-1 shows the application after I’ve added in several different scores for different
folks.
// graphics test
var re2 = /\.gif|\.png|\.jpg|\.svg/;
logs.on('exit', function(code) {
console.log('child process exited with code ' + code);
client.end();
});
Typical console log entries for this application are given in the following block of code,
with the entries of interest (the image file accesses) in bold:
/robots.txt
/weblog
/writings/fiction?page=10
/images/kite.jpg
/node/145
/culture/book-reviews/silkworm
/feed/atom/
/images/visitmologo.jpg
/images/canvas.png
/sites/default/files/paws.png
/feeds/atom.xml
Example 9-5 contains the code for the message queue. It’s a simple application that
starts a TCP server and listens for incoming messages. When it receives a message, it
extracts the data from the message and stores it in the Redis database. The application
uses the Redis rpush command to push the data on the end of the images list (bolded
in the code).
Example 9-5. Message queue that takes incoming messages and pushes them onto a Redis list
var net = require('net');
var redis = require('redis');
// store data
client.rpush('images',data);
});
}).listen(3000);
server.on('close', function(err) {
client.quit();
});
The message queue application console log entries would typically look like the
following:
listening on port 3000
connected
/images/venus.png from 173.255.206.103 39519
/images/kite.jpg from 173.255.206.103 39519
/images/visitmologo.jpg from 173.255.206.103 39519
/images/canvas.png from 173.255.206.103 39519
/sites/default/files/paws.png from 173.255.206.103 39519
The last piece of the message queue demonstration application is the HTTP server that
listens on port 8124 for requests, shown in Example 9-6. As the HTTP server receives
each request, it accesses the Redis database, pops off the next entry in the images list,
and prints out the entry in the response. If there are no more entries in the list (i.e., if
Redis returns null as a reply), it prints out a message that the message queue is empty.
Example 9-6. HTTP server that pops off messages from the Redis list and returns to the user
var redis = require("redis"),
http = require('http');
// set database to 1
client.select(6);
// if data
if (reply) {
res.write(reply + '\n');
} else {
res.write('End of queue\n');
}
res.end();
});
client.quit();
});
messageServer.listen(8124);
console.log('listening on 8124');
Accessing the HTTP server application with a web browser returns a URL for the image
resource on each request (browser refresh) until the message queue is empty.
// set database to 2
client.select(2);
// add IP to set
client.sadd('ip',req.socket.remoteAddress);
client.quit();
next();
}
}
The statistics interface is accessed at the top-level domain, so we’ll add the code for the
router to the index.js file in the routes subdirectory.
First, we need to add the route to the main application file, just after the route for the
top-level index:
app.get('/', routes.index);
app.get('/stats',routes.stats);
The controller code for the statistic application makes use of the Redis transaction
control, accessible via the multi function call. Two sets of data are accessed: the set of
unique IP addresses, returned by smembers, and the URL/count hash, returned with
hgetall. Both functions are invoked, in sequence, when the exec method is called, and
both sets of returned data are appended as array elements in the exec function’s callback
method, as shown in Example 9-8. Once the data is retrieved, it’s passed in a render
call to a new view, named stats. The new functionality for the index.js file appears in
bold text.
Example 9-8. The routes index file with the new controller code for the statistics application
var redis = require('redis');
// home page
exports.index = function(req, res){
res.render('index', { title: 'Express' });
};
// stats
client.select(2);
I mentioned that multi and exec are Redis transaction control commands. These aren’t
the same type of transaction controls you’re probably used to with a relational database.
All the multi command does is collect a set of Redis commands that are then processed
sequentially when the exec command is given. This type of functionality is useful in the
Node world because it provides a way of getting multiple collections of data that are
all returned at the exact same time—no need for nested callback functions or having
to use something like Step or Async to get all the data at once.
Having said that, don’t let the fact that the Redis commands are seemingly chained
together fool you into thinking that the data from one command is then available in
the next, as can happen with JavaScript functions that are chained together. Each Redis
command is processed in isolation, and the data is just added as an array element in
the result, and everything is returned at once.
The last piece of the application is the view, created as a Jade template. The template
is very simple: the IP addresses displayed in an unordered list, and the URL/counter
statistics displayed in a table. The Jade for...in syntax is used to loop through the IP
array, while the each...in syntax is used to access the property names and values of
the object that’s returned with the Redis hgetall. The template is shown
in Example 9-9.
block content
h1= title
h2 Visitor IP Addresses
ul
for ip in ips
li=ip
table
caption Page Visits
each val, key in urls
tr
td #{key}
td #{val}
Figure 9-2 shows the statistics page after several widget application resource pages have
been accessed from a couple of different IP addresses.
We don’t have to first create the hash when using hincrby. If the hash key doesn’t exist,
Redis creates it automatically and sets the value to 0 before the value is incremented.
However, this approach means we would have to get all the keys (the URLs), and then
get the counters for each URL. We can’t necessarily accomplish all of this using
multi, and because of the asynchronous nature of accessing the data, we’d end up
having to use nested callbacks or some other approach to pull all this data together.
There’s no need to go through all of that extra effort when we have built-in functionality
via the Redis hash and the hincrby command.
Chapter 9 covered one popular NoSQL database structure (key/value pairs via Redis),
and this chapter covers another: document-centric data stores via MongoDB.
Where MongoDB differs from relational database systems, such as MySQL, is in its
support for storing structured data as documents, rather than implementing the more
traditional tables. These documents are encoded as BSON, a binary form of JSON,
which probably explains its popularity among JavaScript developers. Instead of a table
row, you have a BSON document; instead of a table, you have a collection.
MongoDB isn’t the only document-centric database. Other popular versions of this
type of data store are CouchDB by Apache, SimpleDB by Amazon, RavenDB, and even
the venerable Lotus Notes. There is some Node support of varying degrees for most
modern document data stores, but MongoDB and CouchDB have the most. I decided
to cover MongoDB rather CouchDB for the same reasons I picked Express over other
frameworks: I feel it’s easier for a person with no exposure to the secondary technology
(in this case, the data store) to be able to grasp the Node examples without having to
focus overmuch on the non-Node technology. With MongoDB, we can query the data
directly, whereas with CouchDB, we work with the concept of views. This higher level
of abstraction does require more up-front time. In my opinion, you can hit the ground
running faster with MongoDB than CouchDB.
There are several modules that work with MongoDB, but I’m going to focus on two:
the MongoDB Native Node.js Driver (a driver written in JavaScript), and Mongoose,
an object modeling tool providing ORM (object-relational mapping) support.
Though I won’t get into too many details in this chapter about how
MongoDB works, you should be able to follow the examples even if you
have not worked with the database system previously. There’s more on
MongoDB, including installation help, at http://www.mongodb.org/.
207
The MongoDB Native Node.js Driver
The MongoDB Native Node.js Driver module is a native MongoDB driver for Node.
Using it to issue MongoDB instructions is little different from issuing these same in-
structions into the MongoDB client interface.
After you have installed MongoDB (following the instructions outlined at the Mon-
goDB website) and started a database, install the MongoDB Native Node.js Driver with
npm:
npm install mongodb
Before trying out any of the examples in the next several sections, make sure MongoDB
is installed locally, and is running.
If you’re already using MongoDB, make sure to back up your data before
trying out the examples in this chapter.
All communication with the MongoDB occurs over TCP. The server constructor
accepts the host and port as the first two parameters—in this case, the default
localhost and port 27017. The third parameter is a set of options. In the code, the
auto_reconnect option is set to true, which means the driver attempts to reestablish a
connection if it’s lost. Another option is poolSize, which determines how many TCP
connections are maintained in parallel.
MongoDB uses one thread per connection, which is why the database
creators recommend that developers use connection pooling.
The first parameter is the database name, and the second is the MongoDB server con-
nection. A third parameter is an object with a set of options. The default option values
are sufficient for the work we’re doing in this chapter, and the MongoDB driver doc-
umentation covers the different values, so I’ll skip repeating them in this chapter.
If you’ve not worked with MongoDB in the past, you may notice that the code doesn’t
have to provide username and password authentication. By default, MongoDB runs
without authentication. When authentication isn’t enabled, the database has to run in
a trusted environment. This means that MongoDB allows connections only from trus-
ted hosts, typically only the localhost address.
You can pass an optional second parameter to both methods, {safe : true}, which
instructs the driver to issue an error if the collection does not exist when used with
db.collection, and an error if the collection already exists if used with db.cre
ateCollection:
db.collection('mycollection', {safe : true}, function (err, collection{});
db.createCollection('mycollection', {safe : true}, function(err, collection{});
If you use db.createCollection on an existing collection, you’ll just get access to the
collection—the driver won’t overwrite it. Both methods return a collection object in
the callback function, which you can then use to add, modify, or retrieve document
data.
If you want to completely drop the collection, use db.dropCollection:
db.dropCollection('mycollection', function(err, result){});
Note that all of these methods are asynchronous, and are dependent on nested callbacks
if you want to process the commands sequentially. This is demonstrated more fully in
the next section, where we’ll add data to a collection.
Once you have a reference to a collection, you can add documents to it. The data is
structured as JSON, so you can create a JSON object and then insert it directly into the
collection.
collection.insert(widget1);
//close database
db.close();
}
});
}
});
});
}
});
The output to the console after the second insert is a variation on:
[ { title: 'Second Great widget',
desc: 'second greatest widget of all',
price: 29.99,
_id: 4fc108e2f6b7a3e252000002 } ]
The MongoDB generates a unique system identifier for each document. You can access
documents with this identifier at a future time, but you’re better off adding a more
meaningful identifier—one that can be determined easily by context of use—for each
document.
As mentioned earlier, we can insert multiple documents at the same time by providing
an array of documents rather than a single document. The following code demonstrates
how both widget records can be inserted in the same command. The code also incor-
porates an application identifier with the id field:
// create two records
var widget1 = {id: 1, title : 'First Great widget',
desc : 'greatest widget of all',
price : 14.99};
var widget2 = {id: 2, title : 'Second Great widget',
desc : 'second greatest widget of all',
price : 29.99};
// close database
db.close();
}
});
If you do batch your document inserts, you’ll need to set the keepGoing option to
true to be able to keep inserting documents even if one of the insertions fails. By default,
the application stops if an insert fails.
The options allow for a great deal of flexibility with queries, though most queries will
most likely need only a few of them. I’ll cover some of the options in the examples, but
I recommend you try the others with your example MongoDB installation.
The simplest query for all documents in the collection is to use the find method without
any parameters. You immediately convert the results to an array using toArray, passing
in a callback function that takes an error and an array of documents. Example 10-2
shows the application that performs this functionality.
Example 10-2. Inserting four documents and then retrieving them with the find method
var mongodb = require('mongodb');
//close database
db.close();
});
}
});
}
});
});
}
});
The result printed out to the console shows all four newly added documents, with their
system-generated identifiers:
[ { id: 1,
title: 'First Great widget',
desc: 'greatest widget of all',
price: 14.99,
type: 'A',
_id: 4fc109ab0481b9f652000001 },
{ id: 2,
title: 'Second Great widget',
desc: 'second greatest widget of all',
price: 29.99,
type: 'A',
_id: 4fc109ab0481b9f652000002 },
{ id: 3,
title: 'third widget',
desc: 'third widget',
price: 45,
type: 'B',
_id: 4fc109ab0481b9f652000003 },
{ id: 4,
title: 'fourth widget',
desc: 'fourth widget',
price: 60,
Rather than return all of the documents, we can provide a selector. In the following
code, we’re querying all documents that have a type of A, and returning all the fields
but the type field:
// return all documents
collection.find({type:'A'},{fields:{type:0}}).toArray(function(err, docs) {
if(err) {
console.log(err);
} else {
console.log(docs);
//close database
db.close();
}
});
We can also access only one document using findOne. The result of this query does not
have to be converted to an array, and can be accessed directly. In the following, the
document with an ID of 1 is queried, and only the title is returned:
// return one document
collection.findOne({id:1},{fields:{title:1}}, function(err, doc) {
if (err) {
console.log(err);
} else {
console.log(doc);
//close database
db.close();
}
});
and you want to modify the title, you can use the update method to do so, as shown
in Example 10-3. You can supply all of the fields, and MongoDB does a replacement
of the document, but you’re better off using one of the MongoDB modifiers, such as
$set. The $set modifier instructs the database to just modify whatever fields are passed
as properties to the modifier.
Example 10-3. Updating a MongoDB document
var mongodb = require('mongodb');
//update
collection.update({id:4},
{$set : {title: 'Super Bad Widget'}},
{safe: true}, function(err, result) {
if (err) {
console.log(err);
} else {
console.log(result);
// query for updated record
collection.findOne({id:4}, function(err, doc) {
if(!err) {
console.log(doc);
//close database
db.close();
}
});
}
});
});
}
});
There are additional modifiers that provide other atomic data updates of interest:
$inc
Increments a field’s value by a specified amount
$set
Sets a field, as demonstrated
$unset
Deletes a field
$push
Appends a value to the array if the field is an array (converts it to an array if it wasn’t)
$pushAll
Appends several values to an array
$addToSet
Adds to an array only if the field is an array
More importantly, the use of modifiers ensures that the action is performed in place,
providing some assurance that one person’s update won’t overwrite another’s.
Though we used none in the example, the update method takes four options:
• safe for a safe update
• upsert, a Boolean set to true if an insert should be made if the document doesn’t
exist (default is false)
• multi, a Boolean set to true if all documents that match the selection criteria should
be updated
• serializeFunction to serialize functions on the document
If you’re unsure whether a document already exists in the database, set the upsert
option to true.
Example 10-3 did a find on the modified record to ensure that the changes took effect.
A better approach would be to use findAndModify. The parameters are close to what’s
used with the update, with the addition of a sort array as the second parameter. If
multiple documents are returned, updates are performed in sort order:
//update
collection.findAndModify({id:4}, [[ti]],
{$set : {title: 'Super Widget', desc: 'A really great widget'}},
{new: true}, function(err, doc) {
if (err) {
You can use the findAndModify method to remove a document if you use the remove
option. If you do, no document is returned in the callback function. You can also use
the remove and the findAndRemove methods to remove the document. Earlier examples
have used remove, without a selector, to remove all the documents before doing an
insert. To remove an individual document, provide a selector:
collection.remove({id:4},
{safe: true}, function(err, result) {
if (err) {
console.log(err);
} else {
console.log(result);
}
The result is the number of documents removed (in this case, 1). To see the document
being removed, use findAndRemove:
collection.findAndRemove({id:3}, [['id',1]],
function(err, doc) {
if (err) {
console.log(err);
} else {
console.log(doc);
}
I’ve covered the basic CRUD (create, read, update, delete) operations you can perform
from a Node application with the Native driver, but there are considerably more ca-
pabilities, including working with capped collections, indexes, and the other MongoDB
modifiers; sharding (partitioning data across machines); and more. The Native driver
documentation covers all of these and provides good examples.
The examples demonstrate some of the challenges associated with handling data access
in an asynchronous environment, discussed more fully in the sidebar “Challenges of
Asynchronous Data Access” on page 220.
Instead of issuing commands directly against a MongoDB database, you define objects
using the Mongoose Schema object, and then sync it with the database using the
Mongoose model object:
var Widget = new Schema({
sn : {type: String, require: true, trim: true, unique: true},
name : {type: String, required: true, trim: true},
desc : String,
price : Number
});
When we define the object, we provide information that controls what happens to that
document field at a later time. In the code just provided, we define a Widget object with
four explicit fields: three of type String, and one of type Number. The sn and name fields
are both required and trimmed, and the sn field must be unique in the document
database.
The collection isn’t made at this point, and won’t be until at least one document is
created. When we do create it, though, it’s named widgets—the widget object name is
lowercased and pluralized.
app.get('/stats', main.stats);
Next, I added a new subdirectory named models. The MongoDB model definitions are
stored in this subdirectory, as the controller code is in the controllers subdirectory. The
directory structure now looks like the following:
/application directory
/main - home directory controller
/controllers - object controllers
/public - static files
/widgets
/views - template files
/widgets
The next change to the application is related to the structure of the data. Currently, the
application’s primary key is an ID field, system-generated but accessible by the user via
the routing system. To show a widget, you’d use a URL like the following:
http://localhost:3000/widgets/1
This modification necessitates some changes to the user interface, but they’re worth-
while. The Jade templates also need to be changed, but the change is minor: basically
replacing references to id with references to sn, and adding a field for serial number to
any form.
Rather than duplicate all the code again to show minor changes, I’ve
made the examples available at O’Reilly’s catalog page for this book
(http://oreilly.com/catalog/9781449323073); you’ll find all of the new
widget application files in the chap12 subdirectory.
The more significant change is to the controller code in the widget.js file. The changes
to this file, and others related to adding a MongoDB backend, are covered in the next
section.
mongoose.connection.on('open', function() {
console.log('Connected to Mongoose');
});
Notice the URI for the MongoDB. The specific database is passed as the last part of the
URI.
This change and the aforementioned change converting routes to main are all the
changes necessary for app.js.
The next change is to maproutecontroller.js. The routes that reference id must be
changed to now reference sn. The modified routes are shown in the following code
block:
// show
app.get(prefix + '/:sn', prefixObj.show);
// edit
app.get(prefix + '/:sn/edit', prefixObj.edit);
// update
app.put(prefix + '/:sn', prefixObj.update);
// destroy
app.del(prefix + '/:sn', prefixObj.destroy);
If we don’t make this change, the controller code expects sn as a parameter but gets
id instead.
The next code is an addition, not a modification. In the models subdirectory, a new file
is created, named widgets.js. This is where the widget model is defined. To make the
model accessible outside the file, it’s exported, as shown in Example 10-4.
Example 10-4. The new widget model definition
var mongoose = require('mongoose');
// add a widget
exports.create = function(req, res) {
var widget = {
sn : req.body.widgetsn,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
desc: req.body.widgetdesc};
widgetObj.save(function(err, data) {
if (err) {
res.send(err);
} else {
console.log(data);
res.render('widgets/added', {title: 'Widget Added', widget: widget});
}
});
};
// show a widget
exports.show = function(req, res) {
var sn = req.params.sn;
Widget.findOne({sn : sn}, function(err, doc) {
if (err)
res.send('There is no widget with sn of ' + sn);
else
// delete a widget
exports.destroy = function(req, res) {
var sn = req.params.sn;
// update a widget
exports.update = function(req, res) {
var sn = req.params.sn;
var widget = {
sn : req.body.widgetsn,
name : req.body.widgetname,
price : parseFloat(req.body.widgetprice),
desc : req.body.widgetdesc};
Now the widget application’s data is persisted to a database, rather than disappearing
every time the application is shut down. And the entire application is set up in such a
way that we can add support for new data entities with minimal impact on the stable
components of the application.
In traditional web development, relational databases are the most popular means of
data storage. Node, perhaps because of the type of applications it attracts, or perhaps
because it attracts uses that fit outside the traditional development box, doesn’t follow
this pattern: there is a lot more support for data applications such as Redis and Mon-
goDB than there is for relational databases.
There are some relational database modules you can use in your Node applications,
but they may not be as complete as you’re used to with database bindings in languages
such as PHP and Python. In my opinion, the Node modules for relational databases
are not yet production ready.
On the positive side, though, the modules that do support relational databases are quite
simple to use. In this chapter I’m going to demonstrate two different approaches to
integrating a relational database, MySQL, into Node applications. One approach uses
mysql (node-mysql), a popular JavaScript MySQL client. The other approach uses db-
mysql, which is part of the new node-db initiative to create a common framework for
database engines from Node applications. The db-mysql module is written in C++.
Neither of the modules mentioned currently supports transactions, but mysql-series
has added this type of functionality to node-mysql. I’ll provide a quick demonstration
on this, and also offer a brief introduction to Sequelize, an ORM (object-relational
mapping) library that works with MySQL.
There are a variety of relational databases, including SQL Server, Oracle, and SQLite.
I’m focusing on MySQL because there are installations available for Windows and Unix
environments, it’s free for noncommercial use, and it’s the most commonly used da-
tabase with applications most of us have used. It’s also the relational database with the
most support in Node.
The test database used in the chapter is named nodetest2, and it contains one table
with the following structure:
229
id - int(11), primary key, not null, autoincrement
title - varchar(255), unique key, not null
text - text, nulls allowed
created - datetime, nulls allowed
The db-mysql module provides two classes to interact with the MySQL database. The
first is the database class, which you use to connect and disconnect from the database
and do a query. The query class is what’s returned from the database query method.
You can use the query class to create a query either through chained methods repre-
senting each component of the query, or directly using a query string; db-mysql is very
flexible.
Results, including any error, are passed in the last callback function for any method.
You can use nested callbacks to chain actions together, or use the EventEmitter event
handling in order to process both errors and database command results.
When creating the database connection to a MySQL database, you can pass several
options that influence the created database. You’ll need to provide, at minimum, a
hostname or a port or a socket, and a user, password, and database name:
var db = new mysql.Database({
hostname: 'localhost',
user: 'username',
password: 'userpass',
database: 'databasenm'
});
The options are detailed in the db-mysql documentation, as well as in the MySQL
documentation.
// connect
db.connect();
db.on('error', function(error) {
console.log("CONNECTION ERROR: " + error);
});
// database connected
db.on('ready', function(server) {
The database object emits a ready event once the database is connected, or error if
there’s a problem with making the connection. The server object passed as a parameter
for the callback function to the ready event contains the following properties:
hostname
The database hostname
If the success event is for a query that performs an update, delete, or insert, the success
event callback function receives a result object as a parameter. I’ll cover this object in
more detail in the next section.
Though the queries are each handled using different approaches, both have to be im-
plemented within the database’s success event callback function. Since db-mysql is
Node functionality, the methods are asynchronous. If you tried to do one of the queries
outside of the database connect callback function, it wouldn’t succeed because the
database connection won’t be established at that point.
Placeholders can be used either with a direct query string or with the chained methods.
Placeholders are a way of creating the query string ahead of time and then just passing
in whatever values are needed. The placeholders are represented by question marks
(?) in the string, and each value is given as an array element in the second parameter to
the method.
The result of the operation being performed on the database is reflected in the parameter
returned in the callback for the success event. In Example 11-2, a new row is inserted
into the test database. Note that it makes use of the MySQL NOW function to set the
created field with the current date and time. When using a MySQL function, you’ll
need to place it directly into the query string—you can’t use a placeholder.
// connect
db.connect();
db.on('error', function(error) {
console.log("CONNECTION ERROR: " + error);
});
// database connected
db.on('ready', function(server) {
qry.on('success', function(result) {
console.log(result);
});
qry.on('error', function(error) {
console.log('Error: ' + error);
});
});
The id is the generated identifier for the table row; the affected property shows the
number of rows affected by the change (1), and the warning displays how many warnings
the query generated for the rows (in this case, 0).
Database table row updates and deletions are handled in the same manner: either use
the exact syntax you’d use in a MySQL client, or use placeholders. Example 11-3 adds
a new record to the test database, updates the title, and then deletes the same record.
You’ll notice I created a different query object for each query. Though you can run the
same query multiple times, each query does have its own arguments—including the
number of arguments it expects each time the query is run. I used four replacement
// connect
db.connect();
db.on('error', function(error) {
console.log("CONNECTION ERROR: " + error);
});
// database connected
db.on('ready', function(server) {
One thing you might notice from the example is there’s no way to roll back previous
SQL statements if an error occurs in any of them. At this time, there is no transaction
management in db-mysql. If you need to ensure database consistency, you’ll have to
provide it yourself in your application. You can do this by checking for an error after
each SQL statement is executed, and then reversing previous successful operation(s) if
a failure occurs. It’s not an ideal situation, and you’ll have to be careful about the use
of any autoincrementing.
// connect
db.connect();
db.on('error', function(error) {
console.log("CONNECTION ERROR: " + error);
});
// database connected
db.on('ready', function(server) {
I’m not overfond of the chained methods, though I think they’re handy if you’re bring-
ing in data from an application, or if your application may support multiple databases.
The native driver is quite simple to use. You create a client connection to the MySQL
database, select the database to use, and use this same client to do all database opera-
tions via the query method. A callback function can be passed as the last parameter in
the query method, and provides information related to the last operation. If no callback
function is used, you can listen for events to determine when processes are finished.
client.query('USE databasenm');
// create
client.query('INSERT INTO nodetest2 ' +
'SET title = ?, text = ?, created = NOW()',
['A seventh item', 'This is a seventh item'], function(err, result) {
if (err) {
console.log(err);
} else {
var id = result.insertId;
console.log(result.insertId);
// update
client.query('UPDATE nodetest2 SET ' +
'title = ? WHERE ID = ?', ['New title', id], function (err, result) {
if (err) {
console.log(err);
} else {
console.log(result.affectedRows);
// delete
client.query('DELETE FROM nodetest2 WHERE id = ?',
[id], function(err, result) {
if(err) {
console.log(err);
} else {
console.log(result.affectedRows);
// retrieve data
function getData() {
client.query('SELECT * FROM nodetest2 ORDER BY id', function(err, result,fields) {
if(err) {
The query results are what we’d expect: an array of objects, each representing one row
from the table. The following is an example of the output, representing the first returned
row:
[ { id: 1,
title: 'This was a better title',
text: 'this is a nice text',
created: Mon, 16 Aug 2010 15:00:23 GMT },
... ]
The fields parameter also matches our expectations, though the format can differ from
other modules. Rather than an array of objects, what’s returned is an object where each
table field is an object property, and its value is an object representing information
about the field. I won’t duplicate the entire output, but the following is the information
returned for the first field, id:
{ id:
{ length: 53,
received: 53,
number: 2,
type: 4,
catalog: 'def',
db: 'nodetest2',
table: 'nodetest2',
originalTable: 'nodetest2',
name: 'id',
originalName: 'id',
charsetNumber: 63,
fieldLength: 11,
fieldType: 3,
flags: 16899,
decimals: 0 }, ...
The module doesn’t support multiple SQL statements concatenated onto each other,
and it doesn’t support transactions. The only way to get a close approximation to
transaction support is with mysql-queues, discussed next.
// connect to database
var client = mysql.createClient({
user: 'username',
password: 'password'
});
client.query('USE databasenm');
// create queue
q = client.createQueue();
// do insert
q.query('INSERT INTO nodetest2 (title, text, created) ' +
'values(?,?,NOW())',
['Title for 8', 'Text for 8']);
// update
q.query('UPDATE nodetest2 SET title = ? WHERE title = ?',
['New Title for 8','Title for 8']);
q.execute();
If you want transactional support, you’ll need to start a transaction rather than a queue.
And you’ll need to use a rollback when an error occurs, as well as a commit when you’re
finished with the transaction. Again, once you call execute on the transaction,
any queries following the method call are queued until the transaction is
// connect to database
var client = mysql.createClient({
user: 'username',
password: 'password'
});
client.query('USE databasenm');
// create transaction
var trans = client.startTransaction();
// do insert
trans.query('INSERT INTO nodetest2 (title, text, created) ' +
'values(?,?,NOW())',
['Title for 8', 'Text for 8'], function(err,info) {
if (err) {
trans.rollback();
} else {
console.log(info);
// update
trans.query('UPDATE nodetest2 SET title = ? WHERE title = ?',
['Better Title for 8','Title for 8'], function(err,info) {
if(err) {
trans.rollback();
} else {
console.log(info);
trans.commit();
}
});
}
});
trans.execute();
Defining a Model
To use Sequelize, you define the model, which is a mapping between the database
table(s) and JavaScript objects. In our previous examples, we worked with a simple
table, nodetest2, with the following structure:
id - int(11), primary key, not null
title - varchar(255), unique key, not null
text - text, nulls allowed,
created - datetime, nulls allowed
You create the model for this database table using the appropriate database and flags
for each field:
// define model
var Nodetest2 = sequelize.define('nodetest2',
{id : {type: Sequelize.INTEGER, primaryKey: true},
title : {type: Sequelize.STRING, allowNull: false, unique: true},
text : Sequelize.TEXT,
created : Sequelize.DATE
});
When you do so, and examine the table in the database, you’ll find that the table and
the model are different because of changes Sequelize makes to the table. For one, it’s
now called nodetest2s, and for another, there are two new table fields:
id - int(11), primary key, autoincrement
title - varchar(255), unique key, nulls not allowed
text - text, nulls allowed
created - datetime, nulls allowed
createdAt - datetime, nulls not allowed
updatedAt - datetime, nulls not allowed
These are changes that Sequelize makes, and there’s no way to prevent it from making
them. You’ll want to adjust your expectations accordingly. For starters, you’ll want to
drop the column created, since you no longer need it. You can do this using Sequelize
by deleting the field from the class and then running the sync again:
// define model
var Nodetest2 = sequelize.define('nodetest2',
{id : {type: Sequelize.INTEGER, primaryKey: true},
title : {type: Sequelize.STRING, allowNull: false, unique: true},
text : Sequelize.TEXT,
});
// sync
Nodetest2.sync().error(function(err) {
console.log(err);
});
Now you have a JavaScript object representing the model that also maps to a relational
database table. Next, you need to add some data to the table.
// define model
var Nodetest2 = sequelize.define('nodetest2',
{id : {type: Sequelize.INTEGER, primaryKey: true},
title : {type: Sequelize.STRING, allowNull: false, unique: true},
text : Sequelize.TEXT,
});
// sync
Nodetest2.sync().error(function(err) {
console.log(err);
});
// first update
Nodetest2.find({where : {title: 'New object'}}).success(function(test) {
test.title = 'New object title';
test.save().error(function(err) {
console.log(err);
});
test.save().success(function() {
// second update
Nodetest2.find(
{where : {title: 'New object title'}}).success(function(test) {
test.updateAttributes(
// find all
Nodetest2.findAll().success(function(tests) {
console.log(tests);
When printing out the results of the findAll, you might be surprised at how much data
you’re getting back. Yes, you can access the properties directly from the returned value,
first by accessing the array entry, and then accessing the value:
tests[0].id; // returns identifier
But the other data associated with this new object completes the demonstrations show-
ing that you’re not in the world of relational database bindings anymore. Here’s an
example of one returned object:
[ { attributes: [ 'id', 'title', 'text', 'createdAt', 'updatedAt' ],
validators: {},
__factory:
{ options: [Object],
name: 'nodetest2',
tableName: 'nodetest2s',
rawAttributes: [Object],
daoFactoryManager: [Object],
associations: {},
validate: {},
autoIncrementField: 'id' },
__options:
{ underscored: false,
hasPrimaryKeys: false,
timestamps: true,
paranoid: false,
instanceMethods: {},
classMethods: {},
validate: {},
freezeTableName: false,
id: 'INTEGER NOT NULL auto_increment PRIMARY KEY',
title: 'VARCHAR(255) NOT NULL UNIQUE',
text: 'TEXT',
// define model
var Nodetest2 = sequelize.define('nodetest2',
{id : {type: Sequelize.INTEGER, primaryKey: true},
title : {type: Sequelize.STRING, allowNull: false, unique: true},
text : Sequelize.TEXT,
});
// sync
Nodetest2.sync().error(function(err) {
console.log(err);
});
var chainer = new Sequelize.Utils.QueryChainer;
chainer.add(Nodetest2.create({title: 'A second object',text: 'second'}))
.add(Nodetest2.create({title: 'A third object', text: 'third'}));
chainer.run()
.error(function(errors) {
console.log(errors);
})
This is much simpler, and much easier to read, too. Plus the approach makes it simpler
to work with a user interface or an MVC application.
There is much more about Sequelize at the module’s documentation website, including
how to deal with associated objects (relations between tables).
Node provides numerous opportunities to work with several different graphics appli-
cations and libraries. Since it’s a server technology, your applications can make use of
any server-based graphics software, such as ImageMagick or GD. However, since it’s
also based on the same JavaScript engine that runs the Chrome browser, you can work
with client-side graphics applications, such as Canvas and WebGL, too.
Node also has some support for serving up audio and video files via the new HTML5
media capabilities present in all modern browsers. Though we have limited capabilities
with working directly with video and audio, we can serve files of both types, as we’ve
seen in previous chapters. We can also make use of server-based technologies, such as
FFmpeg.
No chapter on web graphics would be complete without mentioning PDFs at least once.
Happily for those of us who make use of PDF documents in our websites, we have
access to a very nice PDF generation Node module, as well as access to various helpful
PDF tools and libraries installed on the server.
I’m not going to exhaustively cover every form of graphics or media implementation
and management capability from Node. For one, I’m not familiar with all of them, and
for another, some of the support is still very primitive, or the technologies can be ex-
tremely resource intensive. Instead, I’ll focus on more stable technologies that make
sense for a Node application: basic photo manipulation with ImageMagick, HTML5
video, working with and creating PDFs, and creating/streaming images created with
Canvas.
249
You have a couple of options for working with PDFs from a Node application. One
approach is to use a Node child process to access an operating system tool, such as the
PDF Toolkit or wkhtmltopdf directly on Linux. Another approach is to use a module,
such as the popular PDFKit. Or you can use always use both.
Then I had to install a tool (xvfb) that allows wkhtmltopdf to run headless in a virtual
X server (bypassing the X Windows dependency):
apt-get install xvfb
Next, I created a shell script, named wkhtmltopdf.sh, to wrap the wkhtmltopdf in xvfb.
It contains one line:
xvfb-run -a -s "-screen 0 640x480x16" wkhtmltopdf $*
I then moved the shell script to /usr/bin, and changed permissions with chmod a+x. Now
I’m ready to access wkhtmltopdf from my Node applications.
The wkhtmltopdf tool supports a large number of options, but I’m going to demon-
strate how to use the tool simply from a Node application. On the command line, the
following takes a URL to a remote web page and then generates a PDF using all default
settings (using the shell script version):
wkhtmltopdf.sh http://remoteweb.com/page1.html page1.pdf
To implement this in Node, we need to use a child process. For extensibility, the
application should also take the name of the input URL, as well as the output file. The
entire application is in Example 12-1.
wkhtmltopdf.stdout.setEncoding('utf8');
wkhtmltopdf.stdout.on('data', function (data) {
console.log(data);
});
You typically wouldn’t use wkhtmltopdf in a Node application by itself, but it can be
a handy addition to any website or application that wants to provide a way to create a
persistent PDF of a web page.
The format is easily converted into an object for simpler access of the individual
properties.
PDF Toolkit is a reasonably responsive tool, but you’ll want to use caution when hold-
ing up a web response waiting for it to finish. To demonstrate how to access PDF Toolkit
from a Node web application, and how to deal with the expected lag time that working
with a computationally expensive graphics application can cause, we’ll build a simple
PDF uploader.
The form to upload the PDF is basic, needing little explanation. It uses a file input field
in addition to a field for the person’s name and email address, and sets the method to
POST and the action to the web service. Since we’re uploading a file, the enctype
field must be set to multipart/form-data. The finished form page can be seen
in Example 12-2.
Example 12-2. Form to upload a PDF file
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<title>Upload PDF</title>
<script>
window.onload=function() {
document.getElementById('upload').onsubmit=function() {
document.getElementById('submit').disabled=true;
};
}
</script>
</head>
<body>
<form id="upload" method="POST" action="http://localhost:8124"
enctype="multipart/form-data">
<p><label for="username">User Name:</label>
<input id="username" name="username" type="text" size="20" required /></p>
<p><label for="email">Email:</label>
<input id="email" name="email" type="text" size="20" required /></p>
<p><label for="pdffile">PDF File:</label>
<input type="file" name="pdffile" id="pdffile" required /></p>
<p>
<p>
<input type="submit" name="submit" id="submit" value="Submit"/>
</p>
We have a chance to brush up on our client-side JavaScript skills by disabling the submit
button when the form is submitted. The form makes use of the HTML5 required at-
tribute, which ensures that the proper data is provided.
The web service application that processes both the request for the form and the PDF
upload uses the Connect middleware, this time without the Express framework.
In the service, the Connect static middleware is used to serve up static files, and the
directory middleware is used to pretty-print a directory listing when a directory is ac-
cessed. The only other functionality that’s needed is the process to parse out both the
PDF file and the form data from the upload. The application uses the Connect parse
Body method, which is capable of processing any type of posted data:
connect()
.use(connect.bodyParser({uploadDir: __dirname + '/pdfs'}))
.use(connect.static(__dirname + '/public'))
.use(connect.directory(__dirname + '/public'))
.listen(8124);
The data is then made available to a custom middleware named upload, which handles
both the data and the PDF—invoking a custom module to process the PDF file. The
bodyParser middleware makes the username and email available on the request.body
object, and the uploaded file on the request.files object. If a file is uploaded, it’s
uploaded as an object named pdffile because that’s the name of the file upload field.
You’ll need an additional test on the file type to ensure that the file uploaded is a PDF.
Example 12-3 has the complete code for the PDF service application.
Example 12-3. PDF upload web service application
var connect = require('connect');
var pdfprocess = require('./pdfprocess');
// if POST
// upload file, kick off PDF burst, respond with ack
function upload(req, res, next){
if ('POST' != req.method) return next();
res.setHeader('Content-Type', 'text/html');
if (req.files.pdffile && req.files.pdffile.type === 'application/pdf') {
res.write('<p>Thanks ' + req.body.username +
' for uploading ' + req.files.pdffile.name + '</p>');
res.end("<p>You'll receive an email with file links when processed.</p>");
The custom module pdfprocess is where the application performs the following steps
to process the PDF file:
1. A directory is created for the user under the public pdfs subdirectory if none exists.
2. A timestamp value is used with the file to create a unique name for the current
uploaded PDF.
3. The timestamp is used with the PDF filename to create a new subdirectory for the
PDFs under the user’s subdirectory.
4. The PDF is moved from the temporary upload directory to this new directory, and
renamed the original PDF filename.
5. The PDF Toolkit burst operation is performed on this file, with all the individual
PDFs placed in the pdfs directory.
6. An email is sent to the user providing a URL/link where he can access the new
directory containing the original uploaded PDF and the individual PDF pages.
The filesystem functionality is provided by the Node File System module, the email
functionality is handled by Emailjs, and the PDF Toolkit functionality is managed in a
child process. There is no data returned from this child process, so the only events
captured are child process exit and error events. Example 12-4 contains the code for
this final piece of the application.
Example 12-4. Module to process PDF file and send user email with location of processed files
var fs = require('fs');
var spawn = require('child_process').spawn;
var emailjs = require('emailjs');
fs.mkdir(dir, function(err) {
if (err)
return console.log(err);
//burst pdf
var pdftk = spawn('pdftk', [newfile, 'burst', 'output',
dir + '/page_%02d.pdf' ]);
console.log('sending email');
// send email
var headers = {
text : 'You can find your split PDF at ' + url,
from : 'youremail',
to : email,
subject: 'split pdf'
};
});
});
});
};
The actual child process call to PDF Toolkit is in bold text in the code. The command-
line syntax used is the following:
pdftk filename.pdf burst output /home/location/page_%02d.pdf
The filename is given first, then the operation, and then an output directive. The op-
eration is, as mentioned earlier, the burst operation, which splits the PDF into separate
pages. The output directive instructs PDF Toolkit to place the newly split PDF pages
in a specific directory, and provides formatting for the page names—the first page
would be page_01.pdf, the second page_02.pdf, and so on. I could have used Node’s
process.chdir to change the process to the directory, but it really wasn’t necessary since
I can make the PDF Toolkit operation place the files in a specified directory.
The email is sent using the Gmail SMTP server, which utilizes TLS (transport layer
security), over port 587 and with a given Gmail username and password. You could,
of course, use your own SMTP server. The message is sent both in plain text and with
a given HTML-formatted attachment (for those folks who use an email reader capable
of processing HTML).
The end result of the application is a link sent to the user that takes her to the directory
where she’ll find the uploaded PDF and the split pages. The Connect directory mid-
dleware ensures that the contents of the directory are attractively displayed. Fig-
ure 12-1 shows the results of uploading one very large PDF file on global warming.
With this approach—providing acknowledgment to the user in an email—the user
doesn’t have to wait around for (and the Node service isn’t hung up waiting on) the
PDF processing.
Of course, the user still has to spend time uploading the PDF file—this
application doesn’t touch on the issues associated with large file
uploads.
You can then add a font, a new web page, and graphics, all with the exposed API. The
API methods can all be chained to simplify development.
The application then adds a new PDF page, again changes the font size to 25 pixels,
and draws new text at 100, 100:
doc.addPage()
.fontSize(25)
.text('Here is some vector graphics...', 100, 100);
The document coordinate system is saved, and the vector graphics functionality is used
to draw a red triangle:
doc.save()
.moveTo(100, 150)
.lineTo(100, 250)
.lineTo(200, 250)
.fill("#FF3300");
The next section of code scales the coordinate system to 0.6, translates the origin, draws
a path in the shape of a star, fills it with red, and then restores the document back to
the original coordinate system and scale:
doc.scale(0.6)
.translate(470, −380)
.path('M 250,75 L 323,301 131,161 369,161 177,301 z')
.fill('red', 'even-odd')
.restore();
If you’ve worked with other vector graphics systems, such as Canvas, much of this
should seem familiar. If you haven’t, then you might want to check out the Canvas
examples later in the book and then return to this example.
Another page is added, the fill color is changed to blue, and a link is added to the page.
The document is then written out to a file named output.pdf:
doc.addPage()
.fillColor("blue")
.text('Here is a link!', 100, 100)
.underline(100, 100, 160, 27, {color: "#0000FF"})
.link(100, 100, 160, 27, 'http://google.com/');
doc.write('output.pdf');
It’s tedious to create a PDF document manually. However, we can easily program the
PDFKit API to take content from a data store and generate a PDF on the fly. We could
also use PDFKit to generate a PDF document of web page content on demand, or to
provide a persistent snapshot of data.
The convert tool is the ImageMagick workhorse. With it, you can perform some pretty
amazing transformations on an image and then save the results to a separate file. You
can provide an adaptive blur, sharpen the image, annotate the image with text, position
it on a backdrop, crop it, resize it, and even replace every pixel in the image with its
color complement. There is little you can’t do to an image with ImageMagick. Of
course, not every operation is equal, especially if you’re concerned about how long it
will take. Some of the image conversions occur quickly, while others can take consid-
erable time.
To demonstrate how to use convert from a Node application, the small, self-contained
application in Example 12-5 specifies an image filename on the command line and
scales that image so it fits into a space no more than 150 pixels wide. The image is also
transformed into a PNG, regardless of its original type.
The command-line version of this process is:
convert photo.jpg -resize '150' photo.jpg.png
We’ll need to capture four command arguments in the array for the child process: the
original photo, the -resize flag, the value for the -resize flag, and the name of the new
image.
Example 12-5. Node application to use a child process to scale an image with the ImageMagick convert
tool
var spawn = require('child_process').spawn;
// get photo
var photo = process.argv[2];
// conversion array
var opts = [
photo,
'-resize',
// convert
var im = spawn('convert', opts);
The ImageMagick convert tool processes the image silently, so there is no child process
data event to process. The only events we’re interested in are the error and the exit,
when the image processing is finished.
Where an application like ImageMagick can get tricky is when you’re interested in
doing a much more involved process. One of the more popular effects people have
applied to images using ImageMagick is the Polaroid effect: rotating the image slightly
around its center and adding a border and a shadow to make the image look like a
Polaroid photo. The effect is now so popular that there’s a predefined setting for it, but
prior to this new setting, we had to use a command similar to the following (from the
ImageMagick usage examples):
convert thumbnail.gif \
-bordercolor white -border 6 \
-bordercolor grey60 -border 1 \
-background none -rotate 6 \
-background black \( +clone -shadow 60x4+4+4 \) +swap \
-background none -flatten \
polaroid.png
This is a lot of arguments, and the arguments are in a format you may not have seen
previously. So how does this get converted into a child process arguments array?
Minutely.
What looks like a single argument on the command line (\(+clone -shadow 60x4+4+4
\)) is anything but to the Node child process. Example 12-6 is a variation of the con-
version tool in Example 12-5, except now a Polaroid effect is being applied rather than
the image being scaled. Pay particular attention to the line in bold text.
Example 12-6. Applying a Polaroid effect to a photo using ImageMagick from a Node application
var spawn = require('child_process').spawn;
// get photo
var photo = process.argv[2];
The bolded code in the example demonstrates how what appears to be a single argu-
ment on the command line becomes five arguments to the child process. The end result
of running the application is shown in Figure 12-2.
Figure 12-2. Result of running Node application to apply a Polaroid effect to a photo
It’s unlikely that you’ll use the Node application with an ImageMagick child process
directly on the command line. After all, you can just run ImageMagick’s tools directly.
However, you can use the combined child process/ImageMagick tool to run several
different conversions on a single image, or to provide services from a website (such as
Support for ranges extends beyond serving HTML5 video. Ranges can
also be used to download larger files.
Ranges are an HTTP header that provides a start and end position for loading a re-
source, such as a video file. Here are the steps we need to take to add support for HTTP
ranges:
1. Signal willingness to accept range requests with response header Accept-Ranges:
bytes.
2. Look for a range request in the request header.
3. If a range request is found, parse out the start and end values.
The first modification necessary for the minimal web server is to add the new header:
res.setHeader('Accept-Ranges','bytes');
The client will then send through range requests of the following format:
bytes=startnum-endnum
Where the startnum/endnum values are the starting and end numbers for the range. Sev-
eral of these requests can be sent during playback. For example, the following are actual
range requests sent from the web page with the HTML5 video after starting the video
and then clicking around on the timeline during playback:
bytes=0-
bytes=7751445-53195861
bytes=18414853-53195861
bytes=15596601-18415615
bytes=29172188-53195861
bytes=39327650-53195861
bytes=4987620-7751679
bytes=17251881-18415615
bytes=17845749-18415615
bytes=24307069-29172735
bytes=33073712-39327743
bytes=52468462-53195861
bytes=35020844-39327743
bytes=42247622-52468735
The next addition to the minimal web server is to check to see if a range request has
been sent, and if so, to parse out the start and end values. The code to check for a range
request is:
if (req.headers.range) {...}
start = parseInt(rangearray[0].substr(6));
end = parseInt(rangearray[1]);
if (isNaN(start)) start = 0;
if (isNaN(end)) end = len −1;
The content length (Content-Length) response is also prepared, calculated as the end
value minus the start value. In addition, the HTTP status code is set to 206, for Partial
Content.
Last, the start and end values are also sent as an option to the createReadStream method
call. This ensures that the stream is properly repositioned for streaming.
Example 12-7 pulls all of these pieces together into a modified minimal web server that
can now serve HTML5 video (or other resource) ranges.
Example 12-7. The minimal web server, now with support for ranges
var http = require('http'),
url = require('url'),
function processRange(res,ranges,len) {
start = parseInt(rangearray[0].substr(6));
end = parseInt(rangearray[1]);
if (isNaN(start)) start = 0;
if (isNaN(end)) end = len −1;
var opt={};
// assume no range
res.statusCode = 200;
// adjust length
len = opt.end - opt.start + 1;
// set header
var ctstr = 'bytes ' + opt.start + '-' +
opt.end + '/' + stats.size;
res.setHeader('Content-Range', ctstr);
}
// content type
var type = mime.lookup(pathname);
res.setHeader('Content-Type', type);
res.setHeader('Accept-Ranges','bytes');
file.pipe(res);
});
file.on("error", function(err) {
console.log(err);
});
} else {
res.writeHead(403);
res.write('Directory access is forbidden');
res.end();
}
});
}).listen(8124);
console.log('Server running at 8124/');
Modifying the minimal web server demonstrates that HTTP and other network func-
tionality isn’t necessarily complicated—just tedious. The key is to break down each
task into separate tasks, and then add code to manage each subtask one at a time (testing
after each).
Now the web page (included in the examples) that allows the user to click around on
a timeline works correctly.
All of the standard Canvas functionality you have in a client page is available via the
node-canvas module. You create a Canvas object and then a context, do all your drawing
in the context, and then either display the result or save the result in a file as a JPEG or
PNG.
There are also a couple of additional methods available on the server that you wouldn’t
have on the client. These allow us to stream a Canvas object to a file (either as a PNG
or JPEG), persisting the results for later access (or serving in a web page). You can also
convert the Canvas object to a data URI and include an img element in a generated HTML
web page, or read an image from an external source (such as a file or a Redis database)
and use it directly in the Canvas object.
Jumping right in to demonstrate how to use the node-canvas module, Example 12-8
creates a canvas drawing and then streams it to a PNG file for later access. The example
uses a rotated graphic image from an example at the Mozilla Developer Network, and
adds a border and shadow to it. Once finished, it’s streamed to a PNG file for later
access. Most of the functionality could be used in a client application as well as the
Node application. The only real Node-specific component is persisting the graphic as
a file in the end.
Example 12-8. Creating a graphic using node-canvas and persisting the result to a PNG file
var Canvas = require('canvas');
var fs = require('fs');
ctx.fillStyle = '#fff';
ctx.fillRect(30,30,300,300);
stream.on('data', function(chunk){
out.write(chunk);
});
stream.on('end', function(){
console.log('saved png');
});
Once you’ve run the Node application, access the shadow.png file from your favorite
browser. Figure 12-3 shows the generated image.
In this chapter, we’re working in both the client and server environments, because both
are necessary when it comes to WebSockets and Socket.IO.
WebSockets is a relatively new web technology that enables bidirectional, real-time
communication directly from within a client to a server application, and back again.
The communication occurs over TCP (Transmission Control Protocol), via sockets.
The Socket.IO libraries provide the support necessary to implement this technology.
Not only does Socket.IO provide a module to use in your Node application, but it also
provides a client-side JavaScript library to enable the client end of the communication
channel. For an added bonus, it also works as an Express middleware.
In this chapter I’ll introduce WebSockets more fully by demonstrating how Socket.IO
works, both in the client and in the server.
WebSockets
Before jumping into using Socket.IO, I want to provide a quick overview of WebSock-
ets. To do that, I also need to explain bidirectional full-duplex communication.
The term full duplex describes any form of data transmission that allows communica-
tion in both directions. The term bidirectional means that both endpoints of a trans-
mission can communicate, as opposed to unidirectional communication, when one end
of a data transmission is a sender and all other endpoints are receivers. WebSockets
provides the capability for a web client, such as a browser, to open up bidirectional
full-duplex communication with a server application. And it does so without having to
use HTTP, which adds unnecessary overhead to the communication process.
WebSockets is standardized as part of a specification called the WebSockets API at the
World Wide Web Consortium (W3C). The technology has had a bumpy start, because
no sooner had some browsers begun implementing WebSockets in 2009 than serious
security concerns led those same browsers to either pull their implementation, or enable
WebSockets only as an option.
273
The WebSockets protocol was revamped to address the security concerns, and Firefox,
Chrome, and Internet Explorer support the new protocol. At this time, Safari and Opera
support only the older versions of the technology, but you must enable WebSockets in
the configuration settings. In addition, most mobile browers have only limited support,
or support only the older WebSockets specification.
Socket.IO addresses the issue of uneven support for WebSockets by using several dif-
ferent mechanisms to enable the bidirectional communication. It attempts to use the
following, in order:
• WebSockets
• Adobe Flash Socket
• Ajax long polling
• Ajax multipart streaming
• Forever iFrame for IE
• JSONP Polling
The key point to take away from this list is that Socket.IO supports bidirectional com-
munication in most, if not all, browsers in use today—desktop and mobile.
An Introduction to Socket.IO
Before we jump into code that implements the WebSockets application, you’ll need to
install Socket.IO on your server. Use npm to install the module and supporting Java-
Script library:
npm install socket.io
A Socket.IO application requires two different components: a server and a client ap-
plication. In the examples in this section, the server application is a Node application,
and the client application is a JavaScript block in an HTML web page. Both are adap-
tions of example code provided at the Socket.IO website.
The server application uses HTTP to listen for incoming requests, and serves up only
one file: the client HTML file. When a new socket connection is made, it emits a mes-
sage to the client with the text of Counting... to an event labeled news.
When the server gets an echo event, it takes the text sent with the event and appends a
counter value to it. The counter is maintained in the application and incremented every
time the echo event is transmitted. When the counter gets to 50, the server no longer
transmits the data back to the client. Example 13-2 contains all the code for the server
application.
Example 13-2. Server application in the Socket.IO application
var app = require('http').createServer(handler)
, io = require('socket.io').listen(app)
, fs = require('fs')
var counter;
app.listen(8124);
After the client application is loaded into the server, you can watch the counter update
until it reaches the target end value. The web page doesn’t have to be reloaded, and the
user doesn’t have to do anything special for the application to execute. The application
exhibits the same behavior in all modern browsers, though the underlying technology
that implements the effect differs by browser.
Both news and echo are custom events. The only socket events Socket.IO supports out
of the box are connection, passed during the initial connection, and the following events
on the server socket:
message
Emitted whenever a message sent using socket.send is received
disconnect
Emitted when either the client or server disconnects
And the following events on the client socket:
connect
Emitted when the socket connection is made
connecting
Emitted when the socket connection is being attempted
disconnect
Emitted when the socket is disconnected
connect_failed
Emitted when the connection fails
error
Emitted when an error occurs
message
Emitted when message sent with socket.send is received
On the client, the application can also listen for the message event, and use send to
communicate back:
socket.on('message', function (data) {
var html = '<p>' + data + '</p>';
document.getElementById("output").innerHTML=html;
socket.send('OK, got the data');
});
This example uses send to manually acknowledge receipt of the message. If we want
an automatic acknowledgment that the client received the event, we can pass a callback
function in as the last parameter of the emit method:
io.sockets.on('connection', function (socket) {
socket.emit('news', { news: "All the news that's fit to print" },
function(data) {
console.log(data);
});
});
In the client, we can then pass a message back using this callback function:
socket.on('news', function (data, fn) {
var html = '<p>' + data.news + '</p>';
document.getElementById("output").innerHTML=html;
fn('Got it! Thanks!');
});
The socket passed as a parameter to the connection event handler is the unique con-
nection between the server and the client, and persists as long as the connection persists.
If the connection terminates, Socket.IO attempts to reconnect.
app.listen(8124);
Now you can have several concurrent users, and they each get the exact same com-
munication. The socket object exists until the socket connection is closed and can’t be
reestablished.
You may be wondering if you have to specifically place this code in the top level of your
web server—you don’t.
In the server application, when the HTTP web server was created, it was passed to the
Socket.IO’s listen event:
var app = require('http').createServer(handler)
, io = require('socket.io').listen(app)
What happens is that Socket.IO intercepts requests sent to the web server and listens
for requests for:
/socket.io/socket.io.js
Socket.IO does a clever bit of behind-the-scenes finagling that determines what’s re-
turned in the response. If the client supports WebSockets, the JavaScript file returned
is one that uses WebSockets to implement the client connection. If the client doesn’t
support WebSockets, but does support Forever iFrame (IE9), it returns that particular
JavaScript client code, and so on.
Configuring Socket.IO
Socket.IO comes with several default settings that we usually won’t need to change. In
the examples in the preceding section, I didn’t alter any of the default settings. If I
wanted to, though, I could by using Socket.IO’s configure method, which operates in
a manner similar to what we’ve used with Express and Connect. You can even specify
different configurations based on which environment the application is running.
Socket.IO contains a wiki page (at https://github.com/learnboost/socket.io/wiki/) that
lists all of the options, and I don’t want to repeat the rather extensive list here. Instead,
I want to demonstrate a couple that you may want to consider modifying as you’re
learning to work with Socket.IO.
You can also define different configurations for different environments, such as
production and development:
io.configure('production', function() {
io.set('transports', [
'websocket',
'jsonp-polling']);
});
io.configure('development', function() {
io.set('transports', [
'websocket',
'flashsocket',
'htmlfile',
'xhr-polling',
'jsonp-polling']);
});
Another option controls the amount of detail output to the logger (you’ll notice the
logger output as debug statements to the console on the server). If you want to turn off
the logger output, you can set the log level option to 1:
io.configure('development', function() {
io.set('log level', 1);
});
Some of the options—such as store, which determines where client data is persisted
—have requirements other than just having an option in a configuration method call.
Now anyone who has a socket connection to the server gets the message.
You can also broadcast a message to everyone but a specific individual by issuing a
broadcast.emit on the socket of the person you don’t want to see the message:
socket.broadcast.emit();
In the simple chat application, when a new client connects, the client application
prompts for a name and then broadcasts to other connected clients that this person has
now entered the chat room. The client application also provides a text field and button
to send messages, and provides a place where new messages from all participants are
printed. Example 13-4 shows the client application code.
Example 13-4. Client chat application
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>bi-directional communication</title>
<script src="/socket.io/socket.io.js"></script>
<script>
var socket = io.connect('http://localhost:8124');
socket.on('connect', function() {
socket.on('chat',function(username, data) {
var p = document.createElement('p');
p.innerHTML = username + ': ' + data;
document.getElementById('output').appendChild(p);
});
window.addEventListener('load',function() {
document.getElementById('sendtext').addEventListener('click',
function() {
var text = document.getElementById('data').value;
socket.emit('sendchat', text);
}, false);
}, false);
</script>
</head>
<body>
<div id="output"></div>
<div id="send">
<input type="text" id="data" size="100" /><br />
<input type="button" id="sendtext" value="Send Text" />
</div>
</body>
</html>
Other than the addition of basic JavaScript functionality to capture the click event on
the button, and the prompt to get the person’s name, the functionality isn’t much
different than earlier examples.
In the server, the new person’s username is attached as data to the socket. The server
acknowledges the person directly, and then broadcasts the person’s name to other chat
room participants. When the server receives any new chat message, it attaches the
username to the message so everyone can see who sent it. Finally, when a client dis-
connects from the chat room, another message is broadcast to all connected users in-
dicating that the person is no longer participating. Example 13-5 has the complete code
for the server application.
Example 13-5. Server chat application
var app = require('http').createServer(handler)
, io = require('socket.io').listen(app)
, fs = require('fs');
app.listen(8124);
socket.on('addme',function(username) {
socket.username = username;
socket.emit('chat', 'SERVER', 'You have connected');
socket.broadcast.emit('chat', 'SERVER', username + ' is on deck');
});
socket.on('sendchat', function(data) {
io.sockets.emit('chat', socket.username, data);
});
socket.on('disconnect', function() {
io.sockets.emit('chat', 'SERVER', socket.username + ' has left the building');
});
});
Figure 13-1 shows the results of the application when I tested it from four different
browsers (Chrome, Firefox, Opera, and IE).
Figure 13-1. Trying out the chat application enabled by Socket.IO in several different browsers
app.configure(function () {
app.use(express.static(__dirname + '/public'));
app.use(app.router);
});
var io = sio.listen(server);
server.listen(8124);
socket.on('addme',function(username) {
socket.username = username;
socket.emit('chat', 'SERVER', 'You have connected');
socket.broadcast.emit('chat', 'SERVER', username + ' is on deck');
});
socket.on('sendchat', function(data) {
io.sockets.emit('chat', socket.username, data);
});
socket.on('disconnect', function() {
});
As the Express application is passed to the HTTP server, the HTTP server is in turn
passed to Socket.IO. All three modules work together to ensure that all requests—
whether web service or chat—are handled properly.
Though the chat client is a static page, it would be a simple matter to incorporate a
template. The only issues are ensuring the integrity of the scripting block where the
client application code resides, and remembering to include a link to the Socket.IO
library.
In previous chapters, the only debugging aid used in the examples was printing infor-
mation to the console. For smaller and less complex applications that are still in
development, this is sufficient. However, as your application grows and gets more
complicated, you’ll need to use other, more sophisticated tools for debugging.
You’re also going to want to incorporate more formal testing, including the use of test-
creation tools that can be used by others to test your module or application in their
environments.
Debugging
Frankly, console.log will always remain my debugger of choice, but its usefulness does
degrade as your application increases in size and complexity. Once you’ve moved be-
yond a simple application, you’re going to want to consider using more sophisticated
debugging tools. We’ll go over some options in the sections that follow.
debugger;
if (req.url.match(/^\/node\//))
proxy.proxyRequest(req, res, {
host: 'localhost',
port: 8000
});
287
else
proxy.proxyRequest(req,res, {
host: 'localhost',
port: 8124
});
}).listen(9000);
In debug mode, the application stops at the beginning of the file. To go to the first
breakpoint, type cont, or its abbreviation, c. This causes the debugger to stop at the
first breakpoint; the application then sits, waiting for input from the user (such as a
web request):
< debugger listening on port 5858
connecting... ok
break in app2.js:1
1 var connect = require('connect'),
2 http = require('http'),
3 fs = require('fs'),
debug> cont (--> note it is just waiting at this point for a web request)
break in app2.js:11
9 httpProxy.createServer(function(req,res,proxy) {
10
11 debugger;
12 if (req.url.match(/^\/node\//))
13 proxy.proxyRequest(req, res, {
debug>
You have several options at this point. You can step through the code using the next
(n) command, step into a function using step (s), or step out of a function using out
(o). In the following code, the debugger stops at the breakpoint, and the next few lines
are stepped over with next until line 13, which has a function call. I use step at this
point to step into the function. I can then traverse the function code using next, and
return to the application using out:
debug> cont
break in app2.js:11
9 httpProxy.createServer(function(req,res,proxy) {
10
11 debugger;
12 if (req.url.match(/^\/node\//))
13 proxy.proxyRequest(req, res, {
debug> next
break in app2.js:12
10
11 debugger;
12 if (req.url.match(/^\/node\//))
13 proxy.proxyRequest(req, res, {
14 host: 'localhost',
debug> next
break in app2.js:13
11 debugger;
You can also set a new breakpoint, either on the current line setBreakpoint (sb), or the
first line in a named function or script file:
break in app2.js:22
20 port: 8124
21 });
22 }).listen(9000);
23
24 // add route for request for dynamic resource
debug> sb()
17 else
18 proxy.proxyRequest(req,res, {
19 host: 'localhost',
20 port: 8124
21 });
*22 }).listen(9000);
23
24 // add route for request for dynamic resource
25 crossroads.addRoute('/node/{id}/', function(id) {
26 debugger;
27 });
Debugging | 289
break in app2.js:11
9 httpProxy.createServer(function(req,res,proxy) {
10
11 debugger;
12 if (req.url.match(/^\/node\//))
13 proxy.proxyRequest(req, res, {
debug> repl
Press Ctrl + C to leave debug repl
> req.url
'/node/174'
debug>
The backtrace command is helpful for printing a backtrace (a list of currently active
function calls) of the current execution frame:
debug> backtrace
#0 app2.js:22:1
#1 exports.createServer.handler node-http-proxy.js:174:39
Anytime you want to see which commands are available to you, type help:
debug> help
Commands: run (r), cont (c), next (n), step (s), out (o), backtrace (bt),
setBreakpoint (sb), clearBreakpoint (cb), watch, unwatch, watchers, repl, restart,
kill, list, scripts, breakpoints, version
The built-in debugger is very helpful, but sometimes you want a little bit more. You
have other options, including accessing the V8 debugger directly by using the
--debug command-line flag:
node --debug app.js
This starts up a TCP connection to the debugger, and you enter the V8 debug com-
mands at the prompt. This is an interesting option, but does require a great deal of
understanding of how the V8 debugger works (and what the commands are).
Another option is to use debugging via a WebKit browser—through an application
such as Node Inspector, covered next.
To use the functionality, you’ll first need to start the application using the V8 debugger
flag:
node --debug app.js
Then you’ll need to start the Node Inspector, in either the background or foreground:
node-inspector
Figure 14-1. Running Node Inspector in Chrome on a Node application running on a remote server
Node Inspector is, by far, a superior approach to debugging the server application.
Using the command line is OK, but being able to see all the code at once, and to use a
toolset we’re familiar with, more than compensates for the little extra effort necessary
to enable the Node Inspector setup.
Debugging | 291
Unit Testing
Unit testing is a way of isolating specific components of an application for testing. Many
of the tests that are provided in the tests subdirectory of Node modules are unit tests.
The tests in the test subdirectory of the Node installation are all unit tests.
You can run a module’s test scripts using npm. In the module subdirectory, type:
npm test
This command runs a module test script if one is provided. When I ran the test script
in the subdirectory for node-redis (covered in Chapter 9), the resulting output displayed
successful test results, such as the portion displayed here:
Connected to 127.0.0.1:6379, Redis server version 2.4.11
Many of these unit tests are built using the Assert module, which we’ll go over next.
To see how to use Assert, let’s look at how existing modules use it. The following test
is in the test.js script found with the node-redis installation:
var name = "FLUSHDB";
client.select(test_db_num, require_string("OK", name));
The test uses a function, require_string, which returns a function that uses the Assert
module methods assert.equal and assert.stringEqual:
function require_string(str, label) {
return function (err, results) {
assert.strictEqual(null, err, "result sent back unexpected error: " + err);
The first test, assert.stringEqual, fails if the err object returned in the Redis test isn’t
null. The second test using assert.equal fails if results are not equal to the expected
string. Only if both tests are successful (i.e., neither test fails) does the code fall through
to the return true statement.
What is actually tested is whether the Redis select command succeeds. If an error
occurs, the error is output. If the result of the selection isn’t what’s expected (a return
value of OK), a message is output to that effect, including the test label where the test
failed.
The Node application also makes use of the Assert module in its module unit tests. For
instance, there’s a test application called test-util.js that tests the Utilities module. The
following code is the section that tests the isArray method:
// isArray
assert.equal(true, util.isArray([]));
assert.equal(true, util.isArray(Array()));
assert.equal(true, util.isArray(new Array()));
assert.equal(true, util.isArray(new Array(5)));
assert.equal(true, util.isArray(new Array('with', 'some', 'entries')));
assert.equal(true, util.isArray(context('Array')()));
assert.equal(false, util.isArray({}));
assert.equal(false, util.isArray({ push: function() {} }));
assert.equal(false, util.isArray(/regexp/));
assert.equal(false, util.isArray(new Error));
assert.equal(false, util.isArray(Object.create(Array.prototype)));
Both the assert.equal and the assert.strictEqual methods have two mandatory pa-
rameters: an expected response and an expression that evaluates to a response. In the
earlier Redis test, the assert.strictEqual test expects a result of null for the err argu-
ment. If this expectation fails, the test fails. In the assert.equal isArray test in the
Node source, if the expression evaluates to true, and the expected response is true, the
assert.equal method succeeds and produces no output—the result is silent.
If, however, the expression evaluates to a response other than what’s expected, the
assert.equal method responds with an exception. If I take the first statement in the
isArray test in the Node source and modify it to:
assert.equal(false, util.isArray([]));
The assert.equal and assert.strictEqual methods also have a third optional param-
eter, a message that’s displayed rather than the default in case of a failure:
assert.equal(false, util.isArray([]), 'Test 1Ab failed');
This can be a useful way of identifying exactly which test failed if you’re running several
in a test script. You can see the use of a message (a label) in the node-redis test code:
assert.equal(str, results, label + " " + str + " does not match " + results);
The message is what’s displayed when you catch the exception and print out the
message.
The following Assert module methods all take the same three parameters, though how
the test value and expression relate to each other varies, as the name of the test implies:
assert.equal
Fails if the expression results and given value are not equal
assert.strictEqual
Fails if the expression results and given value are not strictly equal
assert.notEqual
Fails if the expression results and given value are equal
assert.notStrictEqual
Fails if the expression results and given value are strictly equal
assert.deepEqual
Fails if the expression results and given value are not equal
assert.notDeepEqual
Fails if the expression results and given value are equal
The latter two methods, assert.deepEqual and assert.notDeepEqual, work with com-
plex objects, such as arrays or objects. The following succeeds with assert.deepEqual:
assert.deepEqual([1,2,3],[1,2,3]);
is equivalent to:
assert.equal(true, val == 3, 'Equal');
The assert.ifError function takes a value and throws an exception only if the value
resolves to anything but false. As the Node documentation states, it’s a good test for
the error object as the first argument in a callback function:
assert.ifError(err); //throws only if true value
The last assert methods are assert.throws and assert.doesNotThrow. The first expects
an exception to get thrown; the second doesn’t. Both methods take a code block as the
first required parameter, and an optional error and message as the second and third
parameters. The error object can be a constructor, regular expression, or validation
function. In the following code snippet, the error message is printed out because the
error regular expression as the second parameter doesn’t match the error message:
assert.throws(
function() {
throw new Error("Wrong value");
},
/something/
)
} catch(e) {
console.log(e.message);
}
You can create sophisticated tests using the Assert module. The one major limitation
with the module, though, is the fact that you have to do a lot of wrapping of the tests
so that the entire testing script doesn’t fail if one test fails. That’s where using a higher-
level unit testing framework, such as Nodeunit (discussed next), comes in handy.
Nodeunit provides a way to easily run a series of tests without having to wrap everything
in try/catch blocks. It supports all of the Assert module tests, and provides a couple of
methods of its own in order to control the tests. Tests are organized as test cases, each
of which is exported as an object method in the test script. Each test case gets a control
object, typically named test. The first method call in the test case is to the test ele-
ment’s expect method, to tell Nodeunit how many tests to expect in the test case. The
last method call in the test case is to the test element’s done method, to tell Nodeunit
the test case is finished. Everything in between composes the actual test unit:
module.exports = {
'Test 1' : function(test) {
test.expect(3); // three tests
... // the tests
test.done();
},
'Test 2' : function (test) {
test.expect(1); // only one test
... // the test
test.done();
}
};
To run the tests, type nodeunit, followed by the name of the test script:
nodeunit thetest.js
Example 14-1 has a small but complete testing script with six assertions (tests). It con-
sists of two test units, labeled Test 1 and Test 2. The first test unit runs four separate
tests, while the second test unit runs two. The expect method call reflects the number
of tests being run in the unit.
Example 14-1. Nodeunit test script, with two test units, running a total of six tests
var util = require('util');
module.exports = {
'Test 1' : function(test) {
test.expect(4);
test.equal(true, util.isArray([]));
test.equal(true, util.isArray(new Array(3)));
test.equal(true, util.isArray([1,2,3]));
test.notEqual(true, (1 > 2));
test.done();
},
'Test 2' : function(test) {
test.expect(2);
The result of running the Example 14-1 test script with Nodeunit is:
example1.js
✔ Test 1
✔ Test 2
Symbols in front of the tests indicate success or failure: a check for success, and an x
for failure. None of the tests in this script fails, so there’s no error script or stack trace
output.
Mocha
Install Mocha with npm:
npm install mocha -g
You need to install the should.js library before running the test:
npm install should
Jasmine
Jasmine is a behavior-driven development (BDD) framework that can be used with
many different technologies, including Node with the node-jasmine module. The node-
jasmine module can be installed with npm:
npm install jasmine-node -g
Note the module name: jasmine-node, rather than the format of node-
modulename (or the shortened form, modulename) that you’ve seen so far
in the book.
describe('jasmine-node', function(){
The web server is from Chapter 1, and all it does is return the “Hello, World!” message.
Note the use of the newline character—the test will fail if you don’t include it.
I ran the test with the following command line:
jasmine-node --test-dir /home/examples/public_html/node
A successful test.
If the script had been in CoffeeScript, I would have added the --coffee parameter:
jasmine-node --test-dir /home/examples/public_html/node --coffee
Vows
Vows is another BDD testing framework, and has one advantage over others: more
comprehensive documentation. Testing is composed of testing suites, themselves made
up of batches of sequentially executed tests. A batch consists of one or more contexts,
executed in parallel, and each consisting of a topic, which is when we finally get to the
executable code. The test within the code is known as a vow. Where Vows prides itself
on being different from the other testing frameworks is by providing a clear separation
between that which is being tested (topic) and the test (vow).
I know those are some strange uses of familiar words, so let’s look at a simple example
to get a better idea of how a Vows test works. First, though, we have to install Vows:
npm install vows
To try out Vows, I’m using the simple circle module I created earlier in the book, now
edited to set precision:
var PI = Math.PI;
suite.addBatch({
'An instance of Circle': {
topic: circle,
'should be able to calculate circumference': function (topic) {
assert.equal (topic.circumference(3.0), 18.8496);
},
'should be able to calculate area': function(topic) {
assert.equal (topic.area(3.0), 28.2743);
}
}
}).run();
Running the application with Node runs the test because of the addition of the run
method at the end of the addBatch method:
node example2.js
The topic is always an asynchronous function or a value. Instead of using circle as the
topic, I could have directly referenced the object methods as topics—with a little help
from function closures:
var vows = require('vows'),
assert = require('assert');
suite.addBatch({
'Testing Circle Circumference': {
topic: function() { return circle.circumference;},
'should be able to calculate circumference': function (topic) {
assert.equal (topic(3.0), 18.8496);
},
In this version of the example, each context is the object given a title: Testing Circle
Circumference and Testing Circle Area. Within each context, there’s one topic and
one vow.
You can incorporate multiple batches, each with multiple contexts, which can in turn
have multiple topics and multiple vows.
Acceptance Testing
Acceptance testing differs from unit testing in that the former’s primary purpose is to
determine if the application meets user requirements. Unit tests ensure that the appli-
cation is robust, while acceptance tests ensure that the application is useful.
Acceptance testing can be accomplished through the use of predefined scripts that users
actually design and implement in a coordinated setting. Acceptance testing can also be
automated—again through the use of scripts, but scripts that are implemented by tools
rather than people. These tools don’t completely satisfy all aspects of acceptance testing
because they can’t measure subjective perspectives (“This web page form is awkward
to use”), nor can they pinpoint those difficult-to-find bugs that users always seem to
drive out, but they can make sure program requirements are met.
browser
.chain
.session()
.open('/')
.type('q', 'Hello World')
.end(function(err){
browser.testComplete(function() {
console.log('done');
if(err) throw err;
});
});
The code is actually quite intuitive. First, you create a browser object, specifying which
browser to open, the name of the host and port, and what website is being accessed.
Start a new browser session, load a web page ('/'), and type a phrase into an input
field with a given identifier of q. When finished, print done to the console.log, and
throw any error that occurs.
To run a Soda application, you’ll need to ensure that Java is installed. Then, copy the
Selenium RC Java .jar file to your system and run it:
java -jar selenium.jar
The application expects Firefox to be installed, since this is the browser specified in the
application. While I didn’t have it on my Linux box, I did on my Windows laptop and
was able to easily get the application running. It’s rather fascinating but a little
browser
.chain
.session()
.setTimeout(8000)
.open('/login')
.waitForPageToLoad(5000)
.type('username', 'Sally')
.type('password', 'badpassword')
In the test application, a browser object is created with a given browser, browser ver-
sion, and operating system—in this case, Firefox 3.x on Linux. Note also the different
browser client: soda.createSauceClient, not soda.createClient. In the browser object,
I’m restricting testing time to no more than five minutes; the site accessed is http://
examples.burningbird.net:3000; and we’ve just covered where to get the username and
API key.
As each command is issued, it’s logged. We want to have a log so we can check
responses and look for failures and abnormalities:
// Log commands as they are fired
browser.on('command', function(cmd, args){
console.log(' \x1b[33m%s\x1b[0m: %s', cmd, args.join(', '));
});
Last is the actual test. Typically, the tests would have to be nested callbacks (since this
is an asynchronous environment), but Soda provides a chain getter that greatly simpli-
fies adding tasks. The very first task is to start a new session, and then each separate
item in the testing script is encoded. In the end, the application prints out the URLs for
the job, log, and video of the test.
The output from running the application is:
setTimeout: 8000
open: /login
waitForPageToLoad: 5000
type: username, Sally
type: password, badpassword
clickAndWait: //input[@value="Submit"]
assertTextPresent: Invalid password
setContext: sauce:job-info={"passed": true}
testComplete:
https://saucelabs.com/jobs/d709199180674dc68ec6338f8b86f5d6
https://saucelabs.com/rest/shelleyjust/jobs/d709199180674dc68ec6338f8b86f5d6/
results/video.flv
https://saucelabs.com/rest/shelleyjust/jobs/d709199180674dc68ec6338f8b86f5d6/
results/selenium-server.log
You can access the results directly, or you can log into Sauce Labs and see the results
of all your tests, as shown in Figure 14-2.
Zombie resembles Soda in that you create a browser and then run tests that emulate
the actions of a user at a browser. It even supports chained methods to circumvent the
issues with nested callbacks.
I converted the test case against the login form in Example 14-3 to Zombie, except this
time the test uses the proper password and tests for success rather than failure (the user
is redirected to the /admin page). Example 14-4 has the code for this acceptance test.
browser.visit('http://examples.burningbird.net:3000/login', function() {
browser.
fill('username', 'Sally').
fill('password', 'apple').
pressButton('Submit', function() {
assert.equal(browser.location.pathname, '/admin');
});
});
The test is silent, since the assert at the end is successful—the browser location
is /admin, which is the page that should open if the login works, signaling a successful
test.
In the next couple of sections, the test applications are working with
Redis, so if you haven’t read Chapter 9, you may want to do that now.
It’s important to provide the final slash, as ab expects a full URL, including path.
ab provides a rather rich output of information. An example is the following output
(excluding the tool identification) from one test:
Concurrency Level: 10
Time taken for tests: 20.769 seconds
Complete requests: 15000
Failed requests: 0
Write errors: 0
Total transferred: 915000 bytes
HTML transferred: 345000 bytes
Requests per second: 722.22 [#/sec] (mean)
Time per request: 13.846 [ms] (mean)
Time per request: 1.385 [ms] (mean, across all concurrent requests)
Transfer rate: 43.02 [Kbytes/sec] received
// set database to 1
client.select(1);
console.time('test');
var obj = {
member : 2366,
game : 'debiggame',
first_name : 'Sally',
last_name : 'Smith',
email : '[email protected]',
score : 50000 };
scoreServer.listen(8124);
console.log('listening on 8124');
I was curious about performance if I changed one parameter in the application: from
maintaining a persistent connection to Redis to grabbing a connection when the web
service was accessed, and releasing it as soon as the request was finished. That led to
the second version of the application, shown in Example 14-6. The changes from the
first are in bold text.
Example 14-6. Modified application with nonpersistent Redis connections
var redis = require("redis"),
http = require('http');
console.time('test');
// set database to 1
client.select(1);
req.addListener("end", function() {
var obj = {
member : 2366,
game : 'debiggame',
first_name : 'Sally',
last_name : 'Smith',
email : '[email protected]',
score : 50000 };
scoreServer.listen(8124);
console.log('listening on 8124');
The tests give us a fairly good indication that maintaining a persistent connection en-
hances performance. This is further borne out, in rather dramatic fashion, with a second
test.
When I ran the test 100,000 times, with 1,000 concurrent users, the Node application
that maintained a persistent connection to Redis finished the test, while the other op-
tion actually failed; too many concurrent users backed up at Redis, and it started re-
jecting connections. Exactly 67,985 tests completed before the application went toes
up.
When Nodeload is installed globally, you can access the command-line version (nl.js)
of the module application anywhere. The command-line arguments it takes are similar
to what we’ve used with ab:
nl.js -c 10 -n 10000 -i 2 http://examples.burningbird.net:8124
The application accesses the website 10,000 times, emulating 10 concurrent users. The
-i flag alters how frequently the statistics are reported (every 2 seconds rather than the
default 10 seconds). Here is the complete set of flags:
-n --number
Number of requests to make
The graphics file is also persisted for later access, as is a logfile of the test results. At the
end of the test, summary results are given that are very close to ab in nature. An example
of one output is the following:
If you want to provide your own custom test, you can use the Nodeload module to
develop a testing application. The module provides live monitoring, graphics capabil-
ity, statistics, as well as distributed testing capability.
Nodemon wraps your application. Instead of using Node to start the application, use
Nodemon:
nodemon app.js
Nodemon sits quietly monitoring the directory (and any contained directories) where
you ran the application, checking for file changes. If it finds a change, it restarts the
application so that it picks up the recent changes.
You can pass parameters to the application:
nodemon app.js param1 param2
There are other flags, documented with the module. The module can be found at https:
//github.com/remy/nodemon/.
Security in web applications goes beyond ensuring that people don’t have access to the
application server. Security can be complex, and even a little intimidating. Luckily,
when it comes to Node applications, most of the components we need for security have
already been created. We just need to plug them in, in the right place and at the right
time.
In this chapter, I break down security into four major components: encryption,
authentication and authorization, attack prevention, and sandboxing:
Encryption
Ensures that data transmitted over the Internet is safe, even if it is intercepted
midroute. The only receiver that can actually decrypt the data is the system that
has the proper credentials (typically a key). Encryption is also used for data that
must be stored confidentially.
Authentication and authorization
Consist of the logins we get whenever we need to access protected areas of an
application. Not only do these logins ensure that a person has access to a section
of an application (authorization), they also ensure the person is who she says she
is (authentication).
Attack prevention
Ensures that someone who is submitting data via a form isn’t trying to tack on text
that can attack the server or the database you’re using.
Sandboxing
Barricades script so it doesn’t have access to the system resources—it operates only
within a limited context.
315
Encrypting Data
We send a lot of data over the Internet. Most of it isn’t anything essential: Twitter
updates, web page history, comments on a blog post. Much of the data, though, is
private, including credit card data, confidential email messages, or login information
to our servers. The only way to ensure that these types of data transmissions are kept
private, and aren’t hacked in any way during transit, is to use encryption with the
communication.
Setting Up TSL/SSL
Secure, tamper-resistant communication between a client and a server occurs over SSL
(Secure Sockets Layer), and its upgrade, TLS (Transport Layer Security). TSL/SSL pro-
vides the underlying encryption for HTTPS, which I cover in the next section. However,
before we can develop for HTTPS, we have to do some environment setup.
A TSL/SSL connection requires a handshake between client and server. During the
handshake, the client (typically a browser) lets the server know what kind of security
functions it supports. The server picks a function and then sends through an SSL cer-
tificate, which includes a public key. The client confirms the certificate and generates
a random number using the server’s key, sending it back to the server. The server then
uses its private key to decrypt the number, which in turn is used to enable the secure
communication.
For all this to work, you’ll need to generate both the public and private key, as well as
the certificate. For a production system, the certificate would be signed by a trusted
authority, such as our domain registrars, but for development purposes you can make
use of a self-signed certificate. Doing so generates a rather significant warning in the
browser, but since the development site isn’t being accessed by users, there won’t be
an issue.
The tool used to generate the necessary files is OpenSSL. If you’re using Linux, it should
already be installed. There’s a binary installation for Windows, and Apple is pursuing
its own Crypto library. In this section, I’m just covering setting up a Linux environment.
To start, type the following at the command line:
openssl genrsa -des3 -out site.key 1024
The command generates the private key, encrypted with Triple-DES and stored in PEM
(privacy-enhanced mail) format, making it ASCII readable.
You’ll be prompted for a password, and you’ll need it for the next task, creating a
certificate-signing request (CSR).
When generating the CSR, you’ll be prompted for the password you just created. You’ll
also be asked a lot of questions, including the country designation (such as US for
United States), your state or province, city name, company name and organization, and
The private key wants a passphrase. The problem is, every time you start up the server,
you’ll have to provide this passphrase, which is an issue in a production system. In the
next step, you’ll remove the passphrase from the key. First, rename the key:
mv site.key site.key.org
Then type:
openssl rsa -in site.key.org -out site.key
If you do remove the passphrase, make sure your server is secure by ensuring that the
file is readable only by root.
The next task is to generate the self-signed certificate. The following command creates
one that’s good only for 365 days:
openssl x509 -req -days 365 -in site.csr -signkey site.key -out final.crt
Now you have all the components you need in order to use TLS/SSL and HTTPS.
https.createServer(options, function(req,res) {
res.writeHead(200);
res.end("Hello Secure World\n");
}).listen(443);
The public key and certificate are opened, and their contents are read syn-
chronously. The data is attached to the options object, passed as the first parameter in
the https.createServer method. The callback function for the same method is the one
we’re used to, with the server request and response object passed as parameters.
Accessing the page demonstrates what happens when we use a self-signed certificate,
as shown in Figure 15-1. It’s easy to see why a self-signed certificate should be used
only during testing.
Figure 15-1. What happens when you use Chrome to access a website using HTTPS with a self-signed
certificate
The browser address bar demonstrates another way that the browser signals that the
site’s certificate can’t be trusted, as shown in Figure 15-2. Rather than displaying a lock
indicating that the site is being accessed via HTTPS, it displays a lock with a red x
showing that the certificate can’t be trusted. Clicking the icon opens an information
window with more details about the certificate.
Encrypting communication isn’t the only time we use encryption in a web application.
We also use it to store user passwords and other sensitive data.
The digest encoding is set to hexadecimal. By default, encoding is binary, and base64
can also be used.
Many applications use a hash for this purpose. However, there’s a problem with storing
plain hashed passwords in a database, a problem that goes by the innocuous name of
rainbow table.
Put simply, a rainbow table is basically a table of precomputed hash values for every
possible combination of characters. So, even if you have a password that you’re sure
can’t be cracked—and let’s be honest, most of us rarely do—chances are, the sequence
of characters has a place somewhere in a rainbow table, which makes it much simpler
to determine what your password is.
The way around the rainbow table is with salt (no, not the crystalline variety), a unique
generated value that is concatenated to the password before encryption. It can be a
single value that is used with all the passwords and stored securely on the server. A
better option, though, is to generate a unique salt for each user password, and then
store it with the password. True, the salt can also be stolen at the same time as the
password, but it would still require the person attempting to crack the password to
generate a rainbow table specifically for the one and only password—adding immensely
to the complexity of cracking any individual password.
Example 15-2 is a simple application that takes a username and a password passed as
command-line arguments, encrypts the password, and then stores both as a new user
in a MySQL database table. The table is created with the following SQL:
CREATE TABLE user (userid INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(userid),
username VARCHAR(400) NOT NULL, password VARCHAR(400) NOT NULL);
The salt consists of a date value multiplied by a random number and rounded. It’s
concatenated to the password before the resulting string is encrypted. All the user data
is then inserted into the MySQL user table.
Example 15-2. Using Crypto’s createHash method and a salt to encrypt a password
var mysql = require('mysql'),
crypto = require('crypto');
The application to test a username and password, shown in Example 15-3, queries the
database for the password and salt based on the username. It uses the salt to, again,
encrypt the password. Once the password has been encrypted, it’s compared to the
password stored in the database. If the two don’t match, the user isn’t validated. If they
match, then the user’s in.
Example 15-3. Checking a username and a password that has been encrypted
var mysql = require('mysql'),
crypto = require('crypto');
client.query('USE databasenm');
Trying out the applications, we first pass in a username of Michael, with a password of
applef*rk13*:
node password.js Michael apple*frk13*
Of course, we don’t expect our users to log in via the command line. Neither do
we always use a local password system to authenticate people. We’ll go over the
authentication process next.
Passport isn’t the only module that provides authentication and au-
thorization, but I found it to be the easiest to use.
Passport utilizes strategies that are installed independently from the framework. All
Passport strategies have the same basic requirements:
• The strategy must be installed.
• The strategy must be configured in the application.
• As part of the configuration, the strategy incorporates a callback function, used to
verify the user’s credentials.
• All strategies require additional work depending on the authority vetting the cre-
dentials: Facebook and Twitter require an account and account key, while the local
strategy requires a database with usernames and passwords.
• All strategies require a local data store that maps the authority’s username with an
application username.
• Passport-provided functionality is used to persist the user login session.
In this chapter, we’re looking at two Passport strategies: local authentication/authori-
zation, and authentication through Twitter using OAuth.
app.configure(function(){
...
app.use(passport.initialize());
app.use(passport.session());
...
});
Then configure the local strategy. The format for configuring the local strategy is the
same as that for configuring all other strategies: a new instance of the strategy is passed
to Passport via the use method, similar to the approach utilized by Express:
passport.use(new localStrategy( function (user, password, done) { ... }
The passport-local module expects that the username and password are passed to the
web application via a posted form, and that the values are contained in fields named
username and password. If you want to use two other field names, pass them as an option
when creating the new strategy instance:
var options =
{ usernameField : 'appuser',
passwordField : 'userpass'
};
passport.use(new localStrategy(options, function(user, password, done) { ... }
The callback function passed to the strategy construction is called after the username
and password have been extracted from the request body. The function performs the
actual authentication, returning:
• An error, if an error occurs
• A message that the user doesn’t authenticate if he fails authentication
• The user object, if the user does authenticate
Whenever a user tries to access a protected site, Passport is queried to see if he is au-
thorized. In the following code, when the user tries to access the restricted admin page,
a function named ensureAuthenticated is called to determine whether he is authorized:
app.get('/admin', ensureAuthenticated, function(req, res){
res.render('admin', { title: 'authenticate', user: req.user });
});
To persist the login for the session, Passport provides two methods, serializeUser and
deserializeUser. We have to provide the functionality in the callback function that is
passed to these two methods. Basically, passport.serializeUser serializes the user’s
identifier, while passport.deserializeUser uses this identifier to find the user in what-
ever data store we’re using, and return an object with all the user information:
passport.serializeUser(function(user, done) {
done(null, user.id);
});
passport.deserializeUser(function(id, done) {
...
});
Serialization to the session isn’t a requirement for Passport. If you don’t want to serialize
the user, don’t include the passport.session middleware:
app.use(passport.session());
If you do decide to serialize the user to the session (and you should; otherwise, you’ll
have a very annoyed user, as he’ll keep getting login requests), you must ensure that
the Passport middleware is included after the Express session middleware:
app.use(express.cookieParser('keyboard cat'));
app.use(express.session());
app.use(passport.initialize());
app.use(passport.session());
If you don’t maintain the proper order, the user never authenticates.
The last chunk of functionality is handling what happens when the person doesn’t
validate. During the authentication, if a user’s username isn’t found in the data store,
an error message is generated. If the username is found, but the password doesn’t match
what’s stored, an error is generated. We need to communicate these error messages
back to the user.
Passport uses the Express 2.x req.flash method to queue error messages for display
back to the user. I didn’t cover req.flash in earlier chapters because the functionality
was deprecated in Express 3.x. However, to ensure that Passport works with Express
2.x and 3.x, the Passport developer created a new module, connect-flash, that adds this
functionality back in.
The connect-flash module can be installed with npm:
npm install connect-flash
Now, in the POST login route, if the user doesn’t authenticate, he’s redirected to the
login form and given a notification that an error occurred:
app.post('/login',
passport.authenticate('local', { failureRedirect: '/login', failureFlash: true }),
function(req, res) {
res.redirect('/admin');
});
The error message(s) generated via the authentication process can be passed on to the
views engine via req.flash when the login form is rendered:
app.get('/login', function(req, res){
var username = req.user ? req.user.username : '';
res.render('login', { title: 'authenticate', username: username,
message: req.flash('error') });
});
The views engine can then display the error message in addition to the login form
elements, as this Jade template demonstrates:
extends layout
block content
h1 Login
if message
p= message
form(method="POST"
action="/login"
enctype="application/x-www-form-urlencoded")
p Username:
input(type="text"
name="username"
id="username"
size="25"
value="#{username}"
required)
p Password:
input(type="password"
name="password"
id="password"
size="25"
required)
input(type="submit"
name="submit"
id="submit"
value="Submit")
input(type="reset"
name="reset"
id="reset"
value="reset")
client.query('USE databasenm');
client.query('USE nodetest2');
// database error
if (err) {
return done(err);
// check password
} else {
var newhash = crypto.createHash('sha512')
.update(result[0].salt + password)
.digest('hex');
// if passwords match
if (result[0].password === newhash) {
var user = {id : result[0].userid,
username : username,
password : newhash };
return done(null, user);
app.configure(function(){
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');
app.use(express.favicon());
app.use(express.logger('dev'));
app.post('/login',
passport.authenticate('local', { failureRedirect: '/login', failureFlash: true }),
function(req, res) {
res.redirect('/admin');
});
http.createServer(app).listen(3000);
Example 15-4 is a longer example than I normally like to include in a book, but stubbing
in the data source portions of the example wouldn’t give you a real feel for how the
Passport component works with the password hashing component, discussed earlier.
Let’s take a closer look at the authentication method. Once the application has queried
for the user record given the username, it invokes the callback function with the data-
base error, if an error occurs. If an error does not occur, but the username isn’t found,
the application invokes the callback function with the username set to false to signal
that the username wasn’t found, and provides an appropriate message. If the user is
found, but the passwords don’t match, the same thing happens: a value of false is
returned for the user and a message is generated.
Only when no database error occurred, the user exists in the user table, and the pass-
words match is a user object created and returned via the callback function:
// database error
if (err) {
return done(err);
// check password
} else {
var newhash = crypto.createHash('sha512')
.update(result[0].salt + password)
.digest('hex');
// if passwords match
if (result[0].password === newhash) {
var user = {id : result[0].userid,
username : username,
password : newhash };
return done(null, user);
This user object is then serialized to the session, and the user is given access to the
admin page. He’ll continue to have access without challenge to the admin page as long
as the session is alive.
To use OAuth to authenticate a user through Twitter, you need to set up a developer’s
account at Twitter, and get a consumer key and a consumer secret. These are used in
the application to form part of the OAuth request.
Once you have your consumer key and secret, use these, in addition to the callback
URL, to create the Twitter strategy:
passport.use(new TwitterStrategy(
{ consumerKey: TWITTER_CONSUMER_KEY,
consumerSecret: TWITTER_CONSUMER_SECRET,
callbackURL: "http://examples.burningbird.net:3000/auth/twitter/callback"},
function(token, tokenSecret,profile,done) {
findUser(profile.id, function(err,user) {
console.log(user);
if (err) return done(err);
if (user) return done(null, user);
Though Twitter provides authentication, you’re still most likely going to need a way
to store information about the user. In the Twitter strategy code block, notice that the
callback function passed lists several parameters: token, tokenSecret, profile, and then
the last callback function. Twitter provides the token and tokenSecret parameters when
it responds to the request for authentication. The token and tokenSecret values can
then be used to interact with the individual’s Twitter account—for example, to repub-
lish recent tweets, tweet to her account, or discover information about her lists and
followers. The Twitter API exposes all the information the user herself sees when she
interacts with Twitter directly.
The profile object, though, is the object we’re interested in here. It contains a wealth
of information about the person: her Twitter screen name, full name, description, lo-
cation, avatar image, number of followers, number of people followed, number of
tweets, and so on. It’s this data that we’re going to mine in order to store some relevant
information about the user in our local database. We’re not storing a password; OAuth
doesn’t expose the individual’s authentication information. Rather, we’re just storing
information we may want to use in our web applications to personalize the individual’s
experience at our sites.
When the person first authenticates, the application does a lookup on her Twitter
identifier in the local database. If the identifier is found, an object is returned with the
information stored about the person locally. If it’s not found, a new database record is
created for the person. Two functions are created for this process: findUser and crea
teUser. The findUser function is also used when Passport deserializes the user from the
session:
passport.deserializeUser(function(id, done) {
findUser(id, function(err, user) {
done(err,user);
});
});
There is no longer a login page, because Twitter provides the login form. In the appli-
cation, the only login provided is a link to authenticate via Twitter:
extends layout
block content
h1= title
p
a(href='/auth/twitter') Login with Twitter
If the person isn’t logged into Twitter, she’s presented a login page like the one shown
in Figure 15-3.
Once the user is logged in, the web page is then redirected to the application, which
then displays the administrative page for the user. Now, however, the page is person-
alized with data drawn directly from Twitter, including the person’s display name and
avatar:
extends layout
block content
h1 #{title} Administration
p Welcome to #{user.name}
p
img(src='#{user.img}',alt='avatar')
This data is some of what’s stored when the person first authenticates. If you look into
your Twitter account settings page and then click through to the Apps, you’ll see the
application among those listed, as shown in Figure 15-4.
Example 15-5 has the complete application code for authenticating the user via Twitter
and storing her data in a MySQL database. You can, of course, also store the data in
MongoDB, or even Redis, if you persist your Redis data. The Crypto module is no
longer needed, because we’re no longer storing passwords—a distinct advantage to
authenticating via a third-party service.
Example 15-5. Complete application authenticating a user via Twitter
var express = require('express')
, flash = require('connect-flash')
, passport = require('passport')
, TwitterStrategy = require('passport-twitter').Strategy
, http = require('http');
client.query('USE nodetest2');
passport.serializeUser(function(user, done) {
done(null, user.id);
});
passport.deserializeUser(function(id, done) {
findUser(id, function(err, user) {
done(err,user);
});
});
passport.use(new TwitterStrategy(
{ consumerKey: TWITTER_CONSUMER_KEY,
consumerSecret: TWITTER_CONSUMER_SECRET,
callbackURL: "http://examples.burningbird.net:3000/auth/twitter/callback"},
function(token, tokenSecret,profile,done) {
findUser(profile.id, function(err,user) {
console.log(user);
if (err) return done(err);
app.configure(function(){
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');
app.use(express.favicon());
app.use(express.logger('dev'));
app.use(express.bodyParser());
app.use(express.methodOverride());
app.use(express.cookieParser('keyboard cat'));
app.use(express.session());
app.use(passport.initialize());
app.use(passport.session());
app.use(flash());
app.use(app.router);
app.use(express.static(__dirname + '/public'));
});
app.get('/auth', function(req,res) {
res.render('auth', { title: 'authenticate' });
});
app.get('/auth/twitter',
passport.authenticate('twitter'),
function(req, res){
});
app.get('/auth/twitter/callback',
passport.authenticate('twitter', { failureRedirect: '/login' }),
function(req, res) {
res.redirect('/admin');
});
http.createServer(app).listen(3000);
Since Node is using the V8 engine, we know that we have access to the JSON object,
so we don’t have to worry about cross-browser workarounds.
You can check that the incoming data is of a format consistent with its use, such as
checking to ensure that incoming text is an email:
try {
check(email).isEmail();
} catch (err) {
console.log(err.message); // Invalid email
}
The node-validator application throws an error whenever the data doesn’t check out.
If you want a better error message, you can provide it as an optional second parameter
in the check method:
try {
check(email, "Please enter a proper email").isEmail();
} catch (err) {
console.log(err.message); // Please enter a proper email
}
The sanitize filter ensures that the string is sanitized according to whatever method
you use:
var newstr = sanitize(str).xss(); // prevent XSS attack
You can access the check, sanitize, and other provided methods directly on the request
object:
app.get('/somepage', function (req, rest) {
...
req.check('zip', 'Please enter zip code').isInt(6);
req.sanitize('newdata').xss();
...
});
Sandboxed Code
The vm Node module provides a way to safely sandbox JavaScript. It provides access
to a new V8 virtual machine in which you can run JavaScript passed as a parameter.
You can then run the script in a separate context, passing in any data it might need as
an optional object:
script_obj.runInNewContext(sandbox);
Example 15-7 has a small but complete example of using vm to compile a JavaScript
statement, utilizing two sandbox object properties, and creating a third.
Example 15-7. Simple example of using Node’s vm module to sandbox a script
var vm = require('vm');
var util = require('util');
// compile script
var script_obj = vm.createScript("var str = 'My name is ' + name + ' at ' + domain",
'test.vm');
The object passed to the new context is the point of connection between the calling
application and the sandboxed script. The script has no other access to the parent
context. If you tried to use a global object, such as console, in your sandboxed Java-
Script, you’d get an error.
To demonstrate, Example 15-8 modifies the Example 15-7 to load a script in from a
file and run it. The script being loaded is nothing but a slight variation of what we had
in the preceding example, with the addition of a console.log request:
var str = 'My name is ' + name + ' from ' + domain;
console.log(str):
The vm.createScript can’t read in the file directly. The second (optional) parameter
isn’t an actual file, but a name used as a label in a stack trace—it’s for debugging
purposes only. We’ll need to use the filesystem’s readFile to read in the script file
contents.
try {
console.log(data);
var obj = { name: 'Shelley', domain: 'burningbird.net'};
// compile script
var script_obj = vm.createScript(data, 'test.vm');
The error occurs—and rightly so—because there is no console object within the virtual
machine; it’s a V8 virtual machine, not a Node virtual machine. We’ve seen how we
can implement any process with child processes in a Node application. We certainly
don’t want to expose that kind of power to sandboxed code.
We can run the script within a V8 context, which means it has access to the global
object. Example 15-9 re-creates the application from Example 15-8, except this time
the runInContext method is used, with a context object passed to the method. The
context object is seeded with the object that has the parameters the script is expecting.
Printing out the inspection results on the object after the script execution, though,
shows that the newly defined property, str, is no longer present. We need to inspect
the context to see the object as it exists both in the current context and the sandbox
context.
Example 15-9. Running the code in context, with context object passed to vm
var vm = require('vm');
var util = require('util');
var fs = require('fs');
// compile script
var script_obj = vm.createScript(data, 'test.vm');
// create context
var ctx = vm.createContext(obj);
// inspect object
console.log(util.inspect(obj));
// inspect context
console.log(util.inspect(ctx));
} catch(e) {
console.log(e);
}
});
The examples used a precompiled script block, which is handy if you’re going to run
the script multiple times. If you want to run it just once, though, you can access both
the runInContext and runInThisContext methods directly off the virtual machine. The
difference is that you have to pass in the script as the first parameter:
var obj = { name: 'Shelley', domain: 'burningbird.net' };
// create context
var ctx = vm.createContext(obj);
// inspect context
console.log(util.inspect(ctx));
Again, within a supplied context, the sandbox script does have access to a global object
defined via createContext, seeded with any data the sandboxed code needs. And any
resulting data can be pulled from this context after the script is run.
At some point in time, you’re going to want to take your Node application from de-
velopment and testing to production. Depending on what your application does and
what services it provides (or needs), the process can be simple, or it can be very complex.
I’m going to briefly touch on the possible combinations and issues related to production
deployment of a Node application. Some require only minimal effort on your part, such
as installing Forever to ensure that your Node application runs, well, forever. Others,
though, such as deploying your application to a cloud server, can take considerable
time and advance planning.
345
Chapter 14 covered unit, acceptance, and performance testing, and Chapter 15 covered
security. Here, we’ll look at implementing the other necessary components of deploying
a Node application to production on your own server.
I’m not covering all the possible data values in package.json, only those
meaningful for a Node application.
To start, we need to provide the application’s basic information, including its name,
version, and primary author:
{
"name": "WidgetFactory",
"preferGlobal": "false",
"version": "1.0.0",
"author": "Shelley Powers <[email protected]> (http://burningbird.net)",
"description": "World's best Widget Factory",
Note that the name property value cannot have any whitespace.
The author values could also be split out, as follows:
"author": { "name": "Shelley Powers",
"email": "[email protected]",
"url": "http://burningbird.net"},
What this setting tells me is that when the module is installed globally, I can run the
Nodeload application just by typing nl.js.
The widget application doesn’t have a command-line tool. It also doesn’t have any
scripts. The scripts keyword identifies any scripts that are run during the package life
cycle. There are several events that can happen during the life cycle, including prein
stall, install, publish, start, test, update, and so on, and scripts can be run with each.
If you issue the following npm command in a Node application or module directory:
npm test
You should include any unit test script for the widget application in scripts, in addition
to any other script necessary for installation (such as scripts to set up the environment
for the application). Though the Widget Factory doesn’t have a start script yet, your
application should, especially if it is going to be hosted in a cloud service (discussed
later in the chapter).
If you don’t provide a script for some values, npm provides defaults. For the start script,
the default is to run the application with Node:
node server.js
The repository property isn’t essential unless you’re publishing your application
source (though you can restrict source access to a specific group of people). One of the
advantages of providing this information is that users can access your documentation
with npm docs:
npm docs packagename
Then I opened the docs for Passport, the authentication module covered in Chapter 15:
npm docs passport
The widget application has several different dependencies, for both production and
development environments. These are listed individually—the former in devDependen
cies, the latter in dependencies. Each module dependency is listed as the property, and
the version needed as the value:
"dependencies": {
"express": "3.0",
"jade": "*",
"stylus": "*",
"redis": "*",
"mongoose": "*"
},
"devDependencies": {
"nodeunit": "*"
}
If there are any operating system or CPU dependencies, we can also list these:
"cpu" : ["x64", "ia32"],
"os": ["darwin","linux"]
There are some publishing values, including private, to ensure that the application
isn’t accidentally published:
"private": true,
We can test the package.json file by copying the Widget Factory’s code to a new location
and then typing npm install -d to see if all the dependencies are installed and the
application runs.
Rather than start an application with Node directly, start it with Forever:
forever start -a -l forever.log -o out.log -e err.log httpserver.js
The preceding command starts a script, httpserver.js, and specifies the names for the
Forever log, the output log, and the error log. It also instructs the application to append
the log entries if the logfiles already exist.
If something happens to the script to cause it to crash, Forever restarts it. Forever also
ensures that a Node application continues running, even if you terminate the terminal
window used to start the application.
Forever has both options and actions. The start value in the command line just shown
is an example of an action. All available actions are:
There are also a significant number of options, including the logfile settings just
demonstrated, as well as running the script (-s or --silent), turning on Forever’s
You can incorporate the use of Forever directly in your code, as demonstrated in the
documentation for the application:
var forever = require('forever');
child.on('exit', this.callback);
child.start();
Additionally, you can use Forever with Nodemon (introduced in Chapter 14), not only
to restart the application if it unexpectedly fails, but also to ensure that the application
is refreshed if the source is updated. You simply wrap Nodemon within Forever and
specify the --exitcrash option to ensure that if the application crashes, Nodemon exits
cleanly, passing control to Forever:
forever nodemon --exitcrash httpserver.js
If the application does crash, Forever starts Nodemon, which in turn starts the Node
script, ensuring that not only is the running script refreshed if the source is changed,
but also that an unexpected failure doesn’t take your application permanently offline.
If you want your application to start when your system is rebooted, you need to set it
up as a daemon. Among the examples provided for Forever is one labeled initd-
example. This example is the basis of a script that starts your application with Forever
when the system is rebooted. You’ll need to modify the script to suit your environment
and also move it to /etc/init.d, but once you’ve done so, even if the system is restarted,
your application restarts without your intervention.
RewriteEngine on
</IfModule>
If you have the proper permissions, you can also create a subdomain specifically for
your Node application and have Apache proxy all requests to the Node application.
This is an approach used in other environments of this type, such as running Apache
and Tomcat together:
<VirtualHost someipaddress:80>
ServerAdmin [email protected]
ServerName examples.burningbird.net
ServerAlias www.examples.burningbird.net
ProxyRequests off
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
<Location />
ProxyPass http://localhost:8124/
ProxyPassReverse http://localhost:8124/
</Location>
</VirtualHost>
These will work, and the performance should be more than acceptable if you don’t
expect your Node application to be accessed frequently. The problem with both ap-
proaches, though, is that all requests are channeled through Apache, which spins off a
process to handle each. The whole point of Node is to avoid this costly overhead. If
you expect your Node application to get heavy use, another approach—but one that’s
dependent on your having root control of your system—is to modify the Apache
ports.conf file and change which port Apache listens to, from:
Listen 80
Then use a Node proxy, like http-proxy, to listen for and proxy requests to the appro-
priate port. As an example, if Apache is to handle all requests to subdirectory public,
and Node handles all requests to node, you could create a standalone proxy server that
takes incoming requests and routes them accordingly:
var httpProxy = require('http-proxy');
var options = {
The user never sees any of the port magic that is happening behind the scenes. The
http-proxy module also works with WebSocket requests, as well as HTTPS.
Why continue to use Apache? Because applications such as Drupal and others
use .htaccess files to control access to their contents. In addition, several subdomains
at my site use .htpasswd to password-protect the contents. These are all examples of
Apache constructs that have no equivalence in Node server applications.
We have a long-established history with Apache. Tossing it aside in favor of Node
applications is more complicated than just creating a static server using Express.
Improving Performance
There are additional steps you can take to boost the performance of your Node appli-
cation, depending on your system’s resources. Most are not trivial, and all are beyond
the scope of this book.
If your system is multicore, and you’re willing to use experimental technology, you can
use Node clustering. The Node.js documentation contains an example of clustering,
whereby each process is spawned on a different CPU, though all are listening for in-
coming requests on the same port.
In some future version of Node, we’ll be able to automatically take advantage of a
multicore environment just by passing a parameter of --balance when starting the
application.
You can also take advantage of a distributed computing architecture, utilizing a module
such as hook.io.
There are tricks and techniques to improve your Node application’s performance. Most
will take a considerable amount of work. Instead, you can host your application on a
cloud service and take advantage of whatever performance improvements the host
provides. We’ll go over that option next.
You can add and edit files, and then run the application directly in the IDE. Cloud9
also supports debugging.
The Cloud9 IDE is free to start working with an application, but you’ll need to sign up
for premium service if you want to deploy. It supports a variety of languages, though
it’s primarily focused on HTML and Node applications. It also supports multiple re-
positories, including GitHub, Bitbucket, Mercurial repositories, Git repositories, and
FTP servers.
Amazon EC2
Amazon Elastic Compute Cloud, or EC2, has some history behind it now, which makes
it an attractive option. It also doesn’t impose a lot of requirements on the Node devel-
oper looking to host an application in this cloud service.
Setting up on Amazon EC2 is little different than setting up on a more traditional VPN
(virtual private network). You specify your preferred operating system, update it with
the necessary software to run Node, deploy the application using Git, and then use a
tool like Forever to ensure that the application persists.
The Amazon EC2 service has a website that can make it simple to set up the instance.
It doesn’t provide a free service like Joyent does, but the charges are reasonable—about
0.02 an hour while you’re trying out the service.
If your application is using MongoDB, the MongoDB website provides very detailed
Amazon EC2 setup instructions.
Nodejitsu
Nodejitsu is currently in beta, and is offering beta accounts. Like many of the other
excellent cloud services, it lets you try out the service for free.
Like Heroku, Nodejitsu provides a tool, jitsu, to simplify the deployment process. You
install it using npm. Log into Nodejitsu with jitsu, and deploy simply by typing:
jitsu deploy
Git is a version control system, similar to CVS (Concurrent Versioning System) or Sub-
version. Where Git differs from the other, more conventional version control systems
is how it maintains the source as you make modifications. A version control system like
CVS stores version changes as differences from a base file. Git, on the other hand, stores
snapshots of the code at a specific point in time. If a file isn’t changed, Git just links to
the previous snapshot.
To begin using Git, you first need to install it on your system. There are binaries for
Windows and Mac OS X, as well as source code for various flavors of Unix. Installing
it on my Linux (Ubuntu 10.04) server required only one command:
sudo apt-get install git
Once Git is installed, it needs to be configured. You’ll need to provide a Git username
(typically, your first and last name) and an email address. These form the two compo-
nents of the commit author, used to mark your edits:
git config --global user.name "your name"
git config --global user.email "your email"
Since you’re going to be working with GitHub, the hosting service that houses most (if
not all) Node modules, you’re also going to need to set up a GitHub account. You can
use whatever GitHub username you want—it doesn’t have to match the username you
just specified. You’ll also need to generate an SSH (secure shell) key to provide GitHub,
following the documentation outlined in the GitHub help documentation.
359
Most Git tutorials start you out by creating a simple repository (or repo to use common
terminology) of your own work. Since we’re interested mainly in Git with Node, we’ll
start out by cloning an existing repository rather than creating our own. Before you can
clone the source, though, you must first fork (obtain a working snapshot) the repository
at the GitHub website by clicking the Fork button located on the upper-right side of
the repository’s main GitHub web page, as shown in Figure A-1.
Then you can access the forked repository in your profile. You’ll also access the
Git URL in the newly forked repository web page. For instance, when I forked
the node-canvas module (covered in Chapter 12), the URL was
[email protected]:shelleyp/node-canvas.git. The command to clone the forked repository
is git clone URL:
git clone [email protected]:shelleyp/node-canvas.git
You can also clone over HTTP, though the GitHub folks don’t recommend it. However,
it is a good approach to use if you want a read-only copy of the repository source because
you want examples and other material that may not be included when you install the
module with npm (or if you want to access a copy of the module in work that hasn’t
yet been pushed out to npm).
Access the HTTP read-only URL from each repository’s web page, such as the following
for node-canvas:
git clone https://github.com/username/node-whatever.git
Now you have a copy of the node-canvas repository (or whatever repository you want
to access). You can make changes in any of the source files if you wish. You add new
or changed files by issuing the command git add, and then commit those changes by
If you want to see if the file is staged and ready to commit, you can type the git
status command:
git status
If you want to submit the changes to be included back as part of the original repository,
you’ll issue a pull request. To do so, open the forked repository on which you want to
issue the request in your browser, and look for the button labeled Pull Request, as
shown in Figure A-2.
Figure A-2. Click the Pull Request button at GitHub to initiate a pull request
Clicking the Pull Request link opens up a Pull Request preview pane, where you can
enter your name and a description of the change, as well as preview exactly what’s going
to be committed. You can change the commit range and destination repository at this
time.
Once satisfied, send the request. This puts the item in the Pull Request queue for the
repository owner to merge. The repository owner can review the change; have a dis-
cussion about the change; and, if he decides to merge the request, do a fetch and merge
of the change, a patch and apply, or an automerge.
Provide a README for the repository, using your favorite text editor. This is the file
that is displayed when a user clicks Read More in a module page at GitHub. Once the
file is created, add it and commit it:
git add README
git commit -m 'readme commit'
To connect your repository to GitHub, you’ll need to establish a remote repository for
the module and push the commits to it:
git remote add origin [email protected]:username/MyBeautiful-Module.git
git push -u origin master
Once you’ve pushed your new module to GitHub, you can begin the fun of promoting
it by ensuring that it gets listed in the Node module listings as well as the npm registry.
This is a quick run-through of the documentation that you can find at the GitHub
website, under the Help link.
We’d like to hear your suggestions for improving our indexes. Send email to [email protected].
363
using OpenID, 323–324 using ImageMagick from, 260–264
autocomplete text in REPL, 27 chunked transfer encoding, 44
auto_reconnect option, 208 cipher method, 319
.clear command, 27, 31
clearBreakpoint command, 289
B clearInterval function, 41
backtrace, 290 clearTimeout function, 41
--balance parameter, 353 client side requirements for WebSockets, 279
batchSize option, 214 client-side debugging, 290–291
battle hardened, 124 close event, 43, 44, 52
benchmark testing, 307–311 close method, 47, 154
benefits of using Node, 19–20 cloud services, deploying to, 353–358
bidirectional, 273 Amazon EC2 (Elastic Compute Cloud),
Bitbucket, 330 357
block keyword, 175 Heroku, 357
bodyParser middleware module, 131, 142, Joyent SmartMachines, 356
254 Nodejitsu, 357–358
.break command, 27 Windows Azure, 354–356
browsers, testing in multiple, 107 Cloud9 IDE deployment options, 330
BSON format, 207 clustering, 325
Buffer class, 39–40 cmd.exe, 53, 54
built-in debugging, 287–290 CMS (content management system), 126, 323
burst, 251 Coffee-script module, 70
CoffeeScript, ix, 313
C collections in MongoDB, 209–213
Colors module, 71–72
c-ares library, 54
colors, changing in log, 71
cache object, 64
command-line prompt for REPL, 21
Cairo library, 268
Commander, 70
Calipso framework, 151
commands for REPL (read-eval-print loop), 27–
callback functions, 13
28
naming, 90
commit author, 359
vs. promises, 81–84
commit command, 360
callback spaghetti, 90
commit method, 240
camel case, 100
CommonJS module system, 63, 75
canvas element, 268–271
compile method, 156, 192
Cascading Style Sheets (CSS) engines, 153
Concurrent Versioning System (CVS), 359
Cassandra key/value store, 187
config command, 68
certificate-signing request (CSR), 316
configure method, 130, 132, 137, 142, 163,
chain getter, 304
174, 181, 279
chained methods, 236–237
Connect framework
chainer helper, 246
and Express, 133–134
changing colors in log, 71
cookieSession middleware module, 115–
check method, 339, 340
118
child processes, 50–54
creating custom middleware, 118–120
exec method, 52–53
lack of errorHandler function for, 120
execFile method, 52–53
logger middleware module, 114–115
fork method, 53
next callback function for, 119, 120
running in Windows, 53–54
overview, 111–112
spawn method, 50–52
364 | Index
parseCookie middleware module, 115–118 createSocket method, 46
req parameter for, 119 createWriteStream method, 60
res parameter for, 119 Crossroads method, 123
static middleware module, 113–114 CRUD (create, read, update and delete)
connect method, 230 with node-mysql module, 237–239
Connect module, 70 with Sequelize module, 244–245
connectionListener callback, 40 Crypto module, 319–322
console object, 13 CSR (certificate-signing request), 316
constructor argument, 56 CSS (Cascading Style Sheets) engines, 153
content management system (CMS), 126, 323 cURL command, 150–151
Content-Length response header, 12, 265 custom middleware, 118–120
Content-Range response header, 265, 266 custom modules, 74–79
content-type, 12 package.json file for, 75–78
context, 37 packaging directory for, 75
control flow modules publishing, 78–79
Async module, 95–99 custom REPL (read-eval-print loop), 29–32,
Step module, 92–95 30
convert tool (ImageMagick), 261–262 CVS (Concurrent Versioning System), 359
cookieParser middleware module, 116, 133 Cygwin, 28
cookieSession middleware module, 115–118,
133
core
D
Buffer class, 39–40 -d flag, 68, 112
child processes, 50–54 Dahl, Ryan, 83
exec method, 52–53 data event, 43, 50
execFile method, 52–53 data type mapping
fork method, 53 for MongoDB, 210
running in Windows, 53–54 for Sequelize module, 242–243
spawn method, 50–52 db-mysql module, 230–237
DNS module, 54–56 chained methods, 236–237
EventEmitter object, 59–62 direct queries, 233–236
global namespace object, 36–38 overview, 230–233
HTTP module, 44–46 --debug flag, 290
Net module, 42–44 debugging, 287–291
process object, 38–39 built-in, 287–290
stream interface, 48–50 client-side, 290–291
timer functions, 40–41 decipher method, 319
UDP module, 46–47 deepEqual method, 294
Utilities module, 56–59 deferred, defined, 82
CoreJS framework, 127 delays, 82
CouchDB, 207 delete method, 236
crashes, application, 322 DELETE verb, 134, 139, 141, 142, 167
create, read, update and delete (CRUD) (see dependencies for modules, listing, 67
CRUD (create, read, update and deploying applications
delete)) to cloud service, 353–358
createClient method, 188 Amazon EC2 (Elastic Compute Cloud),
createHash method, 320 357
createReadStream method, 60, 105, 265, 266 Heroku, 357
createServer method, 12, 13, 40, 44, 112, 318 Joyent SmartMachines, 356
Nodejitsu, 357–358
Index | 365
Windows Azure, 354–356 encodings for strings, 40
to server, 345–353 encrypting data, 316–322
alongside another web server, 351–353 storing passwords, 319–322
improving performance of, 353 with HTTPS, 317–319
package.json file, 346–349 with TLS/SSL, 316–317
using Forever module, 349–351 enctype field, 253
deserializeUser method, 326 end method, 12, 42, 105, 135–137, 189
dev format, 115 equal method, 292–294
development environment, 2–10 ERB (embedded Ruby), 153
installing error event, 105, 231, 232, 255, 262
on Linux (Ubuntu), 2–4 error handling in Express framework, 132–
on Windows 7, 4–9 133
updating, 9–10 Error object, 83, 84
direct queries, 233–236 error parameter, 53, 132
directories field, 79 errorHandler function, 120, 132
directory middleware module, 133, 134, 146, escape method, 232
257 eval function, 337
disconnect method, 230 avoiding, 338
DNS module, 54–56 for custom REPL, 30
done method, 296–298 event loop, 1, 13–19
double quotes, 100 event-driven I/O (input/output), 1
downcase filter, 157 EventEmitter event, 230
Drizzle, 232 EventEmitter object, 42, 48, 59–62
Drupal, 223 exception handling, 84–90
dumpExceptions flag, 132 exec method, 52–53, 202, 203
execFile method, 52–53
execPath method, 38
E execute method, 240
each event, 232 exists method, 104, 105
each...in statement, 203 exit event, 52, 255, 262
echo event, 60 --exitcrash option, 351
ECMAScript, ix expect method, 296
EJS (embedded JavaScript) template system, exports statement, 74, 75
153–172 Express framework
filters for, 157–158 and Connect, 133–134
for Node, 155–156 app.js file, 129–132
syntax for, 154–155 error handling in, 132–133
using with Express, 158–172 installing, 128
displaying individual object, 166–168 module for, 70
generating picklist, 165–166 MVC structure with, 145–150
multiple object environment for, 160– routing, 134–145
161 and HTTP verbs, 139–145
processing object post, 163–165 path, 136–138
processing PUT request, 168–172 testing with cURL, 150–151
routing to static files, 161–162 using EJS template system with, 158–172
Emailjs module, 253 displaying individual object, 166–168
embedded Ruby (ERB), 153 generating picklist, 165–166
Ember.js framework, 127 multiple object environment for, 160–
emit method, 59, 60, 281 161
encoding parameter, 40
366 | Index
processing object post, 163–165
processing PUT request, 168–172
G
routing to static files, 161–162 -g option, 66
using Socket.IO with, 284–285 game leaderboard example, 190–195
extends keyword, 175 GD, 249
external modules, 65–69 Geddy framework, 127
get method, 134, 135, 137–141, 146, 160
GET verb, 134, 135, 139, 160
F Git, 330, 359–362
fail method, 295 GitHub, 65, 330, 359–362
favicon middleware module, 112, 118, 119, global installation of modules, 66
131 global namespace object, 35–38
Fedora system, 11 for modules, 24
FFmpeg, 249 in REPL, 25, 26
fibers, 92 --global option, 66
file servers, 103–110 globalAgent, 45
File System module, 15, 60 Google V8 JavaScript engine, ix, 1
files, reading asynchronously, 14–16 grep command, 51
filter function, 95 group object, 93
filters for EJS template system, 157–158
FIN (finish packet), 41
find command, 51
H
handshake, 316
find method, 213, 214
hash, 190
find options for MongoDB, 213–214
hash method, 319
findAndModify method, 217–221
headers object, 12
findAndRemove method, 217–221
heapTotal property, 39
finding modules, 69–71
heapUsed property, 39
findOne method, 213, 216
Hello, World example, 10–11
finish packet (FIN), 41
explained, 11–13
first filter, 157
for WebSockets, 281–284
flash method, 326, 327
Heroku, 357
Flatiron framework, 127
hgetall method, 193, 194, 202, 203
for loop, 94
hincrby method, 201, 204, 205
for...in statement, 203
hmac method, 319
forEach method, 88, 93, 95, 98
Holowaychuk, TJ, 130
Forever iFrame, 279
host parameter, 188
Forever module, 349–351
hset method, 189
--exitcrash option for, 351
HTML5
options for, 322–320
canvas content, 268–271
-s option for, 350
serving HTML5 video, 264–268
-v option for, 351
HTTP module, 32, 44–46
fork method, 53
HTTP verbs, 139–145
fork, defined, 360
http-proxy module, 124
format method, 55
HTTPS, encrypting data with, 317–319
forward proxy, 124
frameworks, defined, 127
fs (File System) module, 15 I
future, defined, 82 IDE (integrated development environment),
354
iisnode, 5
Index | 367
ImageMagick, 260–264
img element, 106, 269
L
immediate flag, 114 la option, 67
include directive, 175, 192 last argument callbacks, 83
incr method, 201 .leave command, 48
indenting code, 100 length parameter, 40
index function, 134, 135 length property, 166, 177
index.js file, 159 libraries
inherits method, 56–58, 60, 61, 77 defined, 127
injection attacks, 337–338 installing, 2
insert method, 211, 235, 236, 240 requirements, 2
inspect method, 56 libssl-dev, 2
install event, 347 limit option, and findOne method, 216
installing Linux
development environment installing development environment on, 2–
on Linux (Ubuntu), 2–4 4
on Mac, 1 making REPL executable, 32
on Windows 7, 4–9 list option, 67
Express framework, 128 listen method, 11–13, 15
libraries, 2 listening event, 13, 15
Redis module, 188–190 ll option, 67
integrated development environment (IDE), load
354 balancing using reverse proxy, 124
IPC (interprocess communication), 44 testing with Nodeload module, 311–313
isAuthenticated method, 325 loading modules, 63–65
isEqual method, 294 local installation of modules, 66
log method, 25, 31, 37, 47, 88
log, changing colors in, 71
J logger middleware module, 114–115, 116, 119,
Jade file, 192 131
Jade module, 70 lookup method, 54
Jade template system, 172–180 ls option, 67
modularizing views in, 174–180
syntax for, 172–174
Janczuk, Tomasz, 5
M
Jasmine framework, 298–299 -m flag, 360
JavaScript as basis for Node, ix, 1 Mac, installing on, 1
JavaScript Gateway Interface (JSGI), 111 main property, 75
journaling option, 214 map function, 95
Joyent SmartMachines, 356 maxLength option, 133
.js files for modules, 64 maxObjects option, 133
JSDOM module, 71 maxSockets property, 45
JSGI (JavaScript Gateway Interface), 111 McMahon, Caolan, 95
.json files for modules, 64 Memcached key/value store, 187
memoization, 95
memoryUsage method, 39
K Mercurial repositories, 330
keepGoing option, 213 message event, 47
keyboard shortcuts in REPL, 27 message queue
defined, 196
368 | Index
example using Redis, 196–201 querying data, 213–217
methodOverride option, 131, 142 remove method, 217–221
Microsoft Visual C++ 2010 Redistributable update method, 217–221
Package, 5 update modifiers for, 218–219
Microsoft Web Platform Installer, 5 Mongoose, 221–227
middleware, 110–120 adding database connection, 223–227
cookieSession middleware module, 115– refactoring widget for, 222–223
118 multi method, 201, 203
creating custom middleware, 118–120 multi parameter, 219
logger middleware module, 114–115 multiline code in REPL, 24–32
overview, 111–112 multipart/form-data content type, 253
parseCookie middleware module, 115–118 multiple object environment, 160–161
static middleware module, 113–114 multiple requests, and string values, 17
Mime module, 71 MVC (Model-View-Controller) framework and
MIME type, 107 Express, xi, 145–150
minimal static file server, 103–110 MX record, 54
mixin function, 73, 74 MySQL databases
Mocha framework, 297–298 db-mysql module, 230–237
Model-View-Controller (MVC) framework and chained methods, 236–237
Express (see MVC (Model-View- direct queries, 233–236
Controller) framework) overview, 230–233
modules node-mysql module, 237–242
Colors module, 71–72 CRUD with, 237–239
custom, 74–79 transactions with, 239–242
package.json file for, 75–78 mysql-queues module, 239–242
packaging directory, 75 mysql-series module, 229
publishing, 78–79
defined, 12
dependencies for, 67
N
external, 65–69 name property, 75
finding, 69–71 naming callback functions, 90
global installation of, 66 nested callbacks, 84–90, 209
global namespace for, 24 Net module, 42–44
loading with require statement, 63–65 next callback function, 118–120
local installation of, 66 next command, 288
Optimist module, 72–73 next parameter, 132
Underscore module, 73–74 nextTick method, 39, 91
MongoDB node command, 11
and asynchronous functionality, 220–221 .node files for modules, 64
data type mapping for, 210 Node Inspector, 290–291
find options for, 213–214 Node Package Manager (npm), xi
Mongoose with, 221–227 Node Style, 100–101
adding database connection, 223–227 Node Version Manager (Nvm), 9
refactoring widget for, 222–223 node-mysql module, 237–242
native driver for, 208–221 CRUD with, 237–239
collections in, 209–213 transactions with, 239–242
findAndModify method, 217–221 node-validator module, 339–340
findAndRemove method, 217–221 Nodejitsu, 357–358
overview, 208–209 Nodeload module
flags for, 311–312
Index | 369
load testing with, 311–313 passwords, encrypting, 319–322
Nodemon module, 313–314 PATH environment variable, 3
Nodeunit module, 296–297 path routing in Express framework, 136–138
NODE_ENV variable, 130 pattern attribute, 140
node_modules folder, 64 PDF files, 249–260
normalize method, 162 using PDF Toolkit
NoSQL databases, 187, 207 accessing data about file with, 251–252
npm (Node Package Manager), xi, 65 creating files with, 258–260
.npmignore list, 79 creating page to upload files, 252–257
NS record, 54 wkhtmltopdf utility, 250–251
Nvm (Node Version Manager), 9 PEM (privacy-enhanced mail) format, 316
performance
benchmark testing with ApacheBench
O module, 307–311
OAuth, 323–324, 331–337 improving, 353
object-relational mapping (ORM), 229 load testing with Nodeload module, 311–
ODM (object-document mapping), 221 313
offset parameter, 40 picklist
on method, 42, 59, 61, 62 defined, 165
onclick event handler, 81 generating, 165–166
open event, 105 pipe
open method, 154 defined, 48
open text fields, avoiding, 338–339 method, 105
OpenFeint, 190 placeholders, 233
OpenID, 323–324 platform method, 38
OpenSSL, 316 Polaroid effect, 262
Optimist module, 70, 72–73 poolSize option, 208
ORM (object-relational mapping), 229 port parameter, 188
os module, 32 post method, 134, 140
out command, 288 POST verb, 134, 139
output stream, 112 prefix configuration option, 3
overhead of single thread process, 19 prefork MPM (prefork multiprocessing model),
13
P preinstall event, 347
print method, 189, 190
package.json files
privacy-enhanced mail (PEM) format, 316
deploying to servers, 346–349
private keys, 316
for custom modules, 75–78
process method, 39
generating, 76
process object, 38–39
required fields in, 75
Procfile, 337
packaging directory, 75
profile parameter, 332
parallel method, 95, 98, 193
program flow and asynchronous functionality,
parse method, 55, 338
16–19
parseBody method, 254
promises vs. callback functions, 81–84
parseCookie middleware module, 115–118
proxies, 123–126, 123
passphrase, 317
public keys, 316
Passport module, 322–337
publish event, 347
storing locally with, 324–331
pull request, 361
using OAuth with, 323–324, 331–337
put method, 146
using OpenID with, 323–324
370 | Index
PUT verb, 134, 141, 142 Sequelize module, 242–247
PuTTY, 31 adding several objects at once, 246–247
pwd command, 50 CRUD with, 244–245
pyramid of doom, 90 defining model, 242–243
Python, 2 issues with, 247
remoteAddress property, 43
remotePort property, 43
Q remove method, 217–221
qs variable, 24 remove option, 220
query method, 230, 237 render method, 134, 135, 156, 159, 160, 167,
Query String module, 55 169, 202
querying in MongoDB, 213–217 renderFile method, 155
querystring object in REPL, 24 REPL (read-eval-print loop)
quit method, 189 > prompt in, 29
arrow keys in, 26
R autocomplete text in, 27
benefits of, 23
RailwayJS framework, 151
.break command in, 27
rainbow table, 320
.clear command in, 27, 31
Ranney, Matt, 188
command-line prompt for, 21
RavenDB, 207
commands for, 27–28
read-eval-print loop (REPL) (see REPL (read-
{} (curly braces) in, 24
eval-print loop))
global object in, 25, 26
readFile method, 15, 17, 93–96, 104, 105, 341
http module in, 32
readFileSync function, 44
keyboard shortcuts in, 27
reading files asynchronously, 14–16
log command in, 31
Readline module, 48
log method in, 25
README file, 362
making executable in Linux, 32
ready event, 231
multiline code in, 24–32
reasonPhrase method, 12
os module in, 32
reddish-proxy, 124
overview, 21–23
redirect method, 161
process.stdin in, 29
Redis key/value store, 187
qs variable in, 24
Redis module, 70, 201
querystring object in, 24
Redis support
require statements in, 24
game leaderboard example, 190–195
rlwrap utility in, 28–29, 32, 33
installing module for, 188–190
.save command in, 27, 28
message queue example, 196–201
saving in, 32–33
stats middleware using, 201–205
start method in, 29
refactoring, 222
stream option in, 30
refreshing code changes, 313–314
using custom, 29–32
regular expressions in routes, 122, 136
util module in, 32
relational database bindings
var keyword in, 22, 24
db-mysql module, 230–237
_ (underscore) in, 22
chained methods, 236–237
repl object, 29
direct queries, 233–236
replace method, 87
overview, 230–233
repository, 360
node-mysql module, 237–242
Representational State Transfer (REST), 131
CRUD with, 237–239
req parameter, 119
transactions with, 239–242
Index | 371
Request module, 70 secure shell (SSH) (see SSH (secure shell))
request object, 45, 46 Secure Sockets Layer (SSL), 316–317
request parameter, 132 security
requestListener method, 12, 44 authentication/authorization with Passport
require statements module, 322–337
in REPL, 24 locally stored, 324–331
loading modules using, 63–65, 74 using OAuth, 323–324, 331–337
section in file, 181 using OpenID, 323–324
required attribute, 140, 254 encrypting data, 316–322
requirements storing passwords, 319–322
libraries, 2 with HTTPS, 317–319
Python, 2 with TLS/SSL, 316–317
res parameter, 119 protecting against attacks, 337–340
resolve method, 54, 64 avoiding eval function, 338
response headers, 12 avoiding open text fields, 338–339
response parameter, 132 sanitizing data with node-validator
REST (Representational State Transfer), 131 module, 339–340
resume method, 38 sandboxing code, 340–343
reverse method, 54 Selenium, 301
reverse proxy, 124 self-signed certificates, 316
rewriting web requests, 324 semicolons, 100
rlwrap utility, 28–29, 32, 33, 49, 50 send method, 136, 137, 167, 281
rollback method, 240 sendfile method, 161
router middleware module, 131, 133, 146 Sequelize module, 242–247
routing, 121–123 adding several objects at once, 246–247
* (asterisk) in, 137 CRUD with, 244–245
in Express framework, 134–145 defining model, 242–243
and HTTP verbs, 139–145 issues with, 247
path, 136–138 sequence, 91, 92
regular expressions in, 122, 136 sequential functionality, 84–90
to static files, 161–162 sequential programming, 85
rpush method, 198 serial method, 95, 98
Ruby on Rails, 145 serializeFunction parameter, 219
runInContext method, 342, 343 serializeUser method, 326
runInThisContext methods, 343 series
defined, 91, 92
method, 193, 194, 195
S Server constructor, 208
-s option, 350 ServerRequest object, 44
sadd method, 201 ServerResponse object, 12, 44
safe parameter, 219 servers
salt, 320 deploying to, 345–353
sandboxing, 340–343, 340 alongside another web server, 351–353
Sanderson, Steve, 7 improving performance of, 353
sanitizing data package.json file, 346–349
sanitize method, 339, 340 using Forever module, 349–351
with node-validator module, 339–340 minimal static file server, 103–110
Sauce Labs, 301 session middleware module, 326
.save command, 27, 28 set method, 130, 201, 236
script element, 279
372 | Index
setBreakpoint command, 289 stringEqual method, 293
setEncoding method, 39, 40, 43, 48, 51 strings, encodings for, 40
setInterval function, 41 .styl files, 182
setMaxListeners method, 62 style tag, 192
setTimeout function, 17, 39, 40, 41 Stylus
sha1 algorithm, 320 in template systems, 180–184
shared hosting, 4 no dynamic CSS views in, 181
showStack flag, 132 Subversion, 359–362
sign method, 319 success event, 232, 233
SimpleDB, 207 sudo command, 4
single quotes, 100 superConstructor argument, 56
single thread superuser privileges, 4
for Node, ix syntax for EJS template system, 154–155
overhead of, 19
Socket.IO module, 70
and WebSockets, 274–279
T
configuring, 279–281 tail command, 197
using with Express, 284–285 TCP (Transmission Control Protocol), 40, 273
sockets, 41 template systems
Soda module, 301–305 EJS (embedded JavaScript) template system,
sorted set, 190 153–172
spawn method, 50–52, 50 filters for, 157–158
SSH (secure shell), 356, 359 for Node, 155–156
SSL (Secure Sockets Layer), 316–317 syntax for, 154–155
stack property, 87 using with Express, 158–172
Standard IO (STDIO), 36 Jade template system, 172–180
start event, 347 modularizing views in, 174–180
start method, 29 syntax for, 172–174
startnum/endnum values, 265 Stylus in, 180–184
stat command, 89 test event, 347
static files testing
routing to, 161–162 acceptance testing, 301–306
server for, 103–110 with Soda module, 301–305
static middleware module, 113–114, 131 with Tobi module, 305–306
static middleware option, 113, 114 with Zombie module, 305–306
staticCache middleware module, 133 in all browsers, 107
stats method, 89, 98 performance testing, 306–313
stats middleware module, 201–205 benchmark testing with ApacheBench
stderr stream, 38, 50, 51, 132 module, 307–311
stdin stream, 38, 46, 48, 51, 61 load testing with Nodeload module,
STDIO (Standard IO), 36 311–313
stdout stream, 38, 48, 50, 51, 114 unit testing, 292–301
step command, 288 with Assert module, 292–295
Step module, 92–95 with Jasmine framework, 298–299
Strata framework, 151 with Mocha framework, 297–298
stream interface, 48–50 with Nodeunit module, 296–297
stream option, 30 with Vows framework, 299–301
strict equality operator, 100 text/html content type, 107
strictEqual method, 293, 294 third-party authentication/authorization, 322
this context keyword, 58, 93
Index | 373
V413HAV
time-consuming operations, 15
timer functions, 40–41
V
TLS (Transport Layer Security), 41, 257, 316– -v option, 351
317 var keyword, 19, 20, 22, 24, 100
Tobi module, 305–306 verify method, 319
token parameter, 332 version method, 38
tokenSecret parameter, 332 video element, 106, 264–268
toString method, 47 virtual private network (VPN), 357
Tower.js framework, 151 VOIP (Voice over Internet Protocol), 46
transactions support, 239–242 Vows framework, 299–301
Transmission Control Protocol (TCP) (see TCP VPN (virtual private network), 357
(Transmission Control Protocol))
Transport Layer Security (TLS) (see TLS W
(Transport Layer Security)) W3C (World Wide Web Consortium), 273
transports option, 280 waterfall method, 92, 95, 96, 193
Triple-DES encryption, 316 WebDriver, 301
trusted authorities, 316 WebGL, 249
try blocks, 85 WebSockets protocol, 273–274
Twitter, 331 and Socket.IO, 274–279
type parameter, 122 browser support for, 274
client side requirements, 279
U Hello, World example, 281–284
Ubuntu, 2–4 in asynchronous application, 278–279
UDP (User Datagram Protocol), 41, 46–47 simple example using, 274–277
Uglify-js module, 70 where method, 236
Underscore module, 70, 73–74 Widget Factory, 337
unidirectional, 273 Windows 7
unit testing, 292–301 child processes in, 53–54
with Assert module, 292–295 installing development environment on, 4–
with Jasmine framework, 298–299 9
with Mocha framework, 297–298 Windows Azure, 354–356
with Nodeunit module, 296–297 wkhtmltopdf utility, 250–251
with Vows framework, 299–301 worker MPM (prefork multiprocessing model),
update event, 347 13
update method, 217–221, 235, 236, 240 World Wide Web Consortium (W3C), 273
update modifiers for MongoDB, 218–219 write method, 40, 43, 136, 137
upload files page, 252–257 writeFile method, 94, 96, 98
uppercase, use of, 100 writeHead method, 12
upserts
defined, 217 Z
parameter, 219 zero-sized chunk, 44
URL module, 55 Zombie module, 305–306
url property, 104 zrange method, 192
use method, 112
useGlobal flag, 37
User Datagram Protocol (UDP) (see UDP (User
Datagram Protocol))
Utilities module, 32, 56–59
374 | Index
About the Author
Shelley Powers has been working with, and writing about, web technologies—from the
first release of JavaScript to the latest graphics and design tools—for more than 12
years. Her recent O’Reilly books have covered the semantic web, Ajax, JavaScript, and
web graphics. She’s an avid amateur photographer and web development aficionado,
who enjoys applying her latest experiments on her many websites.
Colophon
The animal on the cover of Learning Node is a hamster rat (Beamys). There are two
species of hamster rats: the greater hamster rat (Beamys major) and the lesser hamster
rat (Beamys hindei).
The hamster rat inhabits the African forests from Kenya to Tanzania. This large rodent
prefers to make its home in moist environments: along riverbanks and in thickly-for-
ested areas. It thrives in coastal or mountainous regions, although deforestation threat-
ens its natural habitat. Hamster rats live in multichambered burrows and are excellent
climbers.
This rodent has a very distinct appearance: it can be 7 to 12 inches long and weigh up
to a third of a pound. It has a short head and gray fur overall, with a white belly and a
mottled black and white tail. The hamster rat, like other rodents, has a variable diet; it
possesses cheek pouches for food storage.
The cover image is from Shaw’s Zoology. The cover font is Adobe ITC Garamond. The
text font is Linotype Birka; the heading font is Adobe Myriad Condensed; and the code
font is LucasFont’s TheSansMonoCondensed.