Teleport: Tutorial and Manual
Teleport: Tutorial and Manual
Teleport: Tutorial and Manual
Congratulations! You have obtained the most powerful offline browsing, site mirroring, and file-retrieving
tool on the Internet.
Part Swiss Army knife, part chainsaw, Teleport is a fully automated, multithreaded, link-following, file-
retrieving webspider. It will retrieve all the files you want — and only the files you want — from any part
of the Internet. Teleport can also:
Completely download a website, enabling you to “offline browse” the site at much greater speeds than
if you were to browse the site online
Create an exact duplicate, or “mirror” of a website, complete with subdirectory structure and all
required files
Search a website for files of a certain type (and size)
Automatically download a list of files from the Internet
Explore every website linked from a central website
Search a website for keywords
Make a list of all pages and files on a website
No more waiting for slow pages to download. No more clicking on links for hours, only to find garbage at
the end of your trail. Teleport can completely automate your website searching, mirroring, publishing, and
downloading tasks.
Contents
Contents .......................................................................................................................................................... 1
Using Teleport ................................................................................................................................................ 3
How Teleport Explores................................................................................................................................... 3
Address Properties .......................................................................................................................................... 4
Offline Browsing ............................................................................................................................................ 4
Opening and Working With Retrieved Files................................................................................................... 4
New Project Wizard ....................................................................................................................................... 4
Tutorial: Overview ........................................................................................................................................ 5
Tutorial: Creating a New Project ................................................................................................................... 5
Tutorial: Saving the Project ........................................................................................................................... 6
Tutorial: Running the Project ........................................................................................................................ 6
Tutorial: Viewing the Results ........................................................................................................................ 7
Other Sample Projects .................................................................................................................................... 7
Project Summary Page .................................................................................................................................... 8
Using Teleport
To use Teleport, you create a project file that contains one or more addresses to files on the Internet. You
also give Teleport some rules that define what links it will follow and what files it will retrieve. You then
send the spider on its mission by selecting the Start command on the File menu, or by pressing the Start
button on the toolbar.
Once activated, the Teleport spider will read your project's starting addresses and retrieve any files that it
finds there. It then reads all of the links on that page, follows those links, gets the files on those pages, and
so on, and so on, and so on... until it runs out of places to go.
You can tell Teleport to retrieve only certain types of files, and to follow only certain types of links. For
example, you may direct it to retrieve only jpg and gif files, the usual types of graphics files on the World
Wide Web. You may also direct it to follow only links within the same domain as the starting address, and
even set its “depth” of search. Your "program" for the spider's behavior will determine how far it goes, how
long it takes, and what types of files it will get.
The Teleport spider is extremely flexible. It has many customizable exploration parameters for specifying
which types links to follow and which types of files to retrieve. Most of the time, however, you can let the
New Project Wizard set up your project’s exploration parameters for you. The New Project Wizard will
usually choose the best parameters for most common Teleport tasks.
Address Properties
The Teleport spider remembers the address of every file or html page that it reads. Each stored address also
has an enable/disable flag that tells Teleport whether it should explore that address again, when it becomes
out-of-date. You can view the address properties, and change the enable/disable flag, by using the
Properties, Enable, and Disable commands from the file context menu.
Note: When a page in the Project Map is disabled, all pages and files linked from that page are also
disabled. This allows you to disable an entire domain, for example, by disabling its gateway page.
Note: Teleport Pro can retrieve and explore only files with HTTP (World Wide Web) and FTP (File
Transfer Protocol) addresses, and files on a local or network drive. Teleport Ultra and Teleport VLX can
also retrieve files from secure servers (those with HTTPS addresses).
Offline Browsing
You can use Teleport to create a local copy of a complete website on your hard drive, allowing you to
“offline browse” the site later at much greater speeds than if you were to browse the site online.
The easiest way to offline browse a site is to create a new project using the New Project Wizard, and select
the first type of project, “Create a browsable copy of a website on my hard drive.” The New Project
Wizard will ask you for the starting address of the website, and then whether you want to copy only text
(html) files, or also graphics and sound files. When you’ve finished creating and saving the new project,
just press the Start button on the toolbar to run the project.
When the project is complete, open the first web page by right-clicking it in the Project Map to display the
file context menu, then select the Open command. Teleport will open the page in your default browser, and
you can then browse the site offline.
select. The project will initially contain one Starting Address (though you may of course add more later),
and the spider will be instructed to stay within the path of that address when exploring. Note that because
all of the files will be stored in a single folder on your hard drive, Teleport may rename some of them to
avoid filename collisions.
Duplicate a website, including directory structure is used for “site mirroring” and will create a “tiered”
copy of a website’s files on your hard drive, where every file is stored in a subfolder of your project’s main
folder, mirroring its storage location on the remote server. Use this project type if you would like to ensure
that the filenames Teleport uses are identical to those on the remote server, or if you just want to see or use
its directory structure. (When creating a “flat” copy of a website, as noted above, Teleport may rename
some files to avoid filename collisions.)
Search a website for files of a certain type is the project type of choice when you’re hunting for graphics
files, background images, sound files, or even ZIP files or programs! This option creates a project that
starts at a single Starting Address, then explores outward within the path of that address, looking for files of
a certain type. It will not create a browsable copy of the website, but if you don’t want web pages, this type
of project will run faster and fill your disk with only the types of files that you want.
Explore every site linked from a central site is the ideal project type if you want to just get an idea for
what types of files are contained in several sites that are linked from a “link page” or “link site,” such as a
Top 100 Websites list. This project does not retrieve files; it only retrieves their names, but it stores them
in the File List as if they were files. This type of project explores very rapidly because it does not need to
write any files to disk. You can easily change this, however, once you’ve determined that one or more of
the sites it’s explored have something you want. You can use the Retrieve Now function to grab individual
files, or disable the “Retrieve only file names” option on the Project Properties Retrieval page, and rerun the
project.
Retrieve one or more files at known addresses is convenient if you have a list of addresses you would
like to poll for data, or if you’d like to just download a set of files. The Wizard will display a multiline text
box in which you can enter (or paste) a list of addresses, one per line. The Wizard will then create a project
containing each of these addresses as a Starting Address. The spider will be instructed to explore only the
starting address page, so it will not follow any links away from the pages or files you specify.
Search a website for keywords will create a project that is identical to Create a browsable copy of a
website on my hard drive (above), except that every web page the spider encounters will be tested to see if
it contains any one or more of a set of keywords. Web pages, and any files embedded in them, will be
retrieved only if they contain one or more of your keywords.
Tutorial: Overview
This tutorial will show you step-by-step how to create and run a Teleport project. The Internet site used for
this tutorial is The State Hermitage Museum of St. Petersburg, Russia. It is a beautiful website, but it can be
slow because it is so graphics-intensive. Teleporting it to your hard drive makes browsing it fast and fun.
The State Hermitage Museum website is located at www.hermitagemuseum.org.
There are four essential steps to creating and running a Teleport project:
1 Create a new project
2 Save the project
3 Run the project
4 View the results
but then you will have to set your Project Properties manually.) The New Project Wizard will ask you a
series of questions to determine what you would like to do, and then create a project for you that will do it.
Page 1: Six of the most common Teleport tasks are presented in the Wizard’s first page.
For this tutorial, choose “Create a browsable copy of a website on my hard drive.”
Page 2: The Wizard will next ask you to enter the Starting Address for this project. The Starting Address
is the first place Teleport will begin looking for files. It will then follow links away from the Starting
Address, retrieving files as it goes, until it has exhausted all possible links.
For this tutorial, the starting address is www.hermitagemuseum.org and we want to set the depth
to 2 links from the starting point. The Hermitage Museum has a very large website, so even a depth of
two will take some time to run. Two is sufficient for this tutorial. If you are interested in the site, you
can simply change the depth later to three or more, and Teleport more of the site.
Note that Internet addresses are case-sensitive. Be sure that the address you type is exactly correct. A
mistyped letter or punctuation mark will invalidate the address. Copying and pasting an address out of your
browser is a good way to ensure you have a valid address. If Teleport doesn't retrieve any files or follow
any links for your starting address, an invalid address is the most common reason.
Note: Teleport can retrieve and explore only files with HTTP (World Wide Web) and FTP (File Transfer
Protocol) addresses, and files on a local or network drive. To use a file on a local or network drive, enter
the path to the file using ordinary Windows notation, e.g., “c:\temp\myfile.htm”.
You can also give your starting address an optional title.
Page 3: The Wizard will now ask you some basic questions about what you want your project to do. When
you want to “create a browsable copy of a website,” the Wizard will ask you what types of files you want to
retrieve for the website.
For this tutorial, select the default, “Everything”. You do not need to enter an account or password.
When you finally press the Next button to confirm your answers to the Wizard’s questions, the Wizard will
create a fully configured project for you.
Page 4: You must now press the Finish button to display the Save As dialog box, and save the project.
At this point, you may minimize the program window, and simply let it run in the background.
Note: You might like to observe the tutorial project as it runs. The project will typically retrieve about 200
files.
After the project has run (and also while it's running), you can view the results in the project window.
Retrieval Settings
The Project Properties File Retrieval Page lists a number of options that allow you to specify what types
and sizes the Teleport spider will retrieve, where it will store them, and whether it will modify them for
offline browsing. Click on the Project Properties button on the toolbar, or select Properties from the Project
menu, to bring up the Project Properties sheet. Select the File Retrieval tab to display that page.
Retrieve All Files (with a maximum size limitation): When this option is checked, Teleport will retrieve all
the files it encounters, regardless of name, type, or extension. For safety reasons, however, you can specify
a maximum file size that will prevent Teleport from trying to retrieve huge files.
Note: Setting the maximum size to zero disables the maximum size limitation.
Note: Teleport can only filter files based on size when the remote server transmits the file size. Some
servers do not send size information. In these cases, Teleport will continue retrieving the file until the
number of bytes received exceeds the size limitation, and then terminate the retrieval.
Retrieval Types: You can tell Teleport to retrieve only files of a specified type and size. When you select
Retrieval by Type and Size, use the Add, Edit, and Delete buttons to set the Retrieval Types list.
User Defined Types: The Add button menu has a preset list of the most common file types found on the
Internet. You can also create your own file type categories, by adding a “User Defined” file specification.
Teleport will display the Edit File Types dialog box. Enter a short description for your new category, and
then a list of DOS-style filename patterns separated by semicolons. There are two wildcard characters used
in DOS-style filename patterns:
An asterisk ( * ) matches zero or more of any characters. For example, the filename pattern *.cgi
matches any filename having the cgi extension, such as test.cgi, but not test.cgid.. The filename
pattern bob*.* matches any filename beginning with bob, such as boba.jpg or bob.gif or even bob or
bobble.
A question mark ( ? ) matches any single character. For example, the filename pattern star???.jpg
matches any filename beginning with star and having three more letters, and then the extension jpg.
This would match star005.jpg or starting.jpg, but not star.jpg or starry.jpg.
You can also specify file size ranges (in kilobytes). This is often a good method of ensuring that Teleport
only retrieves, for example, high quality graphics files — just set the minimum file size at 16k. Or, if you
want to get only basic embedded graphics but not large image files, set the maximum file size to 8k.
Note: Setting the minimum or maximum size to zero disables that limit. So a minimum of 16, maximum of
zero, will allow Teleport to retrieve any files 16k or larger, no matter how big.
Note: Teleport can only filter files based on size when the remote server transmits the file size. Some
remote servers do not send size information. In these cases, Teleport will continue retrieving the file until
the number of bytes received exceeds the size limitation, and then terminates the retrieval.
Retrieval Modes:
Retrieve Embedded Files tells the Teleport spider to get files that are embedded on web pages— that
is, graphics or video files that appear as part of the page. You will usually want this option on, but if
you are looking only for large files you may wish to turn it off.
Retrieve Background Files tells the Teleport spider to get embedded graphics and sound files that
appear in the “background” of a web page.
Retrieve Java Applets tells Teleport to retrieve embedded Java applets from web pages when it finds
them. Teleport does its best to locate all of the files an applet will need, but for security reasons,
Teleport does not the execute applets during the project session. Some Java applets will, unfortunately,
seek to load additional files from the server when they are run later (such as when browsing the site
offline). If these applets fail to locate the files they need, they may not run or will run incompletely.
For this reason, some Java applets cannot be completely loaded and viewed offline.
Retrieve Names Only tells the Teleport spider NOT to get the actual file, but only to determine if it
meets your other retrieval specifications, and, if so, to place its name in the File List. When this option
is on, Teleport will appear to work very quickly, because it is not actually retrieving files. You will not
be able to open the files it lists in the File List — but you can use the Retrieve Now command on the
file context menu to retrieve them later. This is usually a good option for rapidly exploring a large area
of the Internet.
Teleport Ultra and Teleport VLX only: The Accept HTML error pages as actual content option tells
Teleport to accept HTML error pages — that is, the content of pages sent with error codes 400 through 599
— as actual content to be read and stored. By default, Teleport ignores the content of error pages, as it
rarely contains valid or useful information, but in some rare cases it may be useful to keep this content in
the project. Default: Off.
Teleport Ultra and Teleport VLX only: The Set file date to match server-reported file date option tells
Teleport to set the local file modification date to match the file modification date reported by the server
(where available). Not all servers report file modification dates, and many report them only for some files
(usually actual files, and not for dynamic web pages and other generated responses). Default: Off.
Localize links for retrieved files causes the links in saved HTML pages to be “localized,” or altered to
point to files in your project folder, if those files are retrieved by Teleport. If you want to offline browse a
site, you should enable this option.
Links for unretrieved files controls how Teleport rewrites links for files that it doesn’t retrieve. There are
three ways for Teleport to handle these links:
Link to a message tells Teleport to rewrite the link as a short message that explains why that file was not
retrieved. The message will also contain a link directly to the Internet, which you may use with your
browser to continue exploration out onto the Internet.
Teleport Ultra and Teleport VLX only: The content and format of these messages can be defined using the
Message Options command under the File menu.
Link to the Internet address for the file tells Teleport to rewrite the link so that it is “externalized” to
point back out to the Internet. This means that Teleport will rewrite the link as, for example,
“http://www.tenmax.com”.
Link to a place where the local file will be stored tells Teleport to “predictively” link to a blank location
where that file will be stored, by Teleport, when the file is retrieved later. Use this option when building a
local website incrementally; you will not need to relink later when new files are retrieved in the project.
Note: Teleport will always “predictively” localize the links for embedded files, such as sound and
graphics files, background graphics, and Java applets. These links are never externalized or rewritten as a
message, because you cannot click on them.
Link using 8.3 filenames tells Teleport to link localized files using the old DOS 8.3 filename system.
Teleport will continue to write files using their long filenames; however, the localized links it creates for
the web pages it writes will use the 8.3 filename. This ensures the local website copy is usable on both long
filename and short filename systems.
The Relink all files in the project now button can be used to immediately rewrite the links for all HTML
files in the project folder, using the current linkage system specifications.
Exploration Settings
The Project Properties Exploration Page lists a number of options that allow you to control how the
Teleport spider explores, how it asks for information, and how it updates files. Most of the time you will
not want to alter these settings. Advanced users, however, may find them useful. Click on the Project
Properties button on the toolbar, or select Properties from the Project menu, to bring up the Project
Properties sheet. Select the Exploration tab to display that page.
Explore server-side image maps tells the Teleport spider to “ping” server-side image maps to find their
links. Teleport will ping the entire image map at intervals that you can specify; usually a setting between 20
and 40 pixels is sufficiently accurate to pick up most links from a server-side map. Even though the
individual pings are very fast, pinging a map can take quite a while to complete. When localizing links,
Teleport will convert the server-side map into a client-side map, so that you can offline browse it. Default:
On.
Note: Because pinging a server-side map is a slow process, and because each “ping” query imposes only
a small burden on the server, Teleport will ignore the query delay setting in the project’s Netiquette
properties page, and will instead ping the server with all ten threads, as fast as possible.
Explore frames lets the Teleport spider explore frame links, such as those read by some browsers. Default:
On.
Explore forms tells the Teleport spider to query forms as if it were the user. Because Teleport can’t know
your answer to most forms questions, exploring forms is usually ineffective except for simple forms that are
composed of only push buttons. However, Teleport can handle more complex forms if they contain only
hidden data. Default: On.
Process script and event code instructs Teleport to look for and process standard javascript commands
that can open new web pages or load images. When this option is enabled, Teleport will find commands
such as “window.open,” which can cause browsers to open new web pages. Teleport’s script handling
capabilities are, however, limited. The program can only find script links which require no interpretation
(that is, which do not require any processing of the script language to create). More complicated scripts,
such as those that create new links by manipulating string values, cannot be handled. Default: On.
Accept and return cookies lets Teleport accept and return “cookies,” which are small data tags that servers
use to identify and track clients (such as your browser, or Teleport). Teleport won’t store the cookies on
your computer -- it remembers the cookies only as long as it needs to explore the site. Some servers won’t
send data to a client unless the client send back the cookies, so this option is on by default. Turning it off
gains you a small measure of privacy and efficiency, but it can also mean that Teleport can’t fully explore
some sites.
Launch ___ retrieval threads tells Teleport how many simultaneous data requests it can make. Default:
10. Note that Server Overload Protection™ may sometimes reduce the number of simultaneous data
requests to avoid overloading remote servers. In addition, if Teleport is metering its requests to the same
server (see Project Properties, Netiquette page) often fewer than all 10 threads will be launched. You are
likely to see all ten threads operating simultaneously only when Teleport can query two or more servers at
the same time.
Abort threads that show no activity after ___ seconds causes Teleport to abandon a retrieval attempt if
the remote server does not respond after a certain time. Default: 360 seconds.
Retry denied requests ___ times tells the Teleport spider to re-query a server for files when the server
denies a request because it is too busy. During high traffic times, some servers may issue a curt
“unavailable” response. You can usually get such a file with your browser by repeatedly pressing the
Refresh or Reload button. Teleport does this for you automatically, really fast, until it gets the file or it
exceeds the number you set here. Default: 5.
Retry incomplete requests ___ times tells Teleport to verify each file that it retrieves, and if the file is
incomplete or corrupted, to re-request the file until it is correctly retrieved (or the maximum number of
requests is exceeded). This option is useful when dealing with a slow or stubborn server; such servers,
especially during peak traffic periods, will sometimes drop the Internet connection during large file transfers
(especially graphics files), causing the file that Teleport retrieves to be incomplete or corrupted. Enabling
this option will usually guarantee that every file Teleport loads is accurate and complete, but at the same
time it can significantly slow the spider’s progress, if a server is unresponsive. Default: 5.
Updating controls how Teleport updates files that it has previously retrieved. When Teleport updates a
file, it queries the remote server, asking whether the file has changed since the last time Teleport retrieved
it. If the file has changed, Teleport retrieves the new file — automatically overwriting the old one. If the
file hasn’t changed, Teleport does nothing.
Update only good/bad/both files causes Teleport to update only files that were correctly retrieved (e.g.,
they were properly linked and the server responded by sending the file); only those that were unavailable
(usually because of a bad link or a server error); or both. You probably don’t want Teleport to retry bad
files unless you believe they were bad because of a temporary server error; most bad links are unlikely to
become good later. Default: Good files only.
Update HTML/embedded/server-side maps/all other files determines what types of files Teleport will
try to update. Most of the time, only the HTML files change on a site, except for those that have “picture of
the week” or “sound of the week” files that change, but keep the same name. However, you can tell
Teleport to update all files, and even to requery server-side image maps. Default: Update only HTML
files.
Netiquette Settings
The Project Properties Netiquette Page lists a number of options that allow you to control how Teleport
behaves on the Internet. Netiquette (the practice of being polite on the Internet) is important not only for
human users, but also for automated agents like Teleport. Teleport is an extremely powerful tool, and can
impose a very large transaction burden on websites because of its querying speed and simultaneous retrieval
threads. Using the default netiquette settings on this page, however, prevents Teleport from overburdening
websites and intruding into areas marked as off-limits for robotic agents. For most projects, you should not
need to alter these settings, except for the Agent Identity setting. Advanced users and webmasters, however,
may find it useful to disable some of these features. Click on the Project Properties button on the toolbar,
or select Properties from the Project menu, to bring up the Project Properties sheet. Select the Netiquette
tab to display this page.
Domain Dispersed Querying™ is an important feature that, when enabled, causes the Teleport spider to
spread simultaneous requests as far apart as possible. This enables each retrieval thread to work to its
fullest potential, because if one server is slow in responding, other retrieval threads will be connecting to
different servers and will not be affected. Domain Dispersed Querying™ can often boost your overall
throughput by 20% to 50%, when Teleport has to visit more than one domain simultaneously. This setting
also prevents Teleport from jamming a single server with requests, when there are other servers that it can
query at the same time. Default: On.
Server Overload Protection™ is another important feature that, when enabled, prevents Teleport from
overloading intermediate webservers. Because Teleport can launch multiple retrieval threads, it can
sometimes ask for more data than your modem connection can handle! Server Overload Protection™ slows
down Teleport’s requests whenever a dangerous amount of data is already being transmitted. When this
option is turned off, you risk losing data and receiving corrupted files. Default: On.
Obey the Robot Exclusion Standard tells Teleport to abide by an ad-hoc system set up by the world’s
webmasters, for limiting access by automated agents. The Robot Exclusion Standard is a voluntary system
under which automated agents can allow themselves to be directed away from certain parts of a website.
Webmasters commonly use the Robot Exclusion Standard to keep robots and webspiders out of sensitive
areas and files, such as those that control hit counters, voting, and user feedback. Obeying the Robot
Exclusion Standard has no effect on retrieval speed, and in most cases is invisible to the user. In fact, if this
system is disabled, Teleport can waste time trying to access files that are inaccessible anyway. You should
disable this option only if you have a very good reason for doing so. Default: On.
Wait ____ seconds between requests to slow servers (or all servers) directs Teleport to delay between
sequential requests to the same webserver, if it is requesting more than two files at once. Without some
delay, Teleport can quickly overwhelm some sites’ ability to respond, impairing both Teleport’s efficiency
and the site’s performance for other users. Reducing this setting to zero seconds disables it, but will
produce only a marginal increase in performance. When “slow servers” are selected, the delay takes effect
only when the server responds slowly. Otherwise, the delay always takes effect. Note that Teleport may
still issue several requests at the same time, if it can issue them to different servers! Default: 1 second.
Use this domain as the “referrer” for all requests causes Teleport to use the domain you enter as the
“referrer” for all HTTP requests it makes. The “referrer” field in HTTP requests tells the server what page
linked to the one that you’re requesting. Normally, Teleport sets this field to the first page that links to the
URL it’s requesting. If you enter something in this box, however, Teleport will always use this field as the
referrer value. This can sometimes be useful in getting data from servers that require that you access certain
pages only when linked from another particular server. Default: Not set.
Agent Identity: Browsers, webspiders, and other Internet client programs can identify themselves when
making requests for files. Although it’s considered good netiquette to use your true identity, it can have
unwanted side effects. Sometimes a remote server will look at the agent identity to determine what type of
data to send it — or whether to send it data at all. For example, some websites will not send framed
versions of a site to older Microsoft Internet Explorer browsers, which were not capable of reading framed
sites. You may wish try using a different agent identity if a site appears to be unresponsive to Teleport, yet
works with your browser.
Anonymous sends no identity at all.
Teleport uses the string “Teleport TYPE/VERSION” where TYPE is Pro, Ultra, or VLX, and
VERSION is the current version number.
Impersonate Microsoft will cause Teleport to identify itself as a Microsoft Internet Explorer browser.
Impersonate Firefox will cause Teleport to identify itself as a Mozilla Firefox browser.
Custom allows you to specify your own identity string. You can use this option for testing or for
compatibility with other browsers, agents, or webservers.
Project Automation
The Project Properties Advanced Page has a number of features that enable you, in conjunction to with an
external scheduling program such as the Windows System Agent or Task Scheduler, to set up your project
to start automatically, run for a certain time, and quit (terminate Teleport) when finished. You can also set
your project to run continuously, which can be useful for intermittently polling a site for new data. And, if
you set up Teleport automatically to connect and disconnect as necessary, you can schedule projects for
complete unattended operation.
You can also set up Teleport to run from a command-line or batch file, specifying many of these project
options as command-line parameters.
Automatically begin running on open will cause Teleport to begin running the project as soon as it is
opened. This feature is useful for scheduling a Teleport session using a program like the Windows System
Agent or Task Scheduler. Just specify the name of the project on the System Agent, Task Scheduler,
Windows shortcut, or DOS command line that runs Teleport, and the project will begin running
automatically when the program starts.
Run for no more than X minutes will cause Teleport to abort the project (as if you had pressed the Abort
button) after it has run for a certain number of minutes. Note that this option will stop running the project
even if it is set to Run continuously (see below).
Run continuously tells Teleport to continuously run the project. After the project is completed, Teleport
will wait for a specified interval, and then begin running again. On subsequent runs, it’s likely that Teleport
will be merely updating files, but if it finds new files in the process, it will retrieve them as well.
Exit program when project stops running terminates the Teleport program when the project is
completed, or when the Run for no more than X minutes time limit has been reached, whichever occurs
first. If you have no time limit, and the project is set to Run continuously, enabling this option has no
effect.
Note: the Autoexit feature terminates Teleport with no questions asked. Unless you have also enabled the
Autosave feature, Teleport will not save project information it has accumulated while running. If you want
your project state to remain up-to-date and you’ve enabled the Autoexit feature, be sure to enable
Autosaving as well.
Teleport Ultra and Teleport VLX only: The Synchronize local site option, when enabled, will cause
Teleport to delete “dead” files from the project after exploration is complete. “Dead” files are those that
Teleport retrieved in an earlier project session, but which are now not available on the server. (Remember
that Teleport’s default mode is to “accumulate” files, meaning that it retains files even after they’re no
longer on the server. Enabling synchronization changes this behavior.) The dead files will be deleted from
the project folder, and links to them will be rewritten in the manner that Teleport normally rewrites dead
links (controlled by the project’s Browsing/Mirroring properties).
Teleport Ultra and Teleport VLX only: The Borrow cookies option instructs Teleport to use Internet
Explorer’s cookies when it communicate with webservers. (This feature is only compatible with Internet
Explorer.) To copy a site that requires form-based login or some other complex authentication procedure,
enable this option, and then log in or otherwise access the website with Internet Explorer. Once you’ve
completed the login using Internet Explorer, you can start the project. (You can sometimes close Internet
Explorer, but often it is best to leave it running, because some cookies will automatically expire when the
browser is closed.) Teleport will pick up whatever cookies it needs from Internet Explorer, and use them
when it crawls the site.
Note: For most sites, you will have to log in first with Internet Explorer each time you want to crawl them.
This is because the cookies Internet Explorer receives will expire. For this reason, projects that require
using Internet Explorer cookies are unlikely to work if scheduled for automatic execution.
Teleport Ultra and Teleport VLX only: The Custom HTTP headers field can be used to alter the way that
Teleport crawls the website. Put additional HTTP headers in this field, one per line. A common use for
this field is adding cookies that you know to be useful for crawling the site. For example, you could add the
field “Cookie: SHOPPER%5FID=236570453” if you know that that’s the cookie that identifies you when
accessing a shopping website. (This is just an example, don’t use it in a real project.) You can also inject
other fields that may be necessary for communicating with a proxy, or for testing webserver capabilities.
You may use the following macros in any additional headers: $URL will be replaced by the requested
URL; $FILE will be replaced by the filename part of the requested URL (including query parameters); and
$HOST will be replaced by the domain name (host name) of the requested URL. So, for example, to set the
Referer field to the current URL, you can use the header “Referer: $URL”
Like the main application toolbar, the Thread Bar can be moved, resized, floated, and docked along any
edge of the Teleport window.
Within the path of links to other servers allows Teleport to go beyond the path or domain of the
starting address. This setting is most useful for "link sites," web pages that contain lists of links to
other web pages with similar content. In this mode, Teleport will follow links to any page linked from
within the starting domain. Once inside an external domain, Teleport will change into its "stay within
path" mode and explore all pages within the path of the external link. When it has finished exploring
the external site, it will return to the starting domain and continue following links, possibly into other
external domains.
Up to ___ links away from any page not on the starting server also allows Teleport to go beyond
the domain of the starting address, but keeps it within a specified number of levels of the external link.
For example, if the Starting Address is http://www.tenmax.com and Teleport locates a link to
http://java.sun.com/somepage.htm, it will explore somepage.htm on java.sun.com, and also up to the
specified number of links away from that page (but never straying outside java.sun.com).
Be careful! When following external links the Teleport spider can travel very far. It can also stay out
a long time, and bring back megabytes, even gigabytes of files. Unless the number of external
domains linked from the starting domain is small, projects that allow external domain exploration will
usually require more than a day to complete.
Teleport Ultra and Teleport VLX only: The final boundary option, anywhere else, but only with
URLs matching the Inclusions listed below, prevents Teleport from getting any pages, even pages on
the starting server, unless they match at least one of the Inclusions expressions specified for the starting
address. This option is often useful for sites that are organized by id numbers or strings instead of
directory names, such as forum sites and blogs — using this option you can tell Teleport to explore
only URLs containing a particular forum id number specified as an Inclusion.
The Authentication block allows you to specify any account name and password that may be required to
access this site.
Note that these fields provide authentication for the remote server only. If you use a proxy and your proxy
server requires authentication, use the Proxy Settings dialog to specify your proxy account and password.
Teleport Ultra and Teleport VLX only: The Inclusions field allows you to specify one or more additional
areas of the website (or even of other websites) that are to be considered “in-bounds” when Teleport crawls
the site. Those areas will be crawled up as long as they are within the depth setting you specify (that is, so
long as they are N or more links away from your starting address, where N is the depth setting for that
address). You can specify inclusions using wildcard expressions that match the URLs you want to include.
For example, to start on “www.server.com/news/index.htm” and crawl everything in the /news/ folder and
also include any URLs in any server’s /sports/ subdirectory, you could enter */sports/* in the inclusions
field. If you’re worried about staying only on the main server, then enter http://ww.server.com*/sports/*.
You can enter more than one inclusion in the inclusions field; just separate them with semicolons. You can
use inclusions advantageously when a site has more than one server and you want to copy everything on
related servers. For example, if you want to make a copy of www.server.com and also any related servers
(say, shop.server.com and news.server.com), you could enter *.server.com in the inclusions field. Be sure
to make the depth setting high enough, though, to reach far enough onto each server.
Note: You can also use “range expressions” in an inclusion, to match a class of characters. For example,
you can use [a-z] to match any letter from a to z; [0-6] to match the digits zero to six (but not 7, 8, or 9);
or [aeiou] to match a, e, i, o, or u. You can also use “negated ranges” to match characters NOT in the
range: [^a-z] will match any character except a letter from a to z, for example. Finally, you can use \xNN,
where NN are two hexadecimal digits, to match any character by its ordinal number. For example, you
can match a literal question mark (normally a reserved wildcard character) with the expression \x3F.
Teleport Ultra and Teleport VLX only: The Aliases field allows you to specify additional names by which
the primary server may be known. It is increasingly common for a website to go by both “www” and “non-
www” names — for example, server.com and www.server.com may point to the same website. Some older
“server farm” sites will also use variants such as “www1.server.com” and “www2.server.com”. If you know
these additional names, you can specify them in the aliases field. Teleport will then translate any URLs it
sees that point to the alias names, so that they point to the server for the starting address. This helps you to
copy more of the site (because if Teleport sees a link to “server.com” leading from a page on
“www.server.com” it will know that the link should be followed), and it has the added benefit that URLs are
more effectively de-duplicated, resulting in a smaller copied site size, a smaller project database, and faster
performance.
where the first quoted string is the complete path to the program, and the second quoted string is the
complete path to the project you wish to run. Be sure to enclose the paths within quotes, if they contain
spaces.
You may also wish to use command-line switches to force Teleport to automatically connect to the Internet,
disconnect, and exit, as necessary. Alternatively, you can set up the project’s own Advanced Settings so
that it will run automatically when opened and/or terminate Teleport when it finishes.
Automatic Connection/Disconnection
Teleport can automatically connect to and disconnect from the Internet, as required to run and complete the
Teleport projects you create. You can set up the automatic connect, disconnect, and reconnect services
using the Connections… command under the File menu.
When you open the Connections dialog box, Teleport presents you with the following options:
Use your default Internet connection allows Teleport to request whatever default networking or dial-up
networking service you have installed for accessing the Internet. Under Windows, this is most often your
default Dial-Up Networking connection, which you can set up using the Internet device in the Windows
Control Panel.
Use the following connection ___ with these parameters tells Teleport to connect using any of the
connection services it locates in your operating system. Choose which connection service you want to use,
then set any of the following parameters, as necessary:
Account and password are the account and password to use when logging in. If you leave these boxes
blank, Teleport will use the default account and password set up for that connection service.
Dial number is the telephone number Teleport will dial to access this service. If you leave this box
blank, Teleport will dial using whatever number is specified for that connection service.
Redial tries specifies the number of times Teleport will try to redial the service, if it can’t connect.
There is no default for this parameter, so be sure to enter something other than zero if you want
Teleport automatically to redial until connected.
Disconnect all other open connections… tells Teleport to hang up or disconnect any other connection
services before dialing this one. Use this parameter if you commonly use more than one connection
service, but have only one telephone line with which to dial out.
If the connection is closed prematurely, reconnect automatically allows Teleport to reconnect
automatically if the connection is shut down while a project is running.
You may also specify automatic disconnection services for Teleport.
Don’t disconnect disables automatic disconnection.
Disconnect whenever a project is completed or stopped tells Teleport automatically to hang up the open
connection whenever the project finishes, or is terminated using the Stop or Abort commands.
Disconnect when Teleport terminates tells Teleport automatically to hang up the open connection
whenever the program stops running — whether automatically, such as when your project is set up to
terminate the program on completion; or when you close the program or select the Exit command.
Command-line Arguments
Teleport can accept the name of any project file (.tpp file) as a command-line argument, which you may
specify on the DOS command line, in a batch file, or as the optional arguments in a Windows shortcut.
Teleport will automatically open the specified project file, and if that project’s properties require it to start
running, Teleport will begin running the project automatically.
Teleport has several command-line switches that you can specify, in addition to a project filename, to
control how Teleport will react when the project is opened. The switches are:
/r (run normally): this tells Teleport to begin running the project file as soon as it is opened. If the
project database contains unretrieved files, they will be retrieved automatically. If none of the files in
the project database are unretrieved, Teleport will switch into Update Mode and attempt to update all
eligible files in the project.
/c (run complete refresh): this tells Teleport to Clear the Project Database just before it runs the
project again. Use this to run your project over and over again from scratch. Note, however, that when
Teleport clears the project database, it will also delete all files in the project folder. If you are using a
batch file or other automated process, you can remove the project files before Teleport deletes them.
/u (run update only): this tells Teleport to begin running the project in Update Mode automatically.
Teleport will not queue any new files for retrieval, but will only update eligible files in the project
database.
/n (do not run): this command is useful if your project’s properties require it to Autorun on open, but
you want to open the project temporarily, perhaps to change its configuration, and don’t want it to run.
This command negates the Autorun property of any opened project.
/a<connection_name>: if you specify /a on the command line, followed immediately by the full name
of a Dial-Up Networking connections entry, Teleport will autoconnect to the Internet using that
connection when the project is started. This is useful for running the project from a command-line,
batch file, or other program, when you don’t want to specify a default connection setup within Teleport
itself.
Note: if the connection name has spaces in it, you must enclose the entire switch in quotes, like this:
"/aConcentric Boston"
/d (autodisconnect): tells Teleport to automatically disconnect from the Internet when the project is
finished running.
/e (autoexit): tells Teleport to exit as soon as the project is finished running.
/t<time_limit_in_minutes>: sets a project time limit, in minutes; if the project session lasts longer
than this limit, the project will stop automatically.
/l (relink project): This special command-line switch causes Teleport to perform the “Relink all files”
command on the specified project, and then exit immediately.
Proxy Settings
Teleport is fully compatible with proxy servers, firewalls, and most forms of corporate intranets.
If your Internet/Intranet system requires that you connect to a proxy server, select the Proxy Settings
command from the File menu to specify your proxy parameters. If you do not know your proxy settings,
contact your system administrator or service provider.
Most proxies do not require authentication. However, if your proxy server requires an account and
password, enter them in the Proxy Settings dialog box. Teleport will automatically handle proxy
authentication when it runs your project.
Running a Project
You can run a project by selecting the Start command from the Project menu, or pressing the Start button
(the solid arrow button) on the toolbar.
When the Start button is pressed, Teleport scans the project database, looking for files that match your
current project specifications. When Teleport finds such files, it then queues them —for retrieval, if they
have never been retrieved, or for updating, if they have been retrieved before. The program then connects
to the Internet, and begins retrieving and updating files. As it explores and encounters links to new files, it
similarly checks them against the project specifications, and if they are also required, it queues the new files
for retrieval, as well.
Note that there are additional limitations that you can set on which files Teleport will update. See the
Project Properties, Exploration page for instructions for setting up your project’s update specifications. By
default, every file in the project that meets your retrieval specifications (except server-side image maps) will
be updated, but you can restrict updates to, for example, only html files, if you wish.
When updating files, Teleport first queries the remote server, asking if the file has changed since the last
time Teleport retrieved it. If the file has changed, Teleport retrieves the new file — and overwrites the old
version in your project folder. It then checks the file for links to more files. If Teleport finds new links to
files that should be retrieved, it queues these for retrieval, as well.
Note that Teleport only attempts to retrieve (or update) those files that meet your current Project Properties
retrieval settings and exploration rules. In other words, if in a previous project session, you directed
Teleport to retrieve graphics and text files; and then you change your retrieval settings to retrieve only text
files, running the project will retrieve and update only text files — because graphics files no longer meet the
project’s retrieval settings.
If you’d like to just check whether the file has been updated since Teleport last retrieved it, use the Update
Now command, instead. Teleport will check the file, and retrieve it only if the server reports that the file
has changed.
When using either the Retrieve Now or Update Now commands, Teleport will automatically begin
retrieving any new files that it finds linked from the pages it is retrieving.
Note: The Retrieve Now and Update Now commands will automatically overwrite existing files.
Finding Files
When in List mode, the File List displays files as a multi-columned list of names, in the order in which the
files were retrieved. If you want to search for a particular file in the File List, you can show file details by
clicking the Details View button on the toolbar. Once in Details view, you can press any of the column
headers in the list to sort the list by that column. Pressing the same column header a second time will
reverse the order of the sort.
Glossary
Domain: the file area contained within a server or a virtual server. The domain of the Internet address
"www.microsoft.com/home/software/" is "www.microsoft.com." Any Internet address beginning with
"www.microsoft.com" is located within that domain.
External domain: any domain that is not the same as the domain of the starting address.
Embedded graphics (also inline graphics): images (and movies) that appear within a web page, as opposed to those
that are linked from a web page (but do not appear on the page when you view it with a browser).
Toolbar: a row of buttons for frequently used commands, usually located a the top of the application window. The
toolbar is movable and dockable. To move it, grab it on the gray area between and around the buttons and drag it. To
dock it, drag it to any window edge until its shape changes to match the side.
Spider (also robot or agent): a program that travels unattended through a network. Most spiders perform a function at
each network location, such as verifying or reporting the location's contents.
Retrieval thread: one of the small subprograms launched by Teleport to retrieve files. Teleport can launch up to ten
of these mini-programs; each behaves independently as a running application, and terminates after it has retrieved (and
sometimes parsed) the file it was sent after. You can view the state of each thread using the Thread Bar.
Starting address: The Internet address at which the Teleport spider will begin its search. The starting address is
entered by the user. Usually, you will enter the address of an HTML page, but you can also enter the address of
programs, zip files, or anything else. All pages linked from a starting address, either directly or indirectly, "belong" to
the page and are indented below it in the Project Map.
Gateway Page: The first page that links into an external domain. The gateway page controls Teleport's access into the
external domain. If you disable a gateway page, all pages in the domain it controls are also disabled.
Project Map: the tree view showing web pages that the Teleport spider has visited for the project, displayed in the
left-hand pane of the project window. When you select pages in the Project Map, a list of files that were retrieved for
that page will appear in the File List. If the web page you select is contracted and it is a starting address or a gateway
page, you will see not only the files that were retrieved for that page, but also the files retrieved for any pages linked
from that page.
File List: the list view showing retrieved files for the project, displayed in the right-hand pane of the project window.
The File List can be switched between a list of names, and a details view, using the buttons on the toolbar.
Context menu: the short popup menu that appears when you press the right mouse button (left button for left-handed
users) on a page in the Project Map, or a file in the File List. The context menu has several file-specific commands.
When you’ve selected HTML files that are displayed as Folder icons, the context menu will display additional
commands (Disable, Enable, and Properties) that apply only to HTML files that “contain” other files.
Java applet: a small executable program that can be embedded in web pages. Java applets can run “inside” your
browser, and can produce interesting text or graphics effects, animations, or even perform calculations and display
data.
Proxy server (or firewall): an computer that acts as an intermediary between your computer and the Internet. Proxy
servers typically control access from within a company’s network, or “intranet”, to the Internet — and vice versa.
Server-Side Image Map: a special type of image map that is handled by a program on the remote computer. The
server-side image map does not publish its links — instead, it keeps them hidden, and only releases them, one at a time,
when the client program (usually your browser) requests them (usually when you click on a part of the image). The
Teleport spider handles server-side image maps through a brute-force "pinging" technique. When it encounters such a
map, it sets up a grid across the map and "pings" the map, asking the server to release its link for each grid location.
You can enable this advanced feature on the Project Properties Exploration Page.
The Trim dead files command (in the Project menu) removes “dead” files from the project folder, rewrites
any links to them, and removes them from the project database. (This procedure is performed automatically
at the end of a project session if the Synchronize project property (see above) is enabled.)
The Export home page command creates a simple HTML “index” page, named after the project (but with
a .htm extension) and in the same place as the project file and the project folder. For projects with just one
starting address, this page will contain a redirection to the project’s home page, within the project folder.
For projects with multiple starting addresses, this page will contain a list of links to the home pages for each
starting address.
The Message Options command (under the File menu) can be used to specify how messages are written for
links to files that are not retrieved (when you’ve selected to write messages for unretrieved links, on the
Browsing/Mirroring properties page). This command brings up the Message Options dialog box, which lets
you select from a variety of methods for write these links:
Option 1 will write Javascript messages for each link to a file that Teleport did not retrieve. The
message for each link will give the full address of the file, and the reason why the file was not retrieved.
The message will be of the form, “This file was not retrieved by <product name>, because <reason>,”
where <product name> is the name you specify as the parameter for Option 1.
Option 2 will rewrite all links to files that Teleport does not retrieve, so that they point to a file located
at some absolute address. This page can be located on a local hard drive, on a CD-ROM, or on another
webserver. This table lists some versions of absolute links that you might use:
Link Effect
http://www.mysite.com/missing.htm Will link to “missing.htm” on the www.mysite.com server.
\missing.htm Will link to “missing.htm” on the root of the server, CD-ROM,
or hard drive where the HTML files are stored.
\special\missing.htm Will link to “missing.htm” in the “special” directory, located on
the root of the server, CDROM, or hard drive where the HTML
files are stored.
c:\files\missing.htm Will link to “missing.htm” on the c: drive. Use this type of link
only if users will browse the local website copy only from their c:
drive.
Option 3 will rewrite all links to files that Teleport does not retrieve, so that they point to a file that
you will place in the root of the project folder. Teleport will rewrite these links using a relative address
that always points to this file, so that users will always see this message, no matter where they browse
from, and no matter where the local website copy is stored. This option is the most robust because it
guarantees that your message is always properly linked. Its only restriction is that it requires you to
store your message page inside the project folder. Be careful to restore your message page linked with
Option 3, if you use the “Clear Project Database” command from within Teleport. This option will
delete all files in the project folder—including your message page. After clearing the database and re-
running the project, you will need to replace your custom message page inside the project folder.
Option 4 will rewrite all links to files that Teleport does not retrieve, so that they point to the message
that you specify. The user’s browser will display the message in a JavaScript confirmation dialog box,
so that the user can either open the file from the remote server, or can cancel and return to the page he
or she is browsing. You can use the string “$url” (without the quotes) to represent the URL of the link,
in the message.
Contact Us
Online help and FAQ information are available 24 hours a day, 7 days a week at the Tennyson Maxwell
Information Systems, Inc., website—
http://www.tenmax.com
We welcome your comments and suggestions! Please feel free to contact us in any of the following ways:
Email: [email protected]
Internet: http://www.tenmax.com
Mail: Tennyson Maxwell Information Systems, Inc.
PO Box 34
Garrison, NY 10524
Phone: +1 845-214-0633
Fax: +1 815-461-9518