Rfs Concepts
Rfs Concepts
Rfs Concepts
Concepts
All rights reserved. No part of this publication may be reproduced, translated, stored in
any electronic retrieval system, or transmitted in any form or by any means, electronic,
mechanical, photocopying, recording, or otherwise, without the prior written permission
of FileTek, Inc.
Copyright 2001-2006 FileTek, Inc. As an Unpublished Licensed Work.
Publication Number: 900159 Rev. G
Information in this document is subject to change without notice and does not represent
a commitment on the part of FileTek, Inc. Further, FileTek, Inc. reserves the right to
supplement the document with information not available at the time of creation of the
document. FILETEK, INC. PROVIDES THIS PUBLICATION AS IS WITHOUT
WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT
NOT LIMITED TO THE IMPLIED WARRANTIES OR CONDITIONS OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, AND CANNOT
WARRANT THE RESULTS YOU MAY OBTAIN USING THE DOCUMENT. IN NO
EVENT SHALL FILETEK, INC. BE LIABLE FOR ANY LOSS OF PROFITS, LOSS OF
BUSINESS, LOSS OF USE OR DATA, INTERRUPTION OF BUSINESS, OR FOR
INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY KIND,
EVEN IF FILETEK, INC. HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES ARISING FROM ANY DEFECT OR ERROR IN THIS PUBLICATION. Some
states or jurisdictions do not allow disclaimer of express or implied warranties in certain
transactions; therefore, this statement may not apply to you.
FileTek and StorHouse are registered U.S. trademarks of FileTek, Inc. VRAM is a U.S.
trademark of FileTek, Inc. All other brand or product names are trademarks or registered
trademarks of their respective owners.
Documentation for FileTeks StorHouse product. Protected by the following U.S. Patents:
4,864,572; 5,247,660; 5,727,197; 6,049,804.
S t o r H o u s e / R F S
C o n c e p t s
Contents
What is StorHouse/RFS? ............................................................................1
Whats new in StorHouse/RFS release 4.0?.................................................2
Enhanced security model ....................................................................2
Simplified StorHouse table creation....................................................2
Enhanced collection process................................................................2
Data integrity checks...........................................................................2
User file information reporting ...........................................................3
Dynamic timeout support...................................................................3
Safety directory for user file changes....................................................3
StorHouse/RFS configuration file reorganization................................3
System file date support ......................................................................4
StorHouse/RFS utility changes ...........................................................4
What is StorHouse? ....................................................................................5
StorHouse/SM....................................................................................6
StorHouse/RM ...................................................................................6
StorHouse/Control Center .................................................................6
What hardware is used by StorHouse/RFS? ................................................8
iii
Contents
iv
StorHouse/RFS Concepts
Contents
vi
StorHouse/RFS Concepts
S t o r H o u s e / R F S
C o n c e p t s
What is StorHouse/RFS?
StorHouse/Relational File System (RFS) is the FileTek file system interface
that enables organizations to store a virtually unlimited number of files
on a StorHouse system. With StorHouse/RFS, your enterprise can:
StorHouse/RFS Concepts
errors. The CRC calculated when the file data is first written is retained
throughout the life of the data and rechecked when the archive data is
retrieved. No data is ever written to disk or retrieved from disk without
the computation and checking of a CRC.
User file information reporting. StorHouse/RFS provides a command
that reports metadata or metadata and media location for a user file. You
can submit the command interactively or programmatically. Metadata
includes staging and local collection information, primary and mirror
StorHouse system information, and the status of collections written to
StorHouse.
You can run the rfsmaint utility from any system that can access the
StorHouse/RFS configuration file or a copy of it and that has ODBC
and TCP/IP connectivity to StorHouse.
StorHouse/RFS Concepts
What is StorHouse?
What is StorHouse?
StorHouse is FileTeks data storage and management system for capturing,
storing, moving, and accessing gigabytes (GB) to petabytes of relational
and non-relational enterprise data. StorHouse technology combines
industry-leading, scalable storage devices and Open System processors
with specialized storage management and relational database
management system (RDBMS) software components. StorHouse is a
comprehensive system that manages a complete storage hierarchy of the
following:
Shelf storage
What is StorHouse?
StorHouse/SM
StorHouse/SM, the StorHouse storage management software component,
controls the hierarchy of storage devices and provides system-managed
storage that optimizes media usage, response time, and storage costs for
each application. StorHouse/SM is also responsible for critical system
management tasks, such as data migration, backup, and recovery. All
storage management features and benefits apply to the user files you store
through StorHouse/RFS.
StorHouse/RM
StorHouse/RM, the StorHouse RDBMS software component, works in
conjunction with StorHouse/SM to specifically administer the storage,
access, and movement of relational data. StorHouse/RM provides direct
row-level Structured Query Language (SQL) access to high volumes of
detail data on any storage layer, including tape, in the StorHouse storage
hierarchy.
StorHouse/RFS uses StorHouse/RM to store file locator data and to
provide a relational search capability for archived files. StorHouse/RFS
also supports the use of ODBC-compliant databases to provide this
relational capability. StorHouse/RFS release 4.0 runs with StorHouse/RM
release 3.3, build 03 and higher.
StorHouse/RFS Concepts
What is StorHouse?
StorHouse/Control Center
StorHouse/Control Center, the StorHouse Windows-based application for
system and database administration, supports an easy-to-use graphical
user interface (GUI) that greatly simplifies StorHouse storage and
database management tasks. StorHouse/Control Center consists of one
or more StorHouse/Control Center servers communicating with
StorHouse/Control Center modules over a TCP/IP network. The three
modules are:
Application hosts
Windows 2000, Solaris 2.6 or higher, HP-UX 11 or HP-UX 11i,
or IBM AIX platform(s) to run the StorHouse/RFS server software
StorHouse system(s)
User
Networks
Application
hosts
TCP/IP
StorHouse
StorHouse/RFS Concepts
TCP/IP
StorHouse
User Networks
Application
hosts
StorHouse and
StorHouse/RFS server
For disaster protection, StorHouse/RFS can store and access the same data
on two StorHouse systems. This feature is called StorHouse/RFS
duplexing. See What is StorHouse/RFS duplexing? on page 35 for more
information.
StorHouse/RFS server
The StorHouse/RFS server consists of collector and retriever components.
10
A collector accesses files from the virtual file system and stores them on
StorHouse in a single file, or collection, in a modified CD format. You
can define multiple collectors for each StorHouse/RFS installation.
See page 13 for more information about the collection and storage
process.
StorHouse/RFS Concepts
StorHouse/ODBC driver
The StorHouse/ODBC driver is FileTeks ODBC-compliant interface that
StorHouse/RFS uses to communicate with StorHouse databases managed
by StorHouse/RM. For example, a StorHouse/RFS retriever accesses file
locator data through ODBC when searching for files on StorHouse.
FileTek provides an ODBC driver appropriate for the client operating
system.
11
You can organize your files on the virtual file system in any way that best
satisfies your site requirements (by user departments, functional
requirements, criteria, and so on). For example, you can write all files to
one or more folders, or directories, for each user; or you can create
separate folders for different applications. Figure 4 shows a virtual file
system in a Windows environment. The drive letter is V, and the virtual
file system contains two folders for two e-mail archives.
12
StorHouse/RFS Concepts
kbytes
used
avail
capacity
/dev/dsk/c1t0d0s2
Mounted on
38%
/filetek
13
2
Staging area
Data file
Data file
Data file
Rename Collection
Directory Directory
Renamed
data
files
File
locator
data
4
StorHouse
StorHouse
collection
StorHouse
tables
A staging area consists of a staging directory and a user directory. You assign
a collector to a staging directory and a user directory. Your enterprise can
have any number of staging directories and any number of user
directories in the same staging directory. Each staging directory must be
14
StorHouse/RFS Concepts
at least one level down from the root level of the virtual file system or
drive. Each user directory must be unique in a StorHouse/RFS
configuration.
For example, assume you have defined staging areas in a Windows
environment for a mailbox e-mail archive and a journal e-mail archive.
The staging directory for both archives is called c:\mailarchive. The user
directory for the mailboxes archive is called \mailboxes and the user
directory for the journal archive is called \journal. Figure 7 illustrates these
multiple staging areas.
Mailbox staging area
files
log
stage
v4r0
StorHouse/RFS creates a folder in the virtual file system for each user
directory. Only the user directory appears in the virtual file system. The
staging directory is transparent. For example, if \mailarchive is the staging
15
directory and \mailboxes and \journal are the user directories, then
StorHouse/RFS creates folders in the virtual file system for the \mailboxes
and \journal user directories. The \mailarchive staging directory does not
appear in the virtual file system. Figure 9 illustrates these user directories
in a virtual file system. An authorized user or application writes files to
these folders.
16
StorHouse/RFS Concepts
Creates a new buffer that will hold the data plus a 16-bit CRC for
each 2 KB of data
Transfers the data to the new buffer with the computed CRC for each
2 KB of data
The file write process continues until the application has completed
writing the data.
17
Renaming a file
where
is the name assigned to the collection in the
StorHouse/RFS configuration file
collection_name
yyyymmddhhmmss
(C)
system_name
created
18
StorHouse/RFS Concepts
StorHouse/RFS places the collection name before the file path and
appends the starting logical block number (LBN) to the file name. The
LBN indicates the position of the data in the final StorHouse collection.
For example, if the file RFS/Collectors/NewFiles/MyFile.txt is ready to be
collected and the name of the collection is
CollectionA20051115083020(C).ELF.FILETEK.COM, then the renamed file is:
CollectionA20051115083020(C).ELF.FILETEK.COM/RFS/NewFiles/MyFile.txt.1
You can specify the maximum amount of space a collector can use for
renamed files with the maximum collection space parameter in the
StorHouse/RFS configuration file.
19
Obtains standard file properties, such as file path, file name, and size
from the operating system
Stores the file locator data in a single load file, or .ldr file, in a
collection directory
20
StorHouse/RFS Concepts
comprise a local collection. Figure 10 depicts the directories and files that
make up a local collection.
Local Collection
RenameDirectory
Directory
Rename
Collection Directory
.ldr file
.dat file
Renamed File1
Renamed File2
Renamed File3
File Data
21
Any file that is found to be corrupt (fails the CRC check) is flagged and
renamed to an isolation directory called RFS_ISOLATED. StorHouse/RFS
sends an e-mail notification when files have been isolated. You can then
run the StorHouse/RFS checkfile utility to locate errors and to request a
copy of a file containing errors.
If StorHouse/RFS encounters corrupt files during the writing of the
collection to StorHouse, it terminates the collection and the load process.
On the next write cycle, StorHouse/RFS writes the collection again, this
time skipping the corrupt files.
Storing file locator data
StorHouse collection
StorHouse table array
After
successfully writing the file data from the rename directory to the
StorHouse collection, StorHouse/RFS first writes the .ldr file containing
all of the file locator data to the StorHouse collection. StorHouse/RFS
reads the .ldr file a line at a time and compares the CRC with the
computed CRC. If an entry is correct, StorHouse/RFS writes it to the
StorHouse collection. When StorHouse/RFS finishes writing all entries in
the .ldr file, it closes the StorHouse collection.
Storing file locator data in a StorHouse collection.
Then
StorHouse/RFS loads the file locator data into a StorHouse table in a
StorHouse database, again checking the CRC with each line read until the
load process for the collection is finally completed. StorHouse/RFS uses a
table array of up to 255 sets of 32 StorHouse tables to store file locator
data for one or more collections. StorHouse/RFS initially creates one set
of 32 tables that can contain information for approximately 1 billion
files. As required, the software can subsequently create up to a total of
Storing file locator data in a StorHouse table array.
22
StorHouse/RFS Concepts
255 sets of 32 tables to store file locator data for approximately 255
billion files.
Figure 11 illustrates the relationship between file locator tables and the
individual files in a collection. Each row in a file locator table corresponds
to a user file in a specific StorHouse collection.
Journal table in a
StorHouse database
Journal collections
on StorHouse
Row 1
File entry
File 1
Row 2
File entry
File 2
Row 3
File entry
File 3
Collection 1
File entry
Row i
File entry
File i
Row n
File entry
File n
Collection i
Collection n
23
24
StorHouse/RFS Concepts
25
Renames eligible files (those that have not been modified for 1
minute) to the local collection for 720 minutes or until the local
collection reaches 1 GB
26
StorHouse/RFS Concepts
27
Yes
Read from staging area
No
Local
search
File in
local
collection?
Yes
No
File on
StorHouse?
Yes
File in
cache?
Yes
StorHouse
search
No
File not found
No
Read data from the
StorHouse collection
into the cache directory
28
StorHouse/RFS Concepts
29
Although users may be able to browse the virtual file system, access to any
listed file is controlled through security measures. See How does
StorHouse/RFS implement security? on page 31 for more information.
30
StorHouse/RFS Concepts
File security
StorHouse/RFS supports the following levels of security for each user file,
or object:
File owner security only the user or application that wrote the data
can retrieve it
These security levels are enforced for files on the local system and on
StorHouse. In the following sections, the term user refers to a person or to
an application.
Securing files on the local system
If only a file owner has read, write, and execute permission for a file,
then only the file owner can archive a file to a staging area and read a
file located in a staging area.
31
If all groups have write permission to all staging areas, then all groups
can write files to all staging areas.
32
Permissions for file owners, groups, and other from the operating
system
For Windows, user name and domain name from the operating
system
For Windows, value of the Permissions parameter (defines crosssystem, or other, permissions) in the StorHouse/RFS configuration
file
StorHouse/RFS Concepts
If a requestor is the file owner, the requestor has access to the file on
StorHouse.
33
34
StorHouse/RFS Concepts
What is StorHouse/RFS
duplexing?
StorHouse/RFS can write the same user data and file locator data to
separate StorHouse systems that can be geographically dispersed.
StorHouse/RFS can also access data on the secondary, or mirror,
StorHouse system should the primary StorHouse system become
unavailable. This disaster protection and data availability feature is called
StorHouse/RFS duplexing.
Figure 14 illustrates a StorHouse/RFS configuration with a secondary
StorHouse system used for StorHouse/RFS duplexing. Any number of
35
TCP/IP
TCP/IP
36
StorHouse/RFS Concepts
Generating statistics
StorHouse/RFS generates statistics to help you monitor local space usage
and to assess collection, storage, and retrieval activity. In a duplexing
environment, you can store duplicate copies of statistics on the primary
and secondary StorHouse systems. At a configured interval,
StorHouse/RFS writes statistics to a local file (in HTML, XML, and/or
text formats) that you can check at any time. You can also store statistics
in a StorHouse database for historical analysis.
Just some of the statistics that StorHouse/RFS provides at each statistics
interval are as follows:
37
38
The program name that will open and read the file (such as Notepad
or cat) must precede the command.
StorHouse/RFS Concepts
39
<OWNER>ELF.FILETEK</OWNER>
<LBN>59</LBN>
<COLLECTION>DIRECT20050203170548(A).DEV-ELF.FILETEK.COM</
COLLECTION>
<VOLUME>TDA"010161":A</VOLUME>
<LIB>L00</LIB>
<MEDIA>WRITTEN</MEDIA>
</INSTANCE>
<INSTANCE>
<STATUS>ACTIVE</STATUS>
<MODIFIED>2005-02-04 10:54:14</MODIFIED>
<SIZE>100000000</SIZE>
<OWNER>ELF.FILETEK</OWNER>
<LBN>59</LBN>
<COLLECTION>DIRECT20050203170548(A).DEV-ELF.FILETEK.COM</
COLLECTION>
<VOLUME>TDA"010161":A</VOLUME>
<LIB>L00</LIB>
<MEDIA>WRITTEN</MEDIA>
</INSTANCE>
</SYSTEM>
</FILE>
40
StorHouse/RFS Concepts
41
File locator data generated for renamed files in the current collection
Modifications, such as file deletes, renames, and security changes, to
files that belong to other collections
42
StorHouse/RFS Concepts
43
The same method of writing files based on extensions applies to the load
file containing file locator data.
After writing the file locator data to the primary StorHouse system,
StorHouse/RFS changes the load file extension from .ldr to .ld2.
Files continue to reside on the StorHouse/RFS server until they have been
successfully written to both StorHouse systems.
44
StorHouse/RFS Concepts
You set a retention period for a collection set. However, the retention
for each user file in a collection set may expire at different times based
on each files last modified time.
45
Glossary
You can change the retention period at any time. The new value
applies only to uncollected files. In other words, any file collected
after the change has the new retention and any file collected before
the change has the old retention.
You can also remove the retention requirement later, that is, change a
retention period to 0 days. In this case, StorHouse/RFS may create file
versions and allow file versions to be deleted with the exception of the
original file version.
Glossary
The remainder of this document defines StorHouse/RFS terms.
The location where StorHouse/RFS places data that it
reads from StorHouse.
cache directory.
checkfile utility.
of a CRC error.
A group of user files stored in a single file. A collection
residing on local or network storage accessible to StorHouse/RFS is called
a local collection. A collection written to StorHouse is called a StorHouse
collection.
46
StorHouse/RFS Concepts
Glossary
collection definition.
collection set.
collector.
47
Glossary
.dat file.
A file that has been collected with the same file name and
path as a previously archived file. StorHouse/RFS creates file versions only
for files without retention requirements. StorHouse/RFS adds a negative
numeric suffix (-1, -2, and so on) to a file version name in the virtual
directory, for example, status.txt(-1).
file version.
isolation directory.
48
StorHouse/RFS Concepts
Glossary
A file, also called .ldr file, containing file locator data for a local
collection. StorHouse/RFS renames the .ldr file during various stages of
the write process to indicate which step(s) of the write process have been
completed. See also .ldd file, .ld1 file, and .ld2 file.
load file.
49
Glossary
mirror system.
retriever.
rfsmaint utility.
50
StorHouse/RFS Concepts
Glossary
security table.
staging area.
51
Glossary
StorHouse database.
StorHouse index.
StorHouse search.
52
StorHouse/RFS Concepts
Glossary
53
Glossary
54
StorHouse/RFS Concepts