Varrays and Nested Tables Oracle
Varrays and Nested Tables Oracle
Varrays and Nested Tables Oracle
The VARRAY
A VARRAY allows you to store multi-line items as part of a single row within a table. For
example, you might want to store people’s addresses: usually, these are stored in a table
using columns such as ‘Address Line 1’, ‘Address Line 2’ and so on. By using a VARRAY, we
can store multiple lines of an address as one entity –in fact as an “object” which is a
“collection” of otherwise separate items… hence the term often used for them…
‘collection objects’.
A VARRAY is so-called because it is a ‘variable array’ of items. It’s variable in the sense
that you get to define how many items it should be comprised of, and you can then use any
number of items up to that number. But once you’ve reached that number, it suddenly
gets a lot less variable: the upper limit is strictly fixed, and can’t be breached.
TYPE CREATED.
TABLE CREATED.
A perfectly ordinary create table statement –except that the data type for the second
column is the address “type” I created just a moment before.
1 ROW CREATED.
Notice here that to populate the table’s second column, I must reference not the column
name (‘Addr’), but the type name (‘Address’). Here, I’ve used all 3 of my available entries.
SQL> COMMIT;
COMMIT COMPLETE.
NAME
---------------
ADDR
HOWARD ROGERS
ADDRESS('16 BRADEY AVENUE', 'HAMMONDVILLE', 'NSW 2170')
And here we see that selecting from the table is a perfectly normal operation, with no
particular syntactical tricks. Note, though, that the display includes the type name [i.e.,
“ADDRESS(…”], which makes it look almost as if you are applying a function to the data.
That means you’ll have to develop ways of stripping the type name out of the returning
data if you’re going to make use of it in your application.
What happens if you try to exceed the permitted array length? Well, this:
So the answer is no, which means that any access to the arrayed column is extremely likely
to provoke a full table scan. That in turn means that the use of VARRAYs is not something
you’d undertake lightly if you were hoping to win some performance awards for your
application.
Finally, can you reference individual parts of the array? Well, the following ‘global’
command works:
ADDR
-------------------------------------------------------
ADDRESS('16 BRADEY AVENUE', 'HAMMONDVILLE', 'NSW 2170')
…so perhaps something like ‘select addr(1) from employees’ will select just the first
element of the array?
Unfortunately not. In fact, you can’t select a single part of the array –and what’s rather
worse, you can’t update a single part of the array, either. You have to update the entire
thing, or none of it. If I move house from 16 Bradey Avenue to 15 Bradey Avenue, the
command has to look like this:
In other words, you have to treat the entire thing as a single entity, and minor changes to
parts of it nevertheless require you to update all of it. That can lead, of course, to rather
a lot of redo and rollback being produced for what would have been, had the (in this case)
address been stored as three separate columns, a fairly small update.
Are VARRAYs useful, then? Not really. They make selecting just the data from the table
tricky (unless you particularly want to see the TYPE name appear with the data every
time). Updates are expensive. Selects are done via full table scans. And you lose any
ability to work with just particular parts of the data… it’s all or nothing.
A ‘nested table’ is, in effect, an infinitely variable VARRAY: there’s no pre-defined limit on
the number of elements that can be stored within it, so if you need to add an extra
element, you are free to do so. What’s more, the various components of the nested table
can be updated and selected on separately (unlike, as we have just seen, the VARRAY
which is treated as a monolithic whole).
As an example, we can repeat our earlier demo, this time using a nested table:
TYPE DROPPED.
SQL> CREATE TYPE ADDRESS AS OBJECT (HOUSE VARCHAR2(25),STREET VARCHAR2(25), STATE VARCHAR2(25));
2 /
TYPE CREATED.
Having first gotten rid of our varray version of the address type, here we are creating a
type ADDRESS. In practical effect, this version of the type is much the same as our earlier
effort: 3 elements, each of varchar2(25). But note how we have to name and define each
of the elements separately this time, as though they were columns of a table (which is, in
effect, what they are about to become). That gives scope for greater flexibility than a
varray: there’s no reason why I couldn’t have used date or number data types for some of
my ‘columns’, if I’d needed them. The syntax of the varray doesn’t allow this sort of
mixing and matching of data types.
TYPE CREATED.
Here, we’re creating an object table type, of the type ‘Address’ just created. It’s the
table type that can be used when we create our relational table:
TABLE CREATED.
As you see, we can finally create an ordinary relational table, but with its second column
defined as having a data type of ‘addr_table’ –which is the table type we created a
moment earlier (and which itself, of course, uses the ‘address’ type created initially).
Note the last line of the syntax. That must be included so that the addresses we are going
to enter into the employees table get stored in a properly-named table of their own –the
nested table. (Failure to include this line, by the way, yields an “ORA-22913: MUST
SPECIFY TABLE NAME FOR NESTED TABLE COLUMN OR ATTRIBUTE” error message). In this
example, the addresses are going to be stored in a table called “Addresses”.
Now, although we’ve just created a table called ‘addresses’, it won’t be listed in
USER_TABLES. Instead, we have to query USER_OBJECT_TABLES to see it. It’s possible to
do a ‘describe addresses’ in SQL*Plus, though –at which point, you’ll see that it contains
three columns which are the three elements originally defined for the ‘address’ type (i.e.,
HOUSE, STREET and STATE in this example).
How do we now get data into this table, and how do we then manipulate it? Well, as ever,
some examples might help:
Now the syntax is rather obscure here (and it doesn’t help much when the word ‘address’
and all its variants keeps cropping up). If you manage to bear in mind, though, that
‘addr_table’ is the data type we declared when creating the employees table, and
‘address’ is the object type we created right at the start –then hopefully it’s a little
clearer as to what keywords go where.
In particular, note that the actual name of the nested table, “Addresses”, doesn’t appear
anywhere in the syntax.
So much for getting one row, containing two addresses, into the employees table. What
about getting them back again? Well, you could try this:
NAME
--------------------
ADDR(HOUSE, STREET, STATE)
---------------------------------------------------------------------
HOWARD ROGERS
ADDR_TABLE(ADDRESS('16 BRADEY AVENUE', 'HAMMONDVILLE', 'NSW'), ADDRESS('4 JULIUS AVENUE',
'CHATSWOOD', 'NSW'))
Fortunately, this time, we can fix that… but it requires rather cleverer syntax than ye olde
‘select * from…’ that we know and love.
In fact, what we’ve had to do here is select the three separate elements from the address
object type as though they were columns from a table for which we have to perform a (left)
outer join. Which is all perfectly understandable, I suppose –except that you might have
expected to use the “Addresses” table name (since that’s what the nested table is actually
called, after all). Instead, you use as the table name what was actually supplied as the
column name when we defined the employees table.
You can see why we usually leave this stuff to the developers!
At which point, you are bound to ask: why not just query the ‘Addresses’ table directly?
ERROR AT LINE 1:
ORA-22812: CANNOT REFERENCE NESTED TABLE COLUMN'S STORAGE TABLE
Well, what about updates. The promise was, you’ll remember, that updates to part of the
address details would now be possible. And it is… but again, the syntax is not exactly
intuitive:
1 ROW UPDATED.
In other words, you use an Oracle-supplied function called ‘TABLE’ to turn the nested table
column into something like a real table, to which regular SQL syntax can then be applied.
(Somewhat bizarrely, the same function was named “THE” in Oracle 8.0 …we can be
grateful for the name-change in 8i!) It certainly does the trick:
1 ROW DELETED.
Once again, if you’d thought you could hare off to the ‘Addresses’ table itself and do
direct updates, you’d be in for a surprise:
And likewise, trying to do direct deletes produces exactly the same error.
Incidentally, and whilst we’re trying to break the entire object-relational parts of Oracle,
you’ll get similar errors if you try performing various bits of DDL on the nested table
directly:
Such mischief aside, however, it does appear that we have a reasonably flexible thing
going for us with nested tables. What’s more, we can talk directly to the “Addresses”
table when it comes to indexing:
INDEX CREATED.
In other words, it is possible to put as many indexes on the nested table as we like, and on
as many columns as we wish. Concatenated indexes are OK, too:
INDEX CREATED.
Having said all of that, nested tables are still potentially a performance dog, and a
manage ment nightmare. The trouble arises right back at the beginning, when you create
the ‘employees’ table with the magic clause:
What that actually does is to create three segments. Of course, you get the employees
table (though you also get an extra column created, which Oracle hides from you, but
which is visible in SYS.COL$, provided you know your object number). You also get the
object table called addresses. But you also end up with a new index, cunningly named
SYS_Cxxxxxxx. That’s an index, associated with a constraint, on that hidden column in the
employees table. The index is there to help tie the nested table’s rows back to the main
table’s data in as efficient a way as possible (that is, travelling from nested table to its
container table is pretty efficient).
Unfortunately (and this is where the management nightmares start), all three segments are
created in the one tablespace (the one where the ‘employees’ table is created), and
there’s nothing you can do about that. Of course, this being 8i, there’s nothing to stop
you now going ahead and moving the employees table to a different tablespace, which
leaves behind the nested table and the linking index (ALTER TABLE X MOVE TABLESPACE Y will
do the trick). You could then rebuild the index into yet another tablespace. That would
get all three segments nicely separated… but it’s a pain to have to remember to do these
things manually.
What’s worse, as things stand, there is no index on the nested table to help speed up
nested row retrieval (that is, travelling from the container data to the relevant nested
data is pretty inefficient –and it’s that direction of travel which is likely to be the one
most frequently used, of course).
No worries: you can create your own index (though again, it’s a pain to have to remember
to do so). But the trouble is that the index you need is on a column of the nested table
which is yet again one of these hidden columns that seem to crop up all over the place as
soon as you start using objects. In fact, the column is called NESTED_TABLE_ID (and is
always so-called, regardless of what your nested table is actually called), so you need to
get into the habit of issuing something like this:
INDEX CREATED.
(Be aware, too, that the nested_table_id is the thing that ties nested rows back to their
parent –so, when “Howard Rogers” had two addresses listed in the reports I showed earlier,
the id would have been repeated twice. If I’d had 65 addresses, the id would have
repeated 65 times. In other words, the nested_table_id has a lot of repetition –and can
thus probably benefit quite a lot from a ‘compress’ attribute… in this example, “compress
1” would have been appropriate).
The net result of all of this is that, frankly, I can’t see why anyone would want to use
Nested Tables, just as VARRAYs seem to be a lot of cleverness that causes more trouble
than it’s worth. I suppose that in the right circumstances, and with all potentially nasty
issues duly thought about and resolved at design time, they might have a use. But I myself
honestly can’t see what those ‘right circumstances’ would be. It strikes me, instead, that
using the old relational model (where you create a separate child table for people’s phone
numbers or addresses, and link back to the parent record using old-fashioned referential
integrity constraints and a couple of good indexes) is simpler, more flexible, and a darned
sight less hassle, administratively.