Reference 6

Download as rtf, pdf, or txt
Download as rtf, pdf, or txt
You are on page 1of 264

Overview of Help

To find help on a specific topic click on the index button above.

For a general tutorial and introduction to UCINET see the online User's Guide which accompanies this
program.

An introduction to the general form of most help files in UCINET is contained in the Introduction Section
(see link below). Also below are links to the UCINET standard datasets together with help on the DL file
format.

Introduction Section
DL
Standard Datasets

To obtain technical support, send email to:

[email protected] (for United States users)


[email protected] (for all other users)
DATA>DESCRIBE>IMPORT LABELS

PURPOSE Import labels into a UCINET dataset

DESCRIPTION Imports labels which are in text format into a UCINET dataset. The labels
should be separated by a carriage return and be of plain text.

PARAMETERS
Label File
Name of text file containing the labels

Import into:
Choices are:

Row Labels
Column Labels
Matrix Labels

LOG FILE None

TIMING N/A

COMMENTS None

REFERENCES None
FILE > DELETE

PURPOSE Delete a UCINET dataset

DESCRIPTION Both the header and the data files are deleted. Files should be separated by a
space.

PARAMETERS
File(s) to be deleted
List of files to be deleted. Data type: any UCINET file.

LOG FILE None

TIMING N/A

COMMENTS None

REFERENCES None
FILE>RENAME UCINET FILE

PURPOSE Rename a UCINET dataset.

DESCRIPTION Renames both a header and data file of a UCINET dataset.

PARAMETERS
Original Dataset Name :
Name of file to be re-named

New Dataset Name:


Name of new UCINET dataset.

LOG FILE None

TIMING N/A

COMMENTS None

REFERENCES None
FILE>COPY UCINET DATASET

PURPOSE Copy a UCINET dataset to a new filename or folder.

DESCRIPTION Copies both a header and data file of a UCINET dataset.

PARAMETERS
Original Dataset Name :
Name of dataset to be copied. Data type: any UCINET file.

New Dataset Name:


Name of new UCINET dataset. This can be sent to a new folder. Default is the
same folder as the original file.

LOG FILE None

TIMING N/A

COMMENTS None

REFERENCES None
Introduction
This file gives technical information about all the routines contained within UCINET.

The manual assumes that users have certain rudimentary knowledge of the Windows operating system and
of network terminology. Elementary information on UCINET is available in the accompanying users guide.

Each routine is documented in a standard way. This should help the user to understand some of the non-
standard routines once documentation for which they are familiar has been thoroughly digested.

Command Format

Each routine is documented using the following keywords: MENU, PURPOSE, PESCRIPTION,
PARAMETERS, LOG FILE, COMMENTS, and REFERENCES. The details of these are as follows:

MENU This gives the exact position of the routine within the UCINET menu system.
For example NETWORK>SUBGROUPS>K-PLEX can be found by first
selecting NETWORK on the top level of the menu and then from the pull down
submenu selecting SUBGROUPS and then finally from this submenu selecting
K-PLEX. The selection of all the options in the MENU list followed by a
mouse click will begin execution of the routine.

PURPOSE This gives a brief one or two line description of the routine.

DESCRIPTION Gives a fuller account of what the routine does. This description will include a
brief definition of some of the concepts required to understand the technique and
an outline of the algorithms employed. It should contain sufficient information
for a user to fully comprehend the action of the routine. An effort has been made
to make the descriptions succinct. Users should read descriptions carefully if
they are unfamiliar with the action of a particular algorithm.

PARAMETERS This gives a complete list of what information must be supplied by the user in
order to run a routine. It contains a list of all the information requested on the
forms when a routine is executed. This list is indented in such a way as to make
it clear what exactly appears on the forms.

For each entry on the form the manual gives the defaults provided by UCINET.
This can be useful in trying to locate files that have been created by the software,
or when re-running a particular routine with different parameters.

In addition the manual gives additional information (to the help line on the form)
about how to complete each entry on the form.

If the routine requires a dataset (which most usually do) then the manual
specifies precisely which type of data can be analyzed. These are as follows:

Graph - an n´n symmetric binary adjacency matrix.

Digraph - an n´n not necessarily symmetric binary adjacency matrix.

Valued graph - an n´n matrix. The entries are usually reals, sometimes there are
restrictions on the values to integers or the matrix to symmetric.
Square matrix - an n´n matrix. The entries are usually reals, sometimes there are
restrictions on the values to integers or probabilities. Obviously valued graph and
square matrix are the same data type, it is just convention which dictates usage.

Matrix - an n´m matrix. The entries are usually reals. These can be restricted to
binary or integer.

Each data type is contained within the next. So, for example, any routine that
accepts valued graphs will run on digraphs or graphs.

Some routines contain options which will run on different data types. In this
case the data type given in the manual is the most general. Certain options
dictated by the parameters may not run with this data type. It should be apparent
from the manual which data types will be applicable for the selected parameters.

Routines which take specific action on multirelational data have this indicated in
the data type specification. For example, the routine specified by

TRANSFORM>SEMIGROUP

has as its data type Digraph.Multirelational. This indicates that this routine acts
on multirelational data in a particular way. If this data type is not included and a
multirelational data set is submitted for analysis then UCINET will perform the
analysis on each relation separately, if possible. In some cases such an action
would not make network sense, and in other cases it is simply not technically
possible to do this. In these cases the routine only acts on the first relation.

LOG FILE The LOG FILE contains output generated by each routine. The contents of the
file are displayed on the screen and the user can browse, edit, save or print it.
For each routine a comprehensive account of the contents of the file is given.

TIMING The timing gives the order of the routine related to the longest dimension of the
data matrix, which is called N. Care should be taken on the interpretation of this
value since it only gives the order of the polynomial (if one exists) which
dictates the time. Hence a time O(N^3) means that for sufficiently large N the
time to execute will increase at the rate of N^3. It is quite possible for the user to
increase N for an O(N^3) routine by a factor of 2 say, and the execution time to
increase by 20-fold instead of the expected 8-fold increase. This would be
because N was not sufficiently large for the highest order to dominate. Equally
well it cannot be used to compare two different routines.

Whilst caution is wise for a strict interpretation, it will be true that for O(N^3)
routine doubling the size of N will probably cause the execution time to increase
by approximately a factor of 8. Timings which are exponential mean that the
user should be aware that small increases in N may cause very large increases in
execution time.

COMMENTS Additional comments which may be of help to the user are given in this section.

REFERENCES A 'sample' of useful references which should enable the interested user to gain
more information.
STANDARD DATASETS
Ucinet comes with a collection of network datasets. Multirelational data are stored,
where possible, in a single multirelational data file. Each relation within a
multirelational set is labelled and information about the form of the data is described
for each individual matrix.

BERNARD & KILLWORTH FRATERNITY


BERNARD & KILLWORTH HAM RADIO
BERNARD & KILLWORTH OFFICE
BERNARD & KILLWORTH TECHNICAL
CAMP 92
COUNTRIES TRADE DATA
DAVIS SOUTHERN CLUB WOMEN
FREEMAN'S EIES DATA
GAGNON & MACRAE PRISON
GALASKIEWICZ'S CEO'S AND CLUBS
KAPFERER MINE
KAPFERER TAILOR SHOP
KNOKE BUREAUCRACIES
KRACKHARDT HIGH-TECH MANAGERS
KRACKHARDT OFFICE CSS
NEWCOMB FRATERNITY
PADGETT FLORENTINE FAMILIES
READ HIGHLAND TRIBES
ROETHLISBERGER & DICKSON BANK WIRING ROOM
SAMPSON MONASTERY
SCHWIMMER TARO EXCHANGE
STOKMAN-ZIEGLER CORPORATE INTERLOCKS
THURMAN OFFICE
WOLFE PRIMATES
ZACHARY KARATE CLUB
DATA>EDIT

PURPOSE Edit or create a UCINET dataset using a spreadsheet style editor.

DESCRIPTION All UCINET data files store the data as a matrix. Upon execution
of this routine a spreadsheet style editor is invoked. The
spreadsheet layout is very similar to that found on other
spreadsheets such as Excel, and hence should be familiar to
most users.

Each element of the data occupies a cell in the spreadsheet. The data matrix is
displayed exactly in matrix form. The user can move around the matrix using the
keys , ¯, ¬ and ® to move from one cell to an adjacent cell, and 'Page Up',
'Page Down', 'Home' and 'End' to move up one screen, down one screen, to the
beginning and to the end of the data respectively. When the cursor is located in a
particular cell the position of the cursor is recorded on the screen in terms
highlighted row and column numbers of the cell.

If the rows and/or columns are labeled then the labels are displayed at the top of
the screen. To edit or enter a new value in a particular cell then the cursor must
be placed in the relevant cell. The new value is typed at the keyboard and this
value appears at the top of the screen. Once the value has been correctly typed
then it is confirmed using the ENTER key. After ENTER has been depressed the
value is placed in the relevant cell.

Note that you can only type in the labels once some data has been filled in to the
relevant row or column. If you already know the size of your data then fill in the
last row and column entry first and you can type in the labels at the beginning. If
your data is symmetric click the Asymmetric mode button before you enter any
data this will automatically fill in the other half of your data. You need only enter
the non-zero values in the spreadsheet, once these have been filled in then click
on the button marked Fill all empty cells will be given a value of zero. If you
accidentally stray outside the size of your required matrix then you need to delete
the extra rows and columns rather than filling them in with blanks. If your data
has more than one relation then add the extra matrices using the + button on the
right side of the toolbar (the - can be used to delete relations). Individual matrices
within the network can be named using the rename sheet button situated just to
the right of the add and delete worksheet buttons.

The editor allows the access to some 2D and 3D graphics facilities. To utilize the
graphics load a UCINET dataset into the editor. Block the data that you wish to
display. Click on edit>copy to move the data onto the clipboard and then click on
edit>paste to deposit the data into the spreadsheet graphics facility. Finally click
on the graph button on the tool bar towards the right hand side just left of the
Symmetric/Asymmetric Mode button. The graphic wizard will take you through
the creation of your picture or chart.

The UCINET spreadsheet is limited to 255 columns and so this method cannot be
used for larger datasets

PARAMETERS N/A.
LOG FILE None.

TIMING Linear.

REFERENCES None.
DATA > RANDOM > MATRIX

PURPOSE Generate matrices where the cell values are drawn randomly from a variety of
possible distributions.

DESCRIPTION Generate a set of m´n matrices whose elements are random numbers drawn from
any of the following distributions - uniform, normal, binomial, Poisson, gamma
or exponential.

PARAMETERS
# of rows: (Default = 10).
The number of rows in the random matrix to be generated.

# of columns: (Default = 10).


The number of columns in the random matrix to be generated.

# of levels: (Default = 1).


The number of matrices to be generated, all matrices will be of the same
dimension.

Probability distribution: (Default = Uniform).


The underlying distribution from which the elements of the matrix are taken.

Choices are:

Uniform
Each cell value is taken from a [0,1] uniform distribution so that each cell value
is between 0 and 1. The mean is 0.5.

Normal
Each cell value is taken from a normal distribution.

Upon execution of the routine with this option a new window will appear with
the following parameters:

Mean of normal distribution (Default = 0.0)

Standard deviation of normal distribution (Default = 1.0).

Binomial
Each cell is filled with the number of times an event with probability p occurs in
n trials.

Upon execution of the routine with this option a window will appear with the
following parameters:

Event probability: (Default = 0.5)


This gives the probability p of success, i.e. the probability of an event occurring
during one trial.

# of trials (Default = 1).


This gives the desired number of repeated trials n. The mean is np.

Poisson
Each cell is filled with the number of times an event occurred in a unit interval of
time assuming a Poisson process.

Upon execution of the routine a window will appear with the following
parameter:

Average # of occurrences per time period (Default = 1.0).


This gives the mean of the distribution.

Gamma
Each cell is filled with the time taken for the kth occurrence of an event to occur
assuming the event follows a Poisson process with an average of one occurrence
per time period.

Upon execution of the routine a window will appear with the following
parameter:

Desired # of occurrences (Default = 1).


The number k of events which must occur. The value k=1 gives the exponential
distribution. The mean is k.

Exponential
Each cell is filled with the time taken for the 1st occurrence of an event to occur
assuming the event follows a Poisson process with an average of one occurrence
per time period. The mean is 1.

Include diagonal values: (Default = YES).


NO will give missing values on the main diagonal.

Generator Seed:
A seed for random number generator. Use of the same number will create
exactly the same 'random' matrix twice. Any value from 1 to 32000 is
permissible. The default is randomly generated.

Output dataset: (Default = 'Random').


Name of data file which will contain random matrix.

LOG FILE Generated random matrix. The cells of the random matrix will be of the
following type:

UNIFORM - real range [0,1].


NORMAL - real range (-¥,¥).
BINOMIAL - integer range [0,¥).
POISSON - integer range [0,¥).
GAMMA - real range (0,¥).
EXPONENTIAL - real range (0,¥).

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA >RANDOM > SOCIOMETRIC
PURPOSE A random digraph is created in which edges are generated with the constraint
that each vertex has a user specified out-degree.

PARAMETERS

Number of nodes (Default = 10)


The size of digraph to be constructed.

Number of graphs (Default = 1)


This specifies the number of relations to be generated.

# of choices per actor (out-degree)


Specifies the out-degree for each actor. A single number will specify the same
out-degree for each actor.The degree of each actor can be specified by a list.
Each element of the list is separated by a space or comma. If the list is shorter
than the number of nodes then it is extended by repeating from the first element.
Values greater than the maximum out-degree are reduced to the maximum value.
The list can be specified by a UCINET data file. This must be of the form:

<filename> ROW (or COLUMN) <number>

where filename is the name of the data file. The command ROW or COLUMN
followed by the appropriate number specifies which row or column of the dataset
is to be used.

Generate self loops (Default = No)


If NO edges connecting a node to itself will not be allowed.

Random generator seed:


A seed for the random number generator. Use of the same number will create
exactly the same 'random' graph. Any value from 1 to 32000 is permissible. The
default is randomly generated.

OUTPUT dataset (Default = 'SociometricRandomGraph')


Name of file which contains generated digraph.

LOG FILE Table of specified out-degrees.


Randomly generated digraph which conforms to the specification.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA > RANDOM > BERNOULLI

PURPOSE Generate a random network taken from a Bernoulli distribution.

DESCRIPTION A random network is created in which edges are generated independently from a
Bernoulli distribution.

A random number between 0 and 1 is generated for each cell in an adjacency


matrix. If this number is less than a user specified probability then an edge is
created. Users can specify a single probability for the whole matrix, or different
probabilities for each row, column or cell. The whole procedure can be repeated
for a number of trials to create an integer valued network.

PARAMETERS
Number of nodes (Default = 10)
The size of the graph to be constructed.

Number of graphs (Default = 1)


This specifies the number of relations to be generated.

Number of trials per cell (Default = 1)


The number of repeated trials per cell. A value of 1 will give a binary matrix.
Values greater than 1 will give entries which correspond to the number of
successes in the given number of trials.

What probabilities will you supply (Default = Matrix)


Choices are:

Matrix - in which a single probability is used for the entire matrix.

Row - a set of probabilities, one for each row is used.

Column - a set of probabilities, one for each column is used.

Cell - a complete matrix of probabilities one for each cell is prescribed.

Once an option has been selected the routine highlights parameters which are
dependent on the option selected.

MATRIX option:

Probability of a tie (Default = 0.5)


A single probability applicable to the whole matrix should be specified.

ROW option:

Row probabilities dataset:


Name of file which will contain dataset with row probabilities.

Probabilities are ROW or COLUMN of the dataset (Default = Column)


Row means that probabilities will be taken from a particular row of the dataset.
Column specifies a column.

Which row/column (Default = 1)


Specifies which row or column of the dataset is to be used.
COLUMN option:

Column probabilities dataset:


Name of file which will contain dataset of column probabilities.

Probabilities are Row or Column of the dataset (Default = Column)


Row means that probabilities will be taken from a particular row of the dataset.
Column specifies a column.

Which row/column (Default = 1)


Specifies which row or column of the dataset is to be used.

CELL option:

Cell probabilities dataset:


Name of file which will contain the matrix of probabilities.

Generate self-loops (Default = No)


No means that nodes cannot be connected to themselves.
Yes means that self-loops may be generated.

Random generator seed:


A seed for the random number generator. Use of the same number will create
exactly the same 'random' graph twice. Any value from 1 to 32000 is
permissible. The default is randomly generated.

Output dataset (Default = 'RandomBernoul')


Name of file which will contain random graph.

LOG FILE Generated random graph.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA > RANDOM > MULTINOMIAL
PURPOSE Generate random valued graphs in which the values are distributed by user
assigned probabilities.

DESCRIPTION The user specifies N, the total number of cases in the simulated "sample". The
algorithm randomly distributes the N cases into the cells of the adjacency
matrix. This distribution can either be uniform, in which case each cell has the
same probability of being assigned one of the cases, or the distribution can be
user specified. In this case the algorithm randomly assigns each case in
proportion to the cell probabilities. The probabilities can be specified by row,
column or individual cells. The result is a value for each directed arc in the
network.

PARAMETERS
Number of nodes (Default = 10)
Number of nodes in each valued adjacency matrix to be created.

Number of graphs (Default = 1)


Number of random matrices to be created.

Total number of cases (sum of values)


Total number of values to be distributed across all cells in adjacency matrix.
Default is n(n-1) where n is the number of nodes.

What probabilities will you supply (Default = Matrix)


Choices are:

Matrix - a single probability is used for the entire matrix.

Row - a set of probabilities, one for each row is used.

Column - a set of probabilities, one for each column is used.

Row*Column - two sets of probabilities are prescribed, one for the rows and
one for the columns. The probability for each cell is the product of the
probabilities prescribed for its row and column.

Cell - a complete matrix of probabilities, one for each cell is prescribed.

Once an option has been selected the routine highlights parameters which are
dependent on the option selected.

Row option

Row probabilities dataset:


Name of file which contains probabilities for each row, it is assumed that the
required probabilities will be contained in a matrix.

Probabilities are Row or Column of this dataset: (Default = Column)


Specify Row or Column as required.

Which Row/Column (Default = 1)


Number of row or column required.
Column option

Column probabilities dataset:


Name of file which contains probabilities for each column, it is assumed that the
required probabilities will be contained in a matrix.

Probabilities are Row or Column of this dataset: (Default = Column)


Specify Row or Column as required.

Which Row/Column: (Default = 1)


Number of row or column required.

Row*Column option
Two datasets are provided row probabilities as in row option and column
probabilities as in column option.

Cell option

Cell probabilities dataset:


Name of file which contains matrix of probabilities.

Generate self loops: (Default = No)


If NO then there will be no ties on the diagonal.

Random number seed:


UCINET generates a different random number as a default each time it is run.
Use of the same seed will result in the same 'random' graph. The range is 1 to
32000.

Output dataset (Default = 'MultinomialRandomGraph')


Name of file which will contain generated random network.

LOG FILE The log file contains a display of each random matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA > IMPORT>DL
PURPOSE Convert text (ie ASCII) data files in DL format to UCINET format.

DESCRIPTION Imports ASCII files, that is plain text files which are in DL format into UCINET.
These files can be created externally or using the UCINET text editor, more
information is contained in the users guide or in the DL help.

PARAMETERS
Input dataset:
Name of DL type file containing data to be imported. Data type: ASCII or text.

Output data type: (Default = Real)


Choices are:

Byte - whole numbers in the range 0 to 255 inclusive.


Missing values are not allowed.

Smallint - whole numbers in the range -32000 to 32000.


Missing values are not allowed.

Real - real numbers in the range -1.E36 to 1.E36.


Missing values permissible.

Output dataset:
Name of UCINET data file, this will be set to the same name as the text file by
default.

LOG FILE UCINET data file.

TIMING O(N^2).

COMMENTS None
DATA > IMPORT > PAJEK
PURPOSE Convert Pajek data files into UCINET format.

DESCRIPTION Imports Pajek files for use by UCINET, both the network in the form of an
adjacency matrix and the co-ordinates of the nodes in the plot may be imported.

PARAMETERS
Input dataset:
Name of file containing data to be imported. Data type: ASCII file.

Output UCINET Network


Name of UCINET data file to contain the network details, default is the same
name as the input dataset.

Output Coordinate dataset


Name of UCINET data file to contain the coordinate details, default is the same
name as the input dataset with Crd added to the name.

LOG FILE A display of the UCINET data file.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA > IMPORT > KRACKPLOT
PURPOSE Convert Krackplot data files into UCINET format.

DESCRIPTION Imports Krackplot files for use by UCINET both the network in the form of an
adjacency matrix and the co-ordinates of the nodes in the plot may be imported.

PARAMETERS
Input dataset:
Name of file containing data to be imported. Data type: ASCII file.

(Output) Network dataset


Name of UCINET data file to contain the network details, default is the same
name as the input dataset.

(Output) Coordinate dataset (Default = 'Kpcrd')


Name of UCINET data file to contain the coordinate details, default is the same
name as the input dataset with Crd added to the name.

LOG FILE A display of the UCINET data file.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA > IMPORT>UCINET 3
PURPOSE Convert UCINET 3 data into UCINET for windows format.

DESCRIPTION Imports UCINET 3 data into UCINET for windows format, this format is the
same as UCINET IV.

PARAMETERS
Input dataset:
Name of UCINET 3 file to be imported.

Output data type: (Default = Real)


Choices are:

Byte - whole numbers in the range 0 to 255 inclusive.


Missing values are not allowed.

Smallint - whole numbers in the range -32000 to 32000.


Missing values are not allowed.

Real - real numbers in the range -1.E36 to 1.E36.


Missing values permissible.

Output dataset:
Name of UCINET data file, this will be set to the same name as the input file by
default.

LOG FILE UCINET data file.

TIMING O(N^2).

COMMENTS None
DATA > IMPORT>RAW
PURPOSE Convert a text file (that is an ASCII file) containing a matrix into UCINET for
windows format.

DESCRIPTION Imports a text file (that is an ASCII file) containing a matrix into UCINET for
windows format. The datafile must be pure text with spaces, commas or carriage
returns between the characters.

PARAMETERS
Input dataset:
Name of text file to be imported.

# of columns
The number of columns in the data matrix.

# of rows
The number of rows in the data matrix

Output data type: (Default = Real)


Choices are:

Byte - whole numbers in the range 0 to 255 inclusive.


Missing values are not allowed.

Smallint - whole numbers in the range -32000 to 32000.


Missing values are not allowed.

Real - real numbers in the range -1.E36 to 1.E36.


Missing values permissible.

Output dataset:
Name of UCINET data file, this will be set to the same name as the input file by
default.

LOG FILE UCINET data file.

TIMING O(N^2).
DATA > IMPORT>EXCEL
PURPOSE Convert EXCEL files (4.0 or 5.0/95) into UCINET format.

DESCRIPTION Imports simple EXCEL files (4.0 or 5.0/95) into UCINET format. Note that the
spreadsheet must have no extras such as shading or borders.

PARAMETERS
Input dataset:
Name of EXCEL type file containing data to be imported.

Output dataset:
Name of UCINET data file, this will be set to the same name as the input file by
default.

LOG FILE UCINET data file.

TIMING O(N^2).

COMMENTS This is very sensitive and many users find it easier to copy and paste from their
spreadsheet into the UCINET spreadsheet. The easiest way is to copy the data
only (ie not the labels) paste into the UCINET spreadsheet by first blocking the
same dimensions as you wish to import. To import the labels save them and use
the label import feature in DESCRIBE.
DATA > IMPORT > NEGOPY

PURPOSE Convert text files formatted for the Negopy program into UCINET datasets.

DESCRIPTION Reads the .dat and .nam Negopy files and creates a UCINET dataset.

PARAMETERS

Input link file: <*.dat>


Name of file, such as TRADE71.DAT, containing ties among actors. Format of
the file looks like this:

(2I3,1F5.1,1f3.1)
19 23 156.7 26.2
19 28 162.3 28.9
...

The first line is a Fortran format statement, required by Negopy but ignored by
UCINET. You can just put a blank line if you like. The second line indicates a
tie from person 19 to person 23, of strength 156.7 on the first relation, and of
strength 26.2 on the second relation.

Input name file: <*.nam>


Name of file, such as TRADE71.NAM, containing labels of actors. Format
looks like this:

(1I2,1X,1A30)
01 Billy-Bob
02 Johnny
...

Number of relations: (Default = 1)


Number of relations contained in the input link file (i.e., the number of columns
of data after the two actor id numbers).

Output dataset: (Default = 'Imported')


Name of UCINET dataset to be created.

LOG FILE Data displayed in matrix form.

TIMING O(N^2).

COMMENTS Negopy is a program written by Bill Richards and Andy Seary.


DATA > EXPORT>DL
PURPOSE Convert UCINET data files into DL format.

DESCRIPTION Converts UCINET data files into DL format, for a full description of the DL
format go to help dl .

PARAMETERS
Input dataset:
Name of file containing data to be exported. Data type: Matrix.

Output format : (Default = "Full matrix")


Choices are:

Full matrix
A complete N´N matrix;

Lowerhalf
Gives the lower-triangle and should only be used for symmetric matrices.

Upper half
Gives the upper-triangle and should only be used for symmetric matrices.

Nodelist1
This is used on binary matrices only. Each line of data consists of a row number
(call it i) followed by a list of column numbers (call each one j) such that x(i,j) =
1.

Nodelist1B
This is used on binary matrices only. Each line of data corresponds to a matrix
row (call it i). The first number on the line is the number of non-zero cells in that
row. This is followed by a list of column numbers (call each one j) such that x(i,j)
= 1. Note that rows must appear in numerical order, and none may be skipped
(unlike the Nodelist1 format).

Nodelist2
Each line begins with a row id number followed by a list of column id numbers
that are connected to that row number. For use in 2-mode matrices

Edgelist1
This format is used on data forming a matrix in which the rows and columns
refer to the same kinds of objects (e.g., an illness-by-illness proximity matrix, or
a person-by-person network). The 1-mode matrix X is built from pairs of indices
(a row and a column indicator). Pairs are typed one to a line, with indices
separated by spaces or commas. The presence of a pair i,j indicates that there is a
link from i to j, which is to say a non-zero value in x(i,j). Optionally, the pair may
be followed by a value representing an attribute of the link, such as its strength or
quality. If no value is present, it is assumed to be 1.0. If a pair is omitted
altogether, it is assigned a value of 0.0.

Edgelist2
This is used on data forming a matrix in which the rows and columns refer to
different kinds of objects (e.g., illnesses and treatments). The 2-mode matrix X is
built from pairs of indices (a row and a column indicator). Pairs are one to a line,
with indices separated by spaces. The presence of a pair i,j indicates that there is
a link from row i to column j, which is to say a non-zero value in x(i,j). If the pair
is followed by a value then this is the strength of the tie. If no value is present, it
is assumed to be 1.0. If a pair is omitted altogether, it is assigned a value of 0.0.

Diagonals present (Default = Present)


If Absent diagonal values will not be written to file.

(edgelist only) Type


Specify whether the data is directed or undirected

Decimal places: (Default = 0)


The number of places of decimals required. The default will correspond to the
number of places of decimals in the original UCINET data file. A smaller value
will result in rounding to the nearest value. A value of 0 will indicate Integer
values only.

Field width (Default = Freefield)


Freefield will simply place each row of a matrix on a new line with no attempt to
align the columns.
Automatic will align the rows and columns into a matrix format. The user can
also specify the number of spaces for each field - this number should be greater
than the number of decimal places in the field.

Guaranteed space (Default = Yes)


Yes separates each number in every row by a space. No prints each number in a
continuous list.

Page width (Default=10000):


The maximum width of the output page.

Embed row labels(Default=No):


Should these labels be embedded.

Embed column labels(Default=No):


Should these labels be embedded.

Embed matrix labels(Default=No):


Should these labels be embedded.

Output dataset:
Name of file to be created with .txt file extension.

LOG FILE A text DL data file of type specified.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA>EXPORT >KRACKPLOT
PURPOSE Convert UCINET data files into Krackplot format.

DESCRIPTION Converts UCINET data files including co-ordinate and attribute files into
Krackplot format.

PARAMETERS
(Input) Network dataset:
Name of file containing data to be exported. Data type: Matrix.

(Input) Co-ordinate dataset


Name of file containing co-ordinates of points for the layout of the data. These
are as in the co-ordinate output of MDS. If there are no co-ordinates then this can
be left blank.

Node attributes (if any)


Name of file containing actor attributes, given as a vector of shared attributes so
that (1,2,3,1,2,2) means that actors 1 and 4 share the same attribute actors
2,5,and 6 share the same attribute and actor 3 has a different attribute from all the
others.

Output data file:


Name of file to be created.

LOG FILE Krackplot data file.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA>EXPORT >MAGE
PURPOSE Convert UCINET data files into Mage format.

DESCRIPTION Converts UCINET data files including co-ordinate files and attribute files into
Mage format for 3D visualization.

PARAMETERS
(Input) Network dataset:
Name of file containing network data to be exported. Data type: Digraph

(Input) Co-ordinate dataset


Name of file containing co-ordinates of points for the layout of the data. These
are as in the co-ordinate output of MDS. If there are no co-ordinates then this can
be left blank.

Node attributes (if any)


Name of file containing actor attributes, given as a vector of shared attributes so
that (1,2,3,1,2,2) means that actors 1 and 4 share the same attribute actors
2,5,and 6 share the same attribute and actor 3 has a different attribute from all the
others. These attributes can be used in Mage to color the nodes according to the
attribute.

Ball Size (Default = 0.15)


Radius of the nodes in the image, a value of zero eliminates nodes, typically
values are from 0.05 to 0.5.

Line thickness (Default = 2)


A number from 1 to 5 which specifies the thickness of the lines.

Arrow Size (Default = 0.25)


Size of arrow heads, typically values are from 0.05 to 0.5.

Arrow Angle (Default = 20)


The angle that the arrow makes with the edge in degrees.

Font Size (Default = 20)


Size of the font used on the image to display the node labels

Output File
Name of file to be created, normally the file extension should be .kin.

Launch Mage on Exit (Default = 'Yes')


If yes exported file is immediately displayed in Mage

LOG FILE Mage data file.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA>EXPORT > PAJEK > NETWORK
PURPOSE Convert UCINET graph or digraph files into Pajek format together with any
categorical attribute files.

DESCRIPTION Converts UCINET data files into Pajek format, the conversion can take valued
data and dichotomize it during the export and also export associated categorical
attribute files together with co-ordinate files. The conversion will also
automatically delete isolated vertices if required.

PARAMETERS
(Input) Network dataset:
Name of file containing network data to be exported. Data type: Valued digraph.

Dichotomize vals > than:


For valued data a cut-off value used to convert the data to a binary matrix, for
binary data leave blank.

Delete isolates? (Default = 'No')


If yes isolated vertices are not included in the exported file

(Input) Co-ordinate dataset


Name of file containing co-ordinates of points for the layout of the data. These
are as in the co-ordinate output of MDS. If there are no co-ordinates then this can
be left blank.

(Input) Attribute dataset


Name of file containing categorical actor attributes, given as a vector of shared
attributes so that (1,2,3,1,2,2) means that actors 1 and 4 share the same attribute
actors 2,5,and 6 share the same attribute and actor 3 has a different attribute from
all the others. If there is more than one attribute this can be combined into an
attribute matrix with the rows representing the actors and each column
corresponding to a different attribute.

Output Attribute file:


Name of Pajek attribute file to be created. If there is more than one attribute then
one file will be created for each attribute with the same file name but with the
column number added as the last character in the name. Pajek categorical
attribute files have the file extension .clu.

Output Network file:


Name of Pajek file containing the adjacency matrix of the network, the file has
.net as an extension.

Launch Pajek on exit?


If yes then Pajek is launched on exit.

LOG FILE Pajek .net data file.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA>EXPORT > PAJEK > CATEGORICAL ATTRIBUTE
PURPOSE Convert UCINET categorical attribute files into a Pajek file.

DESCRIPTION Converts UCINET categorical attribute files into Pajek format ie Pajek clu files.
The conversion can take a matrix of attributes and create a set of Pajek clu files
one for each column of the matrix. These files can be used in Pajek to color the
nodes according to a particular attribute.

PARAMETERS

(Input) Attribute dataset


Name of file containing categorical actor attributes, given as a vector of shared
attributes so that (1,2,3,1,2,2) means that actors 1 and 4 share the same attribute
actors 2,5,and 6 share the same attribute and actor 3 has a different attribute from
all the others. If there is more than one attribute this can be combined into an
attribute matrix with the rows representing the actors and each column
corresponding to a different attribute.

Output file(s) prefix:


Name of Pajek attribute file to be created. If there is more than one attribute then
one file will be created for each attribute with the same file name but with the
column number added as the last character in the name. Pajek categorical
attribute files have the file extension .clu.

LOG FILE Lists the Pajek clu files created

TIMING O(N)

COMMENTS None.

REFERENCES None.
DATA>EXPORT > PAJEK > QUANTITATIVE ATTRIBUTE

PURPOSE Convert UCINET quantative attribute files into a Pajek file.

DESCRIPTION Converts UCINET quantative attribute files into Pajek format ie Pajek vec files.
The conversion can take a matrix of attributes and create a set of Pajek vec files
one for each column of the matrix. These files can be used in Pajek to change the
sizes of the nodes according to a particular attribute.

PARAMETERS (Input) Attribute dataset


Name of file containing quantative actor attributes. These maybe attributes of the
actors Eg age or possibly network attributes Eg centrality. If there is more than
one attribute this can be combined into an attribute matrix with the rows
representing the actors and each column corresponding to a different attribute.

Output file(s) prefix:


Name of Pajek attribute file to be created. If there is more than one attribute then
one file will be created for each attribute with the same file name but with the
column number added as the last character in the name. Pajek quantative
attribute files have the file extension .vec.

LOG FILE Lists the Pajek vec files created

TIMING O(N)

COMMENTS None.

REFERENCES None.
DATA>EXPORT > METIS
PURPOSE Convert UCINET network files into Metis files.

DESCRIPTION Converts UCINET datafiles either binary or valued but only symmetric into data
files for the Metis partitioning software.
PARAMETERS

Input dataset
Name of UCINET data file containing network. Data Type: Valued symmetric
graph

Type of Data
Choices are Binary or Valued.

Output Dataset
Name of Metis file to be created, note there are no prescribed file extensions.

LOG FILE Metis file created

TIMING O(N)

COMMENTS None.

REFERENCES None.
DATA > EXPORT>RAW
PURPOSE Convert UCINET data files into raw format.

DESCRIPTION Converts UCINET data files into raw format, these are the same as the DL format
but without the headers, for full information of the DL formats go to help dl .

PARAMETERS
Input dataset:
Name of file containing data to be exported. Data type: Matrix.

Output format : (Default = "Full matrix")


Choices are:

Full matrix
A complete N´N matrix;

Lowerhalf
Gives the lower-triangle and should only be used for symmetric matrices.

Upper half
Gives the upper-triangle and should only be used for symmetric matrices.

Nodelist1
This is used on binary matrices only. Each line of data consists of a row number
(call it i) followed by a list of column numbers (call each one j) such that x(i,j) =
1.

Nodelist1B
This is used on binary matrices only. Each line of data corresponds to a matrix
row (call it i). The first number on the line is the number of non-zero cells in that
row. This is followed by a list of column numbers (call each one j) such that x(i,j)
= 1. Note that rows must appear in numerical order, and none may be skipped
(unlike the Nodelist1 format).

Nodelist2
Each line begins with a row id number followed by a list of column id numbers
that are connected to that row number. For use in 2-mode matrices

Edgelist1
This format is used on data forming a matrix in which the rows and columns
refer to the same kinds of objects (e.g., an illness-by-illness proximity matrix, or
a person-by-person network). The 1-mode matrix X is built from pairs of indices
(a row and a column indicator). Pairs are typed one to a line, with indices
separated by spaces or commas. The presence of a pair i,j indicates that there is a
link from i to j, which is to say a non-zero value in x(i,j). Optionally, the pair may
be followed by a value representing an attribute of the link, such as its strength or
quality. If no value is present, it is assumed to be 1.0. If a pair is omitted
altogether, it is assigned a value of 0.0.

Edgelist2
This is used on data forming a matrix in which the rows and columns refer to
different kinds of objects (e.g., illnesses and treatments). The 2-mode matrix X is
built from pairs of indices (a row and a column indicator). Pairs are one to a line,
with indices separated by spaces. The presence of a pair i,j indicates that there is
a link from row i to column j, which is to say a non-zero value in x(i,j). If the pair
is followed by a value then this is the strength of the tie. If no value is present, it
is assumed to be 1.0. If a pair is omitted altogether, it is assigned a value of 0.0.

Diagonals present (Default = Present)


If Absent diagonal values will not be written to file.

(edgelist only) Type


Specify whether the data is directed or undirected

Decimal places: (Default = 0)


The number of places of decimals required. The default will correspond to the
number of places of decimals in the original UCINET data file. A smaller value
will result in rounding to the nearest value. A value of 0 will indicate Integer
values only.

Field width (Default = Freefield)


Freefield will simply place each row of a matrix on a new line with no attempt to
align the columns.
Automatic will align the rows and columns into a matrix format. The user can
also specify the number of spaces for each field - this number should be greater
than the number of decimal places in the field.

Guaranteed space (Default = Yes)


Yes separates each number in every row by a space. No prints each number in a
continuous list.

Page width (Default=10000):


The maximum width of the output page.

Embed row labels(Default=No):


Should these labels be embedded.

Embed column labels(Default=No):


Should these labels be embedded.

Embed matrix labels(Default=No):


Should these labels be embedded.

Output dataset:
Name of file to be created with txt file extension.

LOG FILE A text data file of type specified.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA > EXPORT > UCINET 3.0
PURPOSE Convert UCINET data files into Ucinet 3.0 format.

DESCRIPTION Converts UCINET data files into Ucinet 3.0 format.

PARAMETERS
Input dataset:
Name of file containing data to be exported. Data type: Matrix.

Output format:
Choices are:

Lower triangular matrix


Symmetric square matrix
Non-symmetric square matrix
Rectangular square matrix
Stacked square matrices
Stacked triangular matrices

Output data type:


Choices are:

Binary
Non-Binary

Decimal places:
The number of decimal places to include.

Output data file:


Name of file to be created.

LOG FILE Ucinet 3.0 data file.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA>EXPORT>EXCEL
PURPOSE Export a UCINET dataset to Excel format.

DESCRIPTION Creates an Excel spreadsheet file in either Excel 4 or Excel 5

PARAMETERS
Input dataset:
Name of dataset to be converted. Data type: any UCINET file.

Which version of Excel:


Choices are:

Excel 5 and 7
Excel 4

LOG FILE None

TIMING N/A

COMMENTS None

REFERENCES None
DATA >ATTRIBUTE
PURPOSE Create a network from attribute data.

DESCRIPTION Convert a vector of valued attributes to a matrix based upon either exact
matches, differences, absolute differences, squared differences, product or sums
of the values.

PARAMETERS
Dataset containing attribute vector:
Name of data file containing vector of valued attributes. This vector must be a
row or column of a matrix , it can be the only row or column. Data type: Matrix

Vector is Row or Column?:


Choose either row or column

Which Row/Col (Default = 1)


The number of the row or column that contains the attributes to be converted.

Method: (Default = Absolute Difference).


Choices are:

Exact Matches
Matrix X is formed by X(i,j) = 1 if vector(i) = vector(j) and 0 otherwise.

Difference
Matrix X is formed by X(i,j) = vector(i) - vector(j).

Absolute Difference
Matrix X is formed by X(i,j) = ABS (Vector(i) - vector(j)).

Squared Difference
Matrix X is formed by X(i,j) = (vector(i) - vector(j))^2.

Product
Matrix X is formed by X(i,j) = vector(i) * vector(j).

Sum
Matrix X is formed by X(i,j) = vector(i) + vector(j).

Output dataset:
Name of file which contains constructed matrix.

LOG FILE Constructed matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA >AFFILIATIONS
PURPOSE Create a network from affiliation data.

DESCRIPTION Converts an m´n matrix to an m´m or n´n by forming AA' or


A'A.Given an incidence matrix A where the rows represent
actors and the columns events, then the matrix AA' gives the
number of events in which actors simultaneously attended.
Hence AA' (i,j) is the number of events attended by both actor i
and actor j. The matrix A'A gives the number of events
simultaneously attended by a pair of actors. Hence A'A(i,j) is the
number of actors who attended both event i and event j.

PARAMETERS
Input dataset:
Name of file containing 2-mode dataset. Data type: Matrix

Which mode: (Default = Row).


Choices are:

Row
Represents row by row matrix of overlaps, i.e. forms AA'

Column
Represents column by column matrix of overlaps, i.e. forms A'A.

Output dataset: (Default = 'Affiliations').


Name of file which contains new matrix.

LOG FILE New matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA>CSS
PURPOSE Combines a number of different relations or cognitive "slices" of the same
network into a single pooled network. These may either be a number of views of
the whole network or the view of the whole network through all ego centered
networks.

DESCRIPTION The input is a set of k adjacency matrices, each of the form A(i,j) stacked into a
three-dimensional matrix, A(i,j,k). This form is useful for cognitive social
structures, where k refers to the perceiver of a relation from i to j. This routine
compresses this 3-D matrix into a two-dimensional matrix, A'(i,j) using one of
two methods. One is to compute the element-wise sum over the k matrices:
A'(i,j) = SUM over k of A(i,j,k) This matrix can be dichotomized around a
threshold to produce a "consensus" structure.

Alternatively, one can produce a "locally aggregated structure" (LAS) by setting


A'(i,j) = A(i,j,i)+A(i,j,j). In other words, the value of a given cell in the
aggregate matrix is a function only of the perceptions of the two individuals
involved, not the whole group. This matrix can also be dichotomized.

PARAMETERS
Input dataset:
Name of file containing any set of matrices representing the same network. Data
type: Valued graph. Multirelational.

Method of Pooling graphs (Default = Slice)


Choices are:

Slice. Take an individuals view of the network. This simply extracts a single
matrix from the structure.

Row LAS. Construct a matrix which uses each respondents row as a row in the
data matrix. The result is that each row of the data corresponds to the
respondents perception of that row.

Column LAS. Construct a matrix which uses each respondents column as a


column in the data matrix. The result is that each column of the data corresponds
to the respondents perception of that column.

Intersection LAS. Construct a matrix with a connection between i and j if both i


and j agree that such a connection exists.

Union LAS. Construct a matrix with a connection between i and j if either i or j


state that such a connection exists.

Median LAS. Construct a matrix with values A(i,j) which are the median of i's
value of the i,j connection and j's value of the connection.

Consensus. The consensus takes the sum of all the respondents and then
dichotomises the sum.

Average. The average of all the respondents view of the network.

If the users choose either Slice or Consensus then the following parameters will
be highlighted.

(For Slice Method) Which informants slice? (Default = 1)


Number of actor to be the informant

(For consensus method) Threshold value (Default =0.5)


Threshold value for dichotomising the aggregated matrix.

Output dataset (Default = 'Pooled')


Output file that will contain pooled graph .

LOG FILE Pooled graph adjacency matrix.

TIMING O(N^2)

COMMENTS None.

REFERENCES Krackhardt D. (1987). 'Cognitive social structures'. Social Networks 9, 104-134.


DATA>DISPLAY
PURPOSE Display UCINET datasets on the screen.

DESCRIPTION Allows display of all or part of any UCINET dataset.

PARAMETERS
Data Set Filename
Name of file to be displayed. Data type: Matrix.

Width of Fields (Default = Min)


The width of field gives the size allocated for the width of each cell. The default
value will display the number in each cell separated by a single space.

# of decimals (Default = Min)


Defines the number of places of decimals to be displayed. The default will give
the number of the original data up to a maximum of 2 places of decimals.

Print zeros as (Default = 0)


Enter blank to suppress zeros.

Scale factor (Default= 1)


Scales up entries by multiplying them by the scale factor. Useful for seeing
small numbers. Note the data is left unchanged.

Which rows (Default = All)


Rows to be displayed are done so in the order specified by a row list. Each
element of the list is separated by a comma or space. The keywords, TO, FIRST
and LAST are permissible. Hence 3, 7 TO 9, FIRST 2 will display rows 3, 7, 8,
9, 1 and 2 in that order.

Which cols (Default = All)


Columns to be displayed are done so in the order specified by a column list in
the same way as the rows above.

Row blocking (if any): (Default = None)


To partition the rows of the displayed matrix into blocks, specify a blocking
vector by giving the dataset name, a dimension and an integer value. For
example, to use the second row of a dataset called ATTRIB, enter "ATTRIB
ROW 2". The program will then read the second row of ATTRIB and use that
information to sort the rows of the matrix. All rows with identical values on the
criterion vector (i.e. the second row of attrib) will be placed in the same block of
the matrix.

Column blocking (if any): (Default = None)


To partition the columns of the displayed matrix into blocks, specify a blocking
vector by giving the dataset name, a dimension and an integer value. For
example, to use the second row of a dataset called ATTRIB, enter "ATTRIB
ROW 2". The program will then read the second row of ATTRIB and use that
information to sort the columns of the matrix. All columns with identical values
on the criterion vector (i.e. the second row of attrib) will be placed in the same
block of the matrix.
LOG FILE Display of UCINET dataset, or part of dataset as prescribed.

TIMING Linear.

COMMENTS 'Width of Field' should be greater than # of places of decimals. If this is not the
case data is still displayed with no spaces between cells causing the labels to be
incorrectly aligned.

REFERENCES None.
DATA>DESCRIBE
PURPOSE Gives a description of a UCINET dataset and allows the user to import, enter or
edit the labels

DESCRIPTION Displays information contained in UCINET header file, this includes the data
type; number of dimensions, size of matrix, title and labels. The labels can be
edited, entered or imported. To edit an existing label simply double click on the
label and perform the edit. The edits will only be kept if the file is saved using
the 'save as' button. To type in a new set of labels change the label flag from false
to true and double click in the label box. Proceed as an edit remembering to save
the file when you have finished. You can import labels saved in ASCII by
clicking on the import button and then entering the appropriate file name.

PARAMETERS None

LOG FILE None

TIMING Linear.

COMMENTS None.

REFERENCES None.
DATA>EXTRACT
PURPOSE To extract parts of a dataset from a UCINET dataset.

DESCRIPTION Extracts by means of specified lists rows, columns or matrices from UCINET IV
datasets.

PARAMETERS
Input dataset:
Name of file from which data is to be extracted. Data type: matrix.

Are you going to Keep or Delete (Default = Keep)


User can either specify which rows, columns or matrices form the new dataset or
which rows, columns or matrices will be deleted to form the new dataset.

Which rows (Default = All (None))


Rows to be kept or dropped are specified by a list. Each row number is listed
separated by a comma or space. The keywords TO, FIRST and LAST are
permissible. Hence FIRST 3, 5 TO 7, 10, 12 would give row numbers 1, 2, 3, 5,
6, 7, 10 and 12. ALL gives all possible rows, NONE gives no rows. Lists kept in
a UCINET dataset can be used. Enter the filename followed by ROW (or
COLUMN) and a number to specify which row or column of the file to use.The
list must be specified using a binary vector where a 1 in position k indicates that
vertex k is a member of the list, a zero indicates that k is not a member.

Which columns (Default = All (None))


Same as above but for columns.

Which matrices(Default = All (None))


In multirelational data matrices from different levels can be selected using the
same list format as above.

Output dataset: (Default = 'Extract')


Name of UCINET dataset that will contain edited data.

LOG FILE Newly created dataset with labeled rows and columns.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
DATA>EGONET
PURPOSE Construct an ego centered network from the whole network

DESCRIPTION The neighborhood of an actor is the set of actors they are connected to together
with the actors that are connected to them. An ego centered network is the
subgraph induced by the set of neighbors. That is the network that consists of all
the neighbors and the connections between them. The idea of an ego network
can be extended to a group of actors and the neighborhood is simply the union
of the neighborhoods of the group. This procedure returns the adjacency matrix
of the ego network and provides an option to include or exclude ego(s) from the
network

PARAMETERS Input Dataset


Name of file containing the network from which the egonet is to be constructed.

Focal Nodes
The node or nodes on whom the neighborhood will be built. Nodes are specified
by a list. Each node is listed separated by a comma or space. The keywords TO,
FIRST and LAST are permissible. Hence FIRST 3, 5 TO 7, 10, 12 would give
nodes 1, 2, 3, 5, 6, 7, 10 and 12. Lists kept in a UCINET dataset can be used.
Enter the filename followed by ROW (or COLUMN) and a number to specify
which row or column of the file to use.The list must be specified using a binary
vector where a 1 in position k indicates that vertex k is a member of the list, a
zero indicates that k is not a member.

Include focal? (Default = 'Yes')


Whether to include the focal nodes in the network or not.

Output dataset (Default = 'Neighborhood')


Name of file containing adjacency matrix of the ego network.

LOG FILE Ego network adjacency matrix.

TIMING O(N)

COMMENTS None

REFERENCES None
DATA > UNPACK

PURPOSE To unpack matrices from a UCINET dataset.

DESCRIPTION Unpacks some or all matrices from a UCINET multirelational dataset. This
routine is similar to extract for matrices except it places each extracted matrix as
a single UCINET dataset. Hence extracting n matrices results in n different
single datasets.

PARAMETERS
Input dataset:
Name of file from which data is to be unpacked. Data type: matrix
multirelational.

Which relations to unpack (Default = ALL)


List of relations to unpack. Each matrix number is listed separated by a comma
or space. The keywords TO, FIRST and LAST are permissible. Hence FIRST 3,
5 TO 7, 10, 12 would give matrix numbers 1, 2, 3, 5, 6, 7, 10 and 12. ALL gives
all possible matrices.

LOG FILE Lists the filenames of the unpacked matrices

TIMING Linear

COMMENTS None.

REFERENCES None.
DATA>JOIN
PURPOSE Combine UCINET data files to form a single data file. Combines sets of single
matrices into a new matrix by merging all rows or all columns. Also combines
sets of single matrices or multi-relational matrices into one multi-relational
matrix.

DESCRIPTION Combines sets of single matrices, with equal columns, row wise into a larger
matrix. If A1, A2 ... AN are all matrices with R1, R2, ... RN rows respectively
and C columns then these are merged into the R1 + R2 +...+ RN by C matrix
(A1 A2 ... AN) transpose.

Also combines sets of single matrices, with equal rows, column wise into a larger
matrix. If A1, A2, ... AN are all matrices with R rows and C1, C2, ... CN
columns respectively then these are merged into the R by C1 + C2 + CN matrix
(A1 A2 ... AN).

Certain UCINET routines permit the analysis of multiple relations on the same
set of actors. This routine can create a single data file which brings together all
the relevant networks or matrices and makes them suitable for analysis.

PARAMETERS
Files selected:
Names of datasets each containing one or more matrices. The names should be
entered in the order required in the merged data set. To enter a file, highlight one
or more files in the Possible Files and click on the > button and they will be
moved across. Clicking on < moves the files back. All possible files can be
moved across by clicking on >> or <<. To select more than one file press Ctrl and
then click. The files will be placed in the order they are selected.

Dims to join (Default = Rows)


Defines which method is to be used.
Choices are:

Rows
Matrices combine row-wise creating extra rows. Each matrix must be a single
relation with an equal number of columns.

Columns
Matrices combine column-wise creating extra columns. Each matrix must be a
single relation with an equal number of rows.

Matrices
Matrices appended as additional matrices or relations. Networks must all have
the same dimensions.

Destination filename (Default = 'Joined')


Name of the file which will contain merged dataset.

LOG FILE The merged data set with appropriate labels.


If Rows has been selected and the original matrices do not have row labels then
the new row labels are of the form i-j indicating that the row was formed from
row j of matrix i.

If Columns has been selected then the new columns are labeled in a similar way
to Row labels described above.

If Matrices has been selected then each relation is numbered sequentially.

TIMING Linear.

COMMENTS None.

REFERENCES None.
DATA > PERMUTE
PURPOSE Re-order rows, columns or matrices in a dataset according to a user specified list.

DESCRIPTION Re-ordering of matrices can be by a list given at the keyboard or from a dataset.

PARAMETERS
Input dataset:
Name of dataset to be permuted. Data type: Matrix

New order of rows (Default is the natural order)


Rows are ordered as specified by a list. Each row number is listed separated by a
comma or space. The keywords TO, FIRST and LAST are permissible. Hence
5, FIRST 3, 6 TO 8, 4, LAST 2, 9 specifies the order 5, 1, 2, 3, 6, 7, 8, 4, 10, 11,
9.

A UCINET data file can be specified which contains the order. This must be of
the form

<file name> ROW (or COLUMN) <number>

where file name is the name of the data file. The command ROW or COLUMN
followed by the appropriate number specifies which row or column of the dataset
is to be used. The keyword RANDOM is also allowed.

New order of cols (Default is the natural order)


Columns are ordered by a list using the same convention as for rows.

New order of matrices (Default is the natural order)


Matrices are ordered by a list using the same convention as for rows.

Output Filename (Default = 'Permuted')


Name of file which will contain permuted dataset.

LOG FILE Permuted dataset.

TIMING O(N^2).

COMMENTS There is a limitation of 255 characters on keyboard entered lists. Lists longer
than 255 characters must be specified in a UCINET dataset.

REFERENCES None.
DATA > SORT
PURPOSE Re-orders nodes in a network so that they correspond to the monotonic ordering
of a prescribed vector.

DESCRIPTION Arranges the nodes of a network so that they are in the same order as an external
vector.
The sort can be either ascending or descending. Hence if the ASCENDING
option is chosen and the external vector is (V1, V2, ... VN), the nodes would be
ordered so that node i would be before node j if and only if Vi £ Vj. The external
vector can be selected from the rows or columns of any UCINET data matrix.

PARAMETERS
Input dataset
Name of dataset to be sorted. Data type: Matrix

Dimensions to be arranged:
Choices are:

Both-Both rows and columns are simultaneously sorted


Rows Just the rows are sorted and the column order is preserved
Columns Just the columns are sorted and the row order is preserved

Sort order (Default = Ascending)


Choices are:

Ascending
Gives a sort which corresponds to placing the elements of the prescribed vector
in the order from smallest to largest.

Descending
Gives a sort which corresponds to placing the elements of the prescribed vector
in the order from largest to smallest.

Criterion vector (sort key) :


Either the name of the UCINET dataset from which the prescribed vector will be
taken with the row or column specified as follows:

<dataset> ROW (or COLUMN) <number>

where <dataset> is the name of the dataset containing the criterion vector. The
command ROW or COLUMN followed by the appropriate number specifies
which row or column of the dataset is to be used.

Alternatively, a list of values may be entered, one for each row or column being
sorted. Each list entry is separated by a comma or a space. There must be as
many values as rows or columns being sorted.

To sort in ascending or descending order the dataset itself should be used as the
key.

Output dataset: (Default = Sorted)


Name of file which will contain sorted dataset.
LOG FILE Sorted dataset.

TIMING O(N*LOG(N)).

COMMENTS User prescribed SORT to a keyboard list is provided by the routine PERMUTE

REFERENCES None.
DATA >TRANSPOSE
PURPOSE Take the transpose of a matrix.

DESCRIPTION Interchanges the rows and columns of a matrix. Note that this corresponds to
taking the converse of a directed graph. That is, reversing the direction of every
arc.

PARAMETERS
Output dataset (Default = 'Transpose')
Name of file containing transposed data.

LOG FILE Transposed matrix.

TIMING O(N^2).

COMMENTS More complicated transposes for three-dimensional matrices can be done using
TOOLS>MATRIX>ALGEBRA

REFERENCES None.
DATA >PARTITION TO SETS

PURPOSE Transforms a partition indicator vector into a group by actor incidence matrix and
display partition by groups.

DESCRIPTION A partition indicator vector has the form (k1,k2,...,ki...) where ki assigns vertex i
to group ki. So that (1 1 2 1 2) assigns vertices 1, 2 and 4 to block 1; and 3 and
5 to block 2. A group by vertex incidence matrix has vertices as its columns and
the groups as the rows. A 1 in row i column j indicates that actor j is a member
of group i; the values are zero otherwise.

PARAMETERS
Input dataset:
Partition indicator vector. This can either be entered at the keyboard by
specifying the elements of the vector, each number separated by a comma or
space or as a UCINET dataset.
For partitions kept in a UCINET data file enter the filename followed by ROW
(or COLUMN) and a number to specify which row or column of the file to use.
Data type: Partition indicator vector.

Output dataset: (Default = 'PartitionToSets').


Name of file which will contain group by vertex incidence matrix.

LOG FILE A list of the groups. Each group is numbered and specified by the vertices it
contains.

TIMING O(N^2)

COMMENTS Partition indicator vectors enters using the keyboard are restricted to 255
characters. Longer vectors should be specified using a UCINET dataset.

REFERENCES None.
DATA >RESHAPE
PURPOSE Reorganize the data into different size matrix or matrices.

DESCRIPTION This routine treats any input data as one long list. The list is formed row by row
and, if applicable, level by level. The new matrix is then filled up row by row
and then level by level from this list.

PARAMETERS
# of rows desired (Default = 0)
Number of rows in reshaped matrix.

# of columns desired (Default = 0)


Number of columns in reshaped matrix.

# of matrices desired (Default = 1)


Number of different matrices required.

Output dataset (Default = 'Reshaped')


Name of file containing reshaped data.

LOG FILE Reshaped matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
DATA > CREATE NODE SETS
PURPOSE To create a group indicator vector based on comparing two vectors or a vector
and a number.

DESCRIPTION Given a vector of attributes or values for every actor and a threshold number
then this routine selects actors which are have a value which is less than (or
greater than) the threshold. More generally the threshold can itself be a vector so
that actors are selected if they have a value less than (or greater than) the value
in the corresponding cell in the threshold vector. An example of using two
vectors would be the selection of actors whose closeness centrality is less than
their degree centrality.

PARAMETERS
Variable 1:
Name of file from which contains value or attribute vector this must be a
UCINET data file. Enter the filename followed by ROW (or COL) and a number
to specify which row or column of the file to use.

Relational Operator
Criterion by which to compare the actor values or attributes.
Choices are:

LT -Less than
LE -Less than or equal to
EQ -Equal to
NEQ -Not equal to
GE -Greater than or equal to
GT -Greater than

Variable 2
The threshold value or vector. If a single value is required then this can be typed
in directly. Vectors must be specified using a UCINET data file, enter the
filename followed by ROW (or COLUMN) and a number to specify which row
or column of the file to use.

Output dataset (Default = 'SELECTED')


Name of file to contain group indicator matrix. This will be a single column
vector with selected actors having a 1 and non-selected actors having a 0.

LOG FILE Displays the group indicator vector.

TIMING Linear

COMMENTS The group indicator vector can be used in routines such as Extract

REFERENCES None.
TRANSFORM > BLOCK

PURPOSE Partition nodes in a data graph into blocks and calculate block densities, sums or
other statistics.

DESCRIPTION The adjacency matrix is partitioned into submatrices. The average, sum,
maximum, minimum, standard deviation, or sum of squares of each submatrix is
then calculated.
This routine is virtually identical to the Networks>Properties>Density routine,
except that it provides more options for aggregating cells within a matrix block.

PARAMETERS
Input dataset:
Name of file containing matrices to be blocked. Data type: Matrix.

Method: (Default = Average)


Choices are

Average -Arithmetic mean of all cells in each submatrix.


Sum -Simple sum of all cells in each submatrix.
Maximum -Largest value of all cells in each submatrix.
Minimum -Smallest value of all cells in each submatrix.
Std Dev -Standard deviation of all cells in each submatrix.
SSQ -Sum of squares of all cells in each submatrix.

Utilize Diagonal values (Default = No)


Whether diagonals are to be included in density calculations.

Row partition/blocking (if any):


To partition the rows of the data matrix into blocks, specify a blocking vector by
giving the dataset name, a dimension and an integer value. For example, to use
the second row of a dataset called ATTRIB, enter "ATTRIB ROW 2". The
program will then read the second row of ATTRIB and use that information to
sort the rows of the matrix. All rows with identical values on the criterion vector
(i.e. the second row of attrib) will be placed in the same block of the matrix.
Densities will then be computed separately for each block. The block partitions
can also be typed directly into the box by typing in a partition indicator vector. A
partition indicator vector has the form (k1,k2,...,ki...) where ki assigns vertex i to
group ki. So that (1 1 2 1 2) assigns vertices 1, 2 and 4 to block 1; and 3 and 5
to block 2.

Column partition/blocking (if any):


To partition the columns of the data matrix into blocks, specify a blocking vector
by giving the dataset name, a dimension and an integer value. For example, to
use the second row of a dataset called ATTRIB, enter "ATTRIB ROW 2". The
program will then read the second row of ATTRIB and use that information to
sort the columns of the data matrix. All columns with identical values on the
criterion vector (i.e. the second row of attrib) will be placed in the same block of
the matrix. Densities will then be computed separately for each block.The block
partitions can also be typed directly into the box by typing in a partition indicator
vector. A partition indicator vector has the form (k1,k2,...,ki...) where ki assigns
vertex i to group ki. So that (1 1 2 1 2) assigns vertices 1, 2 and 4 to block 1;
and 3 and 5 to block 2.
(Output) Reduced image dataset (Default = 'Blocked')
Name of dataset that will contain the reduced block density matrix.

(Output) Pre-image dataset (Default= 'PreImage')


Name of dataset that will contain the original data with the rows and columns
permuted to form the blocks.

LOG FILE List of block numbers together with their members. The pre-image matrix ie the
permuted original data matrix. Blocked matrices. A blank in the matrix indicates
that a matrix value (such as the average), was undefined.

TIMING O(N^2)

COMMENTS Users who wish to produce a binary image matrix from the output of this routine
can obtain one by using Transform>Dichotomize.

REFERENCES None.
TRANSFORM>COLLAPSE

PURPOSE Combine one or more rows or columns of a matrix.

DESCRIPTION Combines row, columns or both simultaneously to form a new smaller matrix.
The value of the combined cells can either be the average, the sum, the maximum
or the minimum of the set of cells which are to be collapsed.

PARAMETERS
Input dataset
Name of file containing matrix to be collapsed. Data type: Matrix.

Aggregation operation: (Default = Sum).


Specifies how to aggregate the cells which are to be collapsed.

Choices are:

Average - The arithmetic mean of all the cells.


Sum - The sum of all the cells.
Maximum- Maximum value of all the cells.
Minimum - Minimum value of all the cells.

Enter instructions for collapsing:


In the window provided the user must provide instructions to the routine which
specify which rows or columns must be collapsed.
The following keywords are used.

ROWS to combine rows.


COLS to combine columns
NODES to combine rows and columns simultaneously.

Each new line must commence with one of these keywords. Each keyword is
followed by a list of the rows, columns or nodes which are to be collapsed. The
list has elements separated by spaces or commas, the keywords TO is
permissible. For example:

ROWS 1 3 4

collapse rows 1, 3 and 4 to a single row;

COLS 2 TO 4
COLS 1, 6

collapses columns 2, 3 and 4 to one column and 1 and 6 to another column


separately.

(For Square Mats) Include diagonal values? (Default = No).


No excludes diagonal values from consideration.

OUTPUT dataset: (Default = 'Collapse').


Name of file which contains labeled collapsed matrix described below.

LOG FILE A list of assignments of rows and columns to blocks. The blocks specify the new
row and column numbers for each of the old row and column numbers.
The collapsed matrix. Each row or column is labeled. Rows or columns that
have been collapsed are labeled by B followed by their block number. Rows or
columns which have not been collapsed retain the label R (for row) or C (for
column) followed by their row or column number.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
TRANSFORM>RECODE

PURPOSE Change ranges of matrix values to new values.

DESCRIPTION The routine allows the user to change values or a range of values in a matrix to a
new value. Up to 5 values or ranges can be recoded.

PARAMETERS
Input dataset
Name of dataset to be recoded. Data type: Matrix.

Rows to recode (Default = All)


Rows to be recoded are specified by a list. Each row number is listed separated
by a comma or space. The keywords TO, FIRST and LAST are permissible.
Hence FIRST 3, 5 TO 7, 10, 12 would give row numbers 1, 2, 3, 5, 6, 7, 10 and
12. ALL gives all possible rows. Lists kept in a UCINET dataset can be used.
Enter the filename followed by ROW (or COLUMN) and a number to specify
which row or column of the file to use. The list must be specified using a binary
vector where a 1 in position k indicates that vertex k is a member of the list, a
zero indicates that k is not a member.

Cols to recode (Default = All)


Columns to be recoded are specified by a list. Each column number is listed
separated by a comma or space. The keywords TO, FIRST and LAST are
permissible. Hence FIRST 3, 5 TO 7, 10, 12 would give column numbers 1, 2, 3,
5, 6, 7, 10 and 12. ALL gives all possible columns. Lists kept in a UCINET
dataset can be used. Enter the filename followed by ROW (or COLUMN) and a
number to specify which row or column of the file to use. The list must be
specified using a binary vector where a 1 in position k indicates that vertex k is a
member of the list, a zero indicates that k is not a member.

Mats (levels) to recode (Default = All)


Matrices to be recoded are specified by a list. Each matrix number is listed
separated by a comma or space. The keywords TO, FIRST and LAST are
permissible. Hence FIRST 3, 5 TO 7, 10, 12 would give matrix numbers 1, 2, 3,
5, 6, 7, 10 and 12. ALL gives all possible matrices. Lists kept in a UCINET
dataset can be used. Enter the filename followed by ROW (or COLUMN) and a
number to specify which row or column of the file to use. The list must be
specified using a binary vector where a 1 in position k indicates that vertex k is a
member of the list, a zero indicates that k is not a member.

Include diagonal values: (Default = No).


Yes means that diagonal values are recoded.
No ignores the diagonal in the recoding.

Five boxes of the form


values to are recoded as

If the values x, y and z are entered so that the completed line reads

values x to y are recoded as z

then all values of the matrix in the range from x to y inclusive are changed to the
value z. To change a single value set both x and y to the value. Note that the
value na can be used for missing values.

Output dataset: (Default = 'Recode').


Name of file which contains recoded matrix.

LOG FILE Recoded matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
TRANSFORM>REVERSE

PURPOSE Convert similarity data to distance data, or distance to similarity by a linear


transformation.

DESCRIPTION Subtract each value of the matrix from the sum of the maximum and minimum
entries.

PARAMETERS
Input dataset:
Name of file containing matrix to be reversed. Data type: Matrix.

Rows to reverse: (Default = 'ALL')


Enter id numbers of all rows whose values are to be reversed.

Columns to reverse: (Default = 'ALL')


Enter id numbers of all columns whose values are to be reversed.

Matrices (levels) to reverse: (Default = 'ALL')


Enter id numbers of all matrices in dataset whose values are to be reversed.

(Sq. matrices only) Include diagonal values (Default = Yes).


Whether diagonals are to be included in the reversing process.

Output dataset: (Default = 'Reverse').


Name of file that will contain reversed matrix.

LOG FILE Display of reversed matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
TRANSFORM>DICHOTOMIZE

PURPOSE Form a binary matrix from a valued matrix.

DESCRIPTION Given a specified cut-off value then the valued matrix is made binary by
comparing each element with the cut-off value. Comparisons can be strictly
greater, greater than or equal, equal, less than or equal or strictly less than.

PARAMETERS
Input dataset:
Name of matrix to be dichotomized. Data type: Matrix.

Cut-off value: (Default = 0).


Any user-specified value. MEAN gives the average value of all the cells in the
input matrix.

Diagonal OK? (Default = No)


Yes means that diagonal elements are considered valid in calculating the mean.
No ignores diagonal values.

Cut-off operator: (Default = 'GT').


Choices are:

GT - Matrix values replaced by a 1 if they are strictly greater than the cut-off
value and 0 otherwise.
GE - Matrix values replaced by a 1 if they are greater than or equal to the cut-off
value and 0 otherwise.
EQ - Matrix values replaced by a 1 if they are equal to the cut-off value and 0
otherwise.
LE - Matrix values replaced by a 1 if they are less than or equal to the cut-off
value and 0 otherwise.
LT - Matrix values replaced by a 1 if they are strictly less than the cut-off value
and 0 otherwise.

Output dataset: (Default = 'Dichotomize').


Name of file which contains dichotomized matrix.

LOG FILE Dichotomized matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
TRANSFORM > DIAGONAL

PURPOSE Perform simple operations on the diagonal of a square matrix.

DESCRIPTION Set the diagonal of a matrix to a new value. Save the diagonal of a matrix.

PARAMETERS
Input dataset
Name of file on which to perform the transformations. Data type: Square matrix.

New diagonal value(s): (Default = 0).


A single value will set all diagonal elements to the value. A list will set the
diagonal to the values in the list; these values can be separated by a space or
comma. The name of a data file of any UCINET dataset consisting of a square
matrix of the same size. The diagonal of the input dataset will be set to the same
value of the diagonal of the specified data set.

(Output) Diagonal Dataset: (Default = 'DiagonalSaveDiag').


Name of file which contains a square matrix with the diagonal of the input
dataset as its diagonal and zeros elsewhere. This file is not displayed in the LOG
FILE.

(Output) Changed Matrix: (Default = 'DiagonalNewMat').


Name of file which contains matrix with new diagonal values.

LOG FILE Matrix with reset diagonal.

TIMING O(N).

COMMENTS None.

REFERENCES None.
TRANSFORM>SYMMETRIZE

PURPOSE Change an unsymmetric matrix into a symmetric matrix by using one of a variety
of criteria.

DESCRIPTION Produces a symmetric square matrix by one of the following methods. Replace
xij and xji by their maximum, minimum, average, sum, absolute difference,
product or xij/xji (provided xji is non zero) i < j. Alternatively make the lower
triangle equal the upper triangle or the upper triangle equal the lower triangle.

The routine also produces a symmetric matrix with binary values on all off-
diagonal by replacing xij and xji by 1 if xij > xji for i £ j. The > operation in xij >
xji can be replaced by ³, =, <, £.

If the data has missing values

PARAMETERS
Input dataset:
Name of file containing matrix to be symmetrized. Data type: Square matrix.

Symmetrizing method (Default = Maximum).


Choices are:

Maximum - Replace xij and xji by max(xij,xji), i < j.


Minimum - Replace xij and xji by min(xij,xji), i < j.
Average - Replace xij and xji by (xij + xji)/2, i < j.
Sum - Replace xij and xji by xij + xji, i < j.
Difference - Replace xij and xji by abs(xij - xji), i < j.
Product - Replace xij and xji by xijxji, i < j.
Division - Replace xij and xji by xij/xji, i < j provided xji is non zero.
Lower Half - Replace xij by xji, i < j.
Upper Half - Replace xji by xij, i < j.
Upper > Lower - Replace xij and xji by 1 if xij > xji, by 0 otherwise, i < j.
Upper ³ Lower - Replace xij and xji by 1 if xij³ xji, by 0 otherwise, i < j.
Upper = Lower - Replace xij and xji by 1 if xij = xji, by 0 otherwise, i < j.
Upper £ Lower - Replace xij and xji by 1 if xij £ xji, by 0 otherwise, i < j.
Upper < Lower - Replace xij and xji by 1 if xij < xji, by 0 otherwise, i < j.

Handle missing
Specify how to treat missing data in the symmetrization process. Choose the
non-missing value allows the user to reduce or even eliminate the number of
missing values in the data. Both missing means that if either value is missing
then this is recorded as missing in the symmetrized data.

Output dataset (Default = 'Symmetrize').


Name of file containing symmetrized data.

LOG FILE Symmetrized matrix.

TIMING O(N^2).

COMMENTS None.
TRANSFORM>NORMALIZE
PURPOSE Normalize the values in a matrix.

DESCRIPTION This routine normalizes using a variety of techniques.

Each technique can be applied to either the whole matrix or just the rows or
columns. In addition an iterative facility is provided to Normalize both rows and
columns simultaneously. These operate on the matrix as follows:

Marginal: normalizes the sum to be 100. This is achieved by dividing by the


current sum of the rows, columns or matrix and multiplying by 100.

Mean: normalizes the mean to be zero. This is achieved by subtracting from


every row, column, or matrix element the current mean.

Standard Deviation: normalizes the standard deviation to be one. This is


achieved by dividing the rows, columns or matrix by the current standard
deviation.

Z-Score: standardizes the mean to be zero and the standard deviation to be one.
This is achieved by subtracting from every row, column or matrix element the
current mean and then dividing the rows, columns or matrix by the current
standard deviation.

Euclidean: standardizes the Euclidean norm to be one. This is achieved by


dividing the rows, columns or matrix by the current Euclidean norm.

Maximum: standardizes the rows, columns or matrix to each have a maximum


value of 100. This is achieved by dividing the matrix or each row or column by
the current maximum and multiplying by 100.

The routine also allows each of these options to be applied to the rows and
columns simultaneously. This involves an iterative procedure in which the
technique is first applied to the rows and then the columns and then the rows etc.
It is terminated when (and if) there is convergence.

PARAMETERS
Input dataset
Name of file containing matrix to be standardized. Data type: Matrix.

Which dimension(s) to standardize: (Default = Columns).


Choices are:

Rows - Normalization is applied to the rows of the matrix independently.


Columns - Normalization is applied to the columns of the matrix independently.
Matrix - Normalization is applied to the entire matrix.
Both - Normalization is applied to the rows, then the columns, then the rows etc
iteratively until convergence.

Standardizing criterion: (Default = Marginal).


Choices are:

Marginal - Forces the sum of elements to be 100. By row, column, matrix or


row and column.
Mean - Forces the mean of elements to be zero. By row, column, matrix, or row
and column.

Std-Dev - Forces the standard deviation to be one. By row, column, matrix or


row and column. If standard deviation is initially zero then elements of matrix
are treated as missing.

Z-Score - Forces the mean of the elements to be zero and the standard deviation
to be 1. By row, column, matrix or row and column. If standard deviation is
initially zero then elements of matrix are treated as missing.

Euclidean - Forces the Euclidean norm, to be one. By row, column, matrix or


row and column.

Maximum - Forces the maximum of the elements to be 100. By row, column or


row and column. Forces the maximum element to be one for the whole matrix.

Constant to replace zeros with (Default =0.0)


Zeros can cause this procedure to crash and this can be overcome by replacing
them with a relatively small value.

(Sq. matrices only) Include diagonal values? (Default = Yes).


Yes includes diagonals. No treats diagonal values as missing.

(For iterative norm.) Convergence tolerance (Default=0.001)


When both is selected the routine iterates to convergence the tolerance specifies a
point at which when the values change by less than the tolerance the routine has
converged.

(For iterative norm.) Max # of iterations (Default=100)


When both is selected the routine iterates to convergence. Convergence will be
deemed to have failed if the tolerance has not been achieved before the maximum
number of iterations has taken place.

Output dataset: (Default = 'Normalize').


Name of file which contains normalized matrix.

LOG FILE Normalized matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
TRANSFORM > BIPARTITE

PURPOSE Convert a 2-mode dataset into a 1-mode adjacency matrix

DESCRIPTION Any 2-mode incidence matrix can be thought of as a bipartite graph. If the 2-
modes are actors and events then the bipartite graph consists of the union of the
actors and events as vertices with the edges only connecting actors with events
(ie no connections between actors or between events). This routine takes a 2-
mode incidence matrix and converts it to a 1-mode adjacency matrix of a
bipartite graph. If the incidence matrix had n rows and m columns then the
resultant adjacency matrix would be a square matrix of dimension m+n.

PARAMETERS
Input 2-mode dataset:
Name of file containing incidence matrix.

Value to fill within-mode ties: (Default=0.0)


The incidence matrix specifies the values of ties from actors to events the values
of the (non-existent) ties of actors to actors and events to events is not given. The
user can override the default value of zero by specifying their own within mode
value.

Make result symmetric? (Default = 'No')


If yes is selected matrix is symmetrized by taking the maximum of Xij and Xji.

Output dataset:(Default='bi')
Name of file containing adjacency matrix of bipartite graph.

LOG FILE Adjacency matrix of bipartite graph.

TIMING Linear.

COMMENTS None.

REFERENCES None.
TRANSFORM > INCIDENCE
PURPOSE Convert an adjacency matrix to an incidence matrix.

DESCRIPTION An incidence matrix is a node by edge matrix. The rows represent the nodes of a
graph and the columns the edges. A one in row i column j indicates that node i is
incident to edge j. This representation is often called the hypergraph
representation.

PARAMETERS
Input dataset:
Name of file containing adjacency matrix. Data type: Digraph

Treat data as directed: (Default=No)


If Yes then reciprocal ties will occur twice in an incidence matrix.

Include self loops: (Default = No)


If No self loop ties will be ignored.

Output Filename: (Default = 'Incidence')


Name of data file which will contain incidence matrix. The columns will be
labeled with edge labels.

LOG FILE Labeled incidence matrix.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
TRANSFORM > LINEGRAPH
PURPOSE Construct the line graph of a graph or network.

DESCRIPTION The line graph of a graph G is the graph obtained by using the edges of G as
vertices, two vertices being adjacent whenever the corresponding edges are. In a
digraph the arcs of a digraph are the vertices and two vertices are adjacent if the
corresponding arcs induce a walk.

PARAMETERS
Input Dataset:
Name of file containing graph from which to create the line graph. Data type:
Digraph.

Include self-loops: (Default = NO)


NO means that self loops will not generate vertices in the line graph.

Output dataset: (Default = 'Linegraph')


Name of file which contains constructed linegraph.

LOG FILE Adjacency matrix of the line graph vertices labeled with corresponding edges
from original graph

TIMING O(N^2).

COMMENTS Note that multirelational data cannot be converted to line graph format. Users
should do each relation separately.

REFERENCES None.
TRANSFORM > MULTIGRAPH
PURPOSE Convert a valued graph into a set of binary graphs.

DESCRIPTION A single binary graph is created for each different value of a valued graph. All
created graphs are stacked in a single dataset.

PARAMETERS
Input dataset:
Name of file containing valued data. Data type: Valued graph.

Splitting operator (Default = EQ)


Choices are:

GT - Greater than yields Mijk = 1 if xij > wk


GE - Greater than or equal yields Mijk = 1 if xij ³ wk
EQ - Equal yields Mijk = 1 if xij = wk
LE - Less than or equal to yields Mijk = 1 if xij £ wk
LT- Less than yields Mijk = 1 if xij < wk

where Mijk is the (i,j) entry of the kth adjacency matrix, xij is the (i,j) entry of the
input data, and wk are the ordered values of the weights of the valued data placed
in ascending order.

Include self- loops (Default = NO)


If NO then self loops are ignored.

Count zeros as valid relationships: (Default = NO)


If NO then no binary graph is created corresponding to the wk value zero. If YES
then a binary graph corresponding to the value zero is included in the
multigraph.

Output Dataset (Default = 'Multigraph')


Name of file that will contain multigraph as a set of binary graphs.

LOG FILE Constructed multigraph.

TIMING O(N^2).

COMMENTS The number of relations constructed will correspond to the number of different
values. Care should be taken not to enter datasets that will create a large number
of binary graphs.

REFERENCES None.
TRANSFORM>MULTIPLEX

PURPOSE Constructs a multiplex graph from a multirelational graph.

DESCRIPTION Technically if G(V,{Ri}) is a multirelational graph with vertex set V and


relations {Ri}, i e I. If v and w are two vertices of G then the bundle of relations
connecting v to w, Bvw, is defined as Bvw = {Ri: vRiw}. Let Mk be the set of all
bundles. The multiplex graph is the valued graph with valued adjacency matrix
Xi,j = k where k is the Mk bundle of relations connecting i to j. Non technically
the algorithm determines how many different distinct patterns of relations (the
bundles) link any pair of vertices and assigns each of these a numerical label.
The arcs in the output multiplex graph are then labeled with these identifying
numbers.

PARAMETERS
Input dataset:
Name of file that contains multirelational binary network data. Valued data are
automatically converted to multirelational binary data using a technique
identical to Multigraph. Data type: Digraph. Multirelational.

Include transpose(s) in the multiplexing (Default = No).


For non-symmetric data the transposes can be automatically added as additional
relations.

Convert data to geodesic distances (Default = No)


Option to convert each relation in dataset to geodesic distances.

Output dataset (Default = 'Multiplex')


Output file that will contain multiplex graph.

LOG FILE Multiplex graph adjacency matrix.

TIMING Exponential.

COMMENTS In the worst case, the timing for the algorithm is exponential. The timing
depends on the number of possible bundles; up to 2 to the power N bundles can
occur when there are N different relations.

REFERENCES None.
TRANSFORM > SEMIGROUP
PURPOSE Construct the semigroup of a graph, digraph or multirelational graph.

DESCRIPTION The semigroup of a network is an algebraic representation of all compound


relations.

Given a set of adjacency matrices R1,R2,...,Rn of a multirelational graph then the


set of all possible Boolean products of pairs of matrices gives all possible
relations of length 2. If any of these products is repeated then they are discarded.
We continue with products of length 3 etc until no new matrices are found. The
set of all matrices constructed in this way together with the operation of Boolean
matrix multiplication form a semigroup.

This routine finds all members of the semigroup, or members of the semigroup
up to a certain length of product. In addition the semigroup is specified by a
multiplication table.

PARAMETERS
Input dataset:
Name of file containing adjacency matrix or matrices. Data type: Digraph.
Multirelational.

Maximum length of "words": (Default = 9)


The products are called words. The maximum length of products to be
considered is known as the word length.

Save elements of semigroup ?: (Default = No)


If only the multiplication table and words are required then it is not necessary to
save the matrix elements.YES causes all generated matrices to be saved in a file
specified below.

Output semigroup: (Default = 'SEMIGROUP')


Name of file which will contain all compounded relations provided the save
elements of semigroup parameter was set to YES. These are given as a list, each
relation is sequentially numbered. This file does not appear in the LOG FILE.

Output multiplication table: (Default = 'MULTABLE')


Name of file which will contain the multiplication table specified below.

LOG FILE Semigroup multiplication table.

Each row (and column) is labeled with the compound relation number. The rows
also give the word that accounts for the compound. Hence if row 6 is labeled 1 1
2 1 then relation 6 is the matrix obtained by Boolean matrix multiplication of the
original relations numbered 1 1 2 1 in that order. The value in row i column j is
the result of the Boolean matrix multiplication of relation i and relation j.

If the word length is not sufficient to generate all elements of the semigroup then
the right multiplication table of the generated elements is displayed. This table
gives the product of the generated elements with the input matrices.

TIMING Algorithm is exponential.


COMMENTS Relatively small datasets can result in large semigroups.

REFERENCES None.
TOOLS > MDS > METRIC

PURPOSE Metric multidimensional scaling of a proximity matrix.

DESCRIPTION Given a matrix of proximities (similarities or dissimilarities) among a set of


items, the program finds a set of points in k-dimensional space such that the
Euclidean distances among these points corresponds as closely as possible to the
input proximities.

PARAMETERS
Input dataset
Name of file containing proximity matrix. Data type: Square symmetric matrix.

No of dimensions: (Default = 2)
Number of dimensions to use in representing items in Euclidean space.

Similarities or Dissimilarities? (Default = Similarities)


Whether the data represent similarities or dissimilarities. If similarities, large
values of X(i,j) will draw i and j close together on the MDS map. If
dissimilarities, large values will push i and j apart on the map.

Starting Configuration: (Default = Classic)


How to generate initial location of points in space.
Choices are:

Classic - Performs Gower's classical metric ordination procedure.

File - Reads starting coordinates from UCINET dataset.


If this option is chosen then the user must complete the parameter:

Random - Locates points randomly in space.

Starting Config Filename


Name of the coordinate dataset if the file option is taken. This UCINET dataset
should consist of an nxk matrix of values. Each column corresponds to the co-
ordinates in each of the dimensions specified. Hence row i gives the co-ordinates
of the ith point.

Adjust data to nearest Euclidean (Default = Yes)


Iteratively adjusts the data so that it obeys the triangle inequality.

Output dataset: (Default = 'MetricMdsCoord')


Name of file containing the co-ordinates of the points in Euclidean space.

LOG FILE The output first gives a 2D scatterplot of the first pair of co-ordinates. The x-axis
is the first co-ordinate set and the y-axis is the second. The scatterplot can be
saved or printed. Simple editing can be achieved using the options button. The
labels can be turned on or off and values can be attached to the points (or
removed). The scales can also be changed. More advanced editing is possible by
double clicking in the plot, this invokes the chart wizard. To find the label
attached to a single point when all the labels are moved click on a single point,
this will highlight all the points, then click a second time to highlight one vertex.
Now double click on the vertex and the label will be highlighted in the chart
designer. The save button and the save chart data option allow the user to save
all the chart data into a file which can be reviewed using
Tools>Scatterplot>Review. The chart itself can be saved as a windows metafile
which can then be read into a word processing or graphics package. Only one
chart can be open at one time and the chart window will be closed if you click on
any other UCINET window. Behind the chart is a numeric display of coordinates
of each point in space together with information about the stress.

TIMING O(N^4)

COMMENTS MDS solutions are not unique, and they are subject to convergence to local
minima. The first point means that two or more maps can be equally good (same
stress) but place points in radically different locations. The second point means
that it is possible for the algorithm to fail to find the configuration with least
stress. If you suspect this has happened, run the program several times using
random starting configurations. Stress values below 0.1 are excellent and above
0.2 unacceptable.

This routine only works if the regional settings are set to UK or USA. If you do
not have these regional settings and do not get a plot then change them in the
settings control panel on your machine.

REFERENCES Gower
TOOLS > MDS > NON-METRIC

PURPOSE Non-metric multidimensional scaling of a proximity matrix.

DESCRIPTION Given a matrix of proximities (similarities or dissimilarities) among a set of


items, program finds a set of points in k-dimensional space such that the
Euclidean distances among these points corresponds as closely as possible to a
rank preserving transformation of the input proximities. The algorithm is based
on the MDS(X) MINISSA program.

PARAMETERS
Input dataset
Name of file containing proximity matrix. Data type: Square symmetric matrix.

No of dimensions: (Default = 2)
Number of dimensions to use in representing items in Euclidean space.

Similarities or Dissimilarities? (Default = Similarities)


Whether the data represent similarities or dissimilarities. If similarities, large
values of X(i,j) will draw i and j close together on the MDS map. If
dissimilarities, large values will push i and j apart on the map.

Starting Configuration: (Default = Torsca)


How to generate initial location of points in space.
Choices are:

Metric - Performs Gower's classical metric ordination procedure.

Torsca - Uses principal components of rank-order data.

File - Reads starting coordinates from UCINET dataset.

Random - Locates points randomly in space.

Starting Config Filename


Name of the coordinate dataset if the file option is chosen . This UCINET dataset
should consist of an nxk matrix of values. Each column corresponds to the co-
ordinates in each of the dimensions specified. Hence row i gives the co-ordinates
of the ith point.

Print Diagnostics (Default = No)


If Yes is selected then dyads with large discrepancies between the proximity data
and the plot distances will be printed.

Output dataset: (Default = 'NonMetricMdsCoord')


Name of file containing the co-ordinates of the points in Euclidean space.

LOG FILE The output first gives a 2D scatterplot of the first pair of co-ordinates. The x-axis
is the first co-ordinate set and the y-axis is the second. The scatterplot can be
saved or printed. Simple editing can be achieved using the options button. The
labels can be turned on or off and values can be attached to the points (or
removed). The scales can also be changed. More advanced editing is possible by
double clicking in the plot, this invokes the chart wizard. To find the label
attached to a single point when all the labels are moved click on a single point,
this will highlight all the points, then click a second time to highlight one vertex.
Now double click on the vertex and the label will be highlighted in the chart
designer. The save button and the save chart data option allow the user to save
all the chart data into a file which can be reviewed using
Tools>Scatterplot>Review. The chart itself can be saved as a windows metafile
which can then be read into a word processing or graphics package. Only one
chart can be open at one time and the chart window will be closed if you click on
any other UCINET window. Behind the chart is a numeric display of coordinates
of each point in space together with information about the stress. If the print
diagnostics have been selected then dyads with large differences between the
proximity data and the distances in the co-ordinate date are listed.

TIMING O(N^4)

COMMENTS MDS solutions are not unique, and they are subject to convergence to local
minima. The first point means that two or more maps can be equally good (same
stress) but place points in radically different locations. The second point means
that it is possible for the algorithm to fail to find the configuration with least
stress. If you suspect this has happened, run the program several times using
random starting configurations. Stress values below 0.1 are excellent and above
0.2 unacceptable.

This routine only works if the regional settings are set to UK or USA. If you do
not have these regional settings and do not get a plot then change them in the
settings control panel on your machine.

REFERENCES Kruskal J B and Wish M (1978). Multidimensional Scaling, Newbury Park: Sage
Publications.

Kruskal J B (1964). Multidimensional Scaling by optimizing goodness-of-fit to a


non-metric hypothesis. Psychometrika 29, 1-27.
TOOLS > CLUSTERING > HIERARCHICAL

PURPOSE Perform Johnson's hierarchical clustering on a proximity matrix.

DESCRIPTION Given a symmetric n-by-n representing similarities or dissimilarities among a set


of n items, the algorithm finds a series of nested partitions of the items. The
different partitions are ordered according to decreasing [increasing] levels of
similarity [dissimilarity]. The algorithm begins with the identity partition (in
which all items are in different clusters). It then joins the pair of items most
similar (least different), which are then considered a single entity. The algorithm
continues in this manner until all items have been joined into a single cluster (the
complete partition).

PARAMETERS
Input dataset
Name of file containing proximity matrix to be clustered. Data type: Square
symmetric matrix.

Method: (Default = AVERAGE)


Choices are:

SINGLE_LINK
Also known as the "minimum" or "connectedness" method. Distance between
two clusters is defined as smallest dissimilarity (largest similarity) between
members.

COMPLETE_LINK
Also known as the "maximum" or "diameter" method. Distance between two
clusters is defined as largest dissimilarity (smallest similarity) between members.

AVERAGE
Distance between clusters defined as average dissimilarity (or similarity)
between members.

Similarities or Distances? (Default = Similarities)


Whether items i and j should be clustered together when X(i,j) is large or when it
is small. If data are Similarities, items i and j are clustered together if X(i,j) is
very large. If data are Dissimilarities, items i and j are clustered together if X(i,j)
is very small.

Compute ultrametric proximity matrix? (Default = NO)


Hierarchical clustering can be seen as transforming a dissimilarity matrix into an
ultrametric distance matrix. The ultrametric distances correspond monotonically
to the number of iterations (partitions) needed to join a given pair of items.

Diagram Type: (Default = Dendrogram)


The clustering can be shown as a dendrogram or a tree diagram.

Output Partition matrix: (Default = 'Part')


Name of dataset to contain the partition-by-item indicator matrix. Each column
of this matrix gives the cluster to which each item was assigned in a given
partition. The columns are labeled by the level of the cluster. A value of k in a
column labeled x and row j means that actor j was in partition k at level x. Actor
k is always a member of partition k and is a representative label for the group. It
can be used by procedures like Transform>Block to obtain density matrices at
any level of blocking. This file is not displayed in the LOG FILE.

Output Ultrametric matrix (if desired):


Name of dataset to contain the item-by-item ultrametric proximity matrix, if
desired.

LOG FILE Primary output are cluster diagrams. The first diagram (either a tree diagram or a
dendrogram) re-orders the actors so that they are located close to other actors in
similar clusters. The level at which any pair of actors are aggregated is the point
at which both can be reached by tracing from the start to the actors from right to
left. The scale at the top gives the level at which they are clustered. The diagram
can be printed or saved. Parts of the diagram can be viewed by moving the
mouse to the split point in a tree diagram or the beginning of a line in the
dendrogram and clicking. The first click will highlight a portion of the diagram
and the second click will display just the highlighted portion. To return to the
original right click on the mouse. There is also a simple zoom facility simply
change the values and then press enter. If the labels need to be edited
(particularly the scale labels) then you should take the partition indicator matrix
into the spreadsheet editor remove or reduce the labels and then submit the edited
data to Tools>Dendrogram>Draw. The output also produces a standard Log file
that contains a different cluster diagram which looks like this:

       A B C D E F G H I J
                           1
Level 1 2 3 4 5 6 7 8 9 0
­­­­­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­
1.000 XXXXX XXX XXX XXXXX
1.422 XXXXX XXX XXXXXXXXX
1.578 XXXXXXXXX XXXXXXXXX
3.287 XXXXXXXXXXXXXXXXXXX

In this example, the data were distances among 10 items, labeled A through J.
The results are 4 nested partitions, corresponding to rows in the diagram. Within
a given row, an 'X' between two adjacent columns indicates that the items
associated with those columns were assigned to the same cluster in that partition.
For example, in the first partition (level 1.000), items D and E belong to the same
cluster, but C is a member of a different cluster. In the third partition (level
1.578), items D, E and C all belong to the same cluster.

The levels indicate the degree of association (similarity or dissimilarity) among


items within clusters. If, as in the example, the data are distances and the
clustering method is single link, the a level of 1.578 means that every item within
a cluster is no more than 1.578 units distant from at least one other item in that
cluster. If the clustering method is complete link, a level of 1.578 indicates that
every item in a cluster no more than 1.578 units distant from every other item in
the cluster. For the average clustering method, a level of 1.578 indicates that the
average distance among items within the cluster is 1.578.

For similarity data, the meaning of the levels for the single link and complete link
methods is, in a sense reversed. For the single link method, a level of 1.578
means that every item in a cluster is at least 1.578 units similar to at least one
other item in the cluster. For the complete link method, a level of 1.578 means
that every item in a cluster is at least 1.578 units similar to every other item in the
cluster.

TIMING O(N^3)

COMMENTS None.

REFERENCES Johnson, S C (1967). 'Hierarchical clustering schemes'. Psychometrika, 32, 241-


253.
TOOLS > CLUSTERING > OPTIMISATION

PURPOSE Optimizes a cost function which measures the total distance or similarity within
classes for a proximity matrix.

DESCRIPTION Given a partition of a proximity matrix of similarities into clusters, then the
average similarity values within each gives a measure of the extent to which the
groups form clusters. A slightly different approach is required for distance data -
in this case the cost is measured by summing the values for each pair of actors
belonging to the same block. The routine attempts to optimize these measures to
try and find the best fit for a given number of blocks. The cost function can be
changed to give greater weight to relationships between the clusters. In this case
the cost simultaneously reflects a high degree of association within clusters and
a similarity of association between members of different clusters using a
correlation criteria. To do this correlate the data with an ideal structure matrix
A(i,j) in which the i,j th entry is a one if actor i and j are in the same partition
and zero otherwise. This correlation can either be Pearson correlation or a much
faster pseudocorrelation measure. This cost is then either maximized or
minimized depending on whether the proximity matrix contains similarities or
distances. The similarity value needs to be maximized and the distance measure
minimized. The routine uses a tabu search minimization procedure and
therefore to maximize multiplies the costs by -1.

PARAMETERS
Input dataset:
Name of file containing proximity matrix to be clustered. Data type: Square
symmetric matrix.

Number of clusters: (Default = 2)


Number of clusters into which the actors must be assigned.

Fit criterion
Density the average value within clusters for similarity and the sum for distance
data.
PseudoCorrelation a simple fast correlation measure between the clustered data
and the ideal structure matrix.
Correlation the Pearson correlation measure between the clustered data and the
ideal structure matrix.

Are diagonal values valid? (Default = No)


Whether diagonals are to be included on the cost function.

Type of Data:
Similarities causes large values to be clustered together. Distances causes small
values to be clustered together.

Max # of iterations in a series: (Default = 12)


The algorithm starts from an arbitrary partition and attempts to decrease the cost
by taking the steepest descent. If the cost cannot be reduced then the algorithm
continues its search in the neighborhood of the current partition. This search
direction is a mildest ascent direction and from there new search directions are
explored. This exploration only continues for a fixed number of iterations in a
series. If no improvement is made after the fixed number of iterations the
algorithm terminates with the current minimum. Increasing the parameter gives
a more exhaustive and therefore slower search. The recommended default value
is automatically entered on the form once the input data has been selected.

Length of time in penalty box: (Default = 5)


If the algorithm makes an ascending step then it is possible that the best possible
descending step is the reverse of the direction just taken. This parameter
prohibits a move along the reverse direction for a set number of steps. The
larger the value the more difficult it will be to come back to a previously
explored local minimum, however it will also be more difficult to explore the
vicinity of that minimum. The default has been shown experimentally to be the
most useful.

Number of random starts: (Default = 3)


The whole procedure is repeated with a different initial partition. The best of
these are then selected as a minimum.

Random Number Seed:


The random number seed generates the initial partition. UCINET generates a
different random number as default each time it is run. This number should be
changed if the user wishes to repeat the analysis with different initial
configurations. The range is 1 to 32000.

Output Partition Dataset: (Default = 'TabuCluster').


Name of output file which contains a partition indicator vector. This vector has
the form (k1,k2,...ki...) where ki assigns vertex i to block ki, so that (1 1 2 1 2)
assigns vertices 1, 2 and 4 to block 1, and 3 and 5 to block 2. This vector is not
displayed at output.

LOG FILE The value of the cost function.


List of clusters. Each cluster is labeled and is specified by the vertices it
contains.
The blocked proximity matrix. The rows and columns of the original matrix are
permuted into clusters. The proximity matrix is displayed in terms of the matrix
clusters it contains.

TIMING Each iteration of the tabu search algorithm is O(N^2).

COMMENTS Care should be taken when using this routine.


The algorithm seeks to find the minima of the cost function. Even if successful
this result may still have a high value in which case the blocking may not
conform very closely to structural equivalence.

In addition there may be a number of alternative partitions which also produce


the minimum value; the algorithm does not search for additional solutions.
Finally it is possible that the routine terminates at a local minima and does not
locate the desired global minima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into the
reported blocks.

REFERENCES Glover F (1989). Tabu Search - Part I. ORSA Journal on Computing 1, 190-
206.

Glover F (1990). Tabu Search - Part II. ORSA Journal on Computing 2, 4-32.
TOOLS -> 2 MODE > SVD

PURPOSE Perform a singular value decomposition of real-valued matrix.

DESCRIPTION Given an n-by-m matrix X with n ³ m, SVD finds matrices U, D, and V such that
X = UDV'. The matrix D is an r-by-r diagonal matrix containing r singular
values. The matrix U is an n-by-r matrix containing the r eigenvectors of XX'
and V is an m-by-r matrix containing the r eigenvectors of X'X. The
eigenvectors are sorted in descending order by eigenvalue. With symmetric data,
U and V are identical (except for sign reversals).

PARAMETERS
Input dataset:
File containing matrix X to be decomposed; must have at least as many rows as
columns (otherwise transpose the matrix then resubmit). Data type: Matrix.

How to scale row and column scores: (Default = Axes)


Choices are:

Coordinates -
Eigenvectors are weighted by their respective eigenvalues.

Loadings
Eigenvectors are weighted by the square root of the eigenvalues (yields factor
loadings when SVD is applied to correlation matrix).

Axes
No rescaling is performed.

No of factors to save: (Default = 3)


Maximum value of r, the number of eigenvectors used to decompose X.

Reconstruct matrix from factors: (Default = No)


If YES, the product UDV' is computed using r eigenvectors (see 'Number of
factors to save', above). The result is the best possible approximation of X using
matrices of rank r based on a least squares criterion.

(Output) File to contain row scores: (Default = 'RScores')


Name of dataset to contain U matrix.

(Output) File to contain column scores: (Default = 'CScores')


Name of dataset to contain V matrix.

(Output) File to contain singular values: (Default = 'Eigen')


Name of dataset to contain D matrix.

(Output) File to contain reconstructed matrix: (Default = 'Recon')


Name of dataset to contain the approximation X that is UDV'.

(Output) File to contain combined row/column scores: (Default = 'RCScores')


Name of dataset to contain concatenated U and V matrices to produce single
(M+n)-by-r matrix (useful for plotting row and column scores on same map).

LOG FILE The output first gives a 2D scatterplot of the first two dimensions (eigenvectors).
The scatterplot can be saved or printed. Simple editing can be achieved using the
options button. The labels can be turned on or off and values can be attached to
the points (or removed). The scales can also be changed. More advanced editing
is possible by double clicking in the plot, this invokes the chart wizard. To find
the label attached to a single point when all the labels are moved click on a single
point, this will highlight all the points, then click a second time to highlight one
vertex. Now double click on the vertex and the label will be highlighted in the
chart designer. The save button and the save chart data option allow the user to
save all the chart data into a file which can be reviewed using
Tools>Scatterplot>Review. The chart itself can be saved as a windows metafile
which can then be read into a word processing or graphics package. Only one
chart can be open at one time and the chart window will be closed if you click on
any other UCINET window.

Behind the chart is a numeric display of coordinates (U and V matrices) of each


point (rows and columns of X) in r-space.

TIMING O(N^3).

COMMENTS This routine only gives a plot if the regional settings are set to UK or USA. If
you do not have these regional settings and do not get a plot then change them in
the settings control panel on your machine.

REFERENCES Press W H, Flannery B P, Teukolsky S A and Vetterling W T (1989). Numerical


Recipes in Pascal. New York: Cambridge University Press.
TOOLS > 2 MODE > FACTOR ANALYSIS

PURPOSE Perform a complete factor analysis of a 2-mode matrix.


.
DESCRIPTION Decomposes a matrix into factors using either principal components or minimum
residuals methods.

PARAMETERS
Input dataset.
Name of dataset containing 2-mode matrix to be factored. Data type: Matrix.

Method of factor analysis (Default = Principal Components)


Choices are

Principal Components
Perform a principle component analysis in which the matrix is factored into a
product of the most dominant eigenvectors.

Minimum Residuals
Factor the matrix into factors so that the residuals (the sum of squares of the
difference between the original data and the product of the factors) are
minimized.

Method of factor rotation: (Default = Varimax)


Choices are

None
No rotation is performed

Varimax. Maximizes purity of factors.

Quartimax. Maximizes purity of variables (minimizes loading on multiple


factors).Factors are rotated after deleting excess factors (see below).

Number of factors: (Default=3)


Number of factors into which to decompose the matrix. IMPORTANT NOTE:
Factors are rotated after deleting excess factors.

(OUTPUT) Factor Scores: (Default = 'Scores')


Name of file containing the factor scores for each actor on each factor.

(OUTPUT) Factor Loadings: (Default = 'Loadings')


Name of file containing the factor loadings for each actor on each factor.

(OUTPUT) Eigenvectors: (Default= 'Eigen')


Name of file containing eigenvalues corresponding to each eigenvector (factor).

(OUTPUT) Factor score coefficients: (Default='Coefs')


Name of file containing the factor coefficients for each actor on each factor.

LOG FILE The log file gives a full set of descriptive statistics of each actors profile. These
are followed by the eigenvalues placed in descending order of size and labeled as
factors in ascending order. The value of each is expressed as a percentage of the
sum and a cumulative percentage of all the factors given so far is presented. The
final column gives the ratio of the factor below to the current factor. This is
followed by a matrix of factor loadings, entry X(i,j) is the loading of the jth
factor on actor i.

TIMING O(N^3)

COMMENTS None

REFERENCES None
TOOLS > 2 MODE > CORRESPONDENCE

PURPOSE Perform a correspondence analysis of a single real-valued matrix.

DESCRIPTION Given a non-negative, n-by-m matrix with n ³ m, this routine represents the n
rows and m columns as vectors in a common multidimensional space. The
algorithm essentially performs a singular value decomposition of an adjusted
data matrix in which rows and columns have been separately normalized to yield
more equal marginals.

PARAMETERS
Input dataset:
Name of file containing matrix to be analyzed, it must have at least as many rows
as columns (otherwise transpose the matrix then resubmit). Data type: Matrix.

How to scale row and column scores: (Default = COORDINATES)


Choices are:

Coordinates - Scores for each point on each dimension adjusted both for point
marginals and dimension weights (eigenvalues).

CGS - According to Carroll-Green-Schaffer, this transformation makes distance


between a row and a column just as interpretable as distance between a row and a
row or a column and a column.

Optimal - Scores for each point are corrected for point marginals, but not
dimension weights.

Axes - No rescaling is performed.

Number of factors to save: (Default = 3)


Maximum value of r, the number of eigenvectors used to decompose the matrix.

Reconstruct matrix from factors: (Default = No)


If YES, the row and column scores are combined to approximate the data matrix
with r eigenvectors (see 'Number of factors to save', above). The result is the
best possible approximation of X using matrices of rank r based on a least
squares criterion.

Keep the trivial first factor: (Default = No)


The Normalization step prior to singular value decomposition causes first
eigenvector to be constant. If Yes, this factor is retained and eigenvalue
percentages include it. If No, the factor is dropped and eigenvalue percentages
do not include it.

(Output) File to contain row scores: (Default = 'CorrespondenceRScores')


Name of dataset to contain coordinates of row points.

(Output) File to contain column scores: (Default = 'CorrespondenceCScores')


Name of dataset to contain coordinates of column points.

(Output) File to contain singular values: (Default = 'CorrespondenceEigen')


Name of dataset to contain eigenvalue of each dimension.
(Output) File to contain reconstructed matrix: (Default =
CorrespondenceRecon')
Name of dataset to contain the approximated data matrix (if any).

(Output) File to contain combined row/column scores: (Default =


'CorrespondenceRscores')
Name of dataset to contain concatenated row and column scores to produce
single (m+n)-by-r matrix (useful for plotting row and column scores on same
map).

LOG FILE The output first gives a 2D scatterplot of the first two dimensions (eigenvectors).
The scatterplot can be saved or printed. Simple editing can be achieved using the
options button. The labels can be turned on or off and values can be attached to
the points (or removed). The scales can also be changed. More advanced editing
is possible by double clicking in the plot, this invokes the chart wizard. To find
the label attached to a single point when all the labels are moved click on a single
point, this will highlight all the points, then click a second time to highlight one
vertex. Now double click on the vertex and the label will be highlighted in the
chart designer. The save button and the save chart data option allow the user to
save all the chart data into a file which can be reviewed using
Tools>Scatterplot>Review. The chart itself can be saved as a windows metafile
which can then be read into a word processing or graphics package. Only one
chart can be open at one time and the chart window will be closed if you click on
any other UCINET window.

The log file has a numeric display of coordinates (eigenvectors) of each point in
r-space.

TIMING O(N^3).

COMMENTS See the SVD routine for more information.


This routine only gives a plot if the regional settings are set to UK or USA. If
you do not have these regional settings and do not get a plot then change them in
the settings control panel on your machine.

REFERENCES None.
TOOLS > SIMILARITIES

PURPOSE Compute similarities among rows or columns of a matrix using one of various
measures.

DESCRIPTION Given a matrix with n rows and m columns, the program computes either an n-
by-n matrix of similarities among the rows, or an m-by-m matrix of similarities
among the columns.

PARAMETERS

Input dataset:
Name of file containing matrix to be analyzed. Data type: Matrix.

Measure of profile similarity: (Default = CORRELATION)


Choices are:

Correlation - Pearson's product-moment correlation.


Covariance - Mean-centered cross products: Sxy/n - SxSy/n^2
Cross-Products - Sum of products: Sxy
Matches - Proportion of cases in which xi = yi for all i
Positive Matches - Proportion of cases in which xi = yi given that either xi > 0 or
yi > 0 or both

Compute similarities among Rows or Cols: (Default = COLUMNS)


If Rows, an n-by-n similarity matrix representing the similarity between each
pair of rows is computed. If Columns, an m-by-m similarity matrix is computed
representing the similarity between each pair of columns.

(For sq. mats) Diagonal valid (Default = YES)


If No, values along the main diagonal are treated as though they were missing.

Output dataset: (Default = 'Similarities')


Name of dataset to contain output similarity matrix.

LOG FILE Similarity matrix, displayed with 2 decimal places.

TIMING O(N^3).

COMMENTS Missing values are ignored.

REFERENCES None.
TOOLS > DISSIMILARITIES

PURPOSE Compute dissimilarities among rows or columns of a matrix using one of various
measures.

DESCRIPTION Given a matrix with n rows and m columns, the program computes either an n-
by-n matrix of dissimilarities among the rows, or an m-by-m matrix of
dissimilarities among the columns.

PARAMETERS
Input dataset:
Name of file containing matrix to be analyzed. Data type: Matrix.

Measure of profile similarity: (Default = 'EUCLIDEAN')


Choices are:

Euclidean
Euclidean distance: SQRT(S(xi-yi)^2) . When missing values are present, the
computed distance is multiplied by n/m where n is the size of the vectors and m
is the number of non-missing values.

Manhattan
City-block distance: S abs(xi-yi) When missing values are present, the computed
distance is multiplied by n/m where n is the size of the vectors and m is the
number of non-missing values.

Normed SSD
Normed sum of squared differences: S(xi-yi)^2/ Sxi^2Syi^2

Non-Matches
Proportion of cases in which xi does not equal yi for all i.

Positive Non-Matches
Proportion of cases in which xi does not equal yi given that either xi > 0 or yi > 0
or both.

Compute dissimilarities among Rows or Cols (Default = COLUMNS)


If Rows, an n-by-n dissimilarity matrix representing the dissimilarity between
each pair of rows is computed. If Columns an m-by-m dissimilarity matrix is
computed representing the dissimilarity between each pair of columns.

(For sq. mats) Diagonal valid (Default = YES)


If No, values along the main diagonal are treated as though they were missing.

Output dataset:(Default = Dissimilarities)


Name of dataset to contain output dissimilarity matrix.

LOG FILE Dissimilarity matrix.

TIMING O(N^3).

COMMENTS Missing values are ignored.

REFERENCES None.
TOOLS > STATISTICS > UNIVARIATE

PURPOSE Compute standard univariate statistics on values of a matrix.

DESCRIPTION Procedure computes mean, standard deviation, variance, Euclidean norm,


maximum, minimum and total number of observations for each row or column of
a matrix, or for the matrix taken as a whole.

PARAMETERS
Input dataset
Name of file containing matrix to be analyzed. Data type: Matrix.

Which dimension to analyse: (Default = COLUMNS)


Choices are:

Rows - Statistics are computed separately for each row in matrix. Result is a
matrix whose rows correspond to the rows of the data matrix and the columns are
statistics.

Columns - Statistics are computed separately for each column in matrix. Result
is a matrix whose columns correspond to the columns of the data matrix and the
rows are statistics.

Matrices - Statistics are computed on the matrix as a whole.

(For square mats) Diagonal valid? (Default = YES)


Whether diagonal values in square matrices are to be ignored (treated like
missing values).

Output Dataset: (Default = 'UnivariateStats')


Name of data set to contain output statistics.

LOG FILE Matrix of statistics.

TIMING O(N^2).

COMMENTS Missing values are ignored.

REFERENCES None.
TOOLS > STATISTICS > MATRIX (QAP) > QAP-CORRELATION
PURPOSE Compute correlation between entries of two square matrices, and assess the
frequency of random correlations as large as actually observed.

DESCRIPTION The procedure is principally used to test the association between networks.
Often, one network is an observed network while the other is a model or
expected network.

The algorithm proceeds in two steps. In the first step, it computes Pearson's
correlation coefficient (as well as simple matching coefficient) between
corresponding cells of the two data matrices. In the second step, it randomly
permutes rows and columns (synchronously) of one matrix (the observed matrix,
if the distinction is relevant) and recomputes the correlation.

The second step is carried out hundreds of times in order to compute the
proportion of times that a random correlation is larger than or equal to the
observed correlation calculated in step 1. A low proportion (< 0.05) suggests a
strong relationship between the matrices that is unlikely to have occurred by
chance.

PARAMETERS
Data Matrix:
Name of dataset containing the first matrix (the observed or dependent matrix, if
such distinctions are meaningful). Data type: Square Matrix.

Structure Matrix:
Name of dataset containing the expected, modelled or independent matrix (if
such distinctions are meaningful). Data type: Square Matrix.

Number of random permutations: (Default = 500)


Number of correlations to compute between the data matrix and the randomly
permuted structure matrix. The larger the number of permutations, the better the
estimates of standard error and "significance", but the longer the computation
time.

Treat diagonals as valid? (Default = NO)


If YES, the values along the main diagonals of each matrix are included in the
computation of correlation. Otherwise, they are treated as missing.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

LOG FILE The following sample output is generated:

CORRELATION MATCHES
Observed value: 0.207 0.000
Average: 0.001 0.000
Standard deviation: 0.113 0.000
Proportion as large: 0.036 1.000
Proportion as small: 0.964 1.000
Proportion as extreme: 0.036 1.000
The correlation column indicates that the observed correlation between the two
networks was 0.207. The average random correlation was almost zero with a
standard error of 0.113. The percentage of random correlations that were as large
as .207 was 3.7%. At a typical 0.05 level, this correlation would be considered
significant since 0.036 < 0.05.

TIMING O(N^2) per permutation.

COMMENTS The program ignores missing values.

REFERENCES None.
TOOLS > STATISTICS > MATRIX (QAP) > QAP-REGRESSION
PURPOSE Regress a dependent matrix on one or more independent matrices, and assess
significance of the r-square and regression coefficients.

DESCRIPTION The procedure is principally used to model a social relation (matrix) using values
of other relations.

The algorithm proceeds in two steps. In the first step, it performs a standard
multiple regression across corresponding cells of the dependent and independent
matrices.

In the second step, it randomly permutes rows and columns (together) of the
dependent matrix and recomputes the regression, storing resultant values of r-
square and all coefficients. This step is repeated hundreds of times in order to
estimate standard errors for the statistics of interest. For each coefficient, the
program counts the proportion of random permutations that yielded a coefficient
as extreme as the one computed in step 1. The primary requirement for
conducting a multiple regression quadratic assignment procedure is that all the
variables in the regression have to be one-mode, two-way matrices. That is, they
must all be NxN networks. Person-by-object or Person-by-event matrices can be
converted to NxN matrices using Data>Affiliations.

PARAMETERS
Dependent variable:
Name of dataset containing the observed or dependent data: the matrix whose
values are to be predicted. Data type: Square Matrix.

Independent variables:
Names of datasets containing the independent or predictor matrices. To include
more than one dataset using the browse button highlight all required files by
pressing Ctrl and clicking with the mouse. If the file names are typed they should
be separated by commas with no spaces. Data type: Square Matrices.

Number of random permutations: (Default = 500)


Number of regressions to compute between the data matrix and the randomly
permuted structure matrix. The larger the number of permutations, the better the
estimates of standard error and "significance", but the longer the computation
time.

Treat diagonals as valid? (Default = No)


If Yes, the values along the main diagonals of each matrix are included in the
computations. Otherwise, they are treated as missing.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

LOG FILE Two tables are output. The first looks like this:

R-Square One-Tailed Probability


0.023 0.618
The table gives the observed r-square along with the proportion of random trials
yielding an r-square as large or larger than the observed.

The second table is as follows:

Unstandardized Two-Tailed
Independent Coefficient Probability
Intercept 0.385965 0.178
R1 -0.007519 0.866
R2 -0.150376 0.170
R3 0.000000 0.838

This table gives the Unstandardized regression coefficient for each independent
variable, including the intercept, along with the proportion of random trials
yielding a coefficient with an absolute value as large or larger than the observed.
In this example, all the coefficients have non-significant probabilities, indicating
that the observed values are well within the range of random variation.,

TIMING O(N^2).

COMMENTS The program ignores missing values.

REFERENCES None.
TOOLS > STATISTICS > AUTOCORRELATION > CATEGORICAL > JOIN
COUNT

PURPOSE Perform randomization test of autocorrelation for a symmetric adjacency matrix


which is partitioned into two groups.

DESCRIPTION Relates a dyadic binary variable (an actor-by-actor adjacency matrix) to a


monadic variable (a vector representing an attribute of each actor). For example,
if the dyadic variable consists of who is friends with whom, and the categorical
variable is gender, the procedure tests whether friendship is patterned by gender
(e.g., do boys prefer boys and girls prefer girls?). The routine is limited to two
groups and is based upon counting the entries within and between the groups and
comparing them with a randomized model.

PARAMETERS
Input Dataset
Name of file containing matrix to be analyzed. Data type: Graph

Partition Vector:
The name of an UCINET dataset that contains a partition of the actors into two
groups. To partition the data matrix into groups specify a vector by giving the
dataset name, a dimension (either row or column) and an integer value. For
example, to use the second row of a dataset called ATTRIB, enter "ATTRIB
ROW 2". The program will then read the second row of ATTRIB and use that
information to define the groups. All actors with identical values on the criterion
vector (i.e. the second row of attrib) will be placed in the same group.

No. of Permutations: (Default = 10000)


The number of random permutations required in the test.

Treat diagonals as valid? (Default = No)


If Yes, the values along the main diagonals of each matrix are included in the
computations. Otherwise, they are treated as missing.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

LOG FILE The actor attributes are recoded to 1 and 2 these are reported.
A table which gives the observed and expected counts for the data. The first row
gives the counts within group 1, the second is the counts between the groups and
the third is the counts within group 2. The expected simply gives the values that
would be expected if the ones were randomly distributed within and between the
groups. The observed gives the counts of the data and the difference subtracts the
expected from the observed. The P>=Diff and P<=Diff give the relative
frequency that a randomly permuted matrix gets a difference as large or larger
and as small or smaller than the observed. These columns are used to test the
significance of the observed data.

TIMING O(N^2)

COMMENTS None
REFERENCES Cliff, A D and Ord, J K 1973 Spatial Autocorrelation. Pion, London.
TOOLS > STATISTICS > AUTOCORRELATION > CATEGORICAL > RCT
ANALYSIS

PURPOSE Perform randomization test of autocorrelation for a symmetric adjacency matrix


which is partitioned into groups.

DESCRIPTION Relates a dyadic binary variable (an actor-by-actor adjacency matrix) to a


monadic variable (a vector representing an attribute of each actor). For example,
if the dyadic variable consists of who is friends with whom, and the categorical
variable is gender, the procedure tests whether friendship is patterned by gender
(e.g., do boys prefer boys and girls prefer girls?). The routine is similar to
performing a standard chi squared test except instead of using the chi squared
distribution the underlying distribution is constructed using a randomization
procedure.

PARAMETERS
Input Dataset
Name of file containing matrix to be analyzed. Data type: Graph

Attribute:
The name of an UCINET dataset that contains a partition of the actors into two
groups. To partition the data matrix into groups specify a vector by giving the
dataset name, a dimension (either row or column) and an integer value. For
example, to use the second row of a dataset called ATTRIB, enter "ATTRIB
ROW 2". The program will then read the second row of ATTRIB and use that
information to define the groups. All actors with identical values on the criterion
vector (i.e. the second row of attrib) will be placed in the same group.

No. of Permutations: (Default = 1000)


The number of random permutations required in the test.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

Output Dataset (Default= 'lltab')


Name of output dataset that contains the frequencies in the observed data
corresponding to the partition.

LOG FILE The actor attributes are recoded to run from 1 and these are reported.
A table which gives the cross classified frequencies, that is a contingency table
corresponding to the attributes and the input dataset.
A table which gives the expected values of the frequencies assuming that the ties
are independent and randomly distributed throughout the groups.
The observed values in each cell of the first table divided by the corresponding
cell in the second table are then reported. This is followed by the observed chi
square value, ie the square of the observed minus the expected divided by the
expected value.
The average permutation frequency table gives the mean values of the entries
from all the permutation tests. Each of the generated entries have their value
compared with the observed value and the significance is the relative frequency
of the number of times the generated value is larger than the observed.
TIMING O(N^2)

COMMENTS None

REFERENCES Cliff, A D and Ord, J K 1973 Spatial Autocorrelation. Pion, London.


TOOLS > STATISTICS > AUTOCORRELATION > CATEGORICAL > ANOVA /
DENSITY

PURPOSE Perform randomization test of autocorrelation for a categorical variable.

DESCRIPTION Relates a dyadic variable (an actor-by-actor matrix) to a monadic variable (a


vector representing an attribute of each actor). For example, if the dyadic
variable consists of who is friends with whom, and the categorical variable is
gender, the procedure tests whether friendship is patterned by gender (e.g., do
boys prefer boys and girls prefer girls?). The test is based upon the densities
within each block and is similar to performing an analysis of variance. Three
different models which have different patterns of density are possible.

PARAMETERS
Network or Proximity Matrix
Name of file containing matrix to be analyzed. Data type: Matrix.

Actor Attribute:
Name of file containing actor attributes, given as a vector of shared attributes so
that (1,2,3,1,2,2) means that actors 1 and 4 share the same attribute actors 2,5,and
6 share the same attribute and actor 3 has a different attribute from all the others.

Model (Default = Structural Blockmodel)


Choices are:

Constant Homophily. Tests hypothesis that actors prefer to interact with


members of their own kind (as defined by the actor attribute), and assumes that
all groups have equal inbreeding tendencies.

Variable Homophily. Similar to the constant homophily model, except that it


assumes that each group or class of actors has a different homophilic tendency
(different inbreeding parameter).

Structural Blockmodel. Most general model. Just asks whether the different
classes have significantly different interaction patterns. For example, girls might
prefer girls (inbreeding), while boys also prefer girls (outbreeding).

Number of random perms: (Default=1000)


Number of autocorrelations to compute between the data matrix and the
randomly permuted structure matrix. The larger the number of permutations, the
better the estimates of standard error and "significance", but the longer the
computation time.

Treat diagonals as valid? (Default = No)


If Yes, the values along the main diagonals of each matrix are included in the
computations. Otherwise, they are treated as missing.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

Output dataset (Default= 'AUTOSIM')


LOG FILE The actor attributes are recoded so they run from 1 to n, these are reported.
The between group and in-group means are reported if either of the homophily
models were chosen. For constant homophily the in-group mean is the overall
mean of all within group interactions. For variable homophily each separate
within group mean is reported. For the structural blockmodels option the total
sum, the average value and the number of cells within each block are reported. In
all cases this is followed by the value of the autocorrelation together with the r-
squared value, the root mean square and the sum of squares. Below this is the
autocorrelation averaged over all the permutations together with the standard
error. Finally the proportion of random values which are as large as the actual
autocorrelation is reported. This gives the significance of the calculated value, so
for example if this were below 0.05 we would conclude at the 5% level that the
dyadic variable is related to the categorical attribute.

TIMING O(N^2)

COMMENTS None

REFERENCES None
TOOLS>STATISTICS>AUTOCORRELATION>INTERVAL/RATIO

PURPOSE Perform a randomization test of autocorrelation with an interval or ratio level


attribute variable.

DESCRIPTION Relates a dyadic variable (an actor-by-actor matrix) to a monadic variable (a


vector representing an interval-scaled attribute of each actor). For example, if the
dyadic variable is who is friends with whom, and the monadic variable is height,
the procedure tests whether friendship is patterned by height (e.g., children
prefer to be friends with children who are the same height as themselves).

PARAMETERS
Network or Proximity Matrix
Name of file containing matrix to be analyzed. Data type: Matrix.

Actor Attribute(s)
Name of file containing actor attributes.

Model (Default = Geary)


Choices are:

Geary. Geary's C statistic (larger negative values indicate greater positive


autocorrelation).

Moran. Moran's I statistics (larger positive values indicate greater positive


autocorrelation).

Number of random perms: (Default=1000)


Number of autocorrelations to compute between the data matrix and the
randomly permuted structure matrix. The larger the number of permutations, the
better the estimates of standard error and "significance", but the longer the
computation time.

Treat diagonals as valid? (Default = No)


If Yes, the values along the main diagonals of each matrix are included in the
computations. Otherwise, they are treated as missing.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

Output dataset (Default= 'AUTOSIM')

LOG FILE The value of the autocorrelation followed by the autocorrelation averaged over
all the permutations together with the standard error. The proportion of random
values which are as large for Geary or small for Moran as the actual
autocorrelation gives the significance of the calculated value and this is reported.

TIMING O(N^2)

COMMENTS None
REFERENCES See Cliff and Ord's classic 1973 book 'Spatial autocorrelation' London: Pion.
TOOLS > STATISTICS > VECTOR > REGRESSION

PURPOSE Regress a dependent vectors on one or more independent vectors, and assess
significance of the r-square and regression coefficients.

DESCRIPTION The procedure is principally used to model a vector using values of other vectors.

The algorithm proceeds in two steps. In the first step, it performs a standard
multiple regression across corresponding cells of the dependent and independent
vectors.

In the second step, it randomly permutes rows the elements of the dependent
vector and recomputes the regression, storing resultant values of r-square and all
coefficients. This step is repeated hundreds of times in order to estimate standard
errors for the statistics of interest. For each coefficient, the program counts the
proportion of random permutations that yielded a coefficient as extreme as the
one computed in step 1.

PARAMETERS
Dependent dataset:
Name of dataset containing the observed or dependent data: the vector whose
values are to be predicted. This is given as a column in a matrix. Data type:
Matrix.

Dependent column #: (Default=1)


Specifies which column of the data matrix contains the dependent vector.

Independent dataset:
Names of dataset containing the independent vectors. All independent vectors
must be contained in a single matrix. Data type: Matrix.

Independent column #s: (Default=1)


Specifies which columns of the independent dataset contain the independent
vectors. Columns to be selected are specified by a list. Each column number is
listed separated by a comma or space. The keywords TO, FIRST and LAST are
permissible. Hence FIRST 3, 5 TO 7, 10, 12 would give column numbers 1, 2, 3,
5, 6, 7, 10 and 12. ALL gives all possible columns. Lists kept in a UCINET
dataset can be used. Enter the filename followed by ROW (or COLUMN) and a
number to specify which row or column of the file to use.The list must be
specified using a binary vector where a 1 in position k indicates that vertex k is a
member of the list, a zero indicates that k is not a member.

Number of random permutations: (Default = 1000)


Number of regressions to compute between the original data and the randomly
permuted data. The larger the number of permutations, the better the estimates of
standard error and "significance", but the longer the computation time.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

(Output) Regression Coefficients: (Default='Coefs')


Name of file containing the regression coefficients.

(Output) Correlation Matrix:(Default= 'RegCorr')


Name of file containing the correlation matrix.

(Output) Inverse of correlation Matrix (Default='RegInv')


Name of file containing the inverse of the correlation matrix.

(Output) Predicted values and residuals. (Default='PredVals')


Name of file containing the predicted values and residuals.

LOG FILE The correlation matrix followed by information on the model fit. This is followed
by a table of regression coefficients. This table gives the Unstandardized and
standardized regression coefficients for each independent variable, including the
intercept, along with the proportion of random trials yielding a coefficient i) as
large or larger, ii) as small or smaller and iii) as extreme as the observed value.
These values give the significance of the coefficients.

TIMING O(N^2).

COMMENTS The program ignores missing values.

REFERENCES None.
TOOLS > STATISTICS > VECTOR > ANOVA

PURPOSE Performs an ANOVA with a significance based upon a permutation test.

DESCRIPTION Undertakes a standard analysis of variance but uses a permutation test to generate
the significance level so that standard assumptions on independence and random
sampling are not required.

PARAMETERS

Dependent (Y) variable:


Name of file containing the dependent vector, this must be a UCINET data file.
Enter the filename followed by ROW (or COL) and a number to specify which
row or column of the file to use.

Independent (X) variable:


Name of file containing the independent vector, this must UCINET data file.
Enter the filename followed by ROW (or COL) and a number to specify which
row or column of the file to use.

Number of random permutations: (Default = 5000)


The larger the number of permutations, the better the estimates of standard error
and "significance", but the longer the computation time.

Random number seed:


The random number seed sets off the random permutations. UCINET generates a
different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

LOG FILE A standard analysis of variance table together with the significance value derived
from the permutation test.

TIMING N/A

COMMENTS None

REFERENCES None
TOOLS > STATISTICS > VECTOR > T TEST

PURPOSE Performs a t-test with a significance based upon a permutation test.

DESCRIPTION Undertakes a standard t-test to compare the means of two groups but uses a
permutation test to generate the significance level so that standard assumptions
on independence and random sampling are not required.

PARAMETERS

Dependent (Y) variable:


Name of file containing the dependent vector, this must be a UCINET data file.
Enter the filename followed by ROW (or COL) and a number to specify which
row or column of the file to use.

Independent (X) variable:


Name of file containing the independent vector, this must be a UCINET data file.
Enter the filename followed by ROW (or COL) and a number to specify which
row or column of the file to use.

Number of random permutations: (Default = 5000)


The larger the number of permutations, the better the estimates of standard error
and "significance", but the longer the computation time.

Random number seed:


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

LOG FILE Gives standard statistics on each group followed by significance tests. The
difference in means is reported together with the two one tailed tests assessing
whether one mean is greater than the other and the two tailed test.

TIMING N/A

COMMENTS None

REFERENCES None
TOOLS > STATISTICS >COMPARE DENSITIES>PAIRED
PURPOSE Give a statistical test for the comparison of the densities of two networks in
which the actors are paired.

DESCRIPTION This routine uses a bootstrap technique to compare the densities of two not
necessarily independent networks with the same actors. This method is
analogous to the classical paired sample t-test for estimating the standard error
of the difference. Its main use would be in comparing the same relation on the
same set of actors at two different time points.

PARAMETERS 1st Network


Name of UCINET dataset containing one of the datasets to be compared. Data
type: Valued graph

2nd Network
Name of UCINET dataset containing the same actors (in the same order) as the
1st dataset. Data type: Valued graph.

Number of Samples
Gives the number of times sampling with replacement is used to construct the
distribution.

LOG FILE The output gives the density of both matrices together with the difference and
the number of samples taken. This is followed by a classical t-test. The
estimated bootstrap standard errors are then reported together with the bootstrap
standard error of the differences, the bootstrap 95% confidence intervals and the
bootstrap t-statistic assuming independent samples. The bootstrap standard error,
confidence interval, t-statistic and average value are then reported for the paired
samples. Finally the proportion of differences (absolute, as large as and as small
as) to the observed values are given.

TIMING

COMMENTS

REFERENCES Tom A.B. Snijders and Stephen P. Borgatti (1999) Non-Parametric Standard
Errors and Tests for Network Statistics. Connections 22(2): 1-11
TOOLS > STATISTICS >COMPARE>DENSITIES>THEORETICAL
PARAMETER

PURPOSE Give a statistical test for the comparison of the density of a network to a
theoretical value.

DESCRIPTION This routine uses a bootstrap technique to compare the density of a network to a
specified value. In essence a distribution is built up by sampling the network
with replacement from the vertices. There is an assumption that vertices are
interchangeable.

PARAMETERS 1st Network


Name of UCINET dataset containing the datasets to be compared. Data type:
Valued graph

Expected Density
Value of the theoretical parameter to which the observed value will be compared.

Number of Samples
Gives the number of times sampling with replacement is used to construct the
distribution.

LOG FILE The output gives the parameter value and the density of the matrix together with
the difference and the number of samples taken. This is followed by the actual
variance and the classical estimate of the standard error. The number of samples
in the bootstrap are then reported together with the estimated bootstrap standard
error, z-score and average density. Finally the proportion of differences
(absolute, as large as and as small as) to the observed values are given.

TIMING

COMMENTS

REFERENCES Tom A.B. Snijders and Stephen P. Borgatti (1999) Non-Parametric Standard
Errors and Tests for Network Statistics. Connections 22(2): 1-11
TOOLS>STATISTICS>COMPARE AGGREGATE PROXIMITY
MATRICES>PARTITION

PURPOSE Use a permutation test to compare proximity matrices aggregated from a


cognitive social structure into two mutually exclusive groups.

DESCRIPTION To compare aggregated proximity matrices from a partition of the respondents


into two mutually exclusive groups Eg male and female, we begin by correlating
the two matrices (or computing a dissimilarity measure). This is our observed
test statistic. Then we go back to the individual level data and divide the
respondents into two groups at random. We then aggregate the matrices
separately for each group, obtaining an aggregate proximity matrix for each
group. Next, we correlate these matrices (or compute dissimilarity measure) and
store the result. This process is repeated thousands of times to generate a
distribution of (dis)similarities under the null hypothesis of independence (i.e.,
judged proximities are independent of gender). We then count the proportion of
correlations (or dissimilarity measures) that are as small (or as large) as the
observed measure. The proportion of correlations as small as the observed (or,
equally, the proportion of dissimilarity coefficients as large as the observed)
gives the p-value: the likelihood that the difference we see could be obtained by
chance. Note that the aggregation is simply the mean of the matrices.

PARAMETERS Input Dataset


Name of dataset containing the cognitive social structure.
Data type: Valued graph, multirelational.

Utilize diagonal values (Default=No)


If YES diagonal values are included

Data are symmetric (Default = No)

Partition Vector
The name of an Ucinet dataset.To partition the matrices of the data
matrix into groups, specify a blocking vector by giving the dataset name, a
dimension and an integer value. For example, to use the second row of a dataset
called ATTRIB, enter "ATTRIB ROW 2". The program will then read the second
row of ATTRIB and use that information to sort the matrices. All matrices with
identical values on the criterion vector (i.e. the second row of attrib) will be
placed in the same group. There should only be two groups and so the vector
should only contain two different values. The partition can also be typed in
directly so that 1 1 2 1 2 2 2 places matrices 1,2 and 4 in one group and matrices
3,5,6 and 7 in the other group.

No. of permutations (Default =2000)


Number of Permutations used in the permutation test.

Output Dataset (Default = 'agprox')


Name of file that will contain the mean of the matrices corresponding to each
group. Two files will be produced one for each group and they will be called
agprox1 and agprox2. These are not displayed in the logfile.

LOG FILE A listing of the partitions used in the aggregation procedure,


followed by the sizes of the two groups, the number of
observations and the number of permutations used in the test.
The observed correlation and Euclidean distance are the values
calculated between the two aggregated matrices. This is
followed by the average correlation and Euclidean distance over
all the random permutations. Finally the number of times the
correlation and regression were as high or higher and as low or
lower are given as a probability. These values are used to
determine the significance of the observed values.

TIMING O(N^2)

COMMENTS None

REFERENCES Borgatti, S.P. () A Statistical Method for Comparing Aggregate Data Across a

Priori Groups
TOOLS>STATISTICS>COMPARE AGGREGATE PROXIMITY
MATRICES>OVERLAPPING GROUPS

PURPOSE Use a permutation test to compare proximity matrices aggregated from a


cognitive social structure into two groups which may overlap.

DESCRIPTION To compare aggregated proximity matrices from a partition of the respondents


into two possibly overlapping groups Eg Smokers and Drinkers, we begin by
correlating the two matrices (or computing a dissimilarity measure). This is our
observed test statistic. Then we go back to the individual level data and divide
the respondents into two groups at random. We then aggregate the matrices
separately for each group, obtaining an aggregate proximity matrix for each
group. Next, we correlate these matrices (or compute dissimilarity measure) and
store the result. This process is repeated thousands of times to generate a
distribution of (dis)similarities under the null hypothesis of independence (i.e.,
judged proximities are independent of gender). We then count the proportion of
correlations (or dissimilarity measures) that are as small (or as large) as the
observed measure. The proportion of correlations as small as the observed (or,
equally, the proportion of dissimilarity coefficients as large as the observed)
gives the p-value: the likelihood that the difference we see could be obtained by
chance. Note that the aggregation is simply the mean of the matrices.

PARAMETERS Input Dataset


Name of dataset containing the cognitive social structure.
Data type: Valued graph, multirelational.

Utilize diagonal values (Default=No)


If YES diagonal values are included

Data are symmetric (Default = No)

Group Indicator Matrix


The name of an Ucinet dataset. This dataset must contain a row for each
actor and two columns representing the two groups. The (i,j)th entry is a 1 if
actor i is in group j (j= 1 or 2) and zero otherwise. The matrix is simply a
standard incidence matrix with two columns.

No. of permutations (Default =2000)


Number of Permutations used in the permutation test.

Output Dataset (Default = 'agprox')


Name of file that will contain the mean of the matrices corresponding to each
group. Two files will be produced one for each group and they will be called
agprox1 and agprox2. These are not displayed in the logfile.

LOG FILE A listing of the partitions used in the aggregation procedure,


followed by the sizes of the two groups, the number of
observations and the number of permutations used in the test.
The observed correlation and Euclidean distance are the values
calculated between the two aggregated matrices. This is
followed by the average correlation and Euclidean distance over
all the random permutations. Finally the number of times the
correlation and regression were as high or higher and as low or
lower are given as a probability. These values are used to
determine the significance of the observed values.

TIMING O(N^2)

COMMENTS None

REFERENCES Borgatti, S.P. () A Statistical Method for Comparing Aggregate Data Across a

Priori Groups
TOOLS > STATISTICS > P1

PURPOSE Fits the Holland and Leinhardt P1 model for binary networks.

DESCRIPTION All dyads (i,j) in a sociometric choice matrix X can be classified as mutual (xij =
xji = 1), asymmetric (xij not equal to xji), or null (xij = xji = 0). The probabilities of
each type of dyad are modelled as a function of three sets of substantive
parameters: expansiveness of each actor, popularity of each actor, and
reciprocity. The probabilities of mutual, asymmetric and null dyads, denoted mij,
aij, and nij respectively, are modeled as follows:

mij = lijexp(r+2q+ai+aj+âi+âj)
aij = lijexp(q+ai+bj)
nij = lij

In the equations, the a parameters are interpreted as "productivity" or


"expansiveness" measures for each node. The b parameters are interpreted as
"attractiveness" or "popularity" measures. The r parameter is interpreted as a
general measure of the tendency towards "reciprocity" or "mutuality" in the
network. The q parameter is a function of the density of the network, reflecting
the total number of arcs observed. Finally, the l parameters are normalizing
constants used to insure that the modeled probabilities add to 1 for any given
dyad.

PARAMETERS
Input Dataset:
Name of file that contains network to be analyzed. Data type: Valued graph.

(Output) Parameter dataset (Default = 'Alphabet')


Name of file to contain alpha and beta parameters.

(Output) Expected values (Default = 'P1Expect')


Name of file to contain P1 expected values.

Output residual values (Default= 'P1Resid')


Name of file to contain P1 residuals.

LOG FILE G-squared negative goodness-of-fit value with degrees of freedom. Probabilities
are not printed because the theoretical distribution governing these values has not
yet been established.

Values of q and r .

Expansiveness (a) and popularity (b) parameters for each actor.

An nxn matrix containing the P1 expected value between each pair of actors.

An nxn matrix of residuals (observed data minus expected) between each pair of
actors.

A single-link hierarchical clustering of symmetrized residuals.

TIMING O(N^4).
COMMENTS The model would be more useful if the distribution of G-squared were known: as
it is, we cannot say for certain when the model fits and when it does not.

REFERENCES Holland P and Leinhardt J (1981). "An Exponential Family of Probability


Distributions for Directed Graphs." Journal of the American Statistical
Association 76:33-6
TOOLS > MATRIX ALGEBRA

PURPOSE Command-driven matrix algebra package.

DESCRIPTION Input and output are UCINET datasets. Capabilities are divided into functions
and procedures, which have different syntax. Further, within functions we can
distinguish three basic types:

Uniary Operations. Those that operate on a single dataset and take no arguments
(e.g. ABS, which takes the absolute value of every cell in the matrix);

Binary Operations. Those that perform algebraic and arithmetic operations


require two or more datasets (e.g. ADD, which adds corresponding cells of two
or more matrices);

Inner Products. Those that perform arithmetic operations on various dimensions


(i.e. rows, columns, matrices) of a single dataset (e.g. TOTAL, which sums
values of a matrix broken out by row, column, level or combinations of these).

When you choose Algebra from the menu, then a command window will open
up. You can close the window by clicking on the close button. Commands are
typed in the command window you can scroll back to previous commands by
using the up and down arrows.

The difference in the two kinds of commands is reflected in their syntax.

1. Functions

Functions have this basic syntax:

<output matrix> = <function>(<arguments>)

In the documentation to follow, an item enclosed in angle brackets denotes a


name or other input to be provided by the user. Hence, <output matrix> refers to
the name of a dataset to be supplied by the user. Items enclosed in square
brackets will denote optional arguments. Anything else, such as an equal sign or
parenthesis, is something to be typed verbatim.

An example of valid syntax for a function is this:

y = inverse(x)

In the example, x is a pre-existing dataset in the current folder, inverse is the


name of a function, and y is the name of a yet-to-be-created dataset to contain the
inverse of the matrix in x. Datasets may be named using their full pathnames, as
in:

a:tdavis = transpose(c:\ucinet\data\davis)

Most functions will have a single argument consisting of the name of an input
matrix. Others will have two or more arguments, again consisting of the names
of datasets. For instance, the syntax for the ADD command is as follows:

<matrix> = add(<matrix1>,<matrix2>,...)
An example would be:

mpx = add(business,marriage,friend)

A few functions take other kinds of arguments. For example, to generate an


identity matrix with 5 rows and columns, you would type:

junk = identity(5)

2. Procedures

The syntax for procedures differs from functions in that there is no output matrix:

<procedure><arguments>

An example is:

display padgett

Another example is:

svd davis = u d v

This requests a singular value decomposition of the matrix davis into three
matrices (datasets) to be called u, d, and v.

3. Expressions

One useful fact to remember is that whenever the syntax for a function or
procedure calls for the name of a matrix, a function may be substituted instead.
For example, the command

y = inverse(transpose(inf))

requests that the inverse of the transpose of a matrix inf be calculated and saved
as dataset y. There is no limit to the amount of nesting. For example, the
following command is perfectly valid, though neither efficient nor very readable:

b = prod(inv(prod(transp(x),x)),prod(transp(x),y))

A less error-prone alternative would be the following series:

xt = transp(x)
xtx = prod(xt,x)
xty = prod(xt,y)
b = prod(inv(xtx),xty)

FURTHER INFORMATION

Uniary Functions

Binary Functions

Inner Products
Procedures
TOOLS > SCATTERPLOT> DRAW

PURPOSE Plots one matrix column against another in the (x,y) plane.

DESCRIPTION Plots two specified columns of a matrix against each other. The x co-ordinates
(horizontal axes) are an element of the first column and the y co-ordinates
(vertical axes) are the corresponding elements of the second column. Points can
be labeled using ASCII characters.

PARAMETERS
Input dataset:
Name of file containing matrix with data to be plotted. Data type: Matrix.

Column to use for horizontal or x-axis: (Default = 1).


Column number for horizontal axis.

Column to use for vertical or y-axis: (Default = 2).


Column to use for vertical axis.

File containing point labels, if any:


If blank then points are labeled by row number. If used, file should be ASCII and
contain the labels. The labels must be specified in a list, each separated by a
comma, the list must contain the same number of labels as rows in the data
matrix.

LOG FILE A scatter plot with the tick marks on the axes. Each point on the scatter plot is
marked by the row of the column vectors or a label from the label file. If two
points have the same coordinates then the label corresponding to the highest row
number is used.The scatterplot can be saved or printed. Simple editing can be
achieved using the options button. The labels can be turned on or off and values
can be attached to the points (or removed). The scales can also be changed. More
advanced editing is possible by double clicking in the plot, this invokes the chart
wizard. To find the label attached to a single point when all the labels are moved
click on a single point, this will highlight all the points, then click a second time
to highlight one vertex. Now double click on the vertex and the label will be
highlighted in the chart designer. The save button and the save chart data option
allow the user to save all the chart data into a file which can be reviewed using
Tools>Scatterplot>Review. The chart itself can be saved as a windows metafile
which can then be read into a word processing or graphics package. Only one
chart can be open at one time and the chart window will be closed if you click on
any other UCINET window.

TIMING Linear

COMMENTS This routine only works if the regional settings are set to UK or USA. If you do
not have these regional settings and do not get a plot then change them in the
settings control panel on your machine.

REFERENCES None.
TOOLS > SCATTERPLOT REVIEW

PURPOSE Displays previously filed scatter plots.

DESCRIPTION Scatter plots can be saved as files and reviewed directly using this routine. They
are saved with the extension sdf.

PARAMETERS
Input dataset:
Name of scatterplot file to be displayed.

LOG FILE None but scatterplot is displayed.

TIMING N/A

COMMENTS None

REFERENCES None
TOOLS > DENDROGRAM /TREE DIAGRAM> DRAW
PURPOSE Generates a dendrogram or tree diagram from hierarchically nested partition data.

DESCRIPTION This routine allows for the creation of the hierarchical cluster diagrams from a
UCINET generated partition matrix. It is also possible to generate the diagrams
from user defined partition matrices.

PARAMETERS
Input dataset
Name of file containing a partition indicator matrix. A partition indicator matrix
has rows which correspond to different partitions and columns which represent
members of the groups. A value of k in row i and column j means that actor j is
in group k for the partition corresponding to row i. All other actors in the same
group should be assigned the same value in row i. Each successive row must
specify an increasingly finer (or coarser) partition. The row labels (if specified)
correspond to the levels of the partition.

LOG FILE A hierarchical clustering diagram either a tree diagram or a dendrogram. The plot
re-orders the actors so that they are located close to other actors in similar
clusters. The level at which any pair of actors are aggregated is the point at which
both can be reached by tracing from the start to the actors from right to left. The
scale at the top gives the level at which they are clustered. The diagram can be
printed or saved. Parts of the diagram can be viewed by moving the mouse to the
split point in a tree diagram or the beginning of a line in the dendrogram and
clicking. The first click will highlight a portion of the diagram and the second
click will display just the highlighted portion. To return to the original right click
on the mouse. There is also a simple zoom facility simply change the values and
then press enter. If the labels need to be edited (particularly the scale labels) then
you should take the partition indicator matrix into the spreadsheet editor remove
or reduce the labels and then submit the edited data.

TIMING Linear

COMMENTS None

REFERENCES None.
TOOLS >DENDROGRAM/TREE DIAGRAM >REVIEW

PURPOSE Displays previously filed cluster diagrams.

DESCRIPTION Dendrograms and tree diagrams can be saved as bitmap files and reviewed
directly using this routine. They are saved with the extension bmp.

PARAMETERS
Input bitmap filename:
Name of file to be displayed.

LOG FILE None

TIMING N/A

COMMENTS None

REFERENCES None
UNIARY OPERATIONS
ABSOLUTE - Syntax: abs(<mat>).
Takes the absolute value of every value in <mat>. May be abbreviated to "ABS". Example:

junk = abs(a:\atlanta\corrmat)

ARCTAN - Syntax: arc(<mat>). Takes the arctangent of each value in <mat>. Example:

junk = arc(a:\atlanta\corrmat)

COMMON LOG - Syntax: log10(<mat>). Takes the base 10 logarithm of each value of the
argument. Example:

junk = log10(a:\atlanta\corrmat)

COSINE - Syntax: cos(<mat>). Takes the cosine of each value in <mat>. Example:

junk = cos(a:\atlanta\corrmat)

EXPONENT - Syntax: exp(<mat>). Raises e (the base of natural logarithms) to the power
given by each cell of the argument. Example:

junk = exp(a:\atlanta\corrmat)

FILL - Syntax: fill(<mat>,<nr>,<nc>). Expands the matrix in <mat> to the dimensions given
by <nr> and <nc> by duplicating values. For example, given matrix X, the command

1 2 3
X= 4 5 6
7 8 9

y = fill(x,5,6)

yields:
1 2 3 1 2 3
4 5 6 4 5 6
Y= 7 8 9 7 8 9
1 2 3 1 2 3
4 5 6 4 5 6

GENERALISED INVERSE - Syntax: ginv(<mat>). Given a dataset <mat> containing a


matrix X (with at least as many rows as columns), the function computes the inverse X^-1 such
that XX^-1 = I, where I is the identity matrix.

junk = ginv(a:\atlanta\corrmat)

IDENTITY - Syntax: id(<n>). Generates an identity matrix with <n> rows and columns.
Example:

i = id(100)

INVERSE - Syntax: inv(<mat>). Given a dataset <mat> containing a square non-singular


matrix X, the function computes the inverse X^-1 such that XX^-1 = I, where I is the identity
matrix. If the matrix is not square, or is not of full rank, use the generalized inverse ginv instead.
Example:

junk = inv(a:\atlanta\corrmat)

LOG - See NATURAL LOG or COMMON LOG.

LINEAR - Syntax: lin(<mat>,<real>,<real>). Given a data set containing a matrix then the
function performs a linear transformation on every cell value. If a cell value was x then the
function forms real 1x + real 2. If real 2 is omitted then it is assumed to be zero. Example:

junk = lin(a:\atlanta\corrmat,3.2,4)

creates a new matrix junk which has each cell transformed by multiplying by 3.2 and adding 4.

MATRIX - Syntax: mat(<real>[,<nr>][,<nc>],[<n1>]). Converts a number into a matrix, or


creates a matrix of constants. If <nr>, <nc>, and <n1> are not specified, the function returns a 1-
by-1 matrix containing the value <real>. The parameter <n1> specifies the number of
levels/matrices to create. To specify <n1>, you must specify <nr> and <nc> as well. Examples:

junk = mat(3.92) {creates 1-by-1 matrix}


junk = mat(4,10,10) {creates 10-by-10 matrix containing only 4s}
junk = mat(4,10,10,2) {creates 2 10-by-10 matrices containing only 4s}

This function is useful for adding a constant to a matrix. For example,

junk = add(freqs,mat(0.01,8,10))

adds the constant 0.01 to every cell of the 8-by-10 matrix contained in freqs.

NATURAL LOG - Syntax: log(<mat>) or ln(<mat>). Takes the natural logarithm of each value
of the argument. Examples:

junk = log(a:\atlanta\corrmat)
junk = ln(a:\atlanta\corrmat)

NEGATIVE - Syntax: neg(<mat>). Multiplies each value of <mat> by -1. Example:

revcorr = neg(a:\atlanta\corrmat)

RECIPROCAL - Syntax: rec(<mat>). Multiplies each value of the argument by -1. Example:

junk = rec(a:\atlanta\corrmat)

ROUND - Syntax: round(<mat>) or rnd(<mat>). Rounds each value of <mat> to the nearest
integer. Example:

junk = rnd(a:\atlanta\corrmat)

SINE - Syntax: sin(<mat>). Computes sine of each value in <mat>.Example:

junk = sin(a:\atlanta\corrmat)

SQUARE - Syntax: sqr(<mat>). Computes square of each value in <mat>. Example:

junk = sqr(a:\atlanta\corrmat)
SQUARE ROOT - Syntax: sqrt(<mat>). Computes square root of each value in <mat>.
Example:

junk = sqrt(a:\atlanta\corrmat)

TRUNCATE - Syntax: trunc(<mat>) or trnc(<mat>). Rounds each value of <mat> down to the
largest whole number contained by the value. Example:

junk = trunc(a:\atlanta\corrmat)

FURTHER INFORMATION

Binary Operations

Uniary Operations

Procedures

Matrix Algebra
BINARY OPERATIONS
AVERAGE - Syntax: avg(<mat1>,<mat2>,...). Takes the average value of corresponding cells
across two or more matrices.Example:

c = avg(a,b)

BOOLEAN PRODUCT - Syntax: bprod(<mat1>,<mat2>). Boolean multiplication of two


binary matrices. Example:

junk = bprod(business,marriage)

DIVIDE - Syntax: div(<mat1>,<mat2>). Divides each cell of <mat1> by the corresponding


cell of <mat2>. Divisions by zero result in missing values.Example

junk = div(c:\atlanta\corrmat,mcorr)

EQUAL - Syntax: eq(<mat1>,<mat2>,...). Compares two or more matrices and puts a value of
1 where all matrices have the same value and a 0 where any are different. For example, typing

junk = eq(a,b)

gives a new binary matrix called junk which has 1s in those cells where a and b have the same
value, and has 0s elsewhere.

GREATER THAN - Syntax: gt(<mat1>,<mat2>,...). Compares two or more matrices, creating


a new matrix which is 1 for all cells where the first matrix is strictly larger than all subsequent
matrices, and 0 elsewhere.

c = gt(a,b)

In the example, the matrix c will have 1s only in those cells where a dominates b.

GREATER THAN OR EQUAL TO - Syntax: ge(<mat1>,<mat2>,...). Compares two or more


matrices, creating a new matrix which is 1 for all cells where the first matrix is larger than or
equal to all subsequent matrices, and 0 elsewhere.

c = ge(a,b)

In the example, the matrix c will have 1s only in those cells where a is not dominated by b.

LESS THAN - Syntax: 1t(<mat1>,<mat2>,...). Compares two or more matrices, creating a new
matrix which is 1 for all cells where the first matrix is strictly less than all subsequent matrices,
and 0 elsewhere.

c = lt(a,b)

In the example, the matrix c will have 1s only in those cells where a is dominated by b.

LESS THAN OR EQUAL TO - Syntax: le(<mat1>,<mat2>,...). Compares two or more


matrices, creating a new matrix which is 1 for all cells where the first matrix is less than or equal
to all subsequent matrices, and 0 elsewhere.

c = le(a,b)
In the example, the matrix c will have 1s only in those cells where a is smaller than or equal to
the value of b.

MAXIMUM - Syntax: max(<mat1>,<mat2>,...). Takes the largest value of corresponding cells


across two or more matrices.

c = max(a,b)

MINIMUM - Syntax: min(<mat1>,<mat2>,...). Takes the smallest value of corresponding cells


across two or more matrices.

c = min(a,b)

MULTIPLY - Syntax: mul(<mat1>,<mat2>,...). Takes the average value of corresponding cells


across two or more matrices.

c = mul(a,b)

PRODUCT - Syntax: prod(<mat1>,<mat2>,...). Matrix multiplication of two matrices. This is


NOT element-wise multiplication of corresponding values (see MULTIPLY).Example:

buskin = prod(business,marriage)

In the example, the business matrix is pre-multiplied by marriage.

SQUARED DIFFERENCE - Syntax: sqrdif(<mat1>,<mat2>,...). Takes the squared difference


of corresponding cells across two or more matrices.

c = sqrdif(a,b)

One application of this function is to compare a data matrix with a predicted matrix, based on a
least squares criterion.

SUBTRACT - Syntax: sub(<mat1>,<mat2>,...). Subtracts the values of corresponding cells of


two or more matrices from the first matrix mentioned.

c = sub(a,b)

In the example, the values of b are subtracted from the values of a.

FURTHER INFORMATION

Uniary Operations

Inner Products

Procedures

Matrix Algebra
INNER PRODUCTS

WAVERAGE - Syntax: wavg(<mat1>,[R½C½L] [R½C½L]). Average values of <mat1>,


with optional breakout by one or two dimensions. Examples:

rowmeans = wavg(davis rows)


colmeans = wavg(davis cols)
density = wavg(davis)
avgtie = wavg(newcomb rows cols)

The last example totals all matrices contained in thenewcomb dataset to get a single matrix. In
other words, it takes a 3-dimensional table (rows, columns and matrices) and aggregates across
matrices to obtain a table with just rows and columns.

TOTAL - Syntax: tot(<mat1>,[R½C½L] [R½C½L]). Adds values of <mat1>, with optional


breakout by one or two dimensions. Examples:

rowsums = tot(davis rows)


colsums = total(davis cols)
nties = tot(davis)
allrels = tot(newcomb rows cols)

The last example totals all matrices contained in the newcomb dataset to get a single matrix. In
other workds, it takes a 3-dimensional table (rows, columns and matrices) and aggregates across
matrices to obtain a table with just rows and columns.

WMAXIMUM - Syntax: wmax(<mat1> [r½c½1] [r½c½1]). Takes the largest value of


within a dataset, optionally broken out by one or more dimensions. Example:

rowmax = wmax(ron1 rows)


matmax = wmax(krack lev)

WMINIMUM - Syntax: wmin(<mat1> [r½c½1] [r½c½1]). Takes the smallest value of


within a dataset, optionally broken out by one or more dimensions. Example:

rowmin = wmin(ron1 rows)


matmin = wmin(krack lev)

TRANSPOSE - Syntax: transp(<mat> [<dim><dim>]). Exchanges any two dimensions of a


dataset. If no dimensions are given, rows and columns are assumed. Examples:

tdavis = transp(davis)
cent2 = transp(cent cols levs)

FURTHER INFORMATION

Uniary Operations

Binary operations

Procedures

Matrix Algebra
PROCEDURES
In this section we document each ALGEBRA procedure individually, giving the syntax and a
brief description for each one. The syntax gives the minimum abbreviation and any alternate
spellings. The procedures are arranged in alphabetical order by concept.

CHANGE FOLDER - Syntax: cd<drive:\folder>). Change default folder (and/or drive).


Affects where UCINET will look for data and where data will be saved.

cd\ucinet\data
cd a:

DISPLAY - Syntax: disp <mnat> or dsp <mat>. Displays all cells of <mat> to the screen.

dsp c:\ucinet\data\padgett
dsp ginv(transp(davis))

LET - Syntax: let <function call>. Technically, the LET command is always implicit before any
function statement. For example, the following two commands are identical:

xtx = prod(transp(x),x)
let xtx = prod(transp(x),x)

The only reason to use LET is if your output dataset has the same name as an ALGEBRA
procedure, which would confuse the interpreter. For example, the following command would
NOT create a dataset called "DSP":

dsp = inverse(xtx)

Instead, the interpreter would assume that you wanted to display a matrix called "= inverse(xtx)".
However, the following would work:

let dsp = inverse(xtx)

QUIT - Syntax: quit or exit. Leave ALGEBRA and close the matrix algebra windows. Usage:

exit
quit

SINGULAR VALUE DECOMPOSITION - Syntax: svd<amat> = <umat><dmat><vtmat>,


where <amat> is an m-by-n data matrix of rank r, <umat> will be an m-by-r output matrix,
<dmat> will be a diagonal r-by-r output matrix, and <vtmat> will be an n-by-r output matrix.
The program requires m ³ n. Usage:

svd davis = u d vt

The <umat> and <vtmat> matrices are often referred to as "row scores" and "column scores"
respectively. The <dmat> matrix contains singular values down the main diagonal and zeros
elsewhere.

The singular value decomposition of a square, symmetric matrix gives row and column scores
equal to the eigenvectors of the matrix, and the singular values are their eigenvalues. The SVD
of any matrix X gives row scores equal to the eigenvectors of XX' and column scores equal to
the eigenvectors of X'X. The singular values of X are the square of the eigenvalues of both XX'
and X'X.
FURTHER INFORMATION

Uniary Operations

Binary Operations

Inner Products

Matrix Algebra
NETWORK > COHESION > DISTANCE
PURPOSE Constructs a distance or generalized distance matrix between all nodes of a
graph. Allows for transformation of this matrix from distance to nearness.

DESCRIPTION The length of a path is the number of edges it contains. The distance between
two nodes is the length of the shortest path. The generalized distance is the
length of an optimum path.

This optimum can be any of the following:


The cost of a path is the sum of all values on the edges of a path. The optimum is
the cheapest cost.

The strength of a path is the strength of its weakest link. The optimum is the
strongest path.

The probability of a path is the product of the probabilities of its edges. The
optimum is the most probable path.

If there is more than one optimum path then the algorithm uses the shortest
optimum path. For a binary adjacency matrix distance and generalized distance
will be equivalent.

The distance matrix can be converted to a nearness matrix by means of a


nearness transformation. This transformation can be achieved by taking
reciprocals, linear transformations, exponentiation or frequency decays.

PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Valued graph.

Type of Data: (Default = ADJACENCY)


Choices are:

Adjacency - standard binary data, distance corresponds to graph theoretic


geodesic.
Strengths - values indicate cost or lengths of links between nodes. Optimum is
strongest path.
Costs - values indicate strengths, capacities or cost. Optimum is the cheapest
cost.
Probabilities - values indicate probability of link and restricted to [0,1].
Optimum is most probable path.

Nearness transformation: (Default = NONE)


Converts distance matrix to a nearness matrix by a variety of methods.
Choices are:

None - no transformation is applied and raw distances are given as output.

Multiplicative - distances between nodes are divided into the largest possible
distance. New values are given by Yij = (N-1)/Dij.

Additive - distances between nodes are subtracted from the total number of
nodes. New values are given by Yij = N - Dij.
Linear - distances between nodes are transformed linearly into [0,1]. New
values are given by Yij = 1 - (Dij - 1)/(N-1).

Exponential - distances between nodes are transformed using exponential decay.


New values are given by Yij = bDij. The attenuating factor b is selected by the
user and should satisfy 0 < b < 1.

Freq Decay - Uses Burt's 1976 frequency decay function. The nearness of i and
j is one minus the proportion of actors that are as close to i as j is.

Attenuation Factor: (Default = 0.5)


Value of the attenuation factor b when exponential is chosen. Larger values give
slower decay.

Output dataset: (Default = 'GeodesicDistance')


Name of data file containing distance matrix.

LOG FILE Matrix of distances between all pairs of nodes.

TIMING O(N^3)

COMMENTS Note the distances correspond to the number of links and not the optimum
values.
Optimum values are calculated by
NETWORK>COHESION>REACHABILITY

REFERENCES Doreian P (1974). 'On the connectivity of social networks'. Journal of


Mathematical Sociology, 3, 245-258.

Burt R (1976). 'Positions in networks'. Social Forces, 55, 93-122.


NETWORK>COHESION>NO. OF GEODESICS

PURPOSE Counts the number of geodesics connecting all pairs of vertices.

DESCRIPTION A geodesic is a shortest path. There may be more than one shortest path
connecting any two vertices. This procedure gives the number of shortest paths
connecting all pairs of vertices.

PARAMETERS

Input dataset:
Name of file containing network data. Data type: Digraph.

Output Filename: (Default = 'GeodesicsCount').


Name of dataset containing counts of geodesics for every pair of vertices.

LOG FILE An nxn matrix in which row i column j gives the number of geodesics connecting
i to j.

TIMING O(N^4).

COMMENTS None.

REFERENCES None.
NETWORK > COHESION > REACHABILITY
PURPOSE Constructs a matrix of reachability values for every pair of nodes.

DESCRIPTION The reachability for a pair of nodes is the value of an optimum path.

The algorithm produces a value in row i, col j of a matrix if node j is reachable


from node i and a blank otherwise.

This value can be any of the following:


The length of the shortest path.
The cost of the cheapest path, where the cost is the sum of all the values.
The strength of the strongest path, where the strength is the value of the weakest
link.

The probability of the most 'probable' path, where the probability of a path is the
product of the probabilities of its edges.
PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Valued graph.

Type of Data: (Default = ADJACENCY)


Choices are:

Adjacency - standard binary data, distance corresponds to graph theoretic


geodesic.

Strengths - values indicate cost or lengths of links between nodes.Optimum is


strongest path.

Costs - values indicate strengths, capacities or cost.Optimum is the cheapest cost.

Probabilities - values indicate probability of link and restricted to [0,1].


Optimum is most probable path.

Output dataset: (Default = 'Reachability')


Name of data file containing reachability matrix.

LOG FILE Matrix of reachability values between all pairs of nodes.

TIMING O(NLOGN)

COMMENTS None but see comments on


NETWORK>COHESION>DISTANCE

REFERENCES Doreian P (1974). 'On the connectivity of Social Networks'. Journal of


Mathematical Sociology, 3, 245-258.
NETWORK > COHESION > MAX FLOW
PURPOSE Compute the maximum flow (= the minimum cut) between all pairs of nodes in a
network.

DESCRIPTION In a valued or binary network the value of each edge (1 or 0 for binary networks)
can represent a capacity. Let c(x) denote the capacity of each edge of a network
N. A flow in N between two nodes s and t is a function f such that 0 £ f(x) £
c(x) for every edge x and for every node z ¹ s or t, Sf(yz) =Sf(zw). So that
for each node, except s and t, the total amount of flow into the node equals the
total flow leaving the node.

The total flow leaving s is the same as that going into t, this value is called the
value of the flow. The maximum flow is simply the maximum value possible
between two vertices.

This procedure uses the algorithm due to Gomory and Hu to compute the
maximum flow between all pairs of vertices of a symmetric graph.

PARAMETERS
Input dataset
Name of file containing network to be analyzed. Data type: Valued graph -
symmetric matrix only with integer values.

Output Filename (Default = 'MaxFlow').


Name of data file containing maximum flows between all pairs of vertices.

LOG FILE The Input dataset followed by an nxn matrix in which row i column j gives the
value of the maximum flow from vertex i to vertex j (i¹j).

TIMING O(N^4).

COMMENTS The maximum flow in a network is equal to the minimum cut. A cut between
two vertices s and t is a collection of edges which contains an edge from every s-
t path. The value of a cut is the sum of the value of the edges. A minimum cut is
the minimum value of all possible cuts between two vertices. For a binary
network this value is called the local edge connectivity.

REFERENCES Ford L R and Fulkerson D R (1956). 'Maximum flow through a network'.


Canadian Journal of Mathematics, 8, 399-404.

Gomory R E and Hu T C (1964). 'Synthesis of a communication network'.


Journal of SIAM (Appl Math), 12, 348.
NETWORK>COHESION>POINT CONNECTIVITY
PURPOSE Compute the local point connectivity between all pairs of nodes in a network.

DESCRIPTION The local (point) connectivity of two non-adjacent vertices is the number of
vertices that need to be deleted so that no path connects them, this is equal to the
maximum number of vertex disjoint paths connecting them.

PARAMETERS
Input dataset
Name of file containing network to be analyzed. Data type: Digraph

Output Filename (Default = 'PointConnectivity').


Name of data file containing maximum flows between all pairs of vertices.

LOG FILE An nxn matrix in which row i column j gives the local point connectivity from
vertex i to vertex j (i ¹ j). This value is precisely the maximum number of
vertex independent paths from i to j.

TIMING O(N^4).

COMMENTS None

REFERENCES None
NETWORK > REGIONS > COMPONENTS>SIMPLE GRAPHS
PURPOSE Identify the components, of an undirected graph - and the weak or strong
components of a directed graph.

DESCRIPTION In an undirected graph two vertices are members of the same component if there
is a path connecting them. In a directed graph two vertices are in the same weak
component if their is a semi-path connecting them. Two vertices x and y are in
the same strong component if there is a path connecting x to y and a path
connecting y to x.

PARAMETERS
Input dataset:
Name of file containing network data to be analyzed. Dat type: Directed graph.

Minimum Size to save: (Default = 3)


Size of smallest component which is to be saved in the component by actor
incidence matrix specified below.

Kind of components: (Default = Strong)


For directed data specify whether Strong or Weak components are required. For
undirected data either choice will yield the components.

Output sets: (Default = 'SubgroupComponentsSets')


Name of file which will contain a component by actor incidence matrix. A 1 in
row i column j means that node j is in component i. This file is not displayed in
the LOG FILE.

Output Partition: (Default = 'SubgroupComponentsPart')


Name of file which will contain a partition vector. A j in the ith position means
that node i is a member of component j. This file is not displayed in the LOG
FILE.

LOG FILE Number of components found.


List of all nodes indicating which labeled component each node is in.
List of components greater than minimum size, labeled - each component is
specified by the vertices it contains.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
NETWORK > REGIONS > COMPONENTS > VALUED GRAPHS

PURPOSE Identify the weak components corresponding to each cut-off value of a weighted
graph.

DESCRIPTION In a valued graph, the set of dichotomized graphs corresponding to each possible
weight form a nested sequence of graphs. The weak components of each of these
would also be nested and can be combined to form an hierarchical clustering of
weak components. Once two nodes have been placed in the same weak
component of a dichotomized graph for a particular cut-off value they remain in
the same weak component for all smaller cut-off values. This procedure
produces a hierarchical clustering based on these facts.

PARAMETERS Input valued network


Name of file containing valued digraph. Data type: Valued graph.

Output Dataset (Default = 'hicomp')


Name of dataset to contain the partition indicator matrix. Each column of this
matrix gives the component to which each actor was assigned in a given level.
The columns are labeled by the corresponding cut-off value. A value of k in a
column labeled x and row j means that actor j was in component k at cut-off
value x.

LOG FILE Hierarchical clustering diagram of the components. The columns are rearranged
and labeled. A '·' in row label i column label j means that vertex j was not in a
weak component with any other vertex (i.e. it was an isolate) using a cut-off
value of i. An 'X' indicates that vertex j was in a non-trivial weak component
with all vertices on the same row as j which can be found by tracing across that
row without encountering a space.

TIMING O(N^4), actually N^2 times number of different values.

COMMENTS None

REFERENCES None
NETWORK > REGIONS > BICOMPONENTS

PURPOSE Finds all the bi-components or blocks of a graph.

DESCRIPTION A cutpoint of a graph is a vertex whose removal increases the number of


components. A non-separable graph is a graph that is connected non-trivially and
has no cutpoints. A block or bicomponent of a graph is a maximal non-separable
subgraph. The name bi-component reflects the fact that it requires the deletion of
two vertices to disconnect it. Bi-components overlap, but a vertex that is in more
than one bi-component must be a cutpoint.

PARAMETERS
Input dataset:
Name of file containing graph to be analyzed. Data type: Graph.

Output dataset (Default = 'Blocks')


Name of file that will contain a block by actor incidence matrix. A 1 in column i
row j means that node j is in bi-component i. This file is not displayed in the
LOG FILE..

LOG FILE Number of bi-components found.


List of bi-components, labeled - each bi-component is specified by the vertices it
contains.

TIMING O(N^2).

COMMENTS None.

REFERENCES None.
NETWORK > REGIONS > K-CORES

PURPOSE List all k-cores of a graph.

DESCRIPTION A k-core in an undirected graph is a connected maximal induced subgraph which


has minimum degree greater than or equal to k. This procedure finds all k-cores
for every possible value of k.

PARAMETERS
Input dataset:
Name of file containing data to be analyzed. Data type Graph.

Output dataset: (Default = 'Kcores')


Name of file which will contain a k-core by actor partition matrix. The partition
by actor matrix is defined as follows: a value of k in a column labeled i and row
labeled j means that node j is in partition k for the i-core partition. From the
remarks above it follows that if there is only one value k in the column labeled i
then node j is not a member of any i-core. Otherwise all other members of j's i-
core will have a value of k in the same column.

LOG FILE A single link hierarchical clustering dendrogram the actors are re-ordered so that
they are located close to other actors in similar k-cores. The level at which any
pair of actors are aggregated is the point at which both can be reached by tracing
from the start to the actors from right to left. The scale at the top gives the level
at which they are clustered. The diagram can be printed or saved. Parts of the
diagram can be viewed by moving the mouse to the beginning of a line in the
dendrogram and clicking. The first click will highlight a portion of the diagram
and the second click will display just the highlighted portion. To return to the
original right click on the mouse. There is also a simple zoom facility simply
change the values and then press enter. If the labels need to be edited
(particularly the scale labels) then you should take the partition indicator matrix
into the spreadsheet editor remove or reduce the labels and then submit the edited
data to Tools>Dendrogram>Draw. In the clustering diagram each level
corresponding to a different value of 'k' in k-core. Behind the dendrogram is a
clustering diagram representing the same thing. Each row is labeled by the
possible values of k. The columns are rearranged and labeled. A '·' in row i
column label j indicates that vertex j is not in any i-core. An 'X' indicates that
vertex j is in an i-core, all other members of j's i-core are found by tracing along
row i in both directions from column j until a space is encountered in each
direction. The column labels corresponding to an 'X' which are connected to j's
'X' are all members of j's i-core.

TIMING O(N^3)

COMMENTS K-Cores are not necessarily cohesive subsets but they do identify areas of the
graph which contain clique like structures.

REFERENCES Seidman S (1983). 'Network structure and minimum degree'. Social Networks,
5, 269-287.
NETWORK > SUBGROUPS > CLIQUES
PURPOSE Find all cliques in a network.

DESCRIPTION A clique is a maximally complete subgraph.


The program implements the Bron and Kerbosch (1973) algorithm to find all
Luce and Perry (1949) cliques greater than a specified size. The routine will also
provide an analysis of the overlapping structure of the cliques. This analysis
gives information on the number of times each pair of actors are in the same
clique, and gives a hierarchical clustering based upon this information. It is also
does the dual operation by examining the number of actors a pair of cliques has
in common. This to is submitted to an hierarchical clustering routine.

PARAMETERS
Input dataset
Name of file containing data to be analyzed. Data type: Graph.

Minimum Size: (Default = 3)


This gives the smallest group size which is to be considered a clique. The range
is 1 to N.

Analyze pattern of overlaps? (Default = YES).


Yes means that an analysis of clique overlap will be performed. This includes the
construction of a clique co-membership matrix, and an hierarchical clustering
which is saved in a partition indicator matrix as described below. The co-clique
matrix is also constructed and this is also submitted to an hierarchical clustering
routine.
No restricts the analysis to identifying cliques only.

Diagram Type: (Default = 'Tree diagram')


When analyzing the overlap the clustering diagram can either be a Tree Diagram
or a Dendrogram.

(Output) Clique indicator matrix: (Default = 'CliquesSets').


Name of file which contains a clique by actor incidence matrix. A 1 in column i
row j indicates that actor j is a member of clique i. This matrix is not displayed
in the LOG FILE.

(Output) Co-membership matrix: (Default = 'CliquesOver').


Name of file which contains clique overlap matrix described in LOG FILE
below. Note that if no analysis of pattern overlaps was chosen then this file is not
created.

(Output) Partition indicator matrix: (Default = 'CliquePart').


Name of file which contains partition indicator matrix derived from overlap
analysis. The partition indicator matrix corresponds to the hierarchical clustering
displayed in the LOG FILE. A value of k in a column labeled i and row j means
that actor j is in partition k and is in i cliques with every other member of
partition k. Actor k is always a member of partition k, and is a representative
label for the group.

LOG FILE Number of cliques found.


List of cliques, labeled - each clique is specified by the vertices it contains.
The following output is also produced if YES was inserted on the form in reply
to the question 'Analyze pattern of overlaps?' The first part of the output will be
the tree diagram or dendrogram corresponding to the clustering of the actor by
actor co-membership matrix. In the matrix a value of k in row i column j means
that vertices i and j occurred in the same clique k times. The ith diagonal entry
gives the number of cliques which contain i.

The tree diagram (or a dendrogram) re-orders the actors so that they are located
close to other actors in similar clusters. The level at which any pair of actors are
aggregated is the point at which both can be reached by tracing from the start to
the actors from right to left. The scale at the top gives the level at which they are
clustered and corresponds to the number of overlaps. The diagram can be printed
or saved. Parts of the diagram can be viewed by moving the mouse to the split
point in a tree diagram or the beginning of a line in the dendrogram and clicking.
The first click will highlight a portion of the diagram and the second click will
display just the highlighted portion. To return to the original right click on the
mouse. There is also a simple zoom facility simply change the values and then
press enter. If the labels need to be edited (particularly the scale labels) then you
should take the partition indicator matrix into the spreadsheet editor remove or
reduce the labels and then submit the edited data to Tools>Dendrogram>Draw.

Behind the diagram is a window containing the number of cliques and a list as
specified above. This is followed by a clustering diagram representing the same
clustering as the tree diagram (or dendrogram). The columns are rearranged and
labeled. A '·' in row label i column label j means that vertex j was not in i cliques
with any other vertex. An 'X' indicates that vertex j was in i cliques with all
vertices on the same row as j which can be found by tracing across that row
without encountering a space.

This is followed by the clique by clique co-membership matrix. In the matrix a


value of k in row i column j means that cliques i and j contain k actors in
common. The ith diagonal entry gives the number of actors in clique i. This is
followed by a clustering diagram corresponding to an hierarchical clustering of
the clique by clique co-membership matrix.

TIMING Algorithm is exponential.

COMMENTS None.

REFERENCES Luce R and Perry A (1949). A method of matrix analysis of group structure.
Psychometrika 14, 95-116.

Bron C and Kerbosch J (1973). Finding all cliques of an undirected graph.


Comm of the ACM 16, 575-577.
NETWORKS > SUBGROUPS > N-CLIQUES
PURPOSE Find all n-cliques in a network.

DESCRIPTION An n-clique of an undirected graph is a maximal subgraph in which every pair of


vertices is connected by a path of length n or less. These are found using an
adapted version of the Bron and Kerbosch (1973) algorithm. The routine will
also provide an analysis of the overlapping structure of the n-cliques. This
analysis gives information on the number of times each pair of actors are in the
same n-clique and gives an hierarchical clustering based upon this information.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Graph.

Value of N: (Default = 2)
All members of an n-clique are connected by a path of length n or less. A value
of 1 would give all Luce and Perry cliques; the maximum value of N-1 would
give the components of the graph.

Minimum Size: (Default = 3)


This gives the smallest group size which is to be considered an n-clique. The
range is 1 to N.

Analyze pattern of overlaps? (Default = YES).


Yes means that an analysis of n-clique overlap will be performed. This includes
the construction of an n-clique co-membership matrix, and an hierarchical
clustering which is saved in a partition indicator matrix as described below.
No restricts the analysis to identifying n-cliques only.

Diagram Type: (Default = 'Tree diagram')


When analyzing the overlap the clustering diagram can either be a Tree
Diagram or a Dendrogram.

(Output) n-clique indicator matrix: (Default = 'NClqSets').


Name of file which contains a n-clique by actor incidence matrix. A 1 in column
i row j indicates that actor j is a member of n-clique i. This matrix is not
displayed in the LOG FILE.

(Output) Co-membership matrix: (Default = 'NClqOver').


Name of file which contains n-clique overlap matrix described in LOG FILE
below. Note that if no analysis of pattern overlaps was chosen then this file is
not created.

(Output) Partition indicator matrix: (Default = 'NClqPart').


Name of file which contains partition indicator matrix derived from overlap
analysis. The partition indicator matrix corresponds to the hierarchical clustering
displayed in the LOG FILE. A value of k in a column labeled i and row j means
that actor j is in partition k and is in i n-cliques with every other member of
partition k. Actor k is always a member of partition k, and is a representative
label for the group.

LOG FILE Number of n-cliques found.


List of n-cliques, labeled - each n-clique is specified by the vertices it contains.
The following output is also produced if YES was inserted on the form in reply
to the question 'Analyze pattern of overlaps?' The first part of the output will be
the tree diagram or dendrogram corresponding to the single link clustering of the
n-clique overlap matrix. In the n-clique overlap matrix a value of k in row i
column j means that vertices i and j occurred in the same n-clique k times. The
ith diagonal entry gives the number of n-cliques which contain i.

The tree diagram (or a dendrogram) re-orders the actors so that they are located
close to other actors in similar clusters. The level at which any pair of actors are
aggregated is the point at which both can be reached by tracing from the start to
the actors from right to left. The scale at the top gives the level at which they are
clustered and corresponds to the number of overlaps. The diagram can be printed
or saved. Parts of the diagram can be viewed by moving the mouse to the split
point in a tree diagram or the beginning of a line in the dendrogram and clicking.
The first click will highlight a portion of the diagram and the second click will
display just the highlighted portion. To return to the original right click on the
mouse. There is also a simple zoom facility simply change the values and then
press enter. If the labels need to be edited (particularly the scale labels) then you
should take the partition indicator matrix into the spreadsheet editor remove or
reduce the labels and then submit the edited data to Tools>Dendrogram>Draw.

Behind the diagram is a window containing the number of n-cliques and a list as
specified above. This is followed by a clustering diagram representing the same
clustering as the tree diagram (or dendrogram). The columns are rearranged and
labeled. A '·' in row label i column label j means that vertex j was not in i n-
cliques with any other vertex. An 'X' indicates that vertex j was in i n-cliques
with all vertices on the same row as j which can be found by tracing across that
row without encountering a space.

TIMING Algorithm is exponential.

COMMENTS Usually only 2-n-cliques or 3-n-cliques are of significance.

REFERENCES Luce R (1950). Connectivity and generalized n-cliques in sociometric group


structure. Psychometrika 15, 169-190.

Bron C and Kerbosch J (1973). Finding all n-cliques of an undirected graph.


Comm of the ACM 16, 575-577.
NETWORK > SUBGROUPS > N-CLAN
PURPOSE Find all n-clans in a network.

DESCRIPTION An n-clan is an n-clique which has diameter less than or equal to n as an induced
subgraph. These are found by using the n-clique routine and checking the
diameter condition.

The routine will also provide an analysis of the overlapping structure of the n-
clans. This analysis gives information on the number of times each pair of actors
are in the same n-clan and gives an hierarchical clustering based upon this
information.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Graph.

Value of N: (Default = 2)
All members of an n-clan are in an n-clique and have the additional property that
they are connected by a path of length n or less in which each vertex is also a
member of the n-clique. A value of 1 would give all Luce and Perry cliques; the
maximum value of N-1 would give the components of the graph.

Minimum Size: (Default = 3)


This gives the smallest group size which is to be considered an n-clan. The range
is 1 to N.

Analyze pattern of overlaps? (Default = YES).


Yes means that an analysis of n-clan overlap will be performed. This includes the
construction of an n-clan co-membership matrix, and an hierarchical clustering
which is saved in a partition indicator matrix as described below.
No restricts the analysis to identifying n-clans only.

Diagram Type: (Default = 'Tree diagram')


When analyzing the overlap the clustering diagram can either be a Tree Diagram
or a Dendrogram.

(Output) n-clan indicator matrix: (Default = 'NClanSets').


Name of file which contains a n-clan by actor incidence matrix. A 1 in column i
row j indicates that actor j is a member of n-clan i. This matrix is not displayed
in the LOG FILE.

(Output) Co-membership matrix: (Default = 'NClanOver').


Name of file which contains n-clan overlap matrix described in LOG FILE
below. Note that if no analysis of pattern overlaps was chosen then this file is not
created.

(Output) Partition indicator matrix: (Default = 'NClanPart').


Name of file which contains partition indicator matrix derived from overlap
analysis. The partition indicator matrix corresponds to the hierarchical clustering
displayed in the LOG FILE. A value of k in a column labeled i and row j means
that actor j is in partition k and is in i n-clans with every other member of
partition k. Actor k is always a member of partition k, and is a representative
label for the group.

LOG FILE Number of n-clans found.


List of n-clans, labeled - each n-clan is specified by the vertices it contains.

The following output is also produced if YES was inserted on the form in reply
to the question 'Analyze pattern of overlaps?' The first part of the output will be
the tree diagram or dendrogram corresponding to the single link clustering of the
n-clan overlap matrix. In the n-clan overlap matrix a value of k in row i column j
means that vertices i and j occurred in the same n-clan k times. The ith diagonal
entry gives the number of n-clans which contain i.

The tree diagram (or a dendrogram) re-orders the actors so that they are located
close to other actors in similar clusters. The level at which any pair of actors are
aggregated is the point at which both can be reached by tracing from the start to
the actors from right to left. The scale at the top gives the level at which they are
clustered and corresponds to the number of overlaps. The diagram can be printed
or saved. Parts of the diagram can be viewed by moving the mouse to the split
point in a tree diagram or the beginning of a line in the dendrogram and clicking.
The first click will highlight a portion of the diagram and the second click will
display just the highlighted portion. To return to the original right click on the
mouse. There is also a simple zoom facility simply change the values and then
press enter. If the labels need to be edited (particularly the scale labels) then you
should take the partition indicator matrix into the spreadsheet editor remove or
reduce the labels and then submit the edited data to Tools>Dendrogram>Draw.

Behind the diagram is a window containing the number of n-clans and a list as
specified above. This is followed by a clustering diagram representing the same
clustering as the tree diagram (or dendrogram). The columns are rearranged and
labeled. A '·' in row label i column label j means that vertex j was not in i n-
clans with any other vertex. An 'X' indicates that vertex j was in i n-clans with all
vertices on the same row as j which can be found by tracing across that row
without encountering a space.

TIMING Algorithm is exponential.

COMMENTS Usually only 2-clans or 3-clans are signified.

REFERENCES Mokken R (1979). Cliques, clubs and clans. Quality and Quantity 13, 161-173.
NETWORK > SUBGROUPS > K-PLEX
PURPOSE Find all k-plexes in a network.

DESCRIPTION A k-plex is a maximal subgraph with the following property: each vertex of the
induced subgraph is connected to at least n-k other vertices, where n is the
number of vertices in the induced subgraph. The basic algorithm is a depth first
search.

The routine will also provide an analysis of the overlapping structure of the k-
plexes. This analysis gives information on the number of times each pair of
actors are in the same k-plex and gives an hierarchical clustering based upon this
information.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Graph.

Value of K: (Default = 2)
The value of k specifies the relative minimum size of the degree of each vertex
compared with the size of the k-plex. A value of 1 corresponds to a Luce and
Perry clique. Every vertex in a k-plex of size n has degree at least n-k in the
subgraph induced by the k-plex. The range of k is 1 to N. (A value of N would
give the whole graph as the only k-plex).

Minimum Size: (Default = 3)


This gives the smallest group size which is to be considered a k-plex. The range
is 1 to N, normally this should be at least K+2.

Analyze pattern of overlaps? (Default = YES).

Yes means that an analysis of k-plex overlap will be performed. This includes the
construction of an k-plex co-membership matrix, and an hierarchical clustering
which is saved in a partition indicator matrix as described below.

No restricts the analysis to identifying k-plexes only.

Diagram Type: (Default = 'Tree diagram')


When analyzing the overlap the clustering diagram can either be a Tree Diagram
or a Dendrogram.

(Output) k-plex indicator matrix: (Default = 'KPlexSet').


Name of file which contains a k-plex by actor incidence matrix. A 1 in column i
row j indicates that actor j is a member of k-plex i. This matrix is not displayed
in the LOG FILE.

(Output) Co-membership matrix: (Default = 'KplexOvr').


Name of file which contains k-plex overlap matrix described in LOG FILE
below. Note that if no analysis of pattern overlaps was chosen then this file is not
created.

(Output) Partition indicator matrix: (Default = 'KplexPrt').


Name of file which contains partition indicator matrix derived from overlap
analysis. The partition indicator matrix corresponds to the hierarchical clustering
displayed in the LOG FILE. A value of m in a column labeled i and row j means
that actor j is in partition m and is in i k-plexes with every other member of
partition m. Actor m is always a member of partition m, and is a representative
label for the group.

LOG FILE Number of k-plexes found.


List of k-plexes, labeled - each k-plex is specified by the vertices it contains.

The following output is also produced if YES was inserted on the form in reply
to the question 'Analyze pattern of overlaps?' The first part of the output will be
the tree diagram or dendrogram corresponding to the single link clustering of the
k-plex overlap matrix. In the k-plex overlap matrix a value of m in row i column
j means that vertices i and j occurred in the same k-plex m times. The ith
diagonal entry gives the number of k-plexes which contain i.

The tree diagram (or a dendrogram) re-orders the actors so that they are located
close to other actors in similar clusters. The level at which any pair of actors are
aggregated is the point at which both can be reached by tracing from the start to
the actors from right to left. The scale at the top gives the level at which they are
clustered and corresponds to the number of overlaps. The diagram can be printed
or saved. Parts of the diagram can be viewed by moving the mouse to the split
point in a tree diagram or the beginning of a line in the dendrogram and clicking.
The first click will highlight a portion of the diagram and the second click will
display just the highlighted portion. To return to the original right click on the
mouse. There is also a simple zoom facility simply change the values and then
press enter. If the labels need to be edited (particularly the scale labels) then you
should take the partition indicator matrix into the spreadsheet editor remove or
reduce the labels and then submit the edited data to Tools>Dendrogram>Draw.

Behind the diagram is a window containing the number of k-plexes and a list as
specified above. This is followed by a clustering diagram representing the same
clustering as the tree diagram (or dendrogram). The columns are rearranged and
labeled. A '·' in row label i column label j means that vertex j was not in i k-
plexes with any other vertex. An 'X' indicates that vertex j was in i k-plexes with
all vertices on the same row as j which can be found by tracing across that row
without encountering a space.

TIMING Algorithm is exponential.

COMMENTS It is advisable to initially select k and the minimum size n so that k< (n+2)/2 - in
this case the diameter of the k-plex is 2 (or less). If a k-plex is connected and k ≥
(n+2)/2 then the diameter is always less than or equal to 2k-n+1, however it
should not be assumed that the k-plex is connected and this would need to be
examined.

REFERENCES Seidman S and Foster B (1978). A graph theoretic generalization of the clique
concept. J or Math Soc, 6, 139-154.

Seidman S and Foster B (1978). A note on the potential for genuine cross-
fertilization between anthropology and mathematics. Social Networks 1, 65-72.
NETWORK > SUBGROUPS > LAMBDA SETS
PURPOSE List all lambda sets of a graph.

DESCRIPTION The edge connectivity of a pair of vertices is the minimum number of edges
which must be deleted so that there is no path connecting them.

A lambda set is a maximal subset of vertices with the property that the edge
connectivity of any pair of vertices within the subset is strictly greater than the
edge connectivity of any pair of vertices, one of which is in the subset and one of
which is outside.

Hence if l(a,b) represents the edge-connectivity of two vertices a and b from a


graph G(V,E) then a subset S is a lambda set if it is the maximal set with the
property that for all a,b,c e S and d e V-S then l(a,b) > l(c,d).

The algorithm employed first computes the maxima flow (i.e. the connectivity)
between all pairs of vertices (see NETWORKS>COHESION>MAX FLOW)
and uses this information to construct the lambda sets.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Graph.

(Output) Partition Matrix: (Default = 'LambdaSetsPart')


Name of file which contains partition indicator matrix which corresponds to the
hierarchical clustering produced in the LOG FILE. A value of k in a column
labeled i and row j means that actor j is in partition k; the other members of the
partition form a lambda set with minimum edge-connectivity i. Actor k is always
a member of partition k, and is a representative label for the group. This matrix is
not displayed in the LOG FILE.

(Output) Lambda Matrix: (Default = 'LambdaSetsFlow')


Name of data file containing maximum flows between all pairs of vertices.

(Output) Permutation Vector: (Default = 'LambdaSetsPerm')


Name of data file which contains the permutation of the nodes used in
constructing the single link hierarchical clustering diagram below.

LOG FILE An hierarchical clustering dendrogram, each level corresponding to a different


degree of minimum internal edge-connectivity. This value characterizes the
lambda set. The level at which any pair of actors are aggregated is the point at
which both can be reached by tracing from the start to the actors from right to
left. The scale at the top gives the level at which they are clustered. The diagram
can be printed or saved. Parts of the diagram can be viewed by moving the
mouse to the beginning of a line in the dendrogram and clicking. The first click
will highlight a portion of the diagram and the second click will display just the
highlighted portion. To return to the original right click on the mouse. There is
also a simple zoom facility simply change the values and then press enter. If the
labels need to be edited (particularly the scale labels) then you should take the
partition indicator matrix into the spreadsheet editor remove or reduce the labels
and then submit the edited data to Tools>Dendrogram>Draw. In the clustering
diagram each level corresponding to a different value of 'k' in k-core. Behind the
dendrogram is a clustering diagram representing the same thing. The columns
are rearranged and labeled. A '·' in row labeled i column label j indicates that
vertex j is not in a lambda set of minimum connectivity i. An 'X' indicates that
vertex j is a member of the lambda set, all other members of j's lambda set are
found by tracing along row labeled i in both directions from column j until a
space is encountered in each direction. The column labels corresponding to an
'X' which are connected to j's X are all members of j's lambda set with minimum
connectivity i.

The single link hierarchical diagram is followed by a maximum flow matrix. The
maximum flow between i and j is given by the value in row i column j. The
diagonal is set equal to the number of vertices, theoretically this value should be
infinite.

TIMING 0(N^4).

COMMENTS Note this algorithm works on integer valued graphs by the natural extension of
connectivity to minimum weight cutsets.

REFERENCES Borgatti S P, Everett M G and Shirey P R (1990). 'LS Sets, Lambda Sets and
other cohesive subsets'. Social Networks 12, 337-357.
NETWORK >SUBGROUPS >FACTIONS
PURPOSE Optimizes a cost function which measures the degree to which a partition
consists of clique like structures using a tabu search method.

DESCRIPTION Given a partition of a binary network of adjacencies into n groups, then a count
of the number of missing ties within each group summed with the ties between
the groups gives a measure of the extent to which the groups form separate
clique like structures. The routine uses a tabu search minimization procedure to
optimize this measure to find the best fit.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph.

Number of factions: (Default = 2)


Number of partitions into which the data needs to be split.

Maximum # of iterations in a series: (Default = 20)


The algorithm starts from an arbitrary partition and attempts to decrease the cost
by taking the steepest descent. If the cost cannot be reduced then the algorithm
continues its search in the neighborhood of the current partition. This search
direction is a mildest ascent direction and from there new search directions are
explored. This exploration only continues for a fixed number of iterations in a
series. If no improvement is made after the fixed number of iterations the
algorithm terminates with the current minimum. Increasing the parameter gives a
more exhaustive and therefore slower search.

Length of time in penalty box: (Default = 15)


If the algorithm makes an ascending step then it is possible that the best possible
descending step is the reverse of the direction just taken. This parameter
prohibits a move along the reverse direction for a set number of steps. The larger
the value the more difficult it will be to come back to a previously explored local
minimum, however it will also be more difficult to explore the vicinity of that
minimum. The default has been shown experimentally to be the most useful.

Number of random starts: (Default = 10 )


The whole procedure is repeated with a different initial partition. The best of
these are then selected as a minimum.

Random Number Seed:


The random number seed generates the initial partition. UCINET generates a
different random number as default each time it is run. This number should be
changed if the user wishes to repeat the analysis with different initial
configurations. The range is 1 to 32000.

Output partition dataset: (Default = 'FactionsPart')


Name of dataset which contains a partition indicator vector. This vector has the
form (k1,k2,...ki,...) where ki assigns vertex i to faction ki so that (1 1 2 1 2)
assigns vertices 1, 2 and 4 to faction 1 and 3 and 5 to faction 2. This vector is not
displayed in the LOG FILE.

Output sets dataset: (Default ='FactionsSets')


Name of dataset which contains the sets information.
LOG FILE The value of the cost function.
The group assignments. A list of the factions labeled, each faction is specified by
the vertices it contains.
A grouped adjacency matrix. A blocked permuted adjacency matrix where the
diagonal blocks correspond to the factions.

TIMING Each iteration of the tabu search algorithm is O(N^2). Random tests with default
parameters as specified indicate O(N^2.5).

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the minima of the cost function. Even if successful
this result may still be a high value in which case the factions may not represent
cohesive subgroups.

In addition there may be a number of alternative partitions which also produce


the minimum value; the algorithm does not search for additional solutions.
Finally it is possible that the routine terminates at a local minima and does not
locate the desired global minima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into the
reported factions.

REFERENCES de Amorim S G, Barthélemy J P and Ribeiro (1990). Clustering and Clique


Partitioning: Simulated Annealing and Tabu Search Approaches. Research
report from Groupe d'études et de recherche en analyse des décisions. Ecole des
Hautes Etudes Commerciales, Ecole Polytechnique, Université McGill.

F Glover (1989). Tabu Search - Part I. ORSA Journal on Computing 1, 190-


206.

F Glover (1990). Tabu Search - Part II. ORSA Journal on Computing 2, 4-32.
NETWORK > EGO NETWORKS > DENSITY

PURPOSE Compute standard ego network measures for every actor in a network.

DESCRIPTION This routine systematically constructs the ego network for every actor within the
network and computes a collection of ego network measures. For directed data
both in and out networks can be considered separately or together.

PARAMETERS
Input network:
Name of file which contains network to be analyzed. Data type: Digraph.

Type of ego neighborhood: (Default = UNDIRECTED)


Choices are:

UNDIRECTED-considers all actors connected to and from ego.


IN-NEIGHBORHOOD-considers only actors with a tie to ego.
OUT-NEIGHBORHOOD-considers only actors with a tie from ego.

Output dataset (Default = EgoNet)


Name of file containing ego-by-variable matrix.

LOG FILE A table of ego network measures. All measures exclude ties involving ego itself.
The measures include the following:

Size. The number of actors (alters) that ego is directly connected to.

Ties. The total number of ties in the ego network (not counting ties involving
ego).

Pairs. The total number of pairs of alters in the ego network -- i.e., potential ties.

Density. The number of ties divided by the number of pairs, times 100.

Avgdist. The average geodesic (graph-theoretic) distance between pairs of alters.


This is only computed for networks in which every alter is reachable from every
other.

Diameter. The longest geodesic distance within the ego network (unless
infinite).

NweakComp. The number of weak components in the ego network.

PweakComp. The number of weak components as a percentage of the number of


alters.

2StepReach. The number of alters that are within 2 links of ego.

ReachEffic. 2-step reach as a percentage of the number of alters plus the sum of
the their network sizes.

TIMING O(N^3)
COMMENTS None

REFERENCES None
NETWORKS > EGO NETWORKS > STRUCTURAL HOLES

PURPOSE Compute measures of structural holes.

DESCRIPTION Compute several measures of structural holes, including all of the measures
developed by Ron Burt. The measures are computed for all nodes in the network,
treating each one in turn as ego.

PARAMETERS
Input dataset:
Name of file containing network to analyze. Data type: Directed Graph.

Output structural holes dataset: (default = 'holes')


Name of actor-by-variable matrix to hold structural hole measures.

Output dyadic redundancy dataset: (default = 'redund').


Name of actor-by-actor matrix that indicates the extent to which the column
actor (an alter) is a redundant contact for the row actor (ego).

Output dyadic constraint dataset: (default = 'const').


Name of actor-by-actor matrix that indicates the extent to which the row actor
(ego) is constrained by each other actor its ego network.

LOGFILE
Three tables are output. First is the set of monadic (nodal) structural hole
measures based on redundancy and constraint. The following measures are
displayed:

effsize. Burt's measure of the effective size of ego's network (essentially, the
number of alters minus the average degree of alters within the ego network, not
counting ties to ego).

efficiency. The effective size divided by the number of alters in ego's network.

constraint. Burt's constraint measure (equation 2.4, pg. 55 of Burt, 1992).


Essentially a measure of the extent to which ego is invested in people who are
invested in other of ego's alters.

hierarchy. Burt's adjustment of constraint (equation 2.9, pg 71), indicating the


extent to which constraint on ego is concentrated in a single alter.

The second table is the dyadic redundancy matrix. For each ego (rows) it gives
the extent to which each of its alters are tied to all of ego's other alters (i.e., the
extent to which the alter is redundant).

The third table is the dyadic constraint matrix. For each ego (rows) it gives the
extent to which it is constrained by each of its alters. Ego is contained by alter j
if (a) j represents a large proportion of ego's relational investment, and (b) if ego
is heavily invested in other people who are in turn heavily invested in j. In short,
j constrains Ego if ego is heavily invested in j directly and indirectly.

TIMING O(N^3)
REFERENCES Burt, R.S. 1992. Structural Holes: The social structure of competition.
Cambridge: Harvard University Press.
NETWORK > CENTRALITY > DEGREE
PURPOSE Calculates the degree and normalized degree centrality of each vertex and gives
the overall network degree centralization.

DESCRIPTION The number of vertices adjacent to a given vertex in a symmetric graph is the
degree of that vertex. For non-symmetric data the in-degree of a vertex u is the
number of ties received by u and the out-degree is the number of ties initiated by
u. In addition if the data is valued then the degrees (in and out) will consist of
the sums of the values of the ties. The normalized degree centrality is the degree
divided by the maximum possible degree expressed as a percentage. The
normalized values should only be used for binary data.

For a given binary network with vertices v1....vn and maximum degree centrality
cmax, the network degree centralization measure is S(cmax - c(vi)) divided by the
maximum value possible, where c(vi) is the degree centrality of vertex vi.

The routine calculates these measures and some descriptive statistics based on
these measures. Directed graphs may be symmetrized and the analysis is
performed as above, or an analysis of the in and out degrees can be performed.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued Graph

Treat data as symmetric: (Default = Yes).


If Yes directed data is automatically converted to undirected by taking the
underlying graph.
No gives a separate analysis for in and out-degrees.

Count reflexive ties (diagonal values)? (Default = No).


No means that self loops are ignored.

Output dataset: (Default = 'FreemanDegree').

Name of file which will contain degree and normalized degree centrality of each
vertex.

LOG FILE A table which contains a list of the degree and normalized degree (n Degree)
centralities expressed as a percentage for each vertex.
Descriptive statistics which give the mean, standard deviation, variance,
minimum value and maximum value for each list generated. This is followed by
the degree network centralization index expressed as a percentage.

For directed data the tables are the same as for undirected except that separate
values are calculated for in and out degrees.

TIMING O(N).

COMMENTS Degree centrality measures network activity. For valued data the non-normalized
values should be used and the degree centralization should be ignored.

REFERENCES Freeman L C (1979). 'Centrality in Social Networks: Conceptual clarification',


Social Networks 1, 215-239.
NETWORK > CENTRALITY > CLOSENESS
PURPOSE Calculates the farness and normalized closeness centrality of each vertex and
gives the overall network closeness centralization.

DESCRIPTION The farness of a vertex is the sum of the lengths of the geodesics to every other
vertex. The reciprocal of farness is closeness centrality. The normalized
closeness centrality of a vertex is the reciprocal of farness divided by the
minimum possible farness expressed as a percentage. As an alternative to taking
the reciprocal after the summation, the reciprocals can be taken before. In this
case the closeness is the sum of the reciprocated distances so that infinite
distances contribute a value of zero. This can also be normalized by dividing by
the maximum value. In addition the routine also allows the use user to measure
distance by the sums of the lengths of all the paths or all the trails. If the data is
directed the routine calculates separate measures for in-closeness and out
closeness.

For a given network with vertices v1....vn and maximum closeness centrality cmax,
the network closeness centralization measure is S(cmax - c(vi)) divided by the
maximum value possible, where c(vi) is the closeness centrality of vertex vi.

The routine calculates centrality, network closeness centralization and some


descriptive statistics based on these measures for symmetric and directed graphs.

PARAMETERS

Input dataset:
Name of file containing network to be analyzed. Data type: Digraph

Type:
Choices are:
Freeman (geodesic paths) distances are lengths of geodesic paths, the standard
Freeman measure.
Reciprocal Distances distances are the reciprocal of the lengths of the geodesic
paths.
All paths distances between actors are the sums of the distances on all paths
connecting them.
All trails distances between the actors are the sums of the distances on all trails
connecting them.

Output Dataset: (Default = 'Closeness')


Name of file which will contain farness and normalized closeness centrality of
each vertex.

LOG FILE A table which contains a list of the farness (or closeness) and normalized
closeness centrality expressed as a percentage, for each vertex. Descriptive
statistics which give the mean, standard deviation, variance minimum value and
maximum value for both lists. This is followed by the closeness network
centralization index expressed as a percentage. If the data is directed then
separate in and out values are calculated.

TIMING O(N^3) for Freeman and reciprocal distances, the other two can be exponential.
COMMENTS Closeness centrality be thought of as an index of the expected time-until-arrival
for things flowing through the network via optimal paths.

REFERENCES Freeman L C (1979). 'Centrality in Social Networks: Conceptual clarification'.


Social Networks 1, 215-239.
NETWORK > CENTRALITY > BETWEENNESS > NODES
PURPOSE Calculates the betweenness and normalized betweenness centrality of each vertex
and gives the overall network betweenness centralization.

DESCRIPTION Let bjk be the proportion of all geodesics linking vertex j and vertex k which pass
through vertex i. The betweenness of vertex i is the sum of all bjk where i, j and
k are distinct. Betweenness is therefore a measure of the number of times a
vertex occurs on a geodesic. The normalized betweenness centrality is the
betweenness divided by the maximum possible betweenness expressed as a
percentage.

For a given network with vertices v1....vn and maximum betweenness centrality
cmax, the network betweenness centralization measure is S(cmax - c(vi)) divided by
the maximum value possible, where c(vi) is the betweenness centrality of vertex
vi.

The routine calculates these measures, and some descriptive statistics based on
these measures, for symmetric and unsymmetric graphs.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph.

Output dataset: (Default = 'FreemanBetweenness').


Name of file which will contain betweenness and normalized betweenness
centrality of each vertex.

LOG FILE A table which contains a list of the betweenness and normalized betweenness
centrality expressed as a percentage for each vertex.
Descriptive statistics which give the mean, standard deviation, variance,
minimum value and maximum value for both lists. This is followed by the
betweenness network centralization index expressed as a percentage.

TIMING O(N^3).

COMMENTS Betweenness centrality measures information control.


Care should be taken in interpreting betweenness for directed data.

REFERENCES Freeman L C (1979). 'Centrality in Social Networks: Conceptual Clarification'.


Social Networks 1, 215-239.
NETWORKS > CENTRALITY > REACH CENTRALITY

PURPOSE Counts the number of nodes each node can reach in k or less steps. For k = 1, this
is equivalent to degree centrality. For directed networks, both in-reach and out-
reach are calculated.

DESCRIPTION The input is a binary network. The output is a node by distance matrix X in
which xij indicates the proportion of nodes that node i can reach in j or fewer
steps. In a connected network, each row will eventually reach 1 (100%). The
routine also calculates the eccentricity of each node. That is the distance of the
node in question to the one that is furthest away.

In addition, the routine calculates some descriptive statistics based on these


measures for symmetric graphs.

PARAMETERS

Input dataset:
Name of file containing network to be analyzed. Data type: Digraph

Output Dataset: (Default = 'ReachCentrality')


Name of file which will contain reach proportions for each node at each level of
distance.

LOG FILE A table that gives the proportion of nodes reached by each node at each level of
distance. The proportion is expressed as a value from zero to one. A value of x in
row i column j means that 100x% of nodes are reachable from i in a path of
length j or less. For directed data values for those that can be reached from the
node and those that can reach the target node are reported. Descriptive statistics
which give the mean, standard deviation, variance minimum value and
maximum value for the proportion are given.
Finally the eccentricity of each node is given, for directed data both in and out
eccentricity are calculated.

TIMING O(N^2).

COMMENTS When searching for key individuals who are well positioned to reach many
people in a few number of steps, this measure provides a natural metric for
assessing each node.

REFERENCES
NETWORK > CENTRALITY > BETWEENNESS >LINES

PURPOSE Calculates the betweenness centrality of each line.

DESCRIPTION Let bjk be the proportion of all geodesics linking vertex j and vertex k which pass
through edge i. The betweenness of edge i is the sum of all bjk where j and k are
distinct. Betweenness is therefore a measure of the number of times an edge
occurs on a geodesic.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph.

Output dataset: (Default = 'EdgeBetweenness').


Name of file which will contain the betweenness centrality of each edge.

LOG FILE A matrix in which the i,j th entry gives the edge betweenness of the edge (i,j).

TIMING O(N^3).

COMMENTS Betweenness centrality measures information control.

REFERENCES Freeman L C (1979). 'Centrality in Social Networks: Conceptual Clarification'.


Social Networks 1, 215-239.
NETWORK > CENTRALITY > BETWEENNESS >HIERARCHICAL
REDUCTION
PURPOSE Produces a hierarchically nested set of vertices based on betweenness.

DESCRIPTION The betweenness of each vertex is calculated and those with a score of zero are
deleted, the procedure is then repeated on the reduced graph until all vertices
have been deleted. Initially all vertices are placed in the hierarchy and then at
each level the deleted vertices are removed.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Digraph.

(Output) (Default = 'hierbet').


Name of file which will contain the partition vector. The vector consists of a
single row with each column corresponding to a vertex. A value k in column i
means that actor i was deleted after k iterations.

(Output) Partition (Default =hierbetpart)


Name of dataset to contain the partition-by-item incidence matrix. Each column
of this matrix corresponds to a cluster labeled by the level of the cluster. A value
of 1 in a column labeled x and row j means that actor j was in the cluster at level
x.

LOG FILE The partition vector described above. A cluster diagram in which the columns
have been re-arranged so that actors in the same cluster at each level are
consecutive. A value of 1 in a row labeled x and column labelled j means that
actor j was in the cluster at level x.

TIMING O(N^3).

COMMENTS

REFERENCES Freeman L C (1979). 'Centrality in Social Networks: Conceptual Clarification'.


Social Networks 1, 215-239.
NETWORK > EGO NETWORKS > BROKERAGE

PURPOSE Calculates the brokerage measures proposed by Gould & Fernandez (1989).

DESCRIPTION Given (a) a graph, and (b) a partition of nodes, this procedure calculates
measures of five kinds of brokerage. Brokerage occurs when, in a triad of nodes
A, B and C, A has a tie to B, and B has a tie to C, but A has no tie to C. That is, A
needs B to reach C, and B is therefore a broker. When A, B, and C may belong to
different groups, 5 kinds of brokerage are possible. The five kinds are named
using terminology from social roles. In the description below, the notation G(x)
is used to indicate the group that node x belongs to. Important: It is assumed that
a-->b-->c. For example, a (the source node) gives information to b (the broker),
who gives information to c (the destination node).

Coordinator. Counts the number of times b is a broker and G(a) = G(b) = G(c),
that is, all three nodes belong to the same group.

Consultant. Counts the number of times b is a broker and G(a) = G(c), but G(b)¹
G(a); that is, the broker belongs to one group, and the other two belong to a
different group.

Gatekeeper. Counts the number of times b is a broker and G(a) ¹ G(b) and G(b)
= G(c), that is, the source node belongs to a different group.

Representative. Counts the number of times b is a broker and G(a) = G(b) and
G(c) ¹ G(b). That is, the destination node belongs to a different group.

Liaison. Counts the number of times b is a broker and G(a) ¹ G(b) ¹ G(c). That
is, each node belongs to a different group.

When b is not the only intermediary between a and c, it is possible to give b only
partial credit. That is, if there are two paths of length two between a and c, one of
which involves b, we can choose to give b only 1/2 point instead of a full point.
This is an option in the program.

The routine calculates these measures for each node in the network, and also the
total of the five.

The program also computes the expected values of each brokerage measure given
the number of groups and the size of each group. That is, the expected values
under the assumption that brokerage is independent of the group status of nodes.
A final output divides the observed brokerage values by these expected scores.

PARAMETERS Input dataset:


Name of file containing network to be analyzed. Data type: Digraph

Partition vector:
The name of an UCINET dataset that contains a partition of the actors. To
partition the data matrix into groups specify a vector by giving the dataset name,
a dimension (either row or column) and an integer value. For example, to use the
second row of a dataset called ATTRIB, enter "ATTRIB ROW 2". The program
will then read the second row of ATTRIB and use that information to define the
groups. All actors with identical values on the criterion vector (i.e. the second
row of attrib) will be placed in the same group.
Method: (default = 'unweighted')
Choices are 'unweighted' and 'weighted'. Unweighted directs the program to
simply count up the number of times that a given node b is in a brokering
position, regardless of how many other nodes are serving the same function with
the same pair of endpoints a and c. Weighted directs the program to give partial
scores in inverse proportion to the number of alternatives.

(Output) Un-normalized Brokerage


Name of the file containing the raw count of scores for each type of brokerage.

(Output) Normalized Brokerage


Name of file containg brokerage scores divided by the expected values.

LOG FILE 1) A table giving the brokerage scores for each node.
2) A table giving the brokerage scores divided by the expected values.
3) A table giving the expected values.

TIMING O(n^3).

COMMENTS None

REFERENCE Gould, J. and Fernandez, J. 1989. Structures of mediation: A formal approach to


brokerage in transaction networks. Sociological Methodology :89-126.
NETWORK > CENTRALITY > FLOW BETWEENNESS
PURPOSE Calculates the flow betweenness and normalized flow
betweenness centrality of each vertex and gives the overall
network betweenness centralization.

DESCRIPTION Let mjk be the amount of flow between vertex j and vertex k
which must pass through i for any maximum flow. The flow
betweenness of vertex i is the sum of all mjk where i, j and k are
distinct and j < k. The flow betweenness is therefore a measure
of the contribution of a vertex to all possible maximum flows.

The normalized flow betweenness centrality of a vertex i is the


flow betweenness of i divided by the total flow through all pairs
of points where i is not a source or sink.

For a given binary network with vertices v1....vn and maximum flow betweenness
centrality cmax, the network flow betweenness centralization measure is S(cmax -
c(vi)) divided by the maximum value possible, where c(vi) is the flow
betweenness centrality of vertex vi.

The routine calculates these measures, and some descriptive statistics based on
these measures for symmetric, unsymmetric and valued graphs.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued symmetric
graph - integer values only.

Output dataset: (Default = 'FlowBetweenness').


Name of file which will contain flow-betweenness and normalized flow
betweenness centrality of each vertex.

LOG FILE The maximum flow matrix. This gives the maximum flow between all pairs of
vertices - the diagonals give the network size.

A table which contains a list of the flow-betweenness and normalized flow


betweenness (nFlowbet) centrality expressed as a percentage for each vertex.
Descriptive statistics which give the mean, standard deviation, variance,
minimum value and maximum value for both lists.

This is followed by the flow betweenness network centralization index expressed


as a percentage.

TIMING O(N^4).

COMMENTS The measure is based upon the concept of information flow. In valued data the
values should in some way correspond to the capacity for flow, hence valued data
should represent similarity.

REFERENCES Freeman L C, Borgatti S P and White D R (1991). 'Centrality in valued graphs: A


measure of betweenness based on network flow'. Social Networks 13, 141-154.
NETWORK > CENTRALITY > EIGENVECTOR
PURPOSE Calculates the eigenvector of the largest positive eigenvalue as a measure of
centrality.

DESCRIPTION Given an adjacency matrix A, the centrality of vertex i (denoted ci), is given by
ci =aSAijcj where a is a parameter. The centrality of each vertex is therefore
determined by the centrality of the vertices it is connected to. The parameter á is
required to give the equations a non-trivial solution and is therefore the
reciprocal of an eigenvalue. It follows that the centralities will be the elements of
the corresponding eigenvector. The normalized eigenvector centrality is the
scaled eigenvector centrality divided by the maximum difference possible
expressed as a percentage.

For a given binary network with vertices v1....vn and maximum eigenvector
centrality cmax, the network eigenvector centralization measure is S(cmax - c(vi))
divided by the maximum value possible, where c(vi) is the eigenvector centrality
of vertex vi.

This routine calculates these measures and some descriptive statistics based on
these measures. This routine only handles symmetric data and in these
circumstances the eigenvalues provide a measure of the accuracy of the centrality
measure. To help interpretation the routine calculates all positive eigenvalues but
only gives the eigenvector corresponding to the largest eigenvalue.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued Graph
(Symmetric data only).

Output dataset: (Default = 'BonacichCentrality').


Name of file which will contain eigenvector centrality measure for each vertex.

LOG FILE A table of positive eigenvalues. The eigenvalues are placed in descending order
under the heading VALUE. The table gives information on 'how dominant' the
largest eigenvalue is. The table gives the percentage and cumulative percentage
of the total eigenvalue sum for each eigenvalue. The ratio of each eigenvalue to
the next largest is also presented.

This is followed by a list of vertices which contains the eigenvector and


normalized eigenvector centrality measure for every vertex. These values should
be interpreted in terms of an interval scale.

Finally the network eigenvector centralization index expressed as a percentage is


given.

TIMING O(N^3).

COMMENTS The ratio of the largest eigenvalue to the next largest should be at least 1.5 and
preferably 2.0 or more for the centrality measure to be robust. If this is not the
case then a full factor analysis should be undertaken.

REFERENCES Bonacich P (1972). Factoring and Weighting Approaches to status scores and
clique identification. Journal of Mathematical Sociology 2, 113-120.
NETWORK > CENTRALITY > POWER
PURPOSE Compute Bonacich's power based centrality measure for every vertex and give an
overall network centralization index for this centrality measure.

DESCRIPTION Given an adjacency matrix A, the centrality of vertex i (denoted ci), is given by
ci =SAij(a+bcj) where a and b are parameters. The centrality of each vertex is
therefore determined by the centrality of the vertices it is connected to.

The value of a is used to Normalize the measure, the value of b is an attenuation


factor which gives the amount of dependence of each vertex's centrality on the
centralities of the vertices it is adjacent to. The Normalization parameter is
automatically selected so that the sum of squares of the vertex centralities is the
size of the network.

The parameter b is selected by the user, negative values should be selected if an


individual's power is increased by being connected to vertices with low power
and positive values selected if an individual's power is increased by being
connected to vertices with high power.

For a given binary network with vertices v1....vn and maximum degree centrality
cmax, the network degree centralization measure is S(cmax - c(vi)) divided by the
maximum value possible, where c(vi) is the degree centrality of vertex vi.

The routine calculates power centrality and some descriptive statistics of the
measure.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued Graph
(Symmetric data only).

Value of attenuation factor (Beta): (Default = 0.0).


A value of 0 gives a centrality measure directly proportional to the degree of
each vertex. Positive values give weight to being connected to powerful actors,
negative values give weight to being connected to low powered actors. Larger
values in modulus gives greater weight to actors further away.

Output dataset: (Default = 'BonacichPower').


Name of file which contains power centrality measure for every vertex.

LOG FILE A table which contains the power centrality of each actor.
Descriptive statistics which give the mean, standard deviation, variance,
minimum value and maximum value for the measure.

TIMING O(N^3).

COMMENTS It is advisable to select b so that its absolute value is less than the absolute value
of the reciprocal of the largest eigenvalue of the adjacency matrix. An upper-
bound on the eigenvalues can be obtained by the largest row or (column) sums of
the matrix.
REFERENCES Bonnacich P (1987). Power and Centrality: A family of Measures. American
Journal of Sociology 92, 1170-1182.
NETWORK>CONNECTIONS>HUBBEL/KATZ (INFLUENCE)

PURPOSE Calculate the influence measure between every pair of vertices using the models
of Hubbell, Katz or Taylor.

DESCRIPTION Successive powers of matrices provide measures of influence since they


enumerate the number of possible walks of given length between all pairs of
nodes. Since longer walks are assumed to contribute less in terms of influence, an
attenuation factor is included and the sum of all walks is taken. Hubbell includes
the identity matrix in the series whereas Katz does not.

For Hubbell the influence matrix is I + S(bA)^i that equals inverse of (I - bA)
under certain conditions. It follows that for Katz the influence matrix is inverse
of (I - bA) -I under the same condition. Taylor's measure is a normalized version
of the Katz measure. For each power in the series subtract the column marginals
from the row marginals and normalize by the total number of walks of that
length.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued graph.

Computational Method:
Choices are:

Hubbel - influence matrix defined by inverse of (I - bA) where A is the


adjacency matrix and b is the attenuation factor.

Katz - influence matrix defined by inverse of (I - bA) - I where it is the


adjacency matrix and b is the attenuation factor.

Taylor - takes the Katz influence matrix and takes the column marginals from the
row marginals and normalizes.

Attenuation Factor (Beta): (Default = 0.5)


The value of the attenuation factor. This value should be smaller than the
reciprocal of the absolute value of the dominant eigenvalue. This can be
guaranteed by using the simple bound that all eigenvalues are smaller than the
largest row (or column) sum.

Divide matrix by overall sum: (Default = NO)


Dividing the initial matrix by the sum of all its elements guarantees that the
series will converge.

Output dataset:(Default = 'Influence')


Name of file which will contain the influence matrix. Row i column j will give
actor i's influence over actor j.

LOG FILE Influence matrix.

TIMING O(N^3).

COMMENTS None.
REFERENCES Hubbell C H (1965). 'An input-output approach to clique identification'.
Sociometry, 28, 377-399.

Katz L (1953). 'A new status index derived from sociometric data analysis'.
psychometrika, 18, 34-43.

Taylor M (1969). 'Influence structures'. Sociometry 32, 490-502.


NETWORK > CENTRALITY > INFORMATION
PURPOSE Calculate the Stephenson and Zelen information centrality measure for each
vertex, and give an overall network information centralization index.

DESCRIPTION The weighted function of the set of all paths connecting vertex i to vertex j is any
weighted linear combination of the paths such that the sum of the weights is
unity. Assuming that each link in a path is independent, and the variance of a
single link is unity, it can be concluded that the variance of a path is simply its
length.

The information measure between two vertices i and j is the inverse of the
variance of the weighted function. The information centrality of a vertex i is the
harmonic mean of all the information measures between i and all other vertices in
the network.

The routine calculates these measures and some descriptive statistics based on
these measures for symmetric graphs.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Graph.

Include diagonal in calculations? (Default = NO).


If NO self-loops are ignored.

Output dataset: (Default = 'Information').


Name of file which will contain information content and normalized information
centrality of each vertex.

LOG FILE A table which contains a list of the information content Together with descriptive
statistics which give the mean, standard deviation, variance, minimum value and
maximum value.

TIMING O(N^3).

COMMENTS None

REFERENCES Stephenson K and Zelen M (1991). 'Rethinking Centrality'. Social Networks 13.
NETWORKS>CENTRALITY>MULTIPLE MEASURES

PURPOSE Computes four normalized centrality measures: degree, closeness, betweenness,


and eigenvector.

DESCRIPTION Only normalized versions of the measures for undirected data are given . There
are no descriptive statistics nor are there any centralization measures.

PARAMETERS Input dataset:


Name of file containing network to be analyzed. Data type: Graph

Output Dataset: (Default = 'Centrality')


Name of file which will contain centrality measures for each node.

LOG FILE A table of centrality measures.

TIMING O(N^2).

COMMENTS

REFERENCES See individual measures.


GROUP > CENTRALITY > DEGREE > FIND
PURPOSE Find a group with a specified size with the highest group degree centrality.

DESCRIPTION The group degree centrality of a group of actors is the size of the set of actors
who are directly connected to group members. This routine uses a simple greedy
algorithm to optimize this measure for a fixed size group. Local minima are
avoided by taking a number of different random starting configurations.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Graph.

Desired Group Size (Default = 10).


Specified size of group.

No. of starts: (Default = 100)


Number of random starts used to avoid local minima

Output dataset: ('BestDegGroup')


Name of UCINET dataset containing a group indicator vector. The rows give the
actors and an actor is in the group with the largest group degree centrality if the
entry in the vector is a 1. This vector is not shown in the LOGFILE.

LOG FILE The fit is the percentage of actors (both within and outside) adjacent to group
members. The starting fit, final fit and the number of actors together with the
final number of actors connected to the group are reported. This is followed by a
list of the members of the group with the highest group degree centrality.

TIMING O(N^2).

COMMENTS Note that this routine just finds one group. There could be many others.

REFERENCE Everett, M.G. and Borgatti, S.P. (1999) The Centrality of Groups and Classes.
Journal of Mathematical Sociology 23 181-202.
GROUP > CENTRALITY > DEGREE > TEST

PURPOSE Performs a permutation test to assess whether a specified group has a high degree
group centrality score.

DESCRIPTION The group degree centrality of a group of actors is the size of the set of actors
who are directly connected to group members. This routine uses a simple
sampling procedure to test whether a specified group has a higher group degree
centrality measure than those produced at random.

PARAMETERS Input Network:


Name of file containing network to be analyzed. Data type:
Graph.

Central Group
Name of UCINET file containing a column vector which specifies the actors in
the specified group. A 1 in row j indicates that actor j is in the group and a 0
indicates that the actor is not a member.

Number of permutations (Default = 5000)


Number of permutations taken in the random sampling procedure.

LOG FILE The group degree centrality for the specified data set, this is labelled as the
observed # reached. The mean and standard deviation of the group centrality for
the random samples. Finally the number of times, expressed as a p-value, that a
random sample achieved a group centrality score as high or higher than the
specified group.

TIMING O(N^2).

COMMENTS None

REFERENCE Everett, M.G. and Borgatti, S.P. (1999) The Centrality of Groups and Classes.
Journal of Mathematical Sociology 23 181-202.
NETWORK > CORE/PERIPHERY > CONTINUOUS

PURPOSE Fit a continuous (ratio-level) core/periphery model to a data network, and


estimate the coreness of each actor.

DESCRIPTION Simultaneously fits a core/periphery model to the data network and estimates the
degree of coreness or closeness to the core of each actor. This is done by finding
a vector C such that the product of C and C transpose is as close as possible to
the original data matrix. In addition a number of measures which try to assess
the degree to which the network falls into a core/periphery structure for different
sizes of core are calculated. Each measure starts with the actor with the highest
coreness score and places them in the core and all other actors are placed in the
periphery. The core is then successively increased by moving the actor with the
highest coreness score from the periphery into the core. This is continued until
the periphery consists of a single actor. nDiff is a generalization of centralization
and sums the differences between the actor in the core with the lowest coreness
score with all those in the periphery and adds to this the sum of the difference
between the actor with the highest score in the periphery and all the actors in the
core. This value is then normalized. Diff is similar but places a weighting on the
size of the core, this weighting is equal to the square root of the core size and so
the measure gives greater value to smaller cores. The correlation measure
correlates the given coreness scores with the ideal scores of a one for every core
member and a zero for actors in the periphery. Finally, Ident is the same as the
correlation measure but uses Euclidean distance in place of correlation.

PARAMETERS Input dataset:


Name of file containing network to be analyzed. Data type: Valued Digraph.

Data are Pos or Neg: (Default = POSITIVE)


Use positive to indicate that larger values imply a stronger relationship. Use
negative to indicate that larger values in the data imply a more distant
relationship.

Use Corr or Distance: (Default = CORR)


Which measure of fit to use. Corr measures the correlation between the data
matrix and the product of C and C transpose. Distance uses Euclidean distance in
place of correlation, in this case C is simply the principal eigenvector. Minres is
factor analysis without diagonals

Prevent Negatives:
It is possible for the best C to contain negative values, choosing yes prevents this
happening.

Max # of iterations: (Default = 1000)


The maximum number of iterations used in the optimization procedure.

Diagonal values valid: (Default = NO)


If NO diagonal values are ignored.

Output dataset: (Default = 'Coreness')


Name of file containing coreness values.

LOG FILE The correlation or Euclidean distance between the model and the data at the start
and end of the optimization procedure together with the number of iterations
required. Minres option just gives the final correlation.
The coreness of each actor, this has been normalized so that the sum of squares is
one. Followed by some descriptive statistics including gini coefficients and an
heterogeneity measure. The gini coefficient measures how the scores are
distributed over the population and measures the amount of inequality in the
data. If everyone had the same score it gives a value of zero, if a single actor had
a value of 1 and everyone else had a score of zero it gives a value of 1. The
composite score is an adjusted measure which takes account of the fact that we
are looking for core-periphery structures. The heterogeneity measure is based on
a simple summing of proportions which measures the extent to which the scores
are evenly distributed.
This is followed by a table of the four concentration measures which assess the
extent to which the data fits a core periphery structure. Each column gives a
different measure, the value in row i places the i actors with the highest coreness
in the core and the remainder in the periphery.
This is followed by a recommended core size based on the correlation measure.
See the comments below.
Finally the expected values are given, this is C times C transpose and then
normalized so that it has the same mean and standard deviation as the data.

TIMING O(N^3)

COMMENTS The concentration measures can need careful interpretation. If nDiff has a clear
maxima which is not at 1 or n-1 then this indicates a solid core periphery
structure. Often nDiff has a number of maxima indicating that there are a group
of actors situated between the core and the periphery. If the user still wishes to
specify a core then the other measures can be used. Diff is a biased measure and
gives more weight to smaller cores and again if this has a clear maxima this can
indicate a core. If this does not yield any conclusive results or there is no
requirement to favor smaller cores then it is recommended that the correlation is
used together with nDiff or Diff. The correlation measure can indicate an area in
which to focus and the other measures can be used to fine tune the measure to
identify a core size. Ident should be used in the same way as correlation but it
places more weight on the absolute scores.

REFERENCES Borgatti SP and Everett M G (1999) Models of core/periphery structures. Social


Networks 21 375-395
Comrey AL (1962) The minimum residual method for factor analysis.
Psychological Reports 11, 15-18.
NETWORK > CORE/PERIPHERY > CATEGORICAL

PURPOSE Uses a genetic algorithm to fit a core/periphery model to the


data.

DESCRIPTION Simultaneously fits a core/periphery model to the data network,


and identifies which actors belong in the core and which belong
in the periphery.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type:
Valued Digraph.

Data are Pos or Neg: (Default = POSITIVE)


Use positive to indicate that larger values imply a stronger
relationship. Use negative to indicate that larger values in the
data imply a more distant relationship.

Algorithm: (Default = CORR)


Choices are:

CORR
The fit function is the correlation between the permuted data
matrix and an ideal structure matrix consisting of ones in the
core block interactions and zeros in the peripheral block
interactions. This value is maximized.

DENSITY
The fit function is the density of the core block interactions.This
value is maximized.

SXY
The fit function is the element wise product of the permuted
data matrix and an ideal structure matrix consisting of ones in
the core block interactions and zeros in the peripheral block
interactions.This value is maximized.

EMPTYPER
The fit function is the number of entries in the peripheral block
interactions. This value is minimized.

Density of core-to-periphery blocks:


This sets the density of the core to periphery ties in the ideal
structure matrix.. If left blank or the word missing is entered
these ties are ignored. Any other value is entered into every cell
in the off diagonal blocks of the ideal structure matrix.

Maximum # of iterations: (Default = 200)


Sets the maximum number of iterations performed.
Population Size: (Default = 100)
Number of genes in the population.

Output partition: (Default = 'CLUSPART')


Name of output file which contains a cluster indicator vector.
This vector has the form (k1,k2,...ki...) where ki assigns vertex i
to cluster ki where ki is either 1 or 2 where 1 is the core and 2 is the periphery,
so that (1 1 2 1 2) assigns vertices 1, 2 and 4 to the core, and 3
and 5 to the periphery. This vector is not displayed at output.

Output cluster indicator matrix: (Default = 'CLUSTERS')


Name of file which contains a cluster by actor incidence matrix.
A 1 in row i column j indicates that actor j is a member of cluster
i, i = 1 or 2 with 1 representing the core and 2 the periphery. This matrix is
not displayed in the LOG FILE.

LOG FILE The starting and the final correlation of the ideal structure and
the permuted adjacency matrix (regardless of which option was
chosen). A listing of the members of the core and the periphery.
A blocked adjacency matrix dividing the actors into the core and
periphery.

TIMING O(N^2) per iteration. Correlation is considerably slower than the


other options

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the minima (maxima) of the cost function. Even if
successful this result may still be a high (low) value in which case the partition
may not represent a core/periphery model.

In addition there may be a number of alternative partitions which also produce


the minimum (maximum) value; the algorithm does not search for additional
solutions. Finally it is possible that the routine terminates at a local minima
(maxima) and does not locate the desired global minima (maxima).

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into a
core/periphery structure.

REFERENCES Borgatti SP and Everett M G (1999) Models of core/periphery structures. Social


Networks 21 375-395
NETWORK > ROLES & POSITIONS > STRUCTURAL > PROFILE
PURPOSE Compute measures of structural equivalence based upon comparisons of rows
and columns of data matrices and forms clusters based upon the results.

DESCRIPTION The profile of an actor is the row vector corresponding to the actor in the
adjacency matrix. Multiple relations are permissible and the profile vector is the
concatenation of each individual relation profile vector. This matrix can be real
or binary.

Structurally equivalent actors have the same profile except for the diagonal
entries of the adjacency matrix. This routine compares the profile vectors of all
pairs of actors and hence computes a measure of profile similarity. Measures of
similarity can be made using Euclidean distance, Pearson correlation, exact
matches or matches of positive entries only. Euclidean distance produces a
distance matrix and all the other options produce a similarity matrix. This matrix
is then analyzed by single link hierarchical clustering.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Multirelational.

Measure of profile similarity/distance: (Default = EUCLIDEAN DISTANCE).


Choices are:

Euclidean Distance - The distance between the vectors in n-dimensional space,


i.e. the root of the sum of squared differences.

Correlation - Pearson product correlation coefficient of every pair of profiles.

Matches - Proportion of exact matches between all pairs of profiles.

Positive Matches - Proportion of exact matches in which at least one element is


positive, between all pairs of profiles.

Method of handling diagonal values: (Default = RECIPROCAL)


Choices are:

Reciprocal - In considering adjacency matrix X and comparing profile of actor i


with actor j we replace the comparison of elements xii with xji and xij with xjj by
the comparisons xii with xjj and xij with xji respectively.

Ignore - Diagonals are treated as missing values so that the comparisons of xii
with xji and xij with xjj are dropped.

Retain - Profile vectors are compared directly element by element, including the
xii and xjj elements.

Include transpose in calculations?: (Default = YES).


Including transposes means that profiles correspond to rows and columns. This
is obviously not necessary for symmetric data.

For binary data: convert to geodesic distances: (Default = NO).


Converts binary data to geodesic data before performing an analysis.

Diagram Type: (Default = 'Dendrogram')


The clustering diagram can either be a Tree Diagram or a Dendrogram.

(Output) Equivalence matrix: (Default = 'SE').


Name of data file containing actor by actor equivalence matrix.

(Output) Partition dataset: (Default = 'SEPart').


Name of data file containing partition indicator matrices derived from single link
hierarchical clustering. A value of k in row labeled x and column j means that
actor j is in partition k at level x. Actor k is always a member of partition k, and
is a representative label for the group. This matrix is not displayed in the LOG
FILE.

LOG FILE Single link hierarchical clustering dendrogram (or tree diagram) of the structural
equivalence matrix. The level at which any pair of actors are aggregated is the
point at which both can be reached by tracing from the start to the actors from
right to left. The diagram can be printed or saved. Parts of the diagram can be
viewed by moving the mouse to the split point in a tree diagram or the beginning
of a line in the dendrogram and clicking. The first click will highlight a portion of
the diagram and the second click will display just the highlighted portion. To
return to the original right click on the mouse. There is also a simple zoom
facility simply change the values and then press enter. If the labels need to be
edited (particularly the scale labels) then you should take the partition indicator
matrix into the spreadsheet editor remove or reduce the labels and then submit
the edited data to Tools>Dendrogram>Draw.

Behind the plot is the actor by actor structural equivalence matrix. This is
followed by an alternative clustering diagram representing the same information
as above. The columns are rearranged and labeled. A '·' in column label j at level
x means that actor j is not in any cluster at level x. An x indicates that actor j is
in a cluster at this level together with those actors which can be traced across that
row without encountering a space.

TIMING O(N2).

COMMENTS None.

REFERENCES Burt R (1976). Positions in Networks. Social Forces, 55, 93-122.


NETWORK > ROLES & POSITIONS > STRUCTURAL EQUIVALENCE >
CONCOR
PURPOSE Partitions network data by splitting blocks based upon the CONvergence of
iterated CORrelations (CONCOR).

DESCRIPTION Given an adjacency matrix, or a set of adjacency matrices for different relations,
a correlation matrix can be formed by the following procedure. Form a profile
vector for a vertex i by concatenating the ith row in every adjacency matrix; the
i,jth element of the correlation matrix is the Pearson correlation coefficient of the
profile vectors of i and j. This (square, symmetric) matrix is called the first
correlation matrix.

The procedure can be performed iteratively on the correlation matrix until


convergence. Each entry is now 1 or -1. This matrix is used to split the data into
two blocks such that members of the same block are positively correlated,
members of different blocks are negatively correlated.

CONCOR uses the above technique to split the initial data into two blocks.
Successive splits are then applied to the separate blocks. At each iteration all
blocks are submitted for analysis, however blocks containing two vertices are not
split. Consequently n-partitions of the binary tree can produce up to 2n blocks.

Note that any similarity matrix can be used as input.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Multirelational.

Include transpose in calculations?: (Default = YES).


For non-symmetric data each vertices profile would depend on its out ties only
(since we only consider rows). The in-ties can be considered by adding the
transpose of the data matrices as additional relations.

Method of handling diagonal values: (Default = RECIPROCAL)


Choices are:

Reciprocal - In considering adjacency matrix X and comparing profile of actor i


with actor j we replace the comparison of elements xii with xji and xij with xjj by
the comparisons xii with xjj and xij with xji respectively.

Ignore - Diagonals are treated as missing values so that the comparisons of xii
with xji and xij with xjj are dropped.

Retain - Profile vectors are compared directly element by element, including the
xii and xjj elements.

Max depth of splits (not blocks): (Default =2).


How far down the binary tree splits are to be taken. A value of n can produce up
to 2n blocks.

Convergence criteria: (Default = 0.2).


In practice iterations are not taken to convergence but taken to within a tolerance
TOL. Convergence is accepted on values of 1.0 - TOL and -1.0 + TOL. Smaller
values of TOL increase computation time but create more robust solutions.

Maximum iterations: (Default = 25).


The maximum number of iterations performed on the correlation matrix before
terminating through lack of convergence.

Input is corr mat: (Default ='No')


If the input dataset is a correlation matrix already then set to 'yes'.

(Output) Partition dataset: (Default = 'ConcorCPart').


Name of file which contains partition by actor indicator matrix.The indicator
matrix has the same number of rows as specified by the 'Max # of partitions' the
number of columns equals the size of the network. The value k in row i column
label j means that vertex labeled j is in block k at level i (that is the ith partition).
All other members of block k can be found by simply locating all column labels
which correspond to an entry of k in the matrix. This matrix is not displayed in
the LOG FILE.

(Output) Permuted dataset: (Default = 'ConcorCCPerm').


Name of file which contains permuted vertex vector. Permuted vector is such
that vertices in the same block are grouped together. This vector is not displayed
in the LOG FILE.

(Output) First correlation matrix: (Default = 'Concor1stCorr').


Name of file which contains the correlation matrix constructed after the first
iteration.

LOG FILE The correlation matrix constructed during the first iteration.

Blocks represented in terms of a clustering dendrogram. The blocks are given for
each level specified in 'Max # of partitions'. The level at which any pair of actors
are aggregated is the point at which both can be reached by tracing from the start
to the actors from right to left. Hence to find all members of vertex i's block at
level k simply locate the value of k on the line connected to i then all actors that
can be reached from this point by tracing to the left are in i's block. The diagram
can be printed or saved. Parts of the diagram can be viewed by moving the
mouse to the split point in a tree diagram or the beginning of a line in the
dendrogram and clicking. The first click will highlight a portion of the diagram
and the second click will display just the highlighted portion. To return to the
original right click on the mouse. There is also a simple zoom facility simply
change the values and then press enter. If the labels need to be edited
(particularly the scale labels) then you should take the partition indicator matrix
into the spreadsheet editor remove or reduce the labels and then submit the edited
data to Tools>Dendrogram>Draw.

Behind the dendrogram is the correlation matrix constructed during the first
iteration. Followed by an alternative cluster diagram. Members of the same block
are connected by row of X's. Hence to find all members of vertex i's block at
level k simply locate the X in column label i at level k and trace along in both
directions until a space is encountered. All column labels corresponding to the
Xs found are members of i's block. A '·' indicates a singleton block.

A blocked adjacency matrix. The rows and columns of the original adjacency
matrix are permuted into blocks. The adjacency matrix is displayed in terms of
the matrix blocks it contains.

The correlation coefficient R-squared of the partitioned data matrix and an ideal
structure matrix. The structure matrix has the same dimension as the data matrix
but each cell in a block is set to the average value of the corresponding block in
the data matrix.

TIMING Each iteration is O(N^3).

COMMENTS The algorithm splits every non-trivial block at every level. The user may wish to
reject a split at some level - since the history of all splits are given it is a simple
matter to recombine clusters if the user so wishes.

REFERENCES Breiger R, Boorman S and Arabie P (1975). An algorithm for clustering


relational data, with applications to social network analysis and comparison with
multi-dimensional scaling. Journal of Mathematical Psychology, 12, 328-383.
NETWORKS>ROLES & POSITIONS>STRUCTURAL
EQUIVALENCE>OPTIMIZATION>BINARY

PURPOSE Optimizes a cost function which measures the degree to which a


partition forms structurally equivalent blocks using a tabu
search method.

DESCRIPTION A partition of a network divides the adjacency matrix into matrix


blocks. For perfect structural equivalence each block should consist of zeros or
all ones. The number of errors in a block are the least number of changes
required to make either all zeros or all ones. The sum of the errors of all
the matrix blocks gives a measure or cost function of the degree
of structural equivalence for a given partition. The routine
attempts to optimize this cost function to try and find the best
partition of the vertices into a specified number of blocks.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type:
Graph.

Number of blocks: (Default = 2).


Number of groups or blocks into which the vertices are to be
assigned. The number of matrix blocks will be the square of
this number.

Output sets dataset: (Default = 'SbmSets').


Name of file which contains a block by actor incidence matrix. A
1 in row i column j indicates that actor j is a member of block i.
This matrix is not displayed in the LOG FILE.

Output Partition Dataset: (Default = 'SbmPart').


Name of output file which contains a partition indicator vector.
This vector has the form (k1,k2,...ki...) where ki assigns vertex i
to block ki, so that (1 1 2 1 2) assigns vertices 1, 2 and 4 to
block 1 and 3 and 5 to block 2.
This vector is not displayed in the LOG FILE.

Additional
Are diagonal values valid? (Default = NO)
Whether diagonals are to be included in cost function.

Maximum # of iterations in a series: (Default = 50)


The algorithm starts from an arbitrary partition and attempts to
decrease the cost by taking the steepest descent. If the cost
cannot be reduced then the algorithm continues its search in the
neighborhood of the current partition. This search direction is a
mildest ascent direction and from there new search directions
are explored. This exploration only continues for a fixed number
of iterations in a series. If no improvement is made after the
fixed number of iterations the algorithm terminates with the
current minimum. Increasing the parameter gives a more
exhaustive and therefore slower search.
Random Number Seed:
The random number seed generates the initial partition. UCINET
generates a different random number as default each time it is
run. This number should be changed if the user wishes to
repeat the analysis with different initial configurations. The
range is 1 to 32000.

Length of time in penalty box: (Default =25)


If the algorithm makes an ascending step then it is possible that
the best possible descending step is the reverse of the direction
just taken. This parameter prohibits a move along the reverse
direction for a set number of steps. The larger the value the
more difficult it will be to come back to a previously explored
local minimum, however it will also be more difficult to explore
the vicinity of that minimum. The default has been shown
experimentally to be the most useful.

Number of random starts: (Default = 5)


The whole procedure is repeated with a different initial partition.
The best of these are then selected as a minimum.

LOG FILE The number of errors and the R-squared value for the initial partition. The R-
squared value is the correlation coefficient of the partitioned data
matrix and an ideal structure matrix. The structure matrix has
the same dimension as the data matrix but each block is set to a
one or zero corresponding to the nearest block in the data matrix.

The final number or errors the R-squared value and the errors in each block after
the optimization.

List of blocks. Each block is labeled and is specified by the


vertices it contains.

The blocked adjacency matrix. The rows and columns of the


original adjacency matrix are permuted into blocks. The
adjacency matrix is displayed in terms of the matrix blocks it
contains.

TIMING Each iteration of the tabu search algorithm is O(N^2).

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the minima of the cost function.


Even if successful this result may still have a high value in which
case the blocking may not conform very closely to structural
equivalence.

In addition there may be a number of alternative partitions


which also produce the minimum value; the algorithm does not
search for additional solutions. Finally it is possible that the
routine terminates at a local minima and does not locate the
desired global minima.

To test the robustness of the solution the algorithm should be


run a number of times from different starting configurations. If
there is good agreement between these results then this is a
sign that there is a clear split of the data into the reported
blocks.

REFERENCES Panning W (1982). 'Fitting blockmodels to data'. Social


Networks 4, 81-101.

Glover F (1989). Tabu Search - Part I. ORSA Journal on


Computing 1, 190-206.

Glover F (1990). Tabu Search - Part II. ORSA Journal on


Computing 2, 4-32.
NETWORKS>ROLES & POSITIONS>STRUCTURAL
EQUIVALENCE>OPTIMIZATION>VALUED

PURPOSE Optimizes a cost function which measures the degree to which a partition forms
structurally equivalent blocks using a tabu search method.

DESCRIPTION A partition of a network divides the adjacency matrix into matrix blocks. The
variance of the elements of a matrix block gives a measure of an extent to which
the elements within the matrix block conform to structural equivalence. The sum
of the variances of all the matrix blocks gives a measure or cost function of the
degree of structural equivalence for a given partition. The routine attempts to
optimize this cost function to try and find the best partition of the vertices into a
specified number of blocks.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued graph.

Number of blocks: (Default = 2).


Number of groups or blocks into which the vertices are to be assigned. The
number of matrix blocks will be the square of this number.

Output sets dataset: (Default = 'SbmSets').


Name of file which contains a block by actor incidence matrix. A 1 in row i
column j indicates that actor j is a member of block i. This matrix is not
displayed in the LOG FILE.

Output Partition Dataset: (Default = 'SbmPart').


Name of output file which contains a partition indicator vector. This vector has
the form (k1,k2,...ki...) where ki assigns vertex i to block ki, so that (1 1 2 1 2)
assigns vertices 1, 2 and 4 to block 1 and 3 and 5 to block 2.
This vector is not displayed in the LOG FILE.

Additional
Are diagonal values valid? (Default = NO)
Whether diagonals are to be included in cost function.

Maximum # of iterations in a series: (Default = 50)


The algorithm starts from an arbitrary partition and attempts to decrease the cost
by taking the steepest descent. If the cost cannot be reduced then the algorithm
continues its search in the neighborhood of the current partition. This search
direction is a mildest ascent direction and from there new search directions are
explored. This exploration only continues for a fixed number of iterations in a
series. If no improvement is made after the fixed number of iterations the
algorithm terminates with the current minimum. Increasing the parameter gives a
more exhaustive and therefore slower search.

Random Number Seed:


The random number seed generates the initial partition. UCINET generates a
different random number as default each time it is run. This number should be
changed if the user wishes to repeat the analysis with different initial
configurations. The range is 1 to 32000.
Length of time in penalty box: (Default =25)
If the algorithm makes an ascending step then it is possible that the best possible
descending step is the reverse of the direction just taken. This parameter
prohibits a move along the reverse direction for a set number of steps. The larger
the value the more difficult it will be to come back to a previously explored local
minimum, however it will also be more difficult to explore the vicinity of that
minimum. The default has been shown experimentally to be the most useful.

Number of random starts: (Default = 5)


The whole procedure is repeated with a different initial partition. The best of
these are then selected as a minimum.

LOG FILE The correlation coefficient R-squared of the partitioned data matrix and an ideal
structure matrix. The structure matrix has the same dimension as the data matrix
but each cell in a block is set to the average value of the corresponding block in
the data matrix.

List of blocks. Each block is labeled and is specified by the vertices it contains.

The blocked adjacency matrix. The rows and columns of the original adjacency
matrix are permuted into blocks. The adjacency matrix is displayed in terms of
the matrix blocks it contains.

TIMING Each iteration of the tabu search algorithm is O(N^2).

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the minima of the cost function. Even if successful
this result may still have a high value in which case the blocking may not
conform very closely to structural equivalence.

In addition there may be a number of alternative partitions which also produce


the minimum value; the algorithm does not search for additional solutions.
Finally it is possible that the routine terminates at a local minima and does not
locate the desired global minima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into the
reported blocks.

REFERENCES Panning W (1982). 'Fitting blockmodels to data'. Social Networks 4, 81-101.

Glover F (1989). Tabu Search - Part I. ORSA Journal on Computing 1, 190-206.

Glover F (1990). Tabu Search - Part II. ORSA Journal on Computing 2, 4-32.
NETWORK > ROLES > EXACT > OPTIMIZATION
PURPOSE Optimizes a cost function that gives an approximate measure of the degree to
which a partition corresponds to automorphically equivalent sets using a tabu
search.

DESCRIPTION Two vertices u and v of a labelled graph G are automorphically equivalent if all
the vertices can be relabelled to form an isomorphic graph with the labels of u
and v interchanged. Given a partition of the network then the partition divides the
adjacency matrix into blocks. For an automorphic partition the cell values for a
row or column within a block will have the same distribution of values. An
approximate measure of the extent to which these blocks conform to automorphic
equivalence is given by the following procedure. For each block calculate the
variance of the sum of squares of each row and the variance of the sum of
squares of each column. The approximate automorphic cost is the sum of all
these variances for every block. The routine attempts to optimize this cost
function to try and find the best partition of the vertices into a specified number
of blocks.

PARAMETERS

Input dataset:
Name of file containing network to be analyzed. Data type: Valued graph.

Number of blocks (Default = 2).


Number of groups or blocks into which the vertices are to be assigned.

Are diagonal values valid (Default = NO).


Whether diagonals are to be included in cost function.

For binary data: convert to geodesic distances (Default = NO).


No performs an analysis on raw adjacency matrix.
Yes converts the adjacencies to distances and uses this as the input data. If there
is no path connecting two vertices then the distance of n is used, where n is the
number of vertices in the network.

Maximum # of iterations in a series (Default = max(2,n/3)).


The algorithm starts from an arbitrary partition and attempts to decrease the cost
by taking the steepest descent. If the cost cannot be reduced then the algorithm
continues its search in the neighborhood of the current partition. This search
direction is a mildest ascent direction and from there new search directions are
explored. This exploration only continues for a fixed number of iterations in a
series. If no improvement is made after the fixed number of iterations the
algorithm terminates with the current minimum. Increasing the parameter gives a
more exhaustive and therefore slower search. The recommended default value is
automatically entered on the form once the input data has been selected.

Length of time in penalty box (Default = 10).


If the algorithm makes an ascending step then it is possible that the best possible
descending step is the reverse of the direction just taken. This parameter prohibits
a move along the reverse direction for a set number of steps. The larger the value
the more difficult it will be to come back to a previously explored local
minimum, however it will also be more difficult to explore the vicinity of that
minimum. The default of 10 has been shown experimentally to be the most
useful.

Number of random starts (Default = 10 - 2logn).


The whole procedure is repeated with a different initial partition. The best of
these are then selected as a minimum.

Random Number Seed


The random number seed generates the initial partition. UCINET generates a
different random number as default each time it is run. This number should be
changed if the user wishes to repeat the analysis with different initial
configurations. The range is 1 to 32000.

Output Partition Dataset (Default = 'ABMPart').


Name of output file to contain a partition indicator vector. This vector has the
form (k1,k2,...ki...) where ki assigns vertex i to block ki, so that (1 1 2 1 2) assigns
vertices 1, 2 and 4 to block 1 and 3 and 5 to block 2. This vector is not displayed
in the LOG FILE.

Output Indicator Dataset: (Default = 'ABMSets').


Name of file which contains a block by actor incidence matrix. A 1 in row i
column j indicates that actor j is a member of block i. This matrix is not
displayed in the LOG FILE.

LOG FILE The value of the cost function or Fit.

List of blocks. Each block is labelled and is specified by the vertices it contains.

The blocked adjacency matrix. The rows and columns of the original adjacency
matrix are permuted into blocks. The adjacency matrix is displayed in terms of
the matrix blocks it contains.

TIMING Each iteration of the tabu search algorithm is O(N^2).

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the minima of the cost function. Even if successful
this result may still have a high value in which case the blocking may not
conform very closely to automorphic equivalence. In addition there may be a
number of alternative partitions that also produce the minimum value; the
algorithm does not search for additional solutions. Finally it is possible that the
routine terminates at a local minima and does not locate the desired global
minima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into the
reported blocks.

REFERENCES Glover F (1989). Tabu Search - Part I. ORSA Journal on Computing 1, 190-206.

Glover F (1990). Tabu Search - Part II. ORSA Journal on Computing 2, 4-32.
NETWORKS>ROLES&POSITIONS>AUTOMORPHIC>ALL PERMUTATIONS

PURPOSE Partitions the vertices of a graph into orbits by exhaustive search.

DESCRIPTION Two vertices u and v of a labelled graph G are automorphically equivalent if all
the vertices can be relabelled to form an isomorphic graph with the labels of u
and v interchanged. Automorphic equivalence is an equivalence relation and
therefore partitions the vertices into equivalence classes called orbits. This
routine finds the orbits by examining all possible relabelings of the graph. For a
graph of n vertices there are n! possible Permutations of the labels.

PARAMETERS

Input dataset:
Name of file containing network to be analyzed. Data type: Digraph.

(Output) Orbit Dataset (Default = 'AllAutomorphismsOrbits').


Name of output file to contain orbit indicator vector. This vector has the form
(k1,k2,...ki...) where ki assigns vertex i to orbit ki, so that (1 1 2 1 2) assigns
vertices 1, 2 and 4 to orbit 1 and 3 and 5 to orbit 2. This vector is not displayed in
the LOG FILE.

(Output Automorphism Dataset): (Default =


AllAutomorphismsAuto').
Name of file which that gives all automorphisms of the graph.
The automorphisms are specified in a numbered list with the
original labelling at the head. A value of k in row m column n
means that for automorphism number m vertex n was relabelled
k.This vector is not displayed in the LOG FILE.

LOG FILE The number of Permutations examined.

The number of relabellings that produced an isomorphism.

The percentage of all permutations that produced an isomorphic graph (the hit
rate).

A list of the orbits.

TIMING Exponential.

COMMENTS Computation time for this routine is very slow. It is inadvisable to try this on
graphs with more than 10 vertices, impossible on graphs with more than 15.

REFERENCES
NETWORK > ROLES > EXACT >EXCATREGE
PURPOSE Computes a single link hierarchical clustering and a measure of regular
equivalence for binary or nominal data using exact categorical REGE.

DESCRIPTION Two actors are exactly regularly equivalent if they are exactly equally related to
equivalent others. Nominal data is any integer valued adjacency matrix in which
the value represents a coding of the relationship in terms of a category.

For example, we could use 1 to represent close friend, 2 to represent friend and 3
to represent works with. The values 1, 2 and 3 DO NOT measure the strength of
the relationship, they simply refer to the categories.

Two actors are regularly equivalent for nominal data if in addition to the normal
regularity condition they relate to equivalent others in the same category.

The exact categorical REGE algorithm searches for matches in successive


neighborhoods. For binary data in the first iteration, vertices are classified as
sinks, sources or repeaters. At the next iteration the neighborhoods of all the
vertices are considered, two vertices would be classified differently if one
neighborhood contained a representative of one of these categories while the
other did not. The next iteration classifies the vertices in terms of the
neighborhood's classification in the previous iteration. The process continues
until stable (a maximum of n different categories are possible in a graph of n
vertices).

For nominal data the initial categories are included at the first iteration. The
process is easily extended to multiple relations.

From this procedure a similarity matrix can be formed with entries which give
the value of the iteration at which vertices were separated into different
categories.

Initially the procedure places all vertices in the same category; or into user
specified categories. Subsequent iterations split the groups into hierarchical
clusters.

PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Valued graph - integer
values. Multirelational.

Dataset with starting partition (if any):


A null return will initially place all vertices in a single cluster. For user specified
partition enter the name of a data file which contains a partition indicator matrix.
A partition indicator matrix has each row as a separate partition. Each row is of
the form (k1,k2,...,ki...) where ki assigns vertex i to partition ki so that (1 1 2 1 2)
assigns vertices 1, 2 and 4 to partition 1 and 3 and 5 to partition 2.

Convert data to geodesic distances: (Default = 'YES')


Yes performs the analysis on the geodesic distance matrix.
No uses the raw adjacencies.

Diagram Type: (Default = 'Dendrogram')


The clustering diagram can either be a Tree Diagram or a Dendrogram.
(Output) Equivalence matrix: (Default = 'EXCATREGEQUIV').
Name of file which contains actor by actor regular similarity matrix described in
LOG FILE.

Output partition matrix: (Default = 'EXCATREGPART').


Name of file which contains a partition indicator matrix corresponding to the
single link hierarchical clustering displayed in the LOG FILE. A value of k in a
row labeled i and column j means that vertex j is in partition k at level i. Vertex k
is always a member of partition k and is a representative label for the group. This
matrix is not displayed in the LOG FILE.

LOG FILE Single link hierarchical clustering dendrogram (or tree diagram) of the regular
similarity measure. The level at which any pair of actors are aggregated is the
point at which both can be reached by tracing from the start to the actors from
right to left. Each level corresponds to an iteration, level 1 represents the initial
clustering specified in PARAMETERS. The top level gives strict regular
equivalence clusters. The higher the level the greater the degree of regular
equivalence The diagram can be printed or saved. Parts of the diagram can be
viewed by moving the mouse to the split point in a tree diagram or the beginning
of a line in the dendrogram and clicking. The first click will highlight a portion of
the diagram and the second click will display just the highlighted portion. To
return to the original right click on the mouse. There is also a simple zoom
facility simply change the values and then press enter. If the labels need to be
edited (particularly the scale labels) then you should take the partition indicator
matrix into the spreadsheet editor remove or reduce the labels and then submit
the edited data to Tools>Dendrogram>Draw.

Behind the dendrogram is an alternative cluster diagram. The columns have been
rearranged and labeled. A '·' in row labeled i column label j indicates that vertex
j is in a singleton cluster at level i. An 'X' indicates that vertex j is in a non-trivial
cluster at level i, all other members of j's cluster are found by tracing along the
row labeled i in both directions from column j until a space is encountered in
each direction. The column labels corresponding to an 'X' which are connected
to j's X are all members of j's cluster at level i.

An actor by actor exact similarity matrix. A k in row i column j means that actor
i and j were separated at level k, provided k is less than the value on the diagonal.
If k is equal to the value on the diagonal then i and j are exactly regularly
equivalent.

TIMING O(N^3).

COMMENTS None.

REFERENCES Everett M G (1996) and S.P.Borgatti Exact colorations of graphs and digraphs.
Social Networks 18, 319-331.
NETWORK > ROLES & POSITIONS> EXACT > MAXSIM
PURPOSE Calculate a measure of approximate exact equivalence for valued data.

The measure is the Euclidean distance of independently sorted profiles. Binary


data is automatically converted to a distance matrix before analysis.

DESCRIPTION A coloring of a graph G is exact if whenever two vertices u and v of G are


colored the same they have the same color neighborhoods with exactly the same
number of each color.

The sorted profile of vertex i of a valued network is the row vector of i with the
elements placed in ascending order. The maxsim distance is the Euclidean
distance between the sorted profile of a pair of vertices. For directed data the
column profiles are automatically concatenated on to the row profiles.

Binary data is automatically converted to a reciprocal distance matrix so that the


i,jth entry contains the reciprocal of the distance between i and j.

PARAMETERS
Input dataset:
Name of file containing network to be analyzed. Data type: Valued graph. Binary
data is automatically converted to a reciprocal distance matrix.

Treat diagonal values as valid? (Default = NO).


If NO diagonals are ignored.

Diagram Type: (Default = 'Dendrogram')


The clustering diagram can either be a Tree Diagram or a Dendrogram.

Output dataset: (Default = 'MaxSim').


Name of file which will contain maxsim distance matrix.

LOG FILE Single link hierarchical clustering dendrogram (or tree diagram) of the maxsim
distance matrix. The level at which any pair of actors are aggregated is the point
at which both can be reached by tracing from the start to the actors from right to
left. The diagram can be printed or saved. Parts of the diagram can be viewed by
moving the mouse to the split point in a tree diagram or the beginning of a line in
the dendrogram and clicking. The first click will highlight a portion of the
diagram and the second click will display just the highlighted portion. To return
to the original right click on the mouse. There is also a simple zoom facility
simply change the values and then press enter. If the labels need to be edited
(particularly the scale labels) then you should take the partition indicator matrix
into the spreadsheet editor remove or reduce the labels and then submit the
edited data to Tools>Dendrogram>Draw.

Behind the plot is the actor by actor maxsim matrix. This is followed by an
alternative clustering diagram representing the same information as above. The
columns are rearranged and labeled. A '·' in column label j at level x means that
actor j is not in any cluster at level x. An x indicates that actor j is in a cluster at
this level together with those actors which can be traced across that row without
encountering a space.

TIMING O(N^3).
COMMENTS This algorithm is not suitable for data in which the values have low variance or
are sparse.

The algorithm (by Borgatti and Everett) is an adaptation of an algorithm due to


Everett and Borgatti (1988).

REFERENCES Everett M G (1985). 'Role similarity and complexity in social networks'. Social
Networks 7, 353-359.

Everett M G and Borgatti S P (1988). 'Calculating role similarities: An algorithm


that helps determine the orbits of a graph'. Social Networks 10, 71-91.
NETWORKS > ROLES & POSITIONS > MAXIMAL REGULAR > REGE
PURPOSE Compute a measure of regular equivalence using the standard REGE algorithm.

DESCRIPTION Two actors are regularly equivalent if they are equally related to equivalent
others. REGE is an iterative algorithm, within each iteration a search is
implemented to optimize a matching function.

The matching function between vertices i and j is based upon the following. For
each k in i's neighborhood search for an m in j's neighborhood of similar value.
A measure of similar values is based upon the absolute difference of magnitudes
of ties. This measure is then weighted by the degree of equivalence between k
and m at the previous iteration. It is this match that is optimized. This is
summed for all members of i's neighborhood over all relations and normalized to
provide the current iteration's measure of equivalence between i and j. The
procedure is repeated for all pairs of vertices for a fixed number of iterations.

The result of this iterative procedure is a symmetric similarity matrix which


provides a measure of regular equivalence. This matrix is automatically
submitted to a single link hierarchical clustering routine.

PARAMETERS
Input dataset:
Name of file containing data to be analyzed Data type: Valued graph.
Multirelational.
Undirected data will give a trivial result with all non-isolate vertices being
equivalent.

Maximum number of iterations: (Default = 3).


Number of iterations to be performed. Larger values increase the differentiation
between vertices. A value of 3 has often been used and is now customary.

Convert data to geodesic distances: (Default = NO).


YES performs the analysis on the valued distance matrix. If symmetric data is to
be analyzed then this option will provide a non-trivial analysis of the data.

Diagram Type: (Default = 'Dendrogram')


The clustering diagram can either be a Tree Diagram or a Dendrogram.

(Output) similarity matrix: (Default = 'Rege').


Name of file which contains REGE measure of regular equivalence described in
LOG FILE.

(Output) Partition Matrix: (Default = 'Regepart').


Name of file which contains a partition indicator matrix corresponding to the
single link hierarchical clustering displayed in the LOG FILE. A value of k in a
row labeled i and column j means that vertex j is in partition k at level i. Vertex k
is always a member of partition k and is a representative label for the group. This
matrix is not displayed in the LOG FILE.

LOG FILE Single link hierarchical clustering dendrogram (or tree diagram) of the regular
similarity measure. The level at which any pair of actors are aggregated is the
point at which both can be reached by tracing from the start to the actors from
right to left. The diagram can be printed or saved. Parts of the diagram can be
viewed by moving the mouse to the split point in a tree diagram or the beginning
of a line in the dendrogram and clicking. The first click will highlight a portion of
the diagram and the second click will display just the highlighted portion. To
return to the original right click on the mouse. There is also a simple zoom
facility simply change the values and then press enter. If the labels need to be
edited (particularly the scale labels) then you should take the partition indicator
matrix into the spreadsheet editor remove or reduce the labels and then submit
the edited data to Tools>Dendrogram>Draw.

Behind the dendrogram is an alternative cluster diagram. The columns have been
rearranged and labeled. A '·' in row labeled i column label j indicates that vertex
j is in a singleton cluster at level i. An 'X' indicates that vertex j is in a non-trivial
cluster at level i, all other members of j's cluster are found by tracing along the
row labeled i in both directions from column j until a space is encountered in
each direction. The column labels corresponding to an 'X' which are connected
to j's X are all members of j's cluster at level i.

An actor by actor REGE similarity matrix. Values vary between 0 and 100. A
value of 100 indicates strict regular equivalence.

TIMING O(N^5).

COMMENTS The values obtained for non-equivalent vertices are not robust measures of
equivalence. The number of iterations affects these values there is little
correlation between the values from one iteration to the next, even at the rank
order level. This situation is improved if the number of iterations are increased.

For these reasons users with binary or nominal data are advised to use
CATEGORICAL REGE

REFERENCES White D R (1984). REGE: A regular graph equivalence algorithm for computing
role distances prior to block modelling. Unpublished manuscript. University of
California, Irvine.

White D R and Reitz K P (1983). Graph and semi-group homomorphisms on


networks of relations. Social Networks 6, 193-235.
NETWORKS > ROLES > REGULAR EQUIVALENCE > CATREGE
PURPOSE Computes a single link hierarchical clustering and a measure of regular
equivalence for binary or nominal data using categorical REGE.

DESCRIPTION Two actors are regularly equivalent if they are equally related to equivalent
others. Nominal data is any integer valued adjacency matrix in which the value
represents a coding of the relationship in terms of a category.

For example, we could use 1 to represent close friend, 2 to represent friend and 3
to represent works with. The values 1, 2 and 3 DO NOT measure the strength of
the relationship, they simply refer to the categories.

Two actors are regularly equivalent for nominal data if in addition to the normal
regularity condition they relate to equivalent others in the same category.

The categorical REGE algorithm searches for matches in successive


neighborhoods. For binary data in the first iteration, vertices are classified as
sinks, sources or repeaters. At the next iteration the neighborhoods of all the
vertices are considered, two vertices would be classified differently if one
neighborhood contained a representative of one of these categories while the
other did not. The next iteration classifies the vertices in terms of the
neighborhood's classification in the previous iteration. The process continues
until stable (a maximum of n different categories are possible in a graph of n
vertices).

For nominal data the initial categories are included at the first iteration. The
process is easily extended to multiple relations.

From this procedure a similarity matrix can be formed with entries which give
the value of the iteration at which vertices were separated into different
categories.

Initially the procedure places all vertices in the same category; or into user
specified categories. Subsequent iterations split the groups into hierarchical
clusters.

PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Valued graph - integer
values. Multirelational.

Dataset with starting partition (if any):


A null return will initially place all vertices in a single cluster. For user specified
partition enter the name of a data file which contains a partition indicator matrix.
A partition indicator matrix has each row as a separate partition. Each row is of
the form (k1,k2,...,ki...) where ki assigns vertex i to partition ki so that (1 1 2 1 2)
assigns vertices 1, 2 and 4 to partition 1 and 3 and 5 to partition 2.

Convert data to geodesic distances: (Default = 'YES')


Yes performs the analysis on the geodesic distance matrix.
No uses the raw adjacencies.
Note for undirected data the partitioning would be trivial and in this case the YES
option should be selected.
Diagram Type: (Default = 'Dendrogram')
The clustering diagram can either be a Tree Diagram or a Dendrogram.

(Output) Equivalence matrix: (Default = 'CATREGEQUIV').


Name of file which contains actor by actor regular similarity matrix described in
LOG FILE.

Output partition matrix: (Default = 'CATREGPART').


Name of file which contains a partition indicator matrix corresponding to the
single link hierarchical clustering displayed in the LOG FILE. A value of k in a
row labeled i and column j means that vertex j is in partition k at level i. Vertex k
is always a member of partition k and is a representative label for the group. This
matrix is not displayed in the LOG FILE.

LOG FILE Single link hierarchical clustering dendrogram (or tree diagram) of the regular
similarity measure. The level at which any pair of actors are aggregated is the
point at which both can be reached by tracing from the start to the actors from
right to left. Each level corresponds to an iteration, level 1 represents the initial
clustering specified in PARAMETERS. The top level gives strict regular
equivalence clusters. The higher the level the greater the degree of regular
equivalence The diagram can be printed or saved. Parts of the diagram can be
viewed by moving the mouse to the split point in a tree diagram or the beginning
of a line in the dendrogram and clicking. The first click will highlight a portion of
the diagram and the second click will display just the highlighted portion. To
return to the original right click on the mouse. There is also a simple zoom
facility simply change the values and then press enter. If the labels need to be
edited (particularly the scale labels) then you should take the partition indicator
matrix into the spreadsheet editor remove or reduce the labels and then submit
the edited data to Tools>Dendrogram>Draw.

Behind the dendrogram is an alternative cluster diagram. The columns have been
rearranged and labeled. A '·' in row labeled i column label j indicates that vertex
j is in a singleton cluster at level i. An 'X' indicates that vertex j is in a non-trivial
cluster at level i, all other members of j's cluster are found by tracing along the
row labeled i in both directions from column j until a space is encountered in
each direction. The column labels corresponding to an 'X' which are connected
to j's X are all members of j's cluster at level i.

An actor by actor exact similarity matrix. A k in row i column j means that actor
i and j were separated at level k, provided k is less than the value on the diagonal.
If k is equal to the value on the diagonal then i and j are regularly equivalent.

TIMING O(N^3).

COMMENTS None.

REFERENCES Borgatti S P and Everett M G (1989). The class of all regular equivalences:
algebraic structure and computation. Social Networks 11, 65-88.

Borgatti S P and Everett M G (1993). Two algorithms for computing regular


equivalence, Social Networks 15, 361- 376.
NETWORKS > ROLES > MAX REGULAR > OPTIMIZATION
PURPOSE Optimizes a cost function which measures the degree to which a partition forms
regularly equivalent sets for binary data using a tabu search method.

DESCRIPTION Two actors are regularly equivalent if they are equally related to equivalent
others. Given a partition of a network then the partition divides the adjacency
matrix into matrix blocks. In a binary matrix the partition is regular if each
block either contains all zeros (a zero block) or at least one 1 in every row and
every column (a one-block). A measure of the extent to which a partition is
regular is therefore given by the minimum number of changes required to the
elements of the adjacency matrix to satisfy this criteria.

This cost function assumes that any block above a certain specified density will
be changed to a one-block and below this density to a zero-block. The routine
attempts to optimize this cost function to try and find the best partition of the
vertices into a specified number of blocks.

PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Digraph.

Number of blocks: (Default = 2).


Number of groups or blocks into which the vertices are to be assigned.

Are diagonal values valid: (Default = NO).


Whether diagonals are to be included in cost function.

Maximum # of iterations in a series: (Default = max(2,n/3)).


The algorithm starts from an arbitrary partition and attempts to decrease the cost
by taking the steepest descent. If the cost cannot be reduced then the algorithm
continues its search in the neighborhood of the current partition.

This search direction is a mildest ascent direction and from there new search
directions are explored. This exploration only continues for a fixed number of
iterations in a series. If no improvement is made after the fixed number of
iterations the algorithm terminates with the current minimum. Increasing the
parameter gives a more exhaustive and therefore slower search.

Length of time in penalty box: (Default = 10).


If the algorithm makes an ascending step then it is possible that the best possible
descending step is the reverse of the direction just taken. This parameter
prohibits a move along the reverse direction for a set number of steps.

The larger the value the more difficult it will be to come back to a previously
explored local minimum, however it will also be more difficult to explore the
vicinity of that minimum.

The default of 10 has been shown experimentally to be the most useful.

Number of random starts: (Default = 10 - 2logn).


The whole procedure is repeated with a different initial partition. The best of
these are then selected as a minimum.

Random Number Seed:


The random number seed generates the initial partition. UCINET generates a
different random number as default each time it is run. This number should be
changed if the user wishes to repeat the analysis with different initial
configurations. The range is 1 to 32000.

Cut off value for zero blocks: (Default = 0.010).


In evaluation of cost, blocks with density equal to or below this value will be
measured from zero-blocks and above this value from one-blocks.

Output Partition Dataset: (Default = 'RBMPart').


Name of output file which contains a partition indicator vector. This vector has
the form (k1,k2,...ki...) where ki assigns vertex i to block ki, so that (1 1 2 1 2)
assigns vertices 1, 2 and 4 to block 1 and, 3 and 5 to block 2. This vector is not
displayed in the LOG FILE.

Output Sets Dataset: (Default = 'RBMSets')


Name of the output file to contain the block by actor incidence matrix.

LOG FILE The value of the cost function or fit. A value of zero represents exact regular
equivalence.

List of blocks. Each block is labeled and is specified by the vertices it contains.

The blocked adjacency matrix. The rows and columns of the original adjacency
matrix are permuted into blocks. The adjacency matrix is displayed in terms of
the matrix blocks it contains.

TIMING Each iteration of the tabu search algorithm is O(N^2).

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the minima of the cost function. Even if successful
this result may still have a high value in which case the blocking may not
conform very closely to regular equivalence.

In addition there may be a number of alternative partitions which also produce


the minimum value; the algorithm does not search for additional solutions.
Finally it is possible that the routine terminates at a local minima and does not
locate the desired global minima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into the
reported blocks.

REFERENCES Glover F (1989). Tabu Search - Part I. ORSA Journal on Computing 1, 190-
206.

Glover F (1990). Tabu Search - Part II. ORSA Journal on Computing 2, 4-32.

Batagelj V, Doreian P and Ferligoj A (1992). An optimization approach to


regular equivalence. Social Networks 14, 121-135.
NETWORKS > PROPERTIES > TRANSITIVITY
PURPOSE Gives the density of transitive triples in a network. For valued networks the
density of transitive triples defined more generally is given.

DESCRIPTION Three vertices u,v,w taken from a directed graph are transitive if whenever vertex
u is connected to vertex v and vertex v is connected to vertex w then vertex u is
connected to vertex w. The density of transitive tripes is the number of triples
which are transitive divided by the number of paths of length 2, i.e. the number
of triples which have the potential to be transitive.

This definition can be extended to valued data. Strong transitivity occurs only if
the final edge is stronger than the two in the original path. This can be relaxed so
that the user can define the minimum value of the final edge (weak transitivity).
For distances transitivity can be defined in terms of the number of triples
satisfying the triangle inequality, and for probabilities in terms of the product of
probabilities of the edges.

PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Valued graph.

Type of transitivity: (Default = ADJACENCY)


Choices are:

Adjacency - A triple xik,xij,xjk is transitive if xik is 1 whenever xij and xjk are both
1.

Strong - A triple xik,xij,xjk is transitive if xik ³ min(xij,xjk).

Weak - A triple xik,xij,xjk is transitive if whenever min(xij,xjk) ³ s then xik ³ w for


user-specified s and w. s is the strong tie value and w the weak tie value.

Euclidean - A triple xik,xij,xjk is transitive if xik £ xij + xjk.

Stochastic - A triple xik,xij,xjk is transitive if xik ³ xij * xjk.

Min value of Strong tie:


Value of s for WEAK option described above.

Min value of Weak tie:


Value of w for WEAK option described above.

Output Dataset: (Default = 'Transitivity')


Name of file which will contain value of density of transitivity triples, where the
density is the number of transitive triples divided by the number of triples.

LOG FILE Number of non-vacuous transitive triples, number of triples, number of triples in
which i -j-k is a path, then the number of non-vacuous transitive triples expressed
as a percentage of number of triples and number of triples in which i -j-k is a
path.
TIMING O(N^3).

COMMENTS For valued data the following choices are recommended:

Similarities - STRONG, WEAK.


Distances, costs, dissimilarities - EUCLIDEAN.
Probabilities, correlations - STOCHASTIC.

REFERENCES None.
NETWORK > PROPERTIES > DENSITY
PURPOSE Calculate the density of a network or matrix.

DESCRIPTION The density of a binary network is the total number of ties divided by the total
number of possible ties. For a valued network it is the total of all values divided
by the number of possible ties. In this case the density gives the average value.
The routine will perform the analysis for non-square matrices.

PARAMETERS
Input dataset
Name of file containing dataset to be analyzed. Data type: Valued graph.

Utilize diagonal values: (Default = NO)


For square matrices NO means that diagonal entries are ignored.

Row partitioning / Blocking (if any):


The name of an Ucinet dataset.To partition the rows of the data matrix into
blocks, specify a blocking vector by giving the dataset name, a dimension and an
integer value. For example, to use the second row of a dataset called ATTRIB,
enter "ATTRIB ROW 2". The program will then read the second row of ATTRIB
and use that information to sort the rows of the matrix. All rows with identical
values on the criterion vector (i.e. the second row of attrib) will be placed in the
same block of the matrix. Densities will then be computed separately for each
block.

Column partitioning / Blocking (if any):


The name of an Ucinet dataset.To partition the rows of the data matrix into
blocks, specify a blocking vector by giving the dataset name, a dimension and an
integer value. For example, to use the second row of a dataset called ATTRIB,
enter "ATTRIB ROW 2". The program will then read the second row of ATTRIB
and use that information to sort the rows of the matrix. All rows with identical
values on the criterion vector (i.e. the second row of attrib) will be placed in the
same block of the matrix. Densities will then be computed separately for each
block.

Output densities: (Default = 'DENSITY')


Name of data file which will contain density.

Output standard deviations: (Default = 'DENSITYSD')


Name of data file which will contain the standard deviations.

Output pre-image: (Default = 'DENSITYMODEL')


Name of data file which will contain the pre-image matrix.

LOG FILE Density value for each relation.

TIMING O(N^2)

COMMENTS None.

REFERENCES None.
NETWORK>PROPERTIES>E-I INDEX

PURPOSE Calculate the E-I index of a partition of a network and perform a permutation
test to evaluate its significance.

DESCRIPTION Given a partition of a network into a number of mutually exclusive groups then
the E-I index is the number of ties external to the groups minus the number of
ties that are internal to the group divided by the total number of ties. This value
can range from 1 to -1, but for a given network density and group sizes its range
may be restricted and so it can be rescaled. The index is also calculated for each
group and for each individual actor. A permutation test is performed to see
whether the network E-I index is significantly higher or lower than expected.

PARAMETERS
Input Dataset
Name of UCINET dataset to analyzed. Data type: Valued Graph.

Attribute
The name of an UCINET dataset that contains a partition of the actors. To
partition the data matrix into groups specify a vector by giving the dataset name,
a dimension (either row or column) and an integer value. For example, to use the
second row of a dataset called ATTRIB, enter "ATTRIB ROW 2". The program
will then read the second row of ATTRIB and use that information to define the
groups. All actors with identical values on the criterion vector (i.e. the second
row of attrib) will be placed in the same group.

Number of random perms: (Default= 10000)


Number of permutations used in the permutation test.

Diagonal Values Valid (Default = 'NO')


Whether to include the diagonal values.

Random Number Seed


The random number seed sets off the random permutations. UCINET generates
a different random number as default each time it is run. This number should be
changed if the user wishes to repeat an analysis. The range is 1 to 32000.

Output Dataset (Default = IndE-I)


Name of UCINET file that contains the E-I index for each individual actor.

LOG FILE Recoding of the attribute vector used to partition the dataset followed by a
blocked density matrix corresponding to the groups.

A table which gives the whole network results, these include the frequencies in
the observed data followed by a column that gives these frequencies as a
percentage of the total number of ties in the data, the third column gives the
maximum possible given the group sizes, the final column headed density gives
the observed divided by the maximum possible for the internal and external ties
with the final entry in the E-I column giving the value of the E-I index if all the
observed ties had been evenly spread within and between the groups ie the
expected value. The important values from the table are then reproduced together
with the rescaled E-I index.
The results of the permutation test are presented in a table. The observed values
are repeated in column 1, the next 4 cols give the minimum, mean, maximum and
standard deviation derived from the permutation test. This is followed by the
number of times the random test obtains a value greater than or equal to the
observed and less than or equal to the observed. This are expressed as a
probability and can be used as p values.

A table with the group level ties and E-I index.

Finally a table with the individual ties and E-I index.

TIMING O(N)

COMMENTS None

REFERENCES Krackhardt, David and Robert N. Stern (1988). Informal networks and
organizational crises: an experimental simulation. Social Psychology Quarterly
51(2), 123-140.
NETWORK>PROPERTIES>CLUSTERING COEFFICIENT

PURPOSE Calculate the clustering coefficient of every actor and the clustering and
weighted clustering coefficient of the whole network.

DESCRIPTION The clustering coefficient of an actor is the density of its open neighborhood. The
overall clustering coefficient is the mean of the clustering coefficient of all the
actors. The weighted overall clustering coefficient is the weighted mean of the
clustering coefficient of all the actors each one weighted by its degree. This last
figure is exactly the same as the transitivity index of each transitive triple
expressed as a percentage of the triples in which there is a path from i to j. See
NETWORKS>PROPERTIES>TRANSITIVITY.
PARAMETERS
Input network dataset:
Name of file containing dataset to be analyzed. Data type: Digraph.

(output) Node-level coefficients (Default = 'ClusteringCoefficients')


Name of UCINET file that will contain the clustering coefficients for each actor
together with their degree.

LOG FILE The overall clustering coefficient and the weighted overall clustering coefficient.
A table with the actor level clustering coefficient together with their degree.

TIMING O(N^2)

COMMENTS None.

REFERENCES Watts D J (1999) Small worlds. Princeton University Press, Princeton, New
Jersey.
2-MODE>CATEGORICAL CORE/PERIPHERY
PURPOSE Uses a genetic algorithm to fit a core/periphery model to two mode data.

DESCRIPTION Simultaneously fits a core/periphery model to the data network, and identifies
which actors belong in the core and which belong in the periphery and which
events belong in the core and which events belong in the periphery. The rows and
columns are partitioned independently. The fit is simply the correlation between
the data matrix and an idealized structure matrix in which there is a one in the
core block interactions and a zero in the peripheral block interactions.

PARAMETERS
Input dataset:
Name of file containing two-mode network to be analyzed. Data type: Matrix.

Row Partition: (Default = 'rowCPpart')


Name of output file which contains a cluster indicator vector for the row
partition. This vector has the form (k1,k2,...ki...) where ki assigns vertex i to
cluster ki and ki is either 1 or 2 where 1 is the core and 2 is the periphery, so that
(1 1 2 1 2) assigns vertices 1, 2 and 4 to the core, and 3 and 5 to the periphery.
This vector is not displayed at output.

Column Partition: (Default = 'colCPpart')


Name of output file which contains a cluster indicator vector for the column
partition. This vector has the form (k1,k2,...ki...) where ki assigns vertex i to
cluster ki and ki is either 1 or 2 where 1 is the core and 2 is the periphery, so that
(1 1 2 1 2) assigns vertices 1, 2 and 4 to the core, and 3 and 5 to the periphery.
This vector is not displayed at output.

LOG FILE The starting and the final correlation of the ideal structure and the permuted
incidence matrix . A blocked incidence matrix dividing the actors and events
independently into the core and periphery.

TIMING O(N^2) per iteration.

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the maxima of the cost function. Even if successful
this result may still be a low value in which case the partition may not represent a
core/periphery model.

In addition there may be a number of alternative partitions which also produce


the maximum value; the algorithm does not search for additional solutions.
Finally it is possible that the routine terminates at a local maxima and does not
locate the desired global maxima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into a
core/periphery structure.

REFERENCES Borgatti SP and Everett M G (1999) Models of core/periphery structures. Social


Networks 21 375-395
Borgatti SP and Everett M G (1997) Network analysis of 2-mode data. Social
Networks 19 243-269
2-MODE > FACTIONS

PURPOSE Uses a genetic algorithm to simultaneously cluster rows and columns of a 2-


mode matrix.

DESCRIPTION Clusters rows and columns of a 2-mode matrix X by finding a pair of


corresponding 2-class partitions such that if row i and column j are in
corresponding classes, then we expect xij to be a large value. In contrast, if i and
j are not in corresponding classes, then we expect xij to be small. The fit is
simply the correlation between the data matrix and an idealized structure matrix
in which there are large values within classes and small values between classes.

PARAMETERS
Input dataset:
Name of file containing two-mode network to be analyzed. Data type: Matrix.

Row Partition: (Default = 'rowfactionspart')


Name of output file which contains a cluster indicator vector for the row
partition. This vector has the form (k1,k2,...ki...) where ki assigns vertex i to
cluster ki and ki is either 1 or 2. This vector is not displayed at output.

Column Partition: (Default = 'colfactionspart')


Name of output file which contains a cluster indicator vector for the column
partition. This vector has the form (k1,k2,...ki...) where ki assigns vertex i to
cluster ki and ki is either 1 or 2. This vector is not displayed at output.

LOG FILE The starting and the final correlation of the ideal structure and the permuted
incidence matrix . A blocked incidence matrix dividing the rows and columns
independently into two clusters each.

TIMING O(N^2) per iteration.

COMMENTS Care should be taken when using this routine.

The algorithm seeks to find the maxima of the cost function. Even if successful
this result may still be a low value in which case the partition may not have
found cohesive clusters.

In addition there may be a number of alternative partitions which also produce


the maximum value; the algorithm does not search for additional solutions.
Finally it is possible that the routine terminates at a local maxima and does not
locate the desired global maxima.

To test the robustness of the solution the algorithm should be run a number of
times from different starting configurations. If there is good agreement between
these results then this is a sign that there is a clear split of the data into
subgroups.

See Factions .

REFERENCES Borgatti SP and Everett M G (1997) Network analysis of 2-mode data. Social
Networks 19 243-269.
DL LANGUAGE
The DL Protocol is specified below the commands are given in blue followed by
a description of their usage. Examples of importing are given in the UCINET
users guide.

DL
DESCRIPTION Identifies the file as a Data Language file. This is a required command.

SYNTAX DL

COMMENTS Must be the first word in the data file.

DESCRIPTION Specifies the number of rows and columns in a matrix.

SYNTAX N = <integer>

COMMENTS Should be placed before any phrases that can only be interpreted if the number
of rows or columns is already known. For example, it should be placed before
any command regarding labels.

NR

DESCRIPTION Specifies the number of rows in a matrix.

SYNTAX NR = <integer>

COMMENTS Should be placed before any commands that depend on the number of rows,
such as the ROW LABELS command.

NM

DESCRIPTION Specifies the number of matrices in a dataset.

SYNTAX NM = <integer>

COMMENTS Should be placed before any commands that depend on the number of matrices,
such as the MATRIX LABELS command.
ROW LABELS:

DESCRIPTION Indicates the start of a series of row labels. The labels may be up to 18
characters in length (if longer they are truncated). They must be separated by
spaces, carriage returns, equal signs or commas. Labels with embedded spaces
are not advisable, but can be entered by surrounding the label in quotes (e.g.,
"Humpty Dumpty"). Labels are automatically converted to uppercase.

SYNTAX ROW LABELS:

COMMENTS Must not precede a dimension command like N or NR.

COLUMN LABELS:

DESCRIPTION Indicates the start of a series of column labels. The labels may be up to 18
characters in length (if longer they are truncated). They must be separated by
spaces, carriage returns, equal signs or commas. Labels with embedded spaces
are not advisable, but can be entered by surrounding the label in quotes (e.g.,
"Humpty Dumpty"). Labels are automatically converted to uppercase.

SYNTAX COLUMN LABELS:

COMMENTS Must not precede a dimension command like N or NC.

LABELS:

DESCRIPTION Indicates the start of a series of labels applicable to both the rows and the
columns. Warning: The matrix must be square! The labels may be up to 18
characters in length (if longer they are truncated). They must be separated by
spaces, carriage returns, equal signs or commas. Labels with embedded spaces
are not advisable, but can be entered by surrounding the label in quotes (e.g.,
"Humpty Dumpty"). Labels are automatically converted to uppercase.

SYNTAX LABELS:

COMMENTS Must not precede a dimension command like N.

MATRIX LABELS:

DESCRIPTION Signals the start of a series of matrix labels. The labels may be up to 19
characters in length (if longer they are truncated). They must be separated by
spaces, carriage returns, equal signs or commas. Labels with embedded spaces
are NOT advisable, but can be entered by surrounding the label in quotes (e.g.,
"Humpty Dumpty"). Labels are automatically converted to uppercase.

SYNTAX MATRIX LABELS:

COMMENTS Must not precede a dimension command like NM.

EMBEDDED

DESCRIPTION If present, this keyword always follows the word LABELS, as in ROW
LABELS EMBEDDED or LABELS EMBEDDED. It indicates that dimension
labels are found embedded in the data itself. For example in the case of ROW
LABELS EMBEDDED, it means the first item (up to a blank or comma) in
every line of the data is a row label. In the case of COL LABELS
EMBEDDED, it indicates that the first line of data should be treated as column
labels.

SYNTAX ROW LABELS EMBEDDED


COLUMN LABELS EMBEDDED
MATRIX LABELS EMBEDDED
LABELS EMBEDDED

COMMENTS None.

FORMAT

DESCRIPTION Identifies the layout of the data. The following formats are available:

FULLMATRIX. Indicates the data are in the form of a matrix. This is the default
format. Example (with DIAGONAL = PRESENT):

2110
1201
0020
1002

UPPERHALF. The data consist of the values xij where j > i or j ³ i. Only the
values in the upper right triangle of a square matrix are included. The diagonal
may or may not be included, depending on the value of the DIAGONAL
parameter. Example (with DIAGONAL = PRESENT):

2110
201
20
2

LOWERHALF. The data consist of the values xij where j < i or j £ i. Only the
values in the lower left triangle of a square matrix are included. The diagonal
may or may not be included, depending on the value of the DIAGONAL
parameter. Example (with DIAGONAL = PRESENT):

2
12
002
1002

NODELIST1. This is used to read 1/0 matrices only. Each line of data consists of
a row number (call it i) followed by a list of column numbers (call each one j)
such that xij = 1. For example, the following matrix

1110
1101
0000
1001

is coded this way:

1 321
4 14
2 241

NODELIST1B. This is used to read 1/0 matrices only. Each line of data
corresponds to a matrix row (call it i). The first number on the line is the number
of non-zero cells in that row. This is followed by a list of column numbers (call
each one j) such that xij = 1. For example, the following matrix

1110
1101
0000
1001

is coded this way:

3 123
3 124
0
2 14

Note that rows must appear in numerical order, and none may be skipped (unlike
the NODELIST1 format).

EDGELIST1. This format is used to read in data forming a matrix in which the
rows and columns refer to the same kinds of objects (e.g., an illness-by-illness
proximity matrix, or a person-by-person network). The 1-mode matrix X is built
from pairs of indices (a row and a column indicator). Pairs are typed one to a
line, with indices separated by spaces or commas. The presence of a pair i,j
indicates that there is a link from i to j, which is to say a non-zero value in xij.
Optionally, the pair may be followed by a value representing an attribute of the
link, such as its strength or quality. If no value is present, it is assumed to be 1.0.
If a pair is omitted altogether, it is assigned a value of 0.0. For example, the
following matrix,

0053
0000
0000
0100

is coded this way:

1 3 5.0
42
1 4 3.0

Node labels may be used instead of node numbers, as follows:

Amy Cathy 5
Denise Bonnie
Amy Denise 3

If the datafile includes a LABELS statement with the labels (Amy, Cathy,
Bonnie, Denise), in that order, the matrix will look like the matrix shown above.
However, if a LABELS statement is not present, then the program will assign
labels to rows/columns in the order in which they are encountered {Amy, Cathy,
Denise, Bonnie}. So the matrix will look like this:

0035
0000
0100
0000

If you do include labels as part of a LABELS statement, they must match the
labels in the data exactly. Otherwise, the labels in the data will be considered
additional nodes. Also, since the EDGELIST1 format automatically accepts
labels as part of the data, the LABELS=EMBEDDED statement is not necessary
(but doesn't hurt).

EDGELIST2. This is used to read in data forming a matrix in which the rows
and columns refer to different kinds of objects (e.g., illnesses and treatments).
The 2-mode matrix X is built from pairs of indices (a row and a column
indicator). Pairs are typed one to a line, with indices separated by spaces or
commas. The presence of a pair i,j indicates that there is a link from row i to
column j, which is to say a non-zero value in xij. Optionally, the pair may be
followed by a value representing an attribute of the link, such as its strength or
quality. If no value is present, it is assumed to be 1.0. If a pair is omitted
altogether, it is assigned a value of 0.0. For example, the following matrix,

64
35
79

is coded this way:

11 6
21 3
32 9
31 7
12 4

The row index is always given first, followed by the column index. Index labels
may be used instead of index numbers, as follows:

afghan size 6
beagle size 3
chow ferocity 9
chow size 7
afghan ferocity 4

For further details concerning labels, see the description of the EDGELIST1
format.

BLOCKMATRIX. This format is used to read highly structured matrices, such as


those representing simple models of real data. Values for blocks of adjacent cells
are given. For example, the matrix

21111000
12111000
11211000
11121000
11112000
00000211
00000121
00000112

is written like this:

rows 1 to 8
cols 1 to 8
value = 0
rows 1 to 5
cols 1 to 5
value = 1
rows 5 6 7 8
cols 5 to 8
value = 1
diagonal 0
value = 2

The first three lines of data assign a value of 0 to all cells in the matrix. The next
three lines, isolate the top left quadrant of the matrix and assign all cells a value
of 1. The next three lines do the same for the bottom right quadrant. The last two
lines give a value of 2 to every cell along the main or 0th diagonal.

The keywords ROWS, COLUMNS, VALUE, and DIAGONAL may be


abbreviated to the first letter. Lists of row and column indices may use
conventions like ALL, FIRST <n>, and LAST <n>.

PARTITION. This format is used to read collections of partitions, such as pile


sorts. For example, the equivalence matrices

1100
1100
0011
0011

and

1110
1110
1110
0001

are coded as follows:

dl n=4 nm=2 format=partition


data:
12
34
#
123
4

The first line of data ("1 2") indicates that items 1 and 2 belong in the same class
or pile. The second line indicates that 3 and 4 belong together. The pound sign
(#) separates one partition from another.

SYNTAX FORMAT = <keyword>


where <keyword> is one of the following:

FULLMATRIX|FM
UPPERHALF|UH
LOWERHALF|LH
NODELIST1|NL1
NODELIST2|NL2
NODELIST1B|NL1B
EDGELIST1|EL1
EDGELIST2|EL2
BLOCKMATRIX|BM
PARTITION|PT|PS|PR

The vertical bar separates alternative spellings.

COMMENTS None.

DIAGONAL

DESCRIPTION For square matrices, indicates whether the main diagonal is present or absent.
The default is present. If absent, the program expects that diagonal values will
have been omitted from the file. Example of a 4-by-4 matrix with no diagonal:

234
5 78
91 3
456

SYNTAX DIAGONAL = PRESENT|ABSENT

COMMENTS None.
BERNARD & KILLWORTH FRATERNITY

DATASET BFRAT

DESCRIPTION Two 58x58 matrices:

BKFRAB symmetric, valued.


BKFRAC non-symmetric, valued (rankings).

BACKGROUND Bernard & Killworth, later with the help of Sailer, collected five sets of data on
human interactions in bounded groups and on the actors' ability to recall those
interactions. In each study they obtained measures of social interaction among
all actors, and ranking data based on the subjects' memory of those interactions.
The names of all cognitive (recall) matrices end in C, those of the behavioral
measures in B.

These data concern interactions among students living in a fraternity at a West


Virginia college. All subjects had been residents in the fraternity from three
months to three years. BKFRAB records the number of times a pair of subjects
were seen in conversation by an "unobtrusive" observer (who walked through
the public areas of the building every fifteen minutes, 21 hours a day, for five
days). BKFRAC contains rankings made by the subjects of how frequently they
interacted with other subjects in the observation week. A value of 1 representing
no interaction up to a maximum value of 5.

REFERENCES Bernard H, Killworth P and Sailer L. (1980). Informant accuracy in social


network data IV. Social Networks, 2, 191-218.

Bernard H, Killworth P and Sailer L. (1982). Informant accuracy in social


network data V. Social Science Research, 11, 30-66.

Romney K and Weller S. (1984). Predicting informant accuracy from patterns of


recall among individuals. Social Networks, 6, 59-78.
BERNARD & KILLWORTH HAM RADIO
DATASET BKHAM

DESCRIPTION Two 44x44 matrices.

BKHAMB symmetric, valued.


BKHAMC non-symmetric, valued (rankings).

BACKGROUND Bernard & Killworth, later with the help of Sailer, collected five sets of data on
human interactions in bounded groups and on the actors' ability to recall those
interactions. In each study they obtained measures of social interaction among
all actors, and ranking data based on the subjects' memory of those interactions.
The names of all cognitive (recall) matrices end in C, those of the behavioral
measures in B.

BKHAMB records amateur HAM radio calls made over a one-month period, as
monitored by a voice-activated recording device. BKHAMC contains rankings
by the operators of how frequently they talked to other operators, judged
retrospectively at the end of the one-month sampling period. A value of 0
meaning no interaction up to a maximum of 9.

REFERENCES In addition to the references in the previous section, see:

Killworth B and Bernard H. (1976). Informant accuracy in social network data.


Human Organization, 35, 269-286.

Bernard H and Killworth P. (1977). Informant accuracy in social network data


II. Human Communication Research, 4, 3-18.

Killworth P and Bernard H. (1979). Informant accuracy in social network data


III. Social Networks, 2, 19-46.
BERNARD & KILLWORTH OFFICE

DATASET BKOFF

DESCRIPTION Two 40x40 matrices.

BKOFFB symmetric, valued.


BKOFFC non-symmetric, valued (rankings)

BACKGROUND Bernard & Killworth, later with the help of Sailer, collected five sets of data on
human interactions in bounded groups and on the actors' ability to recall those
interactions. In each study they obtained measures of social interaction among
all actors, and ranking data based on the subjects' memory of those interactions.
The names of all cognitive (recall) matrices end in C, those of the behavioral
measures in B.

These data concern interactions in a small business office, again recorded by an


"unobtrusive" observer. Observations were made as the observer patrolled a
fixed route through the office every fifteen minutes during two four-day periods.
BKOFFB contains the observed frequency of interactions; BKOFFC contains
rankings of interaction frequency as recalled by the employees over the two-
week period. The rankings go from 1 for the most frequent to 39 for the least
frequent.

REFERENCES See citations to the previous datasets.


BERNARD & KILLWORTH TECHNICAL

DATASET BKTEC

DESCRIPTION Two 34x34 matrices.

BKTECB symmetric, valued


BKTECC non-symmetric, valued (rankings).

BACKGROUND Bernard & Killworth, later with the help of Sailer, collected five sets of data on
human interactions in bounded groups and on the actors' ability to recall those
interactions. In each study they obtained measures of social interaction among all
actors, and ranking data based on the subjects' memory of those interactions. The
names of all cognitive (recall) matrices end in C, those of the behavioral
measures in B.

These data concern interactions in a technical research group at a West Virginia


university. BKTECB contains a frequency record of interactions, made by an
observer every half-hour during one five-day work week. BKTECC contains the
personal rankings of the remembered frequency of interactions in the same
period. The rankings go from 1 for the most frequent up to 36 for the least
frequent.

REFERENCES See citations to the previous datasets.


DAVIS SOUTHERN CLUB WOMEN

DATASET DAVIS

DESCRIPTION One 18x14 matrix, binary.

BACKGROUND These data were collected by Davis et al in the 1930s. They represent observed
attendance at 14 social events by 18 Southern women. The result is a person-by-
event matrix: cell (i,j) is 1 if person i attended social event j, and 0 otherwise.

REFERENCES Breiger R. (1974). The duality of persons and groups. Social Forces, 53, 181-
190.

Davis, A et al. (1941). Deep South. Chicago: University of Chicago Press.


OPTIONS > LOGFILE OPTIONS

PURPOSE Toggle output made from over-write to append; or change the name of the log
file.

DESCRIPTION The output from running any routine is placed in an ASCII file called
OUTPUT.LOG. It is this file that is used in all of the commands under the menu
heading OUTPUT. This file is usually over-written each time a new routine is
run, UCINET does allow the user to append each run to this file and therefore
keep a complete log of all output.

PARAMETERS
LOG FILE OVERWRITES or APPENDS: (Default = OVERWRITE).
OVERWRITE causes the current contents of the log file to be deleted each time
a new option from the menu is run.

APPEND causes the output from each procedure to be added to the log file.

LOG FILE None.

TIMING Constant.

COMMENTS None.

REFERENCES None.
GAGNON & MACRAE PRISON

DATASET PRISON

DESCRIPTION One 67x67 matrix, non-symmetric, binary.

BACKGROUND In the 1950s John Gagnon collected sociometric choice data from 67 prison
inmates. All were asked, "What fellows on the tier are you closest friends with?"
Each was free to choose as few or as many "friends" as he desired. The data were
analyzed by MacRae and characterized by him as "less clear cut" in their internal
structure than similar data from schools or residential populations.

REFERENCE MacRae J. (1960). Direct factor analysis of sociometric data. Sociometry, 23,
360-371.
KAPFERER MINE

DATASET KAPMINE

DESCRIPTION Two 15x15 matrices

KAPFMM symmetric, binary.


KAPFMU symmetric, binary.

BACKGROUND Bruce Kapferer (1969) collected data on men working on the surface in a mining
operation in Zambia (then Northern Rhodesia). He wanted to account for the
development and resolution of a conflict among the workers. The conflict
centered on two men, Abraham and Donald; most workers ended up supporting
Abraham.

Kapferer observed and recorded several types of interactions among the workers,
including conversation, joking, job assistance, cash assistance and personal
assistance. Unfortunately, he did not publish these data. Instead, the matrices
indicate the workers joined only by uniplex ties (based on one relationship only,
KAPFMU) or those joined by multiple-relation or multiplex ties (KAPFMM).

REFERENCES Kapferer B. (1969). Norms and the manipulation of relationships in a work


context. In J Mitchell (ed), Social networks in urban situations. Manchester:
Manchester University Press.

Doreian P. (1974). On the connectivity of social networks. Journal of


Mathematical Sociology, 3, 245-258.
KAPFERER TAILOR SHOP

DATASET KAPTAIL

DESCRIPTION Four 39x39 matrices

KAPFTS1 symmetric, binary


KAPFTS2 symmetric, binary
KAPFTI1 non-symmetric, binary
KAPFTI2 non-symmetric, binary

BACKGROUND Bruce Kapferer (1972) observed interactions in a tailor shop in Zambia (then
Northern Rhodesia) over a period of ten months. His focus was the changing
patterns of alliance among workers during extended negotiations for higher
wages.

The matrices represent two different types of interaction, recorded at two


different times (seven months apart) over a period of one month. TI1 and TI2
record the "instrumental" (work- and assistance-related) interactions at the two
times; TS1 and TS2 the "sociational" (friendship, socioemotional) interactions.

The data are particularly interesting since an abortive strike occurred after the
first set of observations, and a successful strike took place after the second.

REFERENCE Kapferer B. (1972). Strategy and transaction in an African factory. Manchester:


Manchester University Press.
KNOKE BUREAUCRACIES

DATASET KNOKBUR

DESCRIPTION Two 10x10 matrices.

KNOKM non-symmetric, binary.


KNOKI non-symmetric, binary.

BACKGROUND In 1978, Knoke & Wood collected data from workers at 95 organizations in
Indianapolis. Respondents indicated with which other organizations their own
organization had any of 13 different types of relationships.

Knoke and Kuklinski (1982) selected a subset of 10 organizations and two


relationships. Money exchange is recorded in KNOKM, information exchange
in KNOKI. See Knoke & Kuklinski (1982) for details.

REFERENCES Knoke D. and Wood J. (1981). Organized for action: Commitment in voluntary
associations. New Brunswick, NJ: Rutgers University Press.

Knoke D. and Kuklinski J. (1982). Network analysis, Beverly Hills, CA: Sage.
KRACKHARDT OFFICE CSS

DATASET KRACKAD non-symmetric, binary.


KRACKFR symmetric, binary.

DESCRIPTION Each file contains twenty-one 21x21 matrices. Matrix n gives actor n's
perception of the whole network.

BACKGROUND David Krackhardt collected cognitive social structure data from 21 management
personnel in a high-tech, machine manufacturing firm to assess the effects of a
recent management intervention program. The relation queried was "Who does
X go to for advice and help with work?" (KRACKAD) and "Who is a friend of
X?" (KRACKFR). Each person indicated not only his or her own advice and
friendship relationships, but also the relations he or she perceived among all
other managers, generating a full 21 by 21 matrix of adjacency ratings from each
person in the group.

REFERENCE Krackhardt D. (1987). Cognitive social structures. Social Networks, 9, 104-134.


NEWCOMB FRATERNITY

DATASET NEWFRAT

DESCRIPTION Fifteen 17x17 matrices.

NEWC0 - NEWC15 (except NEWC9) non-symmetric, valued (rankings).

BACKGROUND These 15 matrices record weekly sociometric preference rankings from 17 men
attending the University of Michigan in the fall of 1956; data from week 9 are
missing. A "1" indicates first preference, and no ties were allowed.

The men were recruited to live in off-campus (fraternity) housing, rented for
them as part of the Michigan Group Study Project supervised by Theodore
Newcomb from 1953 to 1956. All were incoming transfer students with no prior
acquaintance of one another.

REFERENCES Newcomb T. (1961). The acquaintance process. New York: Holt, Reinhard &
Winston.

Nordlie P. (1958). A longitudinal study of interpersonal attraction in a natural


group setting. Unpublished doctoral dissertation, University of Michigan.

White H., Boorman S. and Breiger R. (1977). Social structure from multiple
networks, I. Blockmodels of roles and positions. American Journal of Sociology,
81, 730-780.
PADGETT FLORENTINE FAMILIES

DATASET PADGETT and PADGW

DESCRIPTION PADGETT

Two 16x16 matrices:

PADGB symmetric binary


PADGM symmetric binary

PADGW

One 16x3 matrix, valued.

BACKGROUND Breiger & Pattison (1986), in their discussion of local role analysis, use a subset
of data on the social relations among Renaissance Florentine families (person
aggregates) collected by John Padgett from historical documents. The two
relations are business ties (PADGB - specifically, recorded financial ties such as
loans, credits and joint partnerships) and marriage alliances (PADGM).

As Breiger & Pattison point out, the original data are symmetrically coded. This
is acceptable perhaps for marital ties, but is unfortunate for the financial ties
(which are almost certainly directed). To remedy this, the financial ties can be
recoded as directed relations using some external measure of power - for
instance, a measure of wealth. PADGW provides information on (1) each
family's net wealth in 1427 (in thousands of lira); (2) the number of priorates
(seats on the civic council) held between 1282-1344; and (3) the total number of
business or marriage ties in the total dataset of 116 families (see Breiger &
Pattison (1986), p 239).

Substantively, the data include families who were locked in a struggle for
political control of the city of Florence in around 1430. Two factions were
dominant in this struggle: one revolved around the infamous Medicis (9), the
other around the powerful Strozzis (15).

REFERENCES Breiger R. and Pattison P. (1986). Cumulated social roles: The duality of persons
and their algebras. Social Networks, 8, 215-256.

Kent D. (1978). The rise of the Medici: Faction in Florence, 1426-1434. Oxford:
Oxford University Press.
READ HIGHLAND TRIBES

DATASET GAMA

DESCRIPTION Two 16-by-16 matrices

GAMAPOS symmetric, binary


GAMANEG symmetric, binary.

BACKGROUND Hage & Harary (1983) use the Gahuku-Gama system of the Eastern Central
Highlands of New Guinea, described by Read (1954), to illustrate a clusterable
signed graph. Read's ethnography portrayed an alliance structure among three
tribal groups containing balance as a special case; among Gahuku-Gama the
enemy of an enemy can be either a friend or an enemy.

The signed graph has been split into two matrices: GAMAPOS for alliance
("rova") relations, GAMANEG for antagonistic ("hina") relations. To reconstruct
the signed graph, multiply GAMANEG by -1, and add the two matrices.

REFERENCES Hage P. and Harary F. (1983). Structural models in anthropology. Cambridge:


Cambridge University Press. (See p 56-60).

Read K. (1954). Cultures of the central highlands, New Guinea. Southwestern


Journal of Anthropology, 10, 1-43.
ROETHLISBERGER & DICKSON BANK WIRING ROOM

DATASET WIRING

DESCRIPTION Six 14x14 matrices

RDGAM symmetric, binary


RDCON symmetric, binary
RDPOS symmetric, binary
RDNEG symmetric, binary
RDHLP non-symmetric, binary
RDJOB non-symmetric, valued.

BACKGROUND These are the observational data on 14 Western Electric (Hawthorne Plant)
employees from the bank wiring room first presented in Roethlisberger &
Dickson (1939). The data are better known through a scrutiny made of the
interactions in Homans (1950), and the CONCOR analyses presented in Breiger
et al (1975).

The employees worked in a single room and include two inspectors (I1 and I3),
three solderers (S1, S2 and S3), and nine wiremen or assemblers (W1 to W9).
The interaction categories include: RDGAM, participation in horseplay;
RDCON, participation in arguments about open windows; RDPOS, friendship;
RDNEG, antagonistic (negative) behavior; RDHLP, helping others with work;
and RDJOB, the number of times workers traded job assignments.

REFERENCES Breiger R., Boorman S. and Arabie P. (1975). An algorithm for clustering
relational data with applications to social network analysis and comparison with
multidimensional scaling. Journal of Mathematical Psychology, 12, 328-383.

Homans G. (1950). The human group. New York: Harcourt-Brace.

Roethlisberger F. and Dickson W. (1939). Management and the worker.


Cambridge: Cambridge University Press.
SAMPSON MONASTERY

DATASET SAMPSON

DESCRIPTION Ten 18x18 matrices

SAMPLK1 non-symmetric, valued (rankings)


SAMPLK2 non-symmetric, valued (rankings)
SAMPLK3 non-symmetric, valued (rankings)
SAMPDLK non-symmetric, valued (rankings)
SAMPES non-symmetric, valued (rankings)
SAMPDES non-symmetric, valued (rankings)
SAMPIN non-symmetric, valued (rankings)
SAMPNIN non-symmetric, valued (rankings)
SAMPPR non-symmetric, valued (rankings)
SAMPNPR non-symmetric, valued (rankings)

BACKGROUND Sampson recorded the social interactions among a group of monks while
resident as an experimenter on vision, and collected numerous sociometric
rankings. The labels on the data have the abbreviated names followed by the
codings used by Breiger and Boorman in all their work. During his stay, a
political "crisis in the cloister" resulted in the expulsion of four monks (Nos. 2,
3, 17, and 18) and the voluntary departure of several others - most immediately,
Nos. 1, 7, 14, 15, and 16. (In the end, only 5, 6, 9, and 11 remained). All the
numbers used refer to the Boorman and Breiger numbering and are not row or
column labels. Hence in the end Bonaventure, Berthold, Ambrose and Louis all
remanied.

Most of the present data are retrospective, collected after the breakup occurred.
They concern a period during which a new cohort entered the monastery near
the end of the study but before the major conflict began. The exceptions are
"liking" data gathered at three times: SAMPLK1 to SAMPLK3 - that reflect
changes in group sentiment over time (SAMPLK3 was collected in the same
wave as the data described below). Information about the senior monks was not
included.

Four relations are coded, with separate matrices for positive and negative ties on
the relation. Each member ranked only his top three choices on that tie. The
relations are esteem (SAMPES) and disesteem (SAMPDES), liking (SAMPLK)
and disliking (SAMPDLK), positive influence (SAMPIN) and negative
influence (SAMPNIN), praise (SAMPPR) and blame (SAMPNPR). In all
rankings 3 indicates the highest or first choice and 1 the last choice. (Some
subjects offered tied ranks for their top four choices).

REFERENCES Breiger R., Boorman S. and Arabie P. (1975). An algorithm for clustering
relational data with applications to social network analysis and comparison with
multidimensional scaling. Journal of Mathematical Psychology, 12, 328-383.

Sampson, S. (1969). Crisis in a cloister. Unpublished doctoral dissertation,


Cornell University.
SCHWIMMER TARO EXCHANGE

DATASET TARO

DESCRIPTION One 22-by-22 matrix, symmetric, binary.

BACKGROUND These data represent the relation of gift-giving (taro exchange) among 22
households in a Papuan village. Hage & Harary (1983) used them to illustrate a
graph Hamiltonian cycle. Schwimmer points out how these ties function to
define the appropriate persons to mediate the act of asking for or receiving
assistance among group members.

REFERENCES Hage P. and Harary F. (1983). Structural models in anthropology. Cambridge:


Cambridge University Press.

Schwimmer E. (1973). Exchange in the social structure of the Orokaiva. New


York: St Martins.
STOKMAN-ZIEGLER CORPORATE INTERLOCKS

DATASET SZCID, SZCIG

DESCRIPTION SZCID: One 16x16 matrix, symmetric, valued.

SZCIG: One 15-by-15 matrix, symmetric, valued.

BACKGROUND These data come from a six-year research project, concluded in 1976, on
corporate power in nine European countries and the United States. Each matrix
represents corporate interlocks among the major business entities of two
countries - the Netherlands (SZCID) and West Germany (SZCIG).

The volume describing this study, referenced below, includes six chapters on
network theoretical and analytical issues related to data of this type.

REFERENCES Ziegler R., Bender R. and Biehler H. (1985). Industry and banking in the
German corporate network. In F. Stokman, R. Ziegler & J. Scott (eds), Networks
of corporate power. Cambridge: Polity Press, 1985.

Stokman F., Wasseur F. and Elsas D. (1985). The Dutch network: Types of
interlocks and network structure. In F. Stokman, R. Ziegler & J. Scott (eds),
Networks of corporate power. Cambridge: Polity Press, 1985.
THURMAN OFFICE

DATASET THUROFF

DESCRIPTION Two 15x15 matrices

THURA non-symmetric, binary


THURM symmetric, binary

BACKGROUND Thurman spent 16 months observing the interactions among employees in the
overseas office of a large international corporation. During this time, two major
disputes erupted in a subgroup of fifteen people. Thurman analyzed the outcome
of these disputes in terms of the network of formal and informal associations
among those involved.

THURA shows the formal organizational chart of the employees and THURM
the actors linked by multiplex ties.

REFERENCE Thurman B. (1979). In the office: Networks and coalitions. Social Networks, 2,
47-63.
WOLFE PRIMATES

DATASET WOLF, WOLFI

DESCRIPTION WOLF: Two 20x20 matrices

WOLFK non-symmetric, binary


WOLFN symmetric, valued.

WOLFI: One 20x4 matrix, valued.

BACKGROUND These data represent 3 months of interactions among a troop of monkeys,


observed in the wild by Linda Wolfe as they sported by a river in Ocala, Florida.
Joint presence at the river was coded as an interaction and these were summed
within all pairs (WOLFN).

WOLFK indicates the putative kin relationships among the animals: 18 may be
the granddaughter of 19. WOLFI contains four columns of information about the
individual animals: (1) ID number of the animal; (2) age in years; (3) sex; (4)
rank in the troop.
ZACHARY KARATE CLUB

DATASET ZACHARY

DESCRIPTION Two 34x34 matrices.

ZACHE symmetric, binary.


ZACHC symmetric, valued.

BACKGROUND These are data collected from the members of a university karate club by Wayne
Zachary. The ZACHE matrix represents the presence or absence of ties among
the members of the club; the ZACHC matrix indicates the relative strength of
the associations (number of situations in and outside the club in which
interactions occurred).

Zachary (1977) used these data and an information flow model of network
conflict resolution to explain the split-up of this group following disputes among
the members.

REFERENCE Zachary W. (1977). An information flow model for conflict and fission in small
groups. Journal of Anthropological Research, 33, 452-473.
KRACKHARDT HIGH-TECH MANAGERS
DATASET Krack-High-Tec, High-Tec-Attributes

DESCRIPTION Krack-High-Tec Three 21x21matrices

ADVICE non-symmetric, binary.


FRIENDSHIP non-symmetric, binary.
REPORTS_TO non-symmetric, binary.

High-Tec-Attributes One 21x4 valued matrix.

BACKGROUND These are data collected from the managers of a high-tec company. The company
manufactured high-tech equipment on the west coast of the United States and had
just over 100 employees with 21 managers. Each manager was asked to whom do
you go to for advice and who is your friend, to whom do you report was taken
from company documents. In addition attribute information was collected. This
consisted of the managers age (in years), length of service or tenure (in years),
level in the corporate hierarchy (coded 1,2 and 3; 1=CEO, 2 = Vice President, 3 =
manager) and department (coded 1,2,3,4 with the CEO in department 0 ie not in a
department). This data is used by Wasserman and Faust in their network analysis
book.

REFERENCES Krackhardt D. (1987). Cognitive social structures. Social Networks, 9, 104-134.

Wasserman S and K Faust (1994). Social Network Analysis: Methods and


Applications.Cambridge University Press, Cambridge.
FREEMAN'S EIES DATA

DATASET Freeman's_EIES, Freeman's_EIES_Attribute

DESCRIPTION Freeman's_EIES Three 34x34 matrices

TIME_1 non-symmetric, valued.


TIME_2 non-symmetric, valued.
NUMBER_OF_MESSAGES non-symmetric, valued.

Freeman's_EIES_Attribute One 34x2 valued matrix.

BACKGROUND This data arose from an early experiment on computer mediated communication.
Fifty academics interested in interdisciplinary research were allowed to contact
each other via an Electronic Information Exchange System (EIES). The data
collected consisted of all messages sent plus acquaintance relationships at two
time periods (collected via a questionnaire).The data includes the 32 actors who
completed the study. In addition attribute data on primary discipline and number
of citations was recorded. TIME_1 and TIME_2 give the acquaintance
information at the beginning and end of the study. This is coded as follows: 4 =
close personal fiend, 3= friend, 2= person I've met, 1 = person I've heard of but
not met, and 0 = person unknown to me (or no reply). NUMBER_OF
MESSAGES is the total number of messages person i sent to j over the entire
period of the study. The attribute data gives the number of citations of the actors
work in the social science citation index at the beginning of the study together
with a discipline code: 1 = Sociology, 2 = Anthropology, 3 =
Mathematics/Statistics, 4 = other. This data is used by Wasserman and Faust in
their network analysis book.

REFERENCES Freeman, S C and L C Freeman (1979). The networkers network: A study of the
impact of a new communications medium on sociometric structure. Social
Science Research Reports No 46. Irvine CA, University of California.

Wasserman S and K Faust (1994). Social Network Analysis: Methods and


Applications.Cambridge University Press, Cambridge.
COUNTRIES TRADE DATA
DATASET Trade, Trade_Attribute

DESCRIPTION Trade Five 24x24 matrices

MANUFACTURED_GOODS non-symmetric, binary.


FOODS non-symmetric, binary.
CRUDE_MATERIALS non-symmetric, binary.
MINERALS non-symmetric, binary.
DIPLOMATIC_EXCHANGE non-symmetric, binary.

Trade_Attribute One 24x4 valued matrix.

BACKGROUND This data has been selected by Wasserman and Faust (1994) from a list of 63
countries given by Smith and White (1988). The selection was intended to be a
representative sample of countries which spanned the globe physically,
economically and politically and was used by them in their network analysis
book. The data records interaction of the countries with respect to trade of four
goods, namely:manufactured goods, food and live animals, crude materials (not
food) and minerals and fuels. The final matrix records exchange of diplomats
between the countries. All trade (including the diplomats) is from the row to the
column. The Trade_Attribute data lists average population growth between 1970
and 1981, average GNP growth (per capita) over the same period, secondary
school enrollment ratio in 1981, and energy consumption in 1981 (in kilo coal
equivalents per capita).

REFERENCES Smith D and D White (1988). Structure and dynamics of the global economy:
Network analysis of international trade 1965-1980. Unpublished Manuscript.

Wasserman S and K Faust (1994). Social Network Analysis: Methods and


Applications.Cambridge University Press, Cambridge.
CAMP 92

DATASET CAMP92

DESCRIPTION One 18x18 valued matrix (rankings)

BACKGROUND These data were collected by Steve Borgatti, Russ Bernard, Bert Pelto and Gery
Ryan at the 1992 NSF Summer Institute on Research Methods in Cultural
Anthropology. This was a 3 week course given to 14 carefully selected
participants. Network data were collected at the end of each week. These data
were collected at the end of the second week. The data were collected by placing
each person's name on a card and asking each respondent to sort the cards in
order of how much interaction they had with that person since the beginning of
the course (known informally as "camp"). This results in rank order data in which
a "1" indicates the most interaction while a "17" indicates the least interaction.

REFERENCES None
GALASKIEWICZ'S CEO'S AND CLUBS

DATASET Galask

DESCRIPTION One 26x15 affiliation matrix

BACKGROUND This data gives the affiliation network of 26 CEO's and their spouses of major
corporations and banks in the Minneapolis area to 15 clubs, corporate and
cultural boards. Membership was during the period 1978-1981. This data is used
by Wasserman and Faust.

REFERENCES Galaskiewicz J (1985). Social Organization of an Urban Grants Economy. New


York. Academic Press.

Wasserman S and K Faust (1994). Social Network Analysis: Methods and


Applications.Cambridge University Press, Cambridge.

You might also like