Transformative Approaches Project
Visualization: Graphics environment for exploring
Transformative Approaches Project |
1. Software challenge
This note is concerned with presentations of information which will
be possible once a particular computer software problem has been solved.
The problem can be illustrated by three examples:
(a) Traffic network mapping: If a database contained entries
on 300 subway stations (or airports, or bus stops) and their direct route
links to one another, what is required is a software package to construct
one or more possible maps of the resulting network. The important point
is to be able to optimize the comprehensibility of such maps with minimum
manual intervention in the construction process.
(b) Hypercard stack mapping: With the widely acclaimed introduction
of the Apple hypercard, whereby complex networks of relationships between
database records can be handled, the problem remains of mapping the pattern
of relationships in the resulting hypercard stack. The individual entries
may be said to constitute "data", but it is the pattern of relationships
between them which constitutes "knowledge" and "intelligence"
(c) Mind-mapping: This is a technique currently being strongly
promoted in management training and time-management courses. It consists
of manually drawing circles to represent key ideas, objectives or activities
and then interlinking in a network of relationships. There is a clear need
for a software package to facilitate this process. This could take the
form of a non-hierarchical form of the standard outline package to manage
chapter headings of a report, in which the graphic element is emphasized.
There are some resemblances to project scheduling software except that
here the emphasis is on relating concepts.
(d) Comment: Consider a relational database with records consisting
of subway stations and indications of which station was directly connected
to which other stations (and possibly on what "line").
The core problem is how to obtain/adapt/develop software which would
generate one or more maps of the subway station network. The principal
constraint is that the map should be comprehensible. It is neither required
nor desirable that the map should be constrained by some equivalent to
"topographic" constraints (namely the position of the stations
should not be determined by some form of geographic coordinates). Rather
the requirement is that the positions should be determined topologically
and mapped, at least for immediate purposes, onto a two-dimensional surface.
There are additional problems which can be treated at lower levels of
priority, if at all. They include:
- A second problem is that the database in fact contains over 10,000
nodes and ways must be found to segment the network (possibly filtering
out lower levels of detail) so that maps for individual segments can be
interrelated. Such maps, in hardcopy form, will be bound together in a
book to form an "atlas".
- A third problem is that it is desirable that there should be some editorial
interaction with the map to improve its visual quality.
- A fourth problem is that it is desirable that it be possible to update
the data base by introducing changes interactively to the map.
- A fifth problem is to open the way to using the map as a menu via which
the database can be queried for information on the nodes.
2. Constraints and possibilities
(a) Conventional approach: The conventional approach to databases,
and to the reference books produced from them, is to focus on individual
entries. The user is not assisted in understanding the relationships between
entries, other than by fairly crude grouping of entries into categories.
(b) Hypertext approach: With the development of interactive databases,
hypertext (plus the new hypercard approach of Apple) and CD/ROM, data entries
can be organized so that they cross-reference one another to a high degree
and in a non-hierarchical manner.
For example, the current Yearbook of International Organizations
(1994/95) has over 30,000 organizations with some 80,000 relationships
between them (with the major organizations having an average of 70 each)
and with a further 192,552 links to membership countries. This Encyclopedia
provides an equivalent challenge with some 10,000 world problems linked
by 120,000 relationships between them. In database terms this is a major
step towards what is being called hypertext. Both publications are maintained
on a computer network from which a CD-ROM version is being produced.
(c) User need for "maps": Because of the overwhelming
volume of data, users need "maps" of the pathway between entries,
especially in complex subject areas. Such maps provide a sense of context
which is lost in many hierarchical presentations of data in linear text
form. It is only from such maps that users can quickly obtain an adequate
overview of data in an unfamiliar area to guide their efficient use of
conventional information tools. Such maps are of value precisely because
they are richer than simple hierarchically structured thesauri.
(d) Editorial need for a graphic interface: In preparing such
publications, editorial researchers need to be able to graphically represent
the networks of relationships they are endeavouring to clarify. This is
in part strongly related to mind-mapping. Without such a tool, editors
have to produce extensive mind maps in manual form before building up or
modifying the network of relationships. Ideally it should be possible to
communicate such maps to key resource people to obtain insights which are
not so easily indicated in normal text presentations. Interesting examples
of such graph displays, prepared manually, do exist.
(e) Existing techniques: Computer hardware and software for the
construction and manipulation of such networks of relationships have only
been developed for specific applications such as in chemistry, architecture
and engineering (CAD), or electronic circuit board design (PCB). It would
be possible to develop similar software to display relationships between
A number of software packages have been developed, especially for Apple
machines, which go some of the way towards the product required. These
include MORE and INSPIRATION. The disadvantage of these products is that
they have primarily been designed to work around a core concept (a "main
idea") which is the point of departure for a hierarchical structure.
This does not correspond to the essentially non-hierarchical presentation
(f) Atlas production: Once such maps can be successfully produced
and manipulated, computer tapes can be made to drive photocomposition machines
(with vector generators). These make high quality maps. Alternatively such
maps could be generated by standard graph plotters into camera-ready form.
A series of such maps, with facing explanatory text and/or mini-index,
may then be bound together as an "atlas".
Maps would be designed to cover clusters of organizations and/or problems
in a given subject or geographical area. They would have the advantage
of provoking input of new organizations and/or relationships when used
in the form of proofs. They also have important didactic uses. Enlargements
of the maps could also be distributed as wall-charts.
3. Software "modules"
(a) Relational database: The data is currently held and maintained
in an Advanced Revelation database (version 1.16) running on a Novell 3.11
network. The database has been specially developed as a text database with
facilities to manage networks of relationships between the records. It
is desirable that when the data is displayed in map form, interactive changes
to the map should be carried back as updates to the database. But since
the prime requirement is for publishable hardcopy maps, this requirement
may be sacrificed in the short term.
(b) Map design: Several approaches may be taken to the problem
of map design:
- (i) Network analysis This uses specialized extensions of sociometrics
to take data of the type described above and to position the elements in
relation to each other on the basis of various measures of distance, with
those most connected tending to be placed at the centre of a network and
those least connected at the periphery. The advantage of this approach
is that it endeavours to mirror the network on the basis of its internal
characteristics. A number of software packages exist to perform the necessary
computations. Various ways of describing a network and identifying key
components result from such analysis.
The disadvantage of such software is that it has been developed for relatively
small networks only (100 to 300 nodes). Few of the packages are designed
to permit mapping of the resultant network. Data is output in matrix form
only or as indices in relation to key elements. More seriously, such networks
when mapped result in maps which, although they reflect the data, are not
designed to enhance the comprehensibility of the data (other than in a
purely scientific sense). Such computations can consume considerable amounts
of computer time, even on fast machines.
This approach has been explored using test data from the UIA Revelation
database consisting of some 5,000 nodes. The work was done on a Mac II
using software developed at the University of Dartmouth by Joel Levine
of the Department of Mathematical Social Sciences. This software has not
been adapted to run under MS-DOS.
- (ii) "Crude mapping" A simplistic approach could be
taken. This would involve positioning the nodes on a grid determined by
the subjects with which they are associated. Such a subject grid (with
positions determined by a 4 character identifier) is in use to categorize
the UIA data into some 3,000 categories. Relationships would then be plotted
between the nodes.
In this case comprehensibility is achieved through the link to the matrix
and not through determining the shape of the network. Use of a grid could
severely undermine the memorability of the network. It would however be
relatively easy to develop and quick to run. A key question would be what
kind of interaction it would be possible to have with such a map and whether
it would be possible to shift from a detailed focus on a specialized cell
of the grid to a wider focus and back (a zoom facility).
- (iii) Topological manipulation In this approach, the network
of relationships between nodes would be simplified using topological constraints.
For example a string of interlinked nodes would be represented by a straight
line. The position of the nodes on the line might be equidistant or determined
by some logarithmic function based on the distance from the centre of the
line. The aim would be to introduce symmetry elements into the data so
that it acquires a distinct and memorable pattern or shape. Some of the
algorithms required presumably correspond to those of pattern recognition
(c) Plotting: Once coordinates have been determined, software
is required to plot the network, whether onto the screen or onto a graph
plotter. Many packages exist for this purpose. A distinction should however
be made here between adequate quality plots (for working purposes) and
high-quality plots for publication in book form. The latter question is
The problem in plotting is to be able to introduce distinguishing elements
into the plot. These may include variations in line thickness (corresponding
to some measure of importance or proximity), variations in node size (corresponding
to the number of connections to the node) and the introduction of identifying
labels for the nodes.
A key requirement is that the plot be made from the data as processed
by one of the above techniques, rather than from data which is manually
input. A distinction must also be made between a curve fitting approach
and one which passes through the nodes as is required here. A distinction
also needs to be made between plotting a graph (from left to right) and
plotting a network in which there is no privileged direction. The latter
form is more characteristic of CAD programs (see below).
(d) Drawing: It is desirable to move towards an interactive approach
to the data. In other words, once a plot is made for a segment of theoverall
network, editors should be able to modify the network. Such modifications
might take one of two forms. The first would consist of simply moving portions
of the plot to make it more comprehensible, making room for labels and
improving the aesthetics. The second might also involve the capacity to
add or delete features from the network. It would of course be highly desirable
that the latter changes should be carried back into changes to the relational
database. This can raise severe problems of compatibility between the relational
database and the drawing/plotting software, whether in terms of software
or of intermediate files. Such features are available in many CAD programs.
It is however important to recognize that the CAD software is here used
to "design" logical or topological constructs rather than buildings
or mechanical parts. This is not a limitation but it may permit use of
simpler (and cheaper) CAD software.
It is appropriate to note that the variant of CAD software used for
interactive printed circuit board design (PCB) has many features of value
to the present application, especially the "auto-router" feature
which positions connections on the circuit board in the most economic manner
(avoiding cross-overs, etc). Unfortunately the positioning criteria do
not make for maximum comprehensibility.
(e) Interface software: In the case of Advanced Revelation there
exists a software product CAD/Base which offers "complete integration
of CAD drawings with a database environment", via industry standard
DXF files. The drawing is viewed as a Revelation file and the drawing elements
as Revelation records and fields. The drawing exists as a master file in
both the Revelation and CAD environments. Changes in one environment are
reflected in the other automatically without any intermediate file conversion
Clearly this offers interesting opportunities for using the network
map as a menu through which users can select individual nodes on which
they can immediately access additional text data.
(f) High-quality graphic output: One objective is the production
of maps to be printed in book form. To achieve this one approach might
be to produce output in a form which can be handled by PC-TeX to create
files for output on a high quality laser printer.
(h) Integration of features: It is possible that CAD/Base offers
an appropriate means of integrating the different features discussed above
(except the last). It is also possible that such a product, which is relatively
expensive, can be considered as "overkill", and that a more compact
approach would be more suitable and easier to make available to others.
If the emphasis is on the simpler strategy of generating hardcopy, this
would certainly be the case. To the extent that interaction with the data
is desirable, then more features would be required, even though only a
selection of standard CAD features would be necessary.
For the user, there is obviously great merit in ease of use as an adjunct
to normal text editing procedures. Ideally such a package would bear some
resemblance to the more sophisticated forms of "outliner", such
as MORE and INSPIRATION running on Apple machines. In these an essentially
hierarchical outline of topics can be opened up into standard text processing
or converted into bullet charts. What is required is an equivalent which
is tied into a relational database environment. The different approaches
to network "map design" noted above might then be options in
the way the data was manipulated for presentation, as is the case in standard
business graphics (bar charts, pie charts, etc).
From Encyclopedia of World Problems and Human Potential