A graph, of the kind discussed here, is a collection of vertices connected by edges. We'll be concerned exclusively with directed graphs where each edge
has a direction from its parent vertex to some child vertex in the collection. Directed graphs are particularly useful for describing large sets of relationships.
Here are some example graph models:
Build dependency relationships between software packages
Calling relationships between functions in a program
Relationships between types in a system, e.g., inheritance, composition, aggregation, and using
Communication connections, e.g., web hosts and client machines
Frequently graph structures are discussed in terms of adjacency lists like the example in the first diagram. Note that the adjacency list holds redundant nodes to
represent graph vertices, e.g., there are three nodes that represent vertex 4 in the diagram.
We won't want to directly implement this structure, not only because of
the unnecessary storage, but also because that opens up the possiblity of two or more representations of a node having different properties and values. When there are
storage redundancies errors like that are relatively easy to make and often very difficult to find.
The linked structure shown in the next diagram is preferable. It has no redundancies - each vertex holds references to child vertices, where both parent and children
reside in the same collection. These references could be implemented with pointers, but we will choose not to do that, and use indexes into the adjacency collection
to refer to child vertices. That makes creation of copies trivial. In fact, the compiler created member-wise copy constructor and assignment operator will be
correct for that design.
Design:
The Graph's logical structure is shown in the third diagram. The graph class is template-based with parameters V, a vertex type, and E, an edge type.
These could simply be strings that label vertices and edges. In this case a vertex might represent a town on a map with edges that are roads from that town to another.
Often, however, we may need a vertex or an edge to hold composite information containing several different values. In this case we simply define
a V class and/or an E class with the appropriate structure to hold that information.
The graph holds a collection of vertices in a std::vector<Vertex<V,E>>, named adj for adjacency list. Each vertex holds an instance of the V type and a collection of edges
in a std::vector<Edge>. An edge is simply a typedef for std::pair<size_t, E>. The size_t argument is an index into the graph's vertex collection, selecting the edge's
target vertex. The E argument is an instance of the E type, so each edge can hold a unique instance.
Simple XmlReader and XmlWriter classes are part of the graphLib archive and are used to create and retrieve graph models from disk storage.
These are not Document Object Model (DOM) based, but
simply implemented with string manipulation and a scope stack. You may find these usesful for other applicatations as well.
Source Code:
This graph implementation is written in C++ using Visual Studio 2013. It should port, with little difficulty, to Gnu gcc to run on Linux, for example.
One of my doctoral advisees, Mehmet Kaya developed strong component and topological softing algorithms for this graph class. He has been using
that to help develop new ways of restructuring C++ source code to improve its structure and maintainability. Here is a Syracuse University
Technical Report describing that work. Another advisee, Mubarek Mohammed, is using the
graph class to support research on visualization for large software systems.
Conclusions:
This graph class implementation is reasonably simple, fast, and robust. It supports persisting graph models to XML data files and retrieving back
into graph instances. We have an interest in algorithms, based on this graph facility, that support investigations into the structure and dynamics
of large systems. We will add algorithms we are working with to this package as they mature.