As a DBA, I have been asked many times to provide an ER diagram for an application schema. There are many tools that can do the job, mostly expensive CASE tools which have reverse engineering capabilities. There are also some rather exotic freeware tools which are largely unsupported and difficult to find.

Fortunately, there is a software tool ideal for drawing ER diagrams, called GraphViz. GraphViz is a general tool for drawing all kinds of graphs. It was created by AT&T Research specifically for graph drawing and it uses so called “dot” language. The tool itself is available on the following web page: http://graphviz.org

The tool uses rather simple language called “dot”, which would not be a problem to generate by PHP but the task can be simplified even further. It turns out that PHP Extension and Application Repository, known as “PEAR” contains a package with PHP bindings for GraphViz. The home page for this package is http://pear.php.net/package/Image_GraphViz. In order to use GraphViz, one has to install the GraphViz package, highly portable and available for many platforms, both MS Windows and Linux, among others. Installing the PHP bindings for GraphViz with PEAR installer is also a breeze:

#> pear install Image_GraphViz
downloading Image_GraphViz-1.2.1.tar ...
Starting to download Image_GraphViz-1.2.1.tar (Unknown size)
........done: 23,040 bytes
install ok: channel://pear.php.net/Image_GraphViz-1.2.1
#>

Now, let’s see the basic elements of the “dot” language:

Drawing graphs, in general

Graphs consist of nodes and edges. Nodes are represented with boxes and edges with lines. So, let’s analyze a sample graph, ER diagram for the SCOTT schema (Oracle demo schema, now obsolete, but still frequently used):

schema1

This graph consists of 4 nodes and one edge. There are 2 independent nodes and 2 nodes connected by an edge. The independent nodes are BONUS and SALGRADE, the nodes connected by an edge are EMP and DEPT. What do we need the “dot” language for? We need to define the entire graph properties, the node properties and the edge properties.

Let’s see what the dot source for that looks like:

digraph scott {
// Drawing the ER diagram for user SCOTT manually
// Label for the graph
label="**ER diagram for SCOTT";
// Put the label on top of the image
labelloc=t;
// Graph is oriented from left to right. That's the general direction of connecting nodes.
// The default is TD, from top to bottom.
rankdir=LR;
// Nodes
BONUS [ fillcolor=Lightgrey , fontsize=14, shape=record, style=filled];
EMP [ fillcolor=Lightgrey , fontsize=14, shape=record, style=filled];
DEPT [ fillcolor=Lightgrey , fontsize=14, shape=record, style=filled];
SALGRADE [ fillcolor=Lightgrey , fontsize=14, shape=record, style=filled];
// Edge
EMP -> DEPT [color=Grey, arrowhead=oarrow, arrowtail=crow, style=dashed];
}

That can be compiled into an image, like this:

dot scott.dot -Tpng -o scott.png

The “-T” argument specifies the type of picture to produce, in this case PNG. GraphViz package supports many formats. On the web, the most commonly used format is SVG (“Scalable Vector Graphics”) because most of the browser can display that type of graphics directly, without any additional tools. On the command line, I chose PNG because it’s a very common format which provides clear and pretty pictures. The result looks like this:

schema2

So, “dot” language defines nodes (EMP,DEPT,BONUS,SALGRADE), edges (node links) and various attributes and styles. Now, that we know what is the “dot” language all about, let’s examine the PHP bindings, installed earlier, by using PEAR installer.

PHP bindings for GraphViz

This article is not about GraphViz and its “dot” language, it’s about Oracle, PHP and ER diagrams. As we have previously said, there are PHP bindings for the “dot” language which means that we don’t have to program using the “dot” language directly. The PEAR package name is Image_GraphViz and the documentation is here:

http://pear.php.net/package/Image_GraphViz/docs/latest/GraphViz/Image_GraphViz.html

The API mimics the little “dot” snippet that we have created to draw the ER diagram for the SCOTT schema. The constructor opens a new digraph (“directed graph”) to which we can add attributes using the “addAttributes” function:

Attributes are related to the final outlook of the resulting picture. The “dot” language knows many attributes, most of which are self-explanatory. If there is doubt, the GraphViz site has an excellent reference manual. The choice of attributes depends on your preferences for color, fonts, or label location (the default is below the image, on the bottom of the picture). Now, we can add nodes to the graph:

It’s the same thing as above: node is added to the graph, with attributes specified in an associative array. Edges are also added in the same way:

Nodes and edges each have their attributes. When we’re done adding nodes and edges, we can show the graph to the world:

$graph->image(); 

That will display the graph to the browser, in “SVG” format. Most browsers are capable of displaying SVG format directly, Firefox, IE7, Opera9 and Safari among them. I didn’t have an opportunity to test any other.

Now, we have all the necessary knowledge, needed to draw ER diagrams. In order to demonstrate that, I will use ADOdb, for the sake of the code brevity. ADOdb is a very nice and convenient method of working with Oracle RDBMS, documented on http://adodb.sourceforge.net. It is a very popular object oriented library that supports exception handling and that enables me to write much more concise and clear code then the raw OCI8 interface. However, ADOdb uses OCI8 interface under the hood, it is just a nice OO library on top of OCI8 interface. The main reason for using ADOdb is its support of exceptions. In other words, the code looks like this:

Whatever DB operation with the ADOdb library, the error will be correctly handled and
presented. There is no need to insert those pesky error checks which have a tendency to grow the code to monstrous proportions. ADOdb calls are easy to learn, logical and well documented. The complete documentation can be found here: http://phplens.com/lens/adodb/docs-adodb.htm
There is also a fairly complete tutorial for using ADOdb and Oracle RDBMS here: http://phplens.com/lens/adodb/docs-oracle.htm

ER diagram, at last

In the ER diagram, nodes are tables and edges are foreign key constraints. Basically, we have to execute two SQL statements:

  • The first SQL statement lists all the tables in the schema.
  • The second SQL statement lists all the foreign key constraints between the schema tables.

Here are the two queries:

The first query simply lists all the tables belonging to the same owner, represented by the bind variable :OWN. The second query returns pairs of tables connected by foreign keys, also using the same bind variable. Tables will become nodes in our graphs, pairs of tables will define edges.

When combined with ImageGraph calls, the entire code looks like this:

This little script can be called from web server using either GET or POST method to get the schema name to graph. The DB connection descriptor is obtained from the associative array aptly named DSN. This is a session variable which contains TNS descriptor, username and password for connecting to the database. Of course, user can chose another method of logging into the database. The Oracle user used to run this script should be able to query the DBA views, which usually implies the DBA privilege. Below is the result: ER diagram for the HR schema, Oracle demo schema that has replaced venerable and still very popular SCOTT schema. HR schema is a bit more complex then the SCOTT schema, it has more tables and more foreign keys but is not a monster whose ER diagram wouldn’t fit on a page of an article. Here it is:

schema3

This can be done for any schema. Many output formats are supported PNG, JPG and GIF among others. Rendering of the graph is done by GraphViz and it is quite fast. It works many platforms, including not only Linux and Windows but also Solaris, AIX and HP-UX. There are no particular requirements on the hardware. A PC running CentOS 5.2 with 1GB RAM and 2 internal disks is all that was needed. Just as a curiosity, an instance of Oracle11 is also running on that same PC. This configuration was able to draw an ER diagram for the SYS schema (800 tables) in approximately 1 minute. GraphViz also has bindings for Perl and Python. Drawing ER diagrams just became easy.

About the author

Mladen Gogala is a long-time Oracle DBA and the author of Easy Oracle PHP (2006). He has extensive experience in UNIX scripting, Oracle tuning and UNIX system administration, using every popular dialect of UNIX and Linux. Mladen Gogala is also active on the Usenet Oracle forums where he assists in solving Oracle questions.