Positioning of classes in UML diagram

https://stackoverflow.com/questions/10020822

29-05-2021
|

Question

I'm creating a tool for displaying Python project as an UML diagram (+ displaying some code error detection using GUI)

I scan some project using Pyreverse and I have all data I need for drawing UML diagram. The problem is positioning of the class boxes on the canvas

For a start, I decided to use already implemented force based algorithm to decide about position of classes, it works quite well here's the result https://github.com/jvorcak/gpylint/blob/master/screenshots/gpylint.png and here's the code (Python, but it's easy to understand even for non Python programmers)

There is one proble, it's great for displaying graphs, but if I want to display UML I'd like to have some enhancements, for instance if 2 classes extend one superclass, I'd expect them to be at the same level in the graph like in graphs generated by dot program

Can you please advise me an algorithm how to do this? Or at least give me some ideas?

Solution

It seems that the main enhancement you are missing is transforming your graph to a layered graph. This is no easy task, but it's doable. (the quality of the result may vary by the amount of time and thought invested in the process).

The main idea is to do a some kind of topological sorting on the graph to split it into layers, do some arrangements in it and then to draw the graph. (you can find python code to do a real topological sort online (example), but real TS will just produces a long line-like graph, and we want something a little different)

So I'll try to describe an algorithm to transform a given graph into a layered one:

Topological sorting doesn't work on graphs with cycles, so if the input graph is not already a directed graph with no cycles, you'll have to find a set of edges that can be removed (or possibly reversed) to create an a cyclic graph (you will later add them to the layered graph, but that will brake the layering and make the graph less pretty :). Since Finding the smallest possible set of edges you can remove is NP-complete (very hard) - I think you'll have to do some shortcuts here, and not necessarily find the minimal set of edges, but do it in reasonable time.
Brake the graph into layers, there are many optimizations that can be done here, but I suggest you to keep it simple. iterate over all the graph's vertexes and each time collect all the vertexes with no incoming edges to a layer. This might produce a line-like graph in some simple cases, but it suits quite well in the case of UML graphs.
A good graph is one that has the smallest number of edges crossing each other, It doesn't sound important but this fact contributes greatly to the overall look of the graph. what determines the number of crossings is the order of arrangement of the edges in every layer.But again, finding the minimum number of crossings or finding a maximum crossing-free set of edges is NP-complete :( "so again it is typical to resort to heuristics, such as placing each vertex at a position determined by finding the average or median of the positions of its neighbors on the previous level and then swapping adjacent pairs as long as that improves the number of crossings."
The edges removed (or reversed) in the first step of the algorithm are returned to their original position.

And there you have it! a nice layered graph for your UML.

If my explanation wasn't clear enough try and read the Wikipedia article on Layered graph drawing again, or ask me any questions, and I'll try to respond.
Remember that this is an algorithm for the general case, and lots optimizations can be made to better handle your specific case.
If you want more ideas for features for your UML tool, look at the wonderful work done by Jetbrains for their IntelliJ UML tool

Hope that my comments here are helpful in any way.

Important Update: since you stated that you are "Looking for an answer drawing from credible and/or official sources." I attach This The formal documentation from graphviz (of dot's algorithm) that "describe a four-pass algorithm for drawing directed graphs. The ﬁrst pass ﬁnds an optimal rank assignment using a network simplex algorithm. The second pass sets the vertex order within ranks by an iterative heuristic incorporating a novel weight function and local transpositions to reduce crossings. The third pass ﬁnds optimal coordinates for nodes by constructing and ranking an auxiliary graph. The fourth pass makes splines to draw edges. The algorithm makes good drawings and runs fast." http://www.graphviz.org/Documentation/TSE93.pdf

OTHER TIPS

Constrained layout of connected components is a non-trivial problem that you might be better off using existing tools to solve. You mentioned Graphviz, but I don't think that you'll find a straightforward algorithm to port to Python. A better solution might be to use pydot to interface with Graphviz and let it handle layout.

The flow would look something like:

Generate data for UML diagram
Convert to dot language using pydot
Layout using Graphviz tools, outputting dot language including layout
Parse the outputted layout with pydot
Display using Python

Graphviz handles the layout, but all the display is still within Python to allow whatever custom behavior you want to support.

Providing my own answer building on blahdiblah's, you can indeed use the suggested workflow to generate your UML diagrams successfully.

But, this seems like it takes the garden path to your solution, which doesn't seem to be desirable to the design of your application. Specifically, we want to reduce the number of theoretical moving parts required to get this working.

Instead of using pyreverse, I recommend looking into the alternatives mentioned in this thread. Specifically, a tool such as Epydoc may meet your needs better, both for the reduction in dependencies and in its (MIT) licensing structure.

Regardless of which path you choose, best of luck with your application.

i m not a python programmer but functionally i can suggest you something.

You must have the count of the lines you will be having with each class
Keep the level number of the class that will help you to organize the classes on the basis of the level no.

If you want to show classes in an ordered manner (parents top, children bottom) you should keep a track of the "weight" of each class. What I mean by weight is the number of "parents".

For example, if B inherits from A, B.weight = 1 and A.weight = 0. And if C inherits from B, C.weight = 2. If you represent this as a line, class A will be printed on line 0, B on line 1 and C on line 2. Generally speaking, all classes of the same "weight" would be printed on the same virtual line.

Of course that's just the base idea, positioning element will be more diffcult than this if you want tu support complex objects (multi-heritance, etc).

You are unlikely to get good results from real projects that weren't developed UML-first. That is a lesson we learned some 10 years ago using the first java-uml round-trip tools (TogetherJ). In text mode it is much too easy to get away with code that is impossible to draw nicely. The dynamic browser-based views of a smalltalk system are much more effective as a way to get insight in the code than UML tools can currently provide.

For the layout, just take a look at all the work done in CAD for electronics, especially printed circuit boards (PCBs). There are good placing and routing algorithms there. One of the things I've not seen an automated UML tool do right is handling a lot of subclasses, where you want layout to change from a single line of classes below the parent to a double line with the low nodes shifted half a node.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow