Merging C Callergraphs with Doxygen or determining union of all calls

https://stackoverflow.com/questions/478860

20-08-2019
|

Question

I have a collection of legacy C code which I'm refactoring to split the C computational code from the GUI. This is complicated by the heavily recursive mathematical core code being K&R style declarations. I've already abandoned an attempt to convert these to ANSI declarations due to nested use of function parameters (just couldn't get those last 4 compiler errors to go).

I need to move some files into a pure DLL and determine the minimal interface to make public, which is going to require wrapper functions writing to publish a typed interface.

I've marked up the key source files with the Doxygen @callergraph markup so informative graphs are generated for individual functions. What I'd like to do beyond that is amalgamate these graphs so I can determine the narrowest boundary of functions exposed to the outside world.

The original header files are no use - they expose everything as untyped C functions.

There are hundreds of functions so simple inspection of the generated callergraphs is painful.

I'm considering writing some kind of DOT merge tool - setting DOT_CLEANUP=NO makes Doxygen leave the intermediate DOT files there rather then just retaining the png files they generated.

I'm not obsessed by this being a graphical solution - I'd be quite happy if someone could suggest an alternative analysis tool (free or relatively cheap) or technique using Doxygen's XML output to achieve the same goal.

A callergraph amalgamated at the file level does have a certain appeal for client documentation rather than a plain list :-)

Solution

In your Doxyfile, set

GENERATE_XML = YES

and then you can find your call graph in the XML files. For each function marked with the callergraph, you'll find <referencedby> elements in the output that you can use. Here's a sample from one of my C files:

  <memberdef kind="function" id="spfs_8c_1a3"
             prot="public" static="yes" const="no" explicit="no"
             inline="no" virt="non-virtual">
    <type>svn_error_t *</type>
    <definition>svn_error_t * init_apr</definition>
    <argsstring>(apr_pool_t **ppool, int *argc, char const *const **argv)</argsstring>
    <name>init_apr</name>
    <!-- param and description elements clipped for brevity ... -->
    <location file="src/spfs.c" line="26" bodystart="101" bodyend="120"/>
    <referencedby refid="spfs_8c_1a13" compoundref="spfs_8c"
                  startline="45" endline="94">main</referencedby>
  </memberdef>

So for every memberdef/referencedby pair, you have a caller-callee relationship, which you can grab via XSLT:

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="text"/>

    <xsl:template match="/">
        <xsl:apply-templates select="//memberdef[@kind eq 'function']"/>
    </xsl:template>

    <xsl:template match="memberdef">
        <xsl:variable name="function-name"
                      select="concat(definition, argsstring)"/>
        <xsl:for-each select="referencedby">
            <xsl:value-of select="concat(./text(), ' calls ', $function-name, '&#xA;')"/>
        </xsl:for-each>
    </xsl:template>

</xsl:stylesheet>

Which gives you a line per caller-callee like this:

main calls svn_error_t * init_apr(apr_pool_t **ppool, int *argc, char const *const **argv)

You'll want to tweak that XSLT and then partition that directed graph in the way that cuts across the fewest edges. Woo hoo, an NP-complete problem! Luckily, there are lots of papers on the subject, some are here: http://www.sandia.gov/~bahendr/partitioning.html

OTHER TIPS

I had a similar requirement. Wrote a perl script to merge a set of dot files into a single dot file.

https://github.com/bharanis/scripts/blob/master/doxygen_dot_merge.pl

merge multiple doxygen generated dot files. This is useful for generating a call map for a file or a bunch of files.

1) This command is to be run from outside the html directory where doxygen puts all the html, dot and map files.

2) This command assumes flat directory structure used in doxygen CREATE_SUBDIRS = NO

3) doxygen prefixes the source filename to the name of the output dot files. One dot file is generated per function

4) provide the list of doxygen generated dot files to be merged. eg:

./doxydotmerge.pl  `ls html/ssd_*_8c*_cgraph.dot  | grep -v test | grep -v buf`

You could use Scientific Toolworks to see your system-wide call graph.

If you want to automate the analysis and the cutting, you might consider the DMS Software Reengineering Toolkit. It can compute full call graphs for C (complete with points-to analysis for function pointers), and has been proven for systems of 35 million lines of code. It will produce full system DOT files to inspect; they are pretty big but you can pick out subsets to look at. See flow analysis and call graphs.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow