Page MenuHomePhabricator

Provide callgraphs for the Bitcoin-ABC Codebase
Open, Needs TriagePublic

Description

Figure out how to generate callgraphs and xml datafiles from clang, and share reports with the group so we can tackle low hanging functions for clean-up.

Event Timeline

schancel created this task.Feb 15 2018, 05:35

I'll try to work on this...

matra774 claimed this task.Feb 24 2018, 20:16

I'd rather have it genertaed from perf, at least we'd have proper data and all. I'm not sure whata giant inscrutable xml would bring us.

matra774 added a comment.Mar 3 2018, 15:51

I've done some preliminary analysis on call graph generations and I'm shraing the results here.
Unfortunately I did not have enough time to actually analyze bitocin-abc code and find the low hanging fruit, that task description mentions.
Still' I hope, that's a good start.

There are different ways of generating call graphs:

Option # 1: Ask clang to do it

Add compiler flags -S and -emit-llvm compiler flags, so that compiler outputs IL instead of object code.

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 374f91729..06d791f1a 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -24,14 +24,16 @@ add_compiler_flag(
        -Wvla
        -Wformat-security
        -Wcast-align
+       -S 
+       -emit-llvm
 )

Compile!
Link different files together to get them on the same graph. For example:

llvm-link libbitcoin_common*.o -o single.o -v

Ask optimizer to generate call graph:

opt -analyze -dot-callgraph single.o

Demangle function names:

cat callgraph.dot | c++filt | sed 's,>,\\>,g; s,-\\>,->,g; s,<,\\<,g' | gawk '/external node/{id=$1} $1 != id' >callgraph2.dot

Generate image (you might need to use SVG format if dot complains that the bitmap is too large):

dot -Tpng -ocallgraph.png callgraph.dot

Result: You get a huge graph, with bunch of connected notes, which contains too much noise for practical use. Example:

For more info, see:

Option #2 Roll out your own analysis script:
Start with https://github.com/Vermeille/clang-callgraph: Sample output:

  prevector::size()
    prevector::is_direct()
  prevector::size()
CScript::GetOp(prevector<28, unsigned char, unsigned int, int>::const_iterator &, opcodetype &)
  CScript::GetOp2(prevector<28, unsigned char, unsigned int, int>::const_iterator &, opcodetype &, std::vector<uint8_t> *)
    prevector::const_iterator::operator>=(prevector<28, unsigned char, unsigned int, int>::const_iterator)
    prevector::end()
    operator-(prevector<28, unsigned char, unsigned int, int>::const_iterator, prevector<28, unsigned char, unsigned int, int>::const_iterator)
    prevector::end()
    prevector::const_iterator::const_iterator(const prevector<28, unsigned char, unsigned int, int>::const_iterator &)
    prevector::const_iterator::operator*()
    prevector::const_iterator::operator++(int)
    operator-(prevector<28, unsigned char, unsigned int, int>::const_iterator, prevector<28, unsigned char, unsigned int, int>::const_iterator)
    prevector::end()

Remove duplicates, filter out the stuff you do not need and write a bunch of code to do the analysis.

Option 3: Use a tool that already supports advanced analysis.

I've found https://www.cppdepend.com/ quite interesting.
It can be used in Open Source projects for free: https://www.cppdepend.com/CppDependForOSS
They have both Windows and Linux version, although the Linux version is outdated (I've asked for a newer Linux trial but got no response), so I have used the Windows version.

How to use it:

  • Run bear make to generate compile_commands.json
  • expose your linux sources as mapped linux drive (for example z:/bitcoin-abc, z:/usrinclude), so that CppDepend on Windows machine can access them
  • convert compile_commands.json to set of .proj files that can be imported in CppDepend. I've written a simple .NET utility to automate the conversion. I can share it if anyone finds it usefull.
  • Run the analysis.

CppDepends supports LINQ based queries and offers different visualization methods.
Use Dependency matrix, because Dependency graphs are not readable when there are a lot of nodes.
Here is an example:

(Fun fact: did you now, that the number of cpp files included in bitcoin-abc compilation is 333?)

matra774 removed matra774 as the assignee of this task.Mar 3 2018, 15:52