This section provides more information on integrating with compilers that are either unknown or unsupported by kwinject. Specifically, given a log from your own build or a build trace generated by kwinject, this section provides details on how to determine which of your build commands are worth intercepting (and including in your compiler filter file, <compiler_name>_filter.py) and which are not.
Most C/C++ compilers follow the same (or a very similar) execution pattern. There is generally a user-visible front-end (for example, gcc/g++ in GNU and cc/CC in Sun Forte) to various lower-level programs (a so-called "toolchain"). For example:
- preprocessor - preprocesses the code, that is, expands #include and #define directives
- compiler - parses the source code and generates assembler/binary code for particular target architecture
- assembler - translates assembler code into binary object files
The same front-end is often used to incorporate link-time tasks such as linking and combining object files and libraries into executables and/or dynamic libraries (also known as shared objects).
For example, the typical (sub)process tree for gcc looks like this:
- gcc -D_EXAMPLE_ -I/blah/blah/include example.c
- cpp0 -lang-c -v -I/blah/blah/include -DGNUC=2 ... -D_EXAMPLE_ example.c /tmp/XXX.i
- preprocessor, since gcc 3.x preprocessor and compiler have been merged into a single tool
- cc1 /tmp/XXX.i -quiet ... -o /tmp/XXX.s
- gcc 2.x compiler
- -- or --
- cc1 -quiet -v -I/blah/blah/include -D_EXAMPLE_ example.c ... example.c -o /tmp/XXX.s
- gcc 3.x preprocessor and compiler; generates assembler code
- as -V -Qy -o /tmp/XXX.o /tmp/XXX.s
- assembler; translates assembler code into a binary object file
- collect2 ... -dynamic-linker /lib/ld-linux.so.2 ... /tmp/XXX.o
- linker; combines an object file with default system/gcc libraries into an executable
Note that gcc 2.x uses a separate subprocess for each part of the toolchain, while gcc 3.x combines the preprocessor and compiler into a single step.
Note that gcc does not actually do any work; it simply delegates the tasks to other programs.
When integrating Klocwork into your build, you can ignore the "internal" programs used by your compiler. The Klocwork compiler, kwcc, works at the same level as most compiler front-ends, so kwinject does not need to know any low-level details.
Some toolchains (such as gcc = [cpp0 +] cc1+as + collect2) are well known. Others can be easily discovered using a build trace. Use the following rules to determine which command lines are worth intercepting in a build integration:
- Look for the recurring sequence of executed programs that follows the compiler front-end invocation (such as ccmips.exe -> build.exe, ecommip.exe).
- If a command line always follows the compiler front-end and inherits most of its arguments (such as -D, -I, and the name of the input source file), it is most likely a part of a toolchain and can be safely ignored. Or, compare two consecutive command lines, discarding all uninteresting options and arguments (most likely everything but -D, -I and the name of the source file). If both command lines are intercepted and filtered, will this result in duplicate commands? If yes, the second command line should be ignored.
- The commands that use temporary files (/tmp/XXX or /var/tmp/XXX on UNIX, C:\Documents and Settings\JoeUser\Local Settings\Temp\xxx on Windows) as input/output are most likely "internal", and should be ignored.
- Look for three types of commands:
- compiler type - takes one or more source files (such as .c, .cpp, or .cc) as input and generates an object file or executable (such as .o, .obj, or .exe) as output. The output is likely specified with one of the following options: -o, -out or -output.
- linker type - takes a number of object files and libraries and generates an executable or run-time library (such as .exe, .dll, or .so). The same kind of output option is often used.
- library type - takes a number of object files and generates a static library (xxx.lib, libxxx.a).
- The commands that do not fall into one of the above types are most likely of no interest to kwinject. For example, the Green Hills gnm.exe looks like a linker command, since it takes object files and libraries as input, but it does not generate an executable or a dynamic/static library. Therefore, it should not be intercepted.