Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid LLVM's Support CommandLine leaking library arguments?

I've been working on a compiler for a language of mine and wanted to utilize the LLVM Support Library CommandLine to handle argument parsing.

I have only added two simple declarations:

static cl::opt<std::string>
OutputFilename("o", cl::desc("Output filename"), cl::value_desc("filename"));

static cl::list<std::string> 
InputFilenames("i", cl::desc("Input files"), cl::value_desc("filenames"), cl::OneOrMore);

I then add the usual call in main:

int main(int argc, char *argv[])
{
    cl::ParseCommandLineOptions(argc, argv, " My compiler\n");

...

The problem is very apparent when passing -help to my program:

General options:

  -aarch64-neon-syntax             - Choose style of NEON code to emit from AArch64 backend:
    =generic                       -   Emit generic NEON assembly
    =apple                         -   Emit Apple-style NEON assembly
  -cppfname=<function name>        - Specify the name of the generated function
  -cppfor=<string>                 - Specify the name of the thing to generate
  -cppgen                          - Choose what kind of output to generate
    =program                       -   Generate a complete program
    =module                        -   Generate a module definition
    =contents                      -   Generate contents of a module
    =function                      -   Generate a function definition
    =functions                     -   Generate all function definitions
    =inline                        -   Generate an inline function
    =variable                      -   Generate a variable definition
    =type                          -   Generate a type definition
  -debugger-tune                   - Tune debug info for a particular debugger
    =gdb                           -   gdb
    =lldb                          -   lldb
    =sce                           -   SCE targets (e.g. PS4)
  -disable-spill-fusing            - Disable fusing of spill code into instructions
  -enable-implicit-null-checks     - Fold null checks into faulting memory operations
  -enable-load-pre                 - 
  -enable-objc-arc-opts            - enable/disable all ARC Optimizations
  -enable-scoped-noalias           - 
  -enable-tbaa                     - 
  -exhaustive-register-search      - Exhaustive Search for registers bypassing the depth and interference cutoffs of last chance recoloring
  -gpsize=<uint>                   - Global Pointer Addressing Size.  The default size is 8.
  -i=<filenames>                   - Input files
  -imp-null-check-page-size=<uint> - The page size of the target in bytes
  -join-liveintervals              - Coalesce copies (default=true)
  -limit-float-precision=<uint>    - Generate low-precision inline sequences for some float libcalls
  -merror-missing-parenthesis      - Error for missing parenthesis around predicate registers
  -merror-noncontigious-register   - Error for register names that aren't contigious
  -mfuture-regs                    - Enable future registers
  -mips16-constant-islands         - Enable mips16 constant islands.
  -mips16-hard-float               - Enable mips16 hard float.
  -mno-compound                    - Disable looking for compound instructions for Hexagon
  -mno-ldc1-sdc1                   - Expand double precision loads and stores to their single precision counterparts
  -mno-pairing                     - Disable looking for duplex instructions for Hexagon
  -mwarn-missing-parenthesis       - Warn for missing parenthesis around predicate registers
  -mwarn-noncontigious-register    - Warn for register names that arent contigious
  -mwarn-sign-mismatch             - Warn for mismatching a signed and unsigned value
  -nvptx-sched4reg                 - NVPTX Specific: schedule for register pressue
  -o=<filename>                    - Output filename
  -print-after-all                 - Print IR after each pass
  -print-before-all                - Print IR before each pass
  -print-machineinstrs=<pass-name> - Print machine instrs
  -regalloc                        - Register allocator to use
    =default                       -   pick register allocator based on -O option
    =fast                          -   fast register allocator
    =greedy                        -   greedy register allocator
    =pbqp                          -   PBQP register allocator
  -rewrite-map-file=<filename>     - Symbol Rewrite Map
  -rng-seed=<seed>                 - Seed for the random number generator
  -stackmap-version=<int>          - Specify the stackmap encoding version (default = 1)
  -stats                           - Enable statistics output from program (available with Asserts)
  -time-passes                     - Time each pass, printing elapsed time for each on exit
  -verify-debug-info               - 
  -verify-dom-info                 - Verify dominator info (time consuming)
  -verify-loop-info                - Verify loop info (time consuming)
  -verify-regalloc                 - Verify during register allocation
  -verify-region-info              - Verify region info (time consuming)
  -verify-scev                     - Verify ScalarEvolution's backedge taken counts (slow)
  -x86-asm-syntax                  - Choose style of code to emit from X86 backend:
    =att                           -   Emit AT&T-style assembly
    =intel                         -   Emit Intel-style assembly

Generic Options:

  -help                            - Display available options (-help-hidden for more)
  -help-list                       - Display list of available options (-help-list-hidden for more)
  -version                         - Display the version of this program

I would obviously like to cut down the noise to show only the relevant command line options I have exposed.

I realize the CommandLine utility is utilizing global variables and template metaprogramming and my problem is probably due to linking with LLVM statically.

I have found several links that seem to touch on the issue, but nothing concrete as a solution other then possibly dynamically linking to LLVM.


  • LLVM CommandLine: how to reset arguments?
  • LLVMdev: Command line options being put in Target backend libraries
  • Bugzilla 8860: command-line argument parser provides too many options when building with --enable-shared

I am running on OS X El Capitan Version 10.11.1

I installed llvm as follows:

git clone http://llvm.org/git/llvm.git
git clone http://llvm.org/git/clang.git llvm/tools/clang
git clone http://llvm.org/git/clang-tools-extra.git llvm/tools/clang/tools/extra
git clone http://llvm.org/git/compiler-rt.git llvm/projects/compiler-rt
git clone http://llvm.org/git/libcxx.git llvm/projects/libcxx
git clone http://llvm.org/git/libcxxabi.git llvm/projects/libcxxabi

mkdir build_llvm
cd build_llvm && cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=prefix=/usr/local/llvm ../llvm
make

The relevant portions of my Makefile:

LLVMCONFIG = llvm-config
CPPFLAGS = `$(LLVMCONFIG) --cxxflags` -std=c++11
LDFLAGS = `$(LLVMCONFIG) --ldflags` -lpthread -ldl -lz -lncurses -rdynamic
LIBS = `$(LLVMCONFIG) --libs`

%.o: %.cpp
    clang++ -I /usr/local/llvm/include -c $(CPPFLAGS) $< -o $@

mycompiler: $(OBJECTS)
    clang++ -I /usr/local/llvm/include -g $^ $(LIBS) $(LDFLAGS) -o $@
like image 257
Matthew Sanders Avatar asked Dec 08 '15 03:12

Matthew Sanders


3 Answers

Geoff Reedy's answer worked great for me so I accepted his answer, however, I wanted to give more details on how you set it up, as well as some other information I found.

As before I have my options in declarative form above my main function however I have added a declarative OptionCategory and have included this in the declarations of each option in that category:

cl::OptionCategory 
CompilerCategory("Compiler Options", "Options for controlling the compilation process.");

static cl::opt<std::string>
OutputFilename("o", cl::desc("Output filename"), cl::value_desc("filename"), cl::cat(CompilerCategory));

static cl::list<std::string> 
InputFilenames("i", cl::desc("Input files"), cl::value_desc("filenames"), cl::OneOrMore, cl::cat(CompilerCategory));

Then I have the call to HideUnrelatedOptions within main just before ParseCommandLineOptions:

int main(int argc, char *argv[])
{
    cl::HideUnrelatedOptions( CompilerCategory );
    cl::ParseCommandLineOptions(argc, argv, " My compiler\n");

... 

Now my OPTIONS output looks much better:

OPTIONS:

Compiler Options:
Options for controlling the compilation process.

  -i=<filenames> - Input files
  -o=<filename>  - Output filename

Generic Options:

  -help          - Display available options (-help-hidden for more)
  -help-list     - Display list of available options (-help-list-hidden for more)
  -version       - Display the version of this program

This basically marks all options as cl::ReallyHidden so they do not even show up with -help-hidden :).

The other useful information I found was on the CommandLine Library Manual.

I know LLVM is in constant flux so just in case the page is gone in the future here is the example:

using namespace llvm;
int main(int argc, char **argv) {
  cl::OptionCategory AnotherCategory("Some options");

  StringMap<cl::Option*> Map;
  cl::getRegisteredOptions(Map);

  //Unhide useful option and put it in a different category
  assert(Map.count("print-all-options") > 0);
  Map["print-all-options"]->setHiddenFlag(cl::NotHidden);
  Map["print-all-options"]->setCategory(AnotherCategory);

  //Hide an option we don't want to see
  assert(Map.count("enable-no-infs-fp-math") > 0);
  Map["enable-no-infs-fp-math"]->setHiddenFlag(cl::Hidden);

  //Change --version to --show-version
  assert(Map.count("version") > 0);
  Map["version"]->setArgStr("show-version");

  //Change --help description
  assert(Map.count("help") > 0);
  Map["help"]->setDescription("Shows help");

  cl::ParseCommandLineOptions(argc, argv, "This is a small program to demo the LLVM CommandLine API");
  ...
}

This basically shows the power of the system and demonstrates how to modify specific options however you wish.

like image 124
Matthew Sanders Avatar answered Nov 01 '22 15:11

Matthew Sanders


As far as I can see, the best you can do is to put all of your options into particular categories and use void llvm::cl::HideUnrelatedOptions(ArrayRef Categories) (or cl::HideUnrelatedOptions(cl::OptionCategory & Category) if you have only a single category). The documentation indicates that clang uses it to get control over the command line options, though I haven't checked in the clang source to see if that is really true.

like image 34
Geoff Reedy Avatar answered Nov 01 '22 14:11

Geoff Reedy


Sorry for reviving this thread from long ago. However, I was facing the exact same problem and found the two existing answers invaluable. Thank you @geoff-reedy and @matthew-sanders! As I followed the existing answers' advice, I ran into a pretty big problem:

If I invoked the program with -help-hidden the category headers (even though they were unrelated) were still printed even though none of the options contained therein were displayed. It made for very curious output.

So, I dug in a little bit and came up with an alternate solution based on the awesome, existing answers. For posterity I will post it here in case it helps someone (probably a future version of me)!

Instead of declaring the command line values that I wanted my program to accept as file-scoped global variables (the typical advice for using the CommandLine library), I had to create them as variables local to some scope. That essentially delayed the options' addition to the GlobalParser object declared in the library's implementation until that scope was elaborated. Then, before that scope was elaborated, I invoked llvm::cl::ResetCommandLineParser. That removed all the leaked command-line options from the GlobalParser. Then, when the constructor for my options were encountered, they were the only command-line options available in the GlobalParser.

Finally, I specified a "discard" llvm::raw_ostream (llvm::raw_null_ostream) to pass to the Errs parameter of the invocation of llvm::cl::ParseCommandLineOptions. The presence of an argument for Errs in the call to llvm::cl::ParseCommandLineOptions forces the library not to terminate the program if there are errors parsing the arguments. I specified a null output stream because I wanted to generate my own error messages, but specifying a real stream would give the same signal to llvm::cl::ParseCommandLineOptions and allow you to print the parser-generate error messages for the user.

Of course, there were problems left to solve. Most notably, this approach meant that I had to specify and handle -help (or -help-hidden) myself. It was relatively easy to do that using the llvm::cl::PrintHelpMessage() function. Next, I had to handle the case where llvm::cl::Required options were not properly specified by the user. Handling this problem was also straightforward -- cl::llvm::ParseCommandLineOptions will return false when there is an error parsing the command-line arguments.

One of the benefits of declaring the command-line option variables at the global scope is that other functions can access them easily. With the approach that I took, this feature is no longer available. Although I did not handle that problem in the code I wrote, it might be possible to write a set of helper functions that retrieve the values of the command-line option variables from static, function-local storage.

For reference, here is the code that I ended up writing:

[[noreturn]] void usage() {
  llvm::cl::PrintHelpMessage();
  exit(EXIT_FAILURE);
}

int main(int argc, char **argv) {  
  llvm::raw_null_ostream raw_discard{};
  llvm::cl::ResetCommandLineParser();

  llvm::cl::OptionCategory SeePlusPlusCategory(
      "See Plus Plus Options",
      "Options for controlling the behavior of See Plus Plus.");
  llvm::cl::opt<std::string> SourceCodeFilename(
      llvm::cl::Positional, llvm::cl::desc("<source file>"), llvm::cl::Required,
      llvm::cl::cat(SeePlusPlusCategory));
  llvm::cl::opt<bool> SuppressLineNumbers(
      "nl", llvm::cl::desc("Suppress line numbers in output."),
      llvm::cl::cat(SeePlusPlusCategory));
  llvm::cl::opt<bool> Help(
      "help", llvm::cl::desc("Display available options"),
      llvm::cl::ValueDisallowed, llvm::cl::cat(SeePlusPlusCategory));

  llvm::cl::HideUnrelatedOptions(SeePlusPlusCategory);
  if (!llvm::cl::ParseCommandLineOptions(argc, argv, "", &raw_discard)) {
    usage();
  }

  if (Help) {
    usage();
  }

  // Program continues from here.
}

Again, thank you to Matthew and Geoff for their great answers. I hope that what I wrote above helps someone else in the future as much as their answers helped me!

like image 1
Will Hawkins Avatar answered Nov 01 '22 13:11

Will Hawkins