Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Building a tool immediately so it can be used later in same CMake run

Tags:

cmake

I have an interesting chicken-and-egg problem and a potential solution to it (see my posted answer), but that solution uses CMake in an unusual way. Better alternatives or comments would be welcome.

THE PROBLEM:

The simple version of the problem can be described as a single CMake project with the following characteristics:

  1. One of the build targets is a command-line executable which I'll call mycomp, the source of which is in a mycompdir and making any modifications to the contents of that directory is not possible.
  2. The project contains text files (I'll call them foo.my and bar.my) which need mycomp run on them to produce a set of C++ sources and headers and some CMakeLists.txt files defining libraries built from those sources.
  3. Other build targets in the same project need to link against the libraries defined by those generated CMakeLists.txt files. These other targets also have sources which #include some of the generated headers.

You can think of mycomp as being something like a compiler and the text files in step 2 as some sort of source files. This presents a problem, because CMake needs the CMakeLists.txt files at configure time, but mycomp is not available until build time and therefore isn't available on the first run to create the CMakeLists.txt files early enough.

NON-ANSWER:

Normally, an ExternalProject-based superbuild arrangement would be a potential solution to this, but the above is a considerable simplification of the actual project I am dealing with and I don't have the freedom to split the build into different parts or perform other large scale restructuring work.

like image 530
Craig Scott Avatar asked Mar 18 '16 12:03

Craig Scott


People also ask

Do I need to rerun CMake?

The answer is simple: The cmake binary of course needs to re-run each time you make changes to any build setting, but you wont need to do it by design; hence "never" is correct regarding commands you have to issue.

How do I build and run with CMake?

To build with just cmake change directory into where you want the binaries to be placed. For an in-place build you then run cmake and it will produce a CMakeCache. txt file that contains build options that you can adjust using any text editor.

How do I build a project using CMake?

Run the cmake executable or the cmake-gui to configure the project and then build it with your chosen build tool. Run the install step by using the install option of the cmake command (introduced in 3.15, older versions of CMake must use make install ) from the command line, or build the INSTALL target from an IDE.

What is a CMake build tree?

Build Tree. The top-level directory in which buildsystem files and build output artifacts (e.g. executables and libraries) are to be stored. CMake will write a CMakeCache. txt file to identify the directory as a build tree and store persistent information such as buildsystem configuration options.


1 Answers

The crux of the problem is needing mycomp to be available when CMake is run so that the generated CMakeLists.txt files can be created and then pulled in with add_subdirectory(). A possible way to achieve this is to use execute_process() to run a nested cmake-and-build from the main build. That nested cmake-and-build would use the exact same source and binary directories as the top level CMake run (unless cross compiling). The general structure of the main top level CMakeLists.txt would be something like this:

# Usual CMakeLists.txt setup stuff goes here...

if(EARLY_BUILD)
    # This is the nested build and we will only be asked to
    # build the mycomp target (see (c) below)
    add_subdirectory(mycompdir)

    # End immediately, we don't want anything else in the nested build
    return()
endif()

# This is the main build, setup and execute the nested build
# to ensure the mycomp executable exists before continuing

# (a) When cross compiling, we cannot re-use the same binary dir
#     because the host and target are different architectures
if(CMAKE_CROSSCOMPILING)
    set(workdir "${CMAKE_BINARY_DIR}/host")
    execute_process(COMMAND ${CMAKE_COMMAND} -E make_directory "${workdir}")
else()
    set(workdir "${CMAKE_BINARY_DIR}")
endif()

# (b) Nested CMake run. May need more -D... options than shown here.
execute_process(COMMAND ${CMAKE_COMMAND} -G "${CMAKE_GENERATOR}"
                        -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE}
                        -DCMAKE_MAKE_PROGRAM=${CMAKE_MAKE_PROGRAM}
                        -DEARLY_BUILD=ON
                        ${CMAKE_SOURCE_DIR}
               WORKING_DIRECTORY "${workdir}")

# (c) Build just mycomp in the nested build. Don't specify a --config
#     because we cannot know what config the developer will be using
#     at this point. For non-multi-config generators, we've already
#     specified CMAKE_BUILD_TYPE above in (b).
execute_process(COMMAND ${CMAKE_COMMAND} --build . --target mycomp
                WORKING_DIRECTORY "${workdir}")

# (d) We want everything from mycompdir in our main build,
#     not just the mycomp target
add_subdirectory(mycompdir)

# (e) Run mycomp on the sources to generate a CMakeLists.txt in the
#     ${CMAKE_BINARY_DIR}/foobar directory. Note that because we want
#     to support cross compiling, working out the location of the
#     executable is a bit more tricky. We cannot know whether the user
#     wants debug or release build types for multi-config generators
#     so we have to choose one. We cannot query the target properties
#     because they are only known at generate time, which is after here.
#     Best we can do is hardcode some basic logic.
if(MSVC)
    set(mycompsuffix "Debug/mycomp.exe")
elseif(CMAKE_GENERATOR STREQUAL "Xcode")
    set(mycompsuffix "Debug/mycomp")
else()
    set(mycompsuffix "mycomp")
endif()
set(mycomp_EXECUTABLE "${workdir}/mycompdir/${mycompsuffix}")
execute_process(COMMAND "${mycomp_EXECUTABLE}" -outdir foobar ${CMAKE_SOURCE_DIR}/foo.my ${CMAKE_SOURCE_DIR}/bar.my)

# (f) Now pull that generated CMakeLists.txt into the main build.
#     It will create a CMake library target called foobar.
add_subdirectory(${CMAKE_BINARY_DIR}/foobar ${CMAKE_BINARY_DIR}/foobar-build)

# (g) Another target which links to the foobar library
#     and includes headers from there
add_executable(gumby gumby.cpp)
target_link_libraries(gumby PUBLIC foobar)
target_include_directories(gumby PUBLIC foobar)

If we don't re-use the same binary directory at (b) and (c) as we use for the main build, we end up building mycomp twice, which we obviously want to avoid. For cross compiling, we cannot avoid that, so in such cases we build the mycomp tool off to the side in a separate binary directory.

I've experimented with the above approach and indeed it appears to work in the real world project that prompted the original question, at least for the Unix Makefiles, Ninja, Xcode (OS X and iOS) and Visual Studio generators. Part of the attractiveness of this approach is that it only requires a modest amount of code to be added just to the top level CMakeLists.txt file. Nevertheless, there are some observations that should be made:

  • If the compiler or linker commands for mycomp and its sources are different in any way between the nested build and the main build, the mycomp target ends up getting rebuilt a second time at (d). If there are no differences, mycomp only gets built once when not cross compiling, which is exactly what we want.
  • I see no easy way to pass exactly the same arguments to the nested invocation of CMake at (b) as was passed to the top level CMake run (basically the problem described here). Reading CMakeCache.txt isn't an option since it won't exist on the first invocation and it would not give you any new or changed arguments from the current run anyway. The best I can do is to set those CMake variables I think are potentially going to be used and which may influence the compiler and linker commands of mycomp. This can be worked around by adding more and more variables as I encounter ones I discover I need, but that's not ideal.
  • When re-using the same binary directory, we are relying on CMake not starting to write any of its files to the binary directory until the generate stage (well, at least until after the build at (c) completes). For the generators tested, it appears we are okay, but I don't know if all generators on all platforms follow this behaviour too (and I can't test every single combination to find out!). This is the part that gives me the greatest concern. If anyone can confirm with reasoning and/or evidence that this is safe for all generators and platforms, that would be valuable (and worth an upvote if you want to address this as a separate answer).

UPDATE: After using the above strategy on a number of real world projects with staff of varying levels of familiarity with CMake, some observations can be made.

  • Having the nested build re-use the same build directory as the main build can occasionally lead to problems. Specifically, if a user kills the CMake run after the nested build completes but before the main build does, the CMakeCache.txt file is left with EARLY_BUILD set to ON. This then makes all subsequent CMake runs act like a nested build, so the main build is essentially lost until the CMakeCache.txt file is manually removed. It is possible that an error somewhere in one of the project's CMakeLists.txt file may also lead to a similar situation (unconfirmed). Performing the nested build off to the side in its own separate build directory has worked very well though with no such problems.

  • The nested build should probably be Release rather than Debug. If not re-using the same build directory as the main build (now what I'd recommend), we no longer care about trying to avoid compiling the same file twice, so may as well make mycomp as fast as possible.

  • Use ccache so that any costs due to rebuilding some files twice with different settings are minimised. Actually, we found using ccache typically makes the nested build very quick since it rarely changed compared to the main build.

  • The nested build probably needs to have CMAKE_BUILD_WITH_INSTALL_RPATH set to FALSE on some platforms so that any libraries mycomp needs can be found without having to set environment variables, etc.

like image 152
Craig Scott Avatar answered Sep 21 '22 14:09

Craig Scott