Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to build tensorflow 1.13.1 with custom protobuf?

System Information

  • Have I written custom code: modification of bazel files
  • OS Platform and Distribution: Ubuntu 16.04
  • TensorFlow installed from: Source
  • TensorFlow version: 1.13.1 downloaded from corresponding tag (https://github.com/tensorflow/tensorflow/archive/v1.13.1.tar.gz)
  • Python version: Python 3.5.2
  • Bazel version: 0.21.0
  • Protobuf version: 3.7.0 (build from source and put somewhere in filesytem)
  • CUDA/cuDNN version: 10.0/7
  • GPU model and memory: GeForce GTX 1060 Ti

Context: by default, tensorflow builds its own protobuf code. Nevertheless protobuf being also used in other libraries, its exported symbols are conflicting with those provided by tensorflow. Only good solution to that problem is to use a unique and independent (i.e. out of tensorflow) version of protobuf for all libraries (including tensorflow). So I basically need to build tensorflow with a target installed version of protobuf, that is located somewhere in the filesystem.

Problem: using a custom version of protobuf, installed somewhere in filesystem (not in default system path), when building tensorflow 1.13.1. More specifically, my question is : what modifications are needed in bazel files of tensorflow to make this possible. I a am new to bazel and I am really confused about what to do...

Here is what I did:

1) to deactivate internal build of protobuf, in .tf_configure.bazelrc I added the line:

build --action_env TF_SYSTEM_LIBS="protobuf_archive"

This works as expected except that my protobuf installed in default system path is too old to be capable of parsing proto3 files. Not a real problem anyway since I want to use my custom protobuf which is version 3.7.0.

2) to specify where to find protobuf I changed workspace.bzl file by using new_local_repository instead of tf_http_archive.

Here @PATH_TO_PROTOBUF@ if the path to the protobuf library installed in my filesystem.

    new_local_repository(
        name =  "protobuf_archive",
        path = "@PATH_TO_PROTOBUF@",
        build_file = clean_dep("//third_party/systemlibs:protobuf.BUILD"),
    )

    new_local_repository(
        name = "com_google_protobuf",
        path = "@PATH_TO_PROTOBUF@",
        system_build_file = clean_dep("//third_party/systemlibs:protobuf.BUILD"),
        system_link_files = {
            "//third_party/systemlibs:protobuf.bzl": "protobuf.bzl",
        },
    )
    new_local_repository(
        name = "com_google_protobuf_cc",
        path = "@PATH_TO_PROTOBUF@",
        system_build_file = clean_dep("//third_party/systemlibs:protobuf.BUILD"),
        system_link_files = {
            "//third_party/systemlibs:protobuf.bzl": "protobuf.bzl",
        },
    )

3) I changed the protobuf.BUILD file located in tensorflow-1.13.1/third_party/systemlibs by changing binaries used by rules:

cc_library(
    name = "protobuf",
    hdrs = HEADERS,
    linkopts = ["@PATH_TO_PROTOBUF@/lib/libprotobuf.so"],
    visibility = ["//visibility:public"],
)

cc_library(
    name = "protobuf_headers",
    hdrs = HEADERS,
    linkopts = ["@PATH_TO_PROTOBUF@/lib/libprotobuf.so"],
    visibility = ["//visibility:public"],
)

cc_library(
    name = "protoc_lib",
    linkopts = ["@PATH_TO_PROTOBUF@/lib/libprotoc.so"],
    visibility = ["//visibility:public"],
)

genrule(
    name = "protoc",
    outs = ["protoc.bin"],
    cmd = "ln -s @PATH_TO_PROTOBUF@/bin/protoc $@",
    executable = 1,
    visibility = ["//visibility:public"],
)

This way I was thinking that everything would work but when I ran the build:

ERROR: .../tensorflow-1.13.1/tensorflow/core/BUILD:2460:1: ProtoCompile tensorflow/core/lib/core/error_codes.pb.cc failed (Exit 127): protoc.bin failed: error executing command 
  (cd /home/robin/.cache/bazel/_bazel_robin/c04a70144cd329180403af87e4cbdc44/execroot/org_tensorflow && \
  exec env - \
    PATH=/bin:/usr/bin \
  bazel-out/host/genfiles/external/protobuf_archive/protoc.bin '--cpp_out=bazel-out/host/genfiles/' -I. -Iexternal/protobuf_archive -Ibazel-out/host/genfiles/external/protobuf_archive tensorflow/core/lib/core/error_codes.proto)
Execution platform: @bazel_tools//platforms:host_platform
[32 / 203] 6 actions, 5 running
    Executing genrule @protobuf_archive//:link_headers [for host]; 0s local
    ProtoCompile .../core/lib/core/error_codes.pb.cc [for host]; 0s local
    Compiling tensorflow/core/platform/default/logging.cc [for host]; 0s local
    ProtoCompile tensorflow/core/example/example.pb.cc [for host]; 0s local
    Executing genrule @local_config_cuda//cuda:cuda-include; 0s local
    [-----] Writing file external/llvm/llvm-tblgen-2.params [for host]
bazel-out/host/genfiles/external/protobuf_archive/protoc.bin: error while loading shared libraries: libprotoc.so.18: cannot open shared object file: No such file or directory

Apparently protoc fails simply because the loader does not find libprotoc.

4) So the solution was trivial to me, simply setting the LD_LIBRARY_PATH adequately to automatically find libprotoc.so. Unfortunately none of the following solution works:

A) directly setting the LD_LIBRARY_PATH

export LD_LIBRARY_PATH=@PATH_TO_PROTOBUF@/lib
bazel build //tensorflow:tensorflow_cc.So

B) setting the LD_LIBRARY_PATH via .tf_configure.bazelrc:

build --action_env LD_LIBRARY_PATH="@PATH_TO_PROTOBUF@/lib"

The output is exactly the same as previously so my first remark is that LD_LIBRARY_PATH is not exported (as far as I understand). This can be explained because:

exec env - \
    PATH=/bin:/usr/bin \
bazel-out/host/genfiles/external/protobuf_archive/protoc.bin '--cpp_out=bazel-out/host/genfiles/' 

does not contains an expression like

LD_LIBRARY_PATH=@PATH_TO_PROTOBUF@/lib/ 

I googled for a long time without finding any solution to that problem (I tested many but nothing worked) ... maybe it is a limitation of the version of Bazel I used ? Unfortunately I cannot use a more recent version of Bazel simply because tensorflow 1.13.1 forbids it.

So now I do not know really well what to do ... I suppose the solution is to do some more modification into bazel project files of tensorflow...

like image 531
Robin Passama Avatar asked Apr 08 '19 17:04

Robin Passama


Video Answer


1 Answers

Hope the following answer can help someone:

Finally it appears that this is a problem that can be solved by modifying tensorflow bazel files. In the file tensorflow.bzl modify the function tf_generate_proto_text_sources this way:

def tf_generate_proto_text_sources(name, srcs_relative_dir, srcs, protodeps = [], deps = [], visibility = None):
    out_hdrs = (
        [
            p.replace(".proto", ".pb_text.h")
            for p in srcs
        ] + [p.replace(".proto", ".pb_text-impl.h") for p in srcs]
    )
    out_srcs = [p.replace(".proto", ".pb_text.cc") for p in srcs]
    native.genrule(
        name = name + "_srcs",
        srcs = srcs + protodeps + [clean_dep("//tensorflow/tools/proto_text:placeholder.txt")],
        outs = out_hdrs + out_srcs,
        visibility = visibility,
        cmd =
            "LD_LIBRARY_PATH=@CONFIG_LIBRARY_PATH@ " +
            "$(location //tensorflow/tools/proto_text:gen_proto_text_functions) " +
            "$(@D) " + srcs_relative_dir + " $(SRCS)",
        tools = [
            clean_dep("//tensorflow/tools/proto_text:gen_proto_text_functions"),
        ],
    )

    native.filegroup(
        name = name + "_hdrs",
        srcs = out_hdrs,
        visibility = visibility,
    )

    native.cc_library(
        name = name,
        srcs = out_srcs,
        hdrs = out_hdrs,
        visibility = visibility,
        deps = deps,
    )

Where @CONFIG_LIBRARY_PATH@ is a LD_LIBRARY_PATH value containing the path to the protobuf lib dir.

like image 104
Robin Passama Avatar answered Oct 20 '22 09:10

Robin Passama