「TensorFlow」2. 编译 TensorFlow C++

简述:编译 TensorFlow C++ 很费劲,需要一颗向死之心。虽然成功编译已经是半个月前的事,但我拖到现在才整理成稿。


系统环境:

  • OSUbuntu 14.04.6 LTS x64 (trusty)
  • RAM64 GB
  • GPUNVIDIA GTX TITAN x 4
  • CUDA Toolkit:10.0
  • cuDNN:7.4.2

安装版本:

  • bazel:0.21.0
  • TensorFlow:1.13.1

1 安装 bazel

幸好 bazel 0.21.0 支持 Ubuntu 14.04 ,不然就只能砸电脑了,虽然没钱赔!

官方步骤

  1. 安装依赖: pkg-config, zip, g++, zlib1g-dev, unzip, 和 python

    1
    $ sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python
  2. 下载安装脚本,这是bazel 发布页面

    1
    2
    $ cd /tmp
    $ wget https://github.com/bazelbuild/bazel/releases/download/0.21.0/bazel-0.21.0-installer-linux-x86_64.sh
  3. 运行脚本:

    1
    2
    $ chmod +x bazel-0.21.0-installer-linux-x86_64.sh
    $ ./bazel-0.21.0-installer-linux-x86_64.sh --user

    使用 --user 会将 bazel 安装在 $HOME/bin 路径下,并将配置文件 .bazelrc 设置在 $HOME 路径下。

  4. 配置环境变量

    如果你有使用 --user,则:

    1
    2
    3
    $ echo "# bazel path
    export PATH="$PATH:$HOME/bin"" >> ~/.bashrc
    $ source ~/.bashrc

    官方教程只设置了临时环境变量,但我习惯一步到位。

  5. 验证

    1
    2
    3
    4
    5
    $ bazel version
    WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
    INFO: Invocation ID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    Build label: 0.21.0
    ..................................................

2 环境初始化

  1. 拉取 TensorFlow

    1
    2
    3
    $ mkdir ~/Programs
    $ cd Programs/
    $ git clone -b r1.13 https://github.com/tensorflow/tensorflow.git TensorFlow/
  2. 运行环境初始化脚本

    1
    2
    3
    $ cd TensorFlow/tensorflow/contrib/makefile
    $ chmod +x build_all_linux.sh
    $ ./build_all_linux.sh

3 编译 TensorFlow

  1. 配置编译要求

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    $ cd ~/Programs/TensorFlow
    $ ./configure
    WARNING: Duplicate rc file: /home/ttt/Programs/tensorflow-r1.13-4/.tf_configure.bazelrc is read multiple times, most recently imported from /home/ttt/.bazelrc
    WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
    INFO: Invocation ID: a16de493-8f9e-46f8-96bc-0903f0e36c3b
    You have bazel 0.21.0 installed.
    Please specify the location of python. [Default is /usr/bin/python]: /home/ttt/miniconda3/bin/python

    Found possible Python library paths:
    /home/ttt/miniconda3/lib/python3.6/site-packages
    Please input the desired Python library path to use. Default is [/home/ttt/miniconda3/lib/python3.6/site-packages]

    Do you wish to build TensorFlow with XLA JIT support? [Y/n]: y
    XLA JIT support will be enabled for TensorFlow.

    Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
    No OpenCL SYCL support will be enabled for TensorFlow.

    Do you wish to build TensorFlow with ROCm support? [y/N]: n
    No ROCm support will be enabled for TensorFlow.

    Do you wish to build TensorFlow with CUDA support? [y/N]: y
    CUDA support will be enabled for TensorFlow.

    Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]:


    Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:


    Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:


    Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:


    Do you wish to build TensorFlow with TensorRT support? [y/N]: n
    No TensorRT support will be enabled for TensorFlow.

    Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]:


    Please specify a list of comma-separated Cuda compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1,6.1]:


    Do you want to use clang as CUDA compiler? [y/N]: n
    nvcc will be used as CUDA compiler.

    Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:


    Do you wish to build TensorFlow with MPI support? [y/N]: n
    No MPI support will be enabled for TensorFlow.

    Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:


    Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
    Not configuring the WORKSPACE for Android builds.

    Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
    --config=mkl # Build with MKL support.
    --config=monolithic # Config for mostly static monolithic build.
    --config=gdr # Build with GDR support.
    --config=verbs # Build with libverbs support.
    --config=ngraph # Build with Intel nGraph support.
    --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
    Preconfigured Bazel build configs to DISABLE default on features:
    --config=noaws # Disable AWS S3 filesystem support.
    --config=nogcp # Disable GCP support.
    --config=nohdfs # Disable HDFS support.
    --config=noignite # Disable Apacha Ignite support.
    --config=nokafka # Disable Apache Kafka support.
    --config=nonccl # Disable NVIDIA NCCL support.
    Configuration finished
  2. 编译 TensorFlow C++

    • CPU 版本

      1
      $ bazel build --config=opt //tensorflow:libtensorflow_cc.so
    • GPU 版本

      1
      $ bazel build --config=opt --config=cuda //tensorflow:libtensorflow_cc.so

4 测试环境

  1. 新建目录及文件

    1
    2
    3
    4
    $ mkdir -p ~/Projects/demo
    $ cd ~/Projects/demo
    $ mkdir build src
    $ touch CMakeLists.txt src/main.cpp
  2. 编写测试代码 main.cpp

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    #include <tensorflow/core/platform/env.h>
    #include <tensorflow/core/public/session.h>
    #include <iostream>

    using namespace std;
    using namespace tensorflow;

    int main()
    {
    Session* session;
    Status status = NewSession(SessionOptions(), &session);
    if (!status.ok()) {
    cout << status.ToString() << "\n";
    return 1;
    }
    cout << "Session successfully created.\n";
    }
  3. 编写 CMakeLists.txt

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    cmake_minimum_required(VERSION 3.15)
    project(demo)

    set(CMAKE_CXX_STANDARD 11)
    set(PROGRAMS_DIR /home/ttt/Programs)
    set(TENSORFLOW_DIR ${PROGRAMS_DIR}/TensorFlow)

    include_directories(${TENSORFLOW_DIR})
    include_directories(${TENSORFLOW_DIR}/bazel-genfiles)
    include_directories(${TENSORFLOW_DIR}/tensorflow/contrib/makefile/downloads/absl)
    include_directories(${TENSORFLOW_DIR}/tensorflow/contrib/makefile/downloads/eigen)

    link_directories(${TENSORFLOW_DIR}/bazel-bin/tensorflow)

    add_executable(demo main.cpp)
    target_link_libraries(LaneNet tensorflow_cc tensorflow_framework)
  4. 目录结构

    1
    2
    3
    4
    5
    6
    ├── build
    |
    ├── CMakeLists.txt
    |
    ├── src
    | └── main.cpp
  5. 编译并运行测试程序

    1
    2
    3
    4
    $ cd ~/Programs/demo/build
    $ cmake ..
    $ make
    $ ./demo

    输出:

    1
    2
    ..................................................
    Session successfully created.

5 疑难杂症

  1. bazel 版本问题

    1
    ERROR:/home/ttt/.cache/bazel/_bazel_ttt/12fb0bc5892f7f0d9058b186b377c0bf/external/local_config_cc/BUILD:57:1: in cc_toolchain rule @local_config_cc//:cc-compiler-k8: Error while selecting cc_toolchain: Toolchain identifier 'local' was not found, valid identifiers are [local_linux, local_darwin, local_windows]

    解决办法:bazel 0.19.1 -> 0.21.0

  2. eigen 版本问题

    1
    2
    3
    /usr/local/tensorflow/include/third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1:42: 
    fatal error: unsupported/Eigen/CXX11/Tensor: 没有那个文件或目录
    #include "unsupported/Eigen/CXX11/Tensor"

    解决办法:更新 eigen 至 3.3,且添加到 CMakeLists.txt 的搜索目录

  3. 找不到 abseil-cpp

    1
    2
    /usr/local/tensorflow/include/tensorflow/core/lib/core/stringpiece.h:29:38: fatal error: absl/strings/string_view.h: 没有那个文件或目录
    #include "absl/strings/string_view.h"

    解决办法(来源):下载 abseil-cpp 并添加到 CMakeLists.txt 的搜索目录

  4. CMakeLists.txt

    1
    2
    /home/ttt/Programs/tensorflow-r1.13-3/tensorflow/core/framework/tensor_shape.h:22:48: fatal error: tensorflow/core/framework/types.pb.h: 没有那个文件或目录
    #include "tensorflow/core/framework/types.pb.h"

    解决办法:include_directories(${TENSORFLOW_DIR}/bazel-genfiles)


6 参考


7 系列

  1. 「TensorFlow」1. 安装 CUDA 和 cuDNN
  2. 「TensorFlow」2. 编译 TensorFlow C++

以上!


「TensorFlow」2. 编译 TensorFlow C++

https://alexinst.github.io/TensorFlow/build-tensorflow-cpp/

作者

Alex

发布于

2019-11-05

更新于

2021-06-19

许可协议

评论