准备 nvidia-docker

  1. docker 安装
    apt install docker
  2. 安装NVIDIA-docker
    > Ubuntu 14.04/16.04/18.04, Debian Jessie/Stretch
    > Ubuntu will install docker.io by default which isn't the latest version of Docker Engine. This implies that you will need to pin the version of nvidia-docker. See more information here.
# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
​
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker
​
# Add the package repositories
​
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
​
# Install nvidia-docker2 and reload the Docker daemon configuration
​
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
​
# Test nvidia-smi with the latest official CUDA image
​
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

准备基础系统

在这里我们使用NVIDIA官方给出的Ubuntu镜像,NVIDIA官方给出的docker镜像一般安装有对应版本的cudacudnn,这样节省了我们自己安装cudacudnn的时间。
执行命令语句:
sudo docker pull nvidia/cuda:10.0-cudnn7-devel-ubuntu18.0
这里安装的是ubuntu18.04的版本,cuda版本为10.0,cudnn的版本为7.6.3

获取caffe

caffe目前的版本较多,有来自伯克利的官方版本(1.0)版本,也有许多其他添加了很多其他层的caffe版本,这里先以官方版本为例,记录caffe的编译安装过程

从github获得caffe

  1. 从GitHub的远端下载caffe的源码
    git clone git@github.com:BVLC/caffe.git
    git clone https://github.com/BVLC/caffe.git
  2. 在bash输入以下语句安装caffe依赖包
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev
​
sudo apt-get install libhdf5-serial-dev protobuf-compiler
​
sudo apt-get install --no-install-recommends libboost-all-dev
​
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev
​
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
  1. 进入git下来的caffe目录,将Makefile.config.example复制得Makefile.config,修改成符合系统环境的情况。以如下环境为例:
    • Ubuntu 18.04 LTS
    • RTX2080 + NVIDIA-Driver 430.40
    • Cuda 10 + Cudnn 7.6.3
    • Python 3.7
      给出对应的Makefile.config示例
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
​
# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
​
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
​
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
# This code is taken from https://github.com/sh1r0/caffe-android-lib
# USE_HDF5 := 0
​
# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#   You should not set this flag if you will be reading LMDBs with any
#   possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1
​
# Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3
​
# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++
​
# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr
​
# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
# For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility.
CUDA_ARCH := # -gencode arch=compute_20,code=sm_20 \
        # -gencode arch=compute_20,code=sm_21 \
        -gencode arch=compute_30,code=sm_30 \
        -gencode arch=compute_35,code=sm_35 \
        -gencode arch=compute_50,code=sm_50 \
        -gencode arch=compute_52,code=sm_52 \
        -gencode arch=compute_60,code=sm_60 \
        -gencode arch=compute_61,code=sm_61 \
        -gencode arch=compute_61,code=compute_61
​
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
# BLAS := atlas
BLAS := open
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
​
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
​
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
​
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
# PYTHON_INCLUDE := /usr/include/python2.7 \
#       /usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda3
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
#          $(ANACONDA_HOME)/include/python3.7m \
#         $(ANACONDA_HOME)/lib/python3.7/site-packages/numpy/core/include
​
# Uncomment to use Python 3 (default is Python 2)
PYTHON_LIBRARIES := boost_python3 python3.7m
PYTHON_INCLUDE := /usr/include/python3.5m \
                 /usr/lib/python3.5/dist-packages/numpy/core/include
​
# We need to be able to find libpythonX.X.so or .dylib.
# PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
​
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
​
# Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1
​
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
​
# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
​
# NCCL acceleration switch (uncomment to build with NCCL)
# https://github.com/NVIDIA/nccl (last tested version: v1.2.3-1+cuda8.0)
USE_NCCL := 1 # 多GPU开启这个
​
# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1
​
# N.B. both build and distribute dirs are cleared on `make clean`
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
​
# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1
​
# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
​
# enable pretty build (comment to see full commands)
Q ?= @
  1. 此外还要还需将Makefile(注意不是Makefile.config)中的这一行
    NVCCFLAGS += -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
    替换为
    NVCCFLAGS += -D_FORCE_INLINES -ccbin=$(CXX) -Xcompiler -fPIC $(COMMON_FLAGS)
  2. 最后回到终端,进入&lt;Caffe-DIR>/python目录,执行以下语句获取caffe的python依赖库
    sudo pip3 install -r ./require*
  3. 返回caffe目录,开始执行编译过程
cd ..
make clean
make all -j8 # -j8表示调动CPU的8个线程,加快执行该指令的速度 -j表示调动所有线程进行编译(不推荐)
make test -j8
make runtest -j8
  1. 编译完成后,相关库和二进制文件在&lt;Caffe-DIR>/build/下,二进制的工具文件在对应的tools目录下,python库在对应的python目录下。将对应的二进制文件添加到PATH环境变量中,python库路径添加到PYTHONPATH环境变量中。在命令行下进入Python交互,import caffe如果没有报错的话则说明成功。
    > Tips: 千万不要在caffe目录下使用cmake工具进行编译,cmake执行generate后,会覆盖原有的Makefile文件,导致后续无法继续编译。
    > 解决办法
    > git checkout . -f
    > 强制恢复至原有的版本,重新进行第6步操作

从nvcr.io获取caffe镜像

nvcr.io是NVIDIA推出的一个自己的机器学习镜像库,由NVIDIA公司进行维护,但是此版本的caffe版本仍然是caffe0.13的版本,比较落后,但是这种方法不需要自己配置环境。有需要的可以直接pull镜像。
sudo docker pull nvcr.io/nvidia/caffe:19.09-py

最后修改日期:2019年10月2日

作者

留言

撰写回覆或留言

发布留言必须填写的电子邮件地址不会公开。