はじめに

GTX1070を購入しました GPUでTensorFlowを動かすまでの作業ログです

f:id:moyomot:20161015222547j:plain f:id:moyomot:20161015222603j:plain

環境

ハードウェア（2年前に自作したマイサーバー）

CPU: Intel CPU Xeon E3-1241V3 3.50GHz
マザーボード: ASUSTeK Intel H97 Pro
SSD: Samsung SSD840EVO 250GB
メモリ: 8GB * 4

今回購入

ASUSTek NVIDIA GeForce GTX1070
なぜ、GTX1070 → お小遣いで購入できる限界だった

ソフトウェア

OS: Ubuntu Server 16.04.1 LTS

~~- NVIDIAドライバ: cuda-repo-ubuntu1504-7-5-local_7.5-18_amd64.deb~~ → apt-getで入れる

cuDNN: cudnn-7.5-linux-x64-v5.1.tgz
TensorFlow: 0.11

注意

2016/10/15時点でのバージョン構成
CUDAはapt-getで導入しました
CUDA8はTensorFlow0.11に対応してませんでした
必要ソフトウェアのバージョンを合わせるのに試行錯誤が必要でした

お世話になったサイト

Ubuntu14.04 + GPU + TensorFlow 環境構築 - Qiita

GPGPUマシンの更新(2) 〜 CUDA 7.5 と cuDNN 5.0RC - まんぼう日記

GPU合体

f:id:moyomot:20161015222533j:plain

GTX1070はPCI Express 3.0がマザボに備わっているかを確認すればOK
- CPUの場合、チップセットを気にしなければならないが、GPUは何を確認すればいいのか当初わからなかった
- D-sub装備していないので、HDMIでテレビに繋いで作業しました

OSインストール

Ubuntu Server 16.04.1 LTS入れました
OS入れば、SSHで作業できる

GPUに必要なソフトウェア導入

とりあえず

sudo apt-get install gcc make

NVIDIAドライバ

~~- ダウンロード~~ ~~- NVIDIA DRIVERS Linux x64 (AMD64/EM64T) Display Driver~~ ~~- scpでサーバーへ~~ ~~sudo chmod 755 NVIDIA-Linux-x86_64-367.57.run~~ ~~sudo ./NVIDIA-Linux-x86_64-367.27.run~~

[追記]
- dockerで動かすにはapt-getで入れたほうがよい

$ sudo add-apt-repository ppa:graphics-drivers/ppa 
$ sudo apt-cache search nvidia-\d+

nvidia-352 - Transitional package for nvidia-361
mate-sensors-applet-nvidia-dbg - Display readings from hardware sensors in your MATE panel (NVIDIA, dbg package)
nvidia-304 - NVIDIA legacy binary driver - version 304.132
nvidia-304-updates - Transitional package for nvidia-304
nvidia-340 - NVIDIA binary driver - version 340.98
nvidia-355 - NVIDIA binary driver - version 355.11
nvidia-358 - NVIDIA binary driver - version 358.16
nvidia-361 - NVIDIA binary driver - version 361.45.18
nvidia-364 - NVIDIA binary driver - version 364.19
nvidia-367 - NVIDIA binary driver - version 367.44
nvidia-370 - NVIDIA binary driver - version 370.28

$ sudo apt-get install nvidia-370

そして、再起動

CUDA

sudo apt install nvidia-cuda-toolkit

NVIDIAサイトからダウンロードしたものはインストールできず断念

cuDNN

NVIDIAサイトでユーザー登録が必要
- 登録後、すぐにダウンロード可能（2016/10/15時点）
- アンケートに答える必要あり
cudnn-7.5-linux-x64-v5.1.tgz

tar xvzf cudnn-7.5-linux-x64-v5.0-ga.tgz
cd cuda 
sudo cp lib64/* /usr/lib/x86_64-linux-gnu
sudo cp include/cudnn.h /usr/include
sudo chmod a+r /usr/lib/x86_64-linux-gnu/libcudnn*

以上で環境構築は完了

TensorFlow動かしてみる

インストール

公式サイト通りに実行
- Download and Setup

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0rc0-cp27-none-linux_x86_64.whl
sudo apt-get install python-pip python-dev
sudo pip install --upgrade $TF_BINARY_URL

サンプルコード動かす

適当にgit cloneし、中に移動

GitHub - tensorflow/tensorflow: Computation using data flow graphs for scalable machine learning

Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties: 
name: GeForce GTX 1070
major: 6 minor: 1 memoryClockRate (GHz) 1.835
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.84GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0)

ふおおー

チュートリアル実行

Deep MNIST for Experts

ページ最後のコード実行は約30分かかるとのこと今回の環境ではGPUを使用したため、約2-3分で実行できた

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.initialize_all_variables())
for i in range(20000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={
        x:batch[0], y_: batch[1], keep_prob: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy))
  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

print("test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

 nvidia-smi 
Sat Oct 15 21:59:52 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57                 Driver Version: 367.57                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 0000:01:00.0     Off |                  N/A |
| 38%   60C    P2   114W / 166W |   7797MiB /  8112MiB |     83%      Default |
+-------------------------------+----------------------+----------------------+

いいかんじ！