I am new with TensorFlow.
I just installed TensorFlow and to test the installation, I tried the following code and as soon as I initiate the TF Session, I am getting the Segmentation fault (core dumped) error.
bafhf@remote-server:~$ python
Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
/home/bafhf/anaconda3/envs/ismll/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
>>> tf.Session()
2018-05-15 12:04:15.461361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1349] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:04:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
Segmentation fault (core dumped)
My nvidia-smi is:
Tue May 15 12:12:26 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.30 Driver Version: 390.30 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:04:00.0 Off | 0 |
| N/A 38C P8 26W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 On | 00000000:05:00.0 Off | 2 |
| N/A 31C P8 29W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
And nvcc --version is:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
Also gcc --version is:
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Following is my PATH:
/home/bafhf/bin:/home/bafhf/.local/bin:/usr/local/cuda/bin:/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib:/home/bafhf/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
and the LD_LIBRARY_PATH:
/usr/local/cuda/bin:/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib
I am running this on a server and I don't have root privileges. Still I managed to install everything as per the instructions on the official website.
Edit: New observations:
Seems like the GPU is allocating memory for the process for a second and then the core segmentation dumped error is thrown:
Edit2: Changed tensorflow version
I downgraded my tensorflow version from v1.8 to v1.5. The issue still remains.
Is there any way address or debug this issue?
Core Dump/Segmentation fault is a specific kind of error caused by accessing memory that “does not belong to you.” When a piece of code tries to do read and write operation in a read only location in memory or freed block of memory, it is known as core dump. It is an error indicating memory corruption.
What Is Segmentation Fault? In a nutshell, segmentation fault refers to errors due to a process's attempts to access memory regions that it shouldn't. When the kernel detects odd memory access behaviors, it terminates the process issuing a segmentation violation signal (SIGSEGV).
A segmentation fault (aka segfault) is a common condition that causes programs to crash; they are often associated with a file named core . Segfaults are caused by a program trying to read or write an illegal memory location.
I was also facing the same issue. I have a workaround for the same you can try that.
I followed the following steps: 1. Reinstall the python 3.5 or above 2. Reinstall the Cuda and Add the Cudnn libraries to it. 3. Reinstall Tensorflow 1.8.0 GPU version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With