How to interpret TensorFlow output?

Tags:

How do I interpret the TensorFlow output for building and executing computational graphs on GPGPUs?

Given the following command that executes an arbitrary tensorflow script using the python API.

python3 tensorflow_test.py > out

The first part stream_executor seems like its loading dependencies.

I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally

What is a NUMA node?

Click to copy

I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero

I assume this is when it finds the available GPU

Click to copy

I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:  name: Tesla K40c major: 3 minor: 5 memoryClockRate (GHz) 0.745 pciBusID 0000:01:00.0 Total memory: 11.25GiB Free memory: 11.15GiB

Some gpu initialization? what is DMA?

Click to copy

I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0  I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y  I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:01:00.0)

Why does it throw an error E?

Click to copy

E tensorflow/stream_executor/cuda/cuda_driver.cc:932] failed to allocate 11.15G (11976531968 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

Great answer to what the pool_allocator does: https://stackoverflow.com/a/35166985/4233809

Click to copy

I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 3160 get requests, put_count=2958 evicted_count=1000 eviction_rate=0.338066 and unsatisfied allocation rate=0.412025 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 100 to 110 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1743 get requests, put_count=1970 evicted_count=1000 eviction_rate=0.507614 and unsatisfied allocation rate=0.456684 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 256 to 281 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 1986 get requests, put_count=2519 evicted_count=1000 eviction_rate=0.396983 and unsatisfied allocation rate=0.264854 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 655 to 720 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:244] PoolAllocator: After 28728 get requests, put_count=28680 evicted_count=1000 eviction_rate=0.0348675 and unsatisfied allocation rate=0.0418407 I tensorflow/core/common_runtime/gpu/pool_allocator.cc:256] Raising pool_size_limit_ from 1694 to 1863

735

asked Apr 25 '16 11:04

Alexander R Johansen

2 Answers

About NUMA -- https://software.intel.com/en-us/articles/optimizing-applications-for-numa

Roughly speaking, if you have dual-socket CPU, they will each have their own memory and have to access the other processor's memory through a slower QPI link. So each CPU+memory is a NUMA node.

Potentially you could treat two different NUMA nodes as two different devices and structure your network to optimize for different within-node/between-node bandwidth

However, I don't think there's enough wiring in TF right now to do this right now. The detection doesn't work either -- I just tried on a machine with 2 NUMA nodes, and it still printed the same message and initialized to 1 NUMA node.

DMA = Direct Memory Access. You could potentially copy things from one GPU to another GPU without utilizing CPU (ie, through NVlink). NVLink integration isn't there yet.

As far as the error, TensorFlow tries to allocate close to GPU max memory so it sounds like some of your GPU memory is already been allocated to something else and the allocation failed.

You can do something like below to avoid allocating so much memory

Click to copy

config = tf.ConfigProto(log_device_placement=True) config.gpu_options.per_process_gpu_memory_fraction=0.3 # don't hog all vRAM config.operation_timeout_in_ms=15000   # terminate on long hangs sess = tf.InteractiveSession("", config=config)

148

answered Oct 10 '22 01:10

Yaroslav Bulatov

successfully opened CUDA library xxx locally means that the library was loaded, but it does not meant that it will be used.
successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero means that your kernel does not have NUMA support. You can read about NUMA here and here.
Found device 0 with properties: you have 1 GPU which you can use. It lists the properties of this GPU.
DMA is direct memory access. More information on Wikipedia.
failed to allocate 11.15G the error clearly explains why this happened, but it is hard to tell why do you need so much memory without looking at the code.
pool allocator messages are explained in this answer

answered Oct 10 '22 02:10

Salvador Dali

Related questions
                            
                                How is Elastic Net used?
                            
                                How to add virtualenv to path
                            
                                python math domain errors in math.log function
                            
                                Automated docstring and comments spell check
                            
                                AttributeError: 'Namespace' object has no attribute
                            
                                Identifier normalization: Why is the micro sign converted into the Greek letter mu?
                            
                                Pandas update multiple columns at once
                            
                                Left-align a pandas rolling object
                            
                                How to mock a dictionary in Python
                            
                                Serving Python (Flask) REST API over HTTP2
                            
                                Get the bounding box coordinates in the TensorFlow object detection API tutorial
                            
                                FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated use `arr[tuple(seq)]` instead of `arr[seq]`
                            
                                I have a high-performant function written in Julia, how can I use it from Python?
                            
                                How to maintain pip install options in requirements file made by pip freeze?
                            
                                Compare (assert equality of) two complex data structures containing numpy arrays in unittest
                            
                                PyEval_InitThreads in Python 3: How/when to call it? (the saga continues ad nauseam)
                            
                                Django: Can you tell if a related field has been prefetched without fetching it?
                            
                                Multiprocessing : More processes than cpu.count
                            
                                TypeError: descriptor 'strftime' requires a 'datetime.date' object but received a 'Text'
                            
                                Understanding "score" returned by scikit-learn KMeans

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to interpret TensorFlow output?

Tags:

python

tensorflow

gpu

Alexander R Johansen

People also ask

2 Answers

Yaroslav Bulatov

Salvador Dali

Recent Activity

Donate For Us