Tensorflow 1.14 performance issue on rtx 3090

Tags:

I am running a model written with TensorFlow 1.x on 4x RTX 3090 and it is taking a long time to start up the training than as in 1x RTX 3090. Although, as training starts, it gets finished up earlier in 4x than in 1x. I am using CUDA 11.1 and TensorFlow 1.14 in both the GPUs.

Secondly, When I am using 1x RTX 2080ti, with CUDA 10.2 and TensorFlow 1.14, it is taking less amount to start the training as compared to 1x RTX 3090 with 11.1 CUDA and Tensorflow 1.14. Tentatively, it is taking 5 min in 1x RTX 2080ti, 30-35 minutes in 1x RTX 3090, and 1.5 hrs in 4x RTX 3090 to start the training for one of the datasets.

I'll be grateful if anyone can help me to resolve this issue.

I am using Ubuntu 16.04, Core™ i9-10980XE CPU, and 32 GB ram both in 2080ti and 3090 machines.

EDIT: I found out that TF takes a long start-up time in Ampere architecture GPUs, according to this, but I'm still unclear if this is the case; and, if this is the case, does any solution exist for it?

927

asked Oct 21 '20 11:10

Thunder

2 Answers

T.F. 1.x does not have binaries for CUDA 11.1, so at the start, it takes time to compile. Because RTX 3090 compiles using PTX & JIT-compiler it takes a long time.
A general solution for this is to increase the cache size,.using code:-"export CUDA_CACHE_MAXSIZE=2147483648" (here 2147483648 is the cache size, you can set it any number by considering memory limit and it's usage in other processes in account). Refer to https://www.tensorflow.org/install/gpu for clarification. From this in the subsequent run, start-up time will be small. But even after this, binaries produce(At this start) will not be compatible with CUDA 11.1

The best is to migrate the code from T.F. 1.x to 2.x(2.4+) to make it run on RTX 30XX series or try compiling T.F. 1.x from source with CUDA 11.1(Not sure on this).

175

answered Nov 07 '22 04:11

Thunder

As Thunder explained, TensorFlow 1.x is not supported on Nvidia Ampere GPUs, and it looks like it never will be, as Ampere streaming multiprocessor (SM_86) are only supported on CUDA 11.1, see https://forums.developer.nvidia.com/t/can-rtx-3080-support-cuda-10-1/155849/2 and TensorFlow 1.x wasn't fully supported on new versions of CUDA for a while now, for probably similar reason as described in the link above. Unfortunately TensorFlow version 1.x is no longer supported or maintained, see https://github.com/tensorflow/tensorflow/issues/43629#issuecomment-700709796

However, if you have to use Stylegan 2 model, you might have some luck with Nvidia Tensorflow, which apparently has support for version 1.15 on Ampere GPUs, see https://developer.nvidia.com/blog/accelerating-tensorflow-on-a100-gpus/

answered Nov 07 '22 06:11

Dan Pavlov

Related questions
                            
                                Unable to build `Dense` layer with non-floating point dtype Error
                            
                                Keras conditional passing one model output to another model
                            
                                Dropout layer before or after LSTM. What is the difference?
                            
                                Maximize the MSE of a keras model
                            
                                Does SHAP in Python support Keras or TensorFlow models while using DeepExplainer?
                            
                                Device placement unknown in Tensorboard
                            
                                Determining if A Value is in a Set in TensorFlow
                            
                                Using summary with tf slim or tf layers
                            
                                Making a list and appending to it in TensorFlow
                            
                                Calling a Keras model on a TensorFlow tensor but keep weights
                            
                                Implementing custom loss function in keras with different sizes for y_true and y_pred
                            
                                Implement early stopping in tf.estimator.DNNRegressor using the available training hooks
                            
                                How does tf.keras.layers.Conv2D with padding='same' and strides > 1 behave?
                            
                                How to fix 'RuntimeError: `get_session` is not available when using TensorFlow 2.0.'
                            
                                module 'tensorflow.compat.v2.__internal__' has no attribute 'tf2'
                            
                                Deleting all but a few nodes in TensorFlow graph
                            
                                ImportError: No module named 'tflearn'
                            
                                Keras: TypeError: can't pickle _thread.lock objects with KerasClassifier
                            
                                Can I delete events.out.tfevents.XXXXXXXXXX.computer_name files from training folder
                            
                                Tensorflow 2.0 Custom loss function with multiple inputs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow 1.14 performance issue on rtx 3090

Tags:

tensorflow

nvidia

stylegan

Thunder

People also ask

2 Answers

Thunder

Dan Pavlov

Recent Activity

Donate For Us