Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: symbolic layer triggers device creation #25946

@ppwwyyxx

Description

@ppwwyyxx

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):linux ubuntu 16.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:n/a
  • TensorFlow installed from (source or binary):binary
  • TensorFlow version (use command below):b'v1.13.0-rc2-0-gc865ec5621' 1.13.0-rc2
  • Python version:3.7
  • Bazel version (if compiling from source):n/a
  • GCC/Compiler version (if compiling from source):n/a
  • CUDA/cuDNN version:10.0 / 7.4.2
  • GPU model and memory:gtx960M

Describe the current behavior
The following code:

import tensorflow as tf
a = tf.placeholder(tf.float32, [100, 100, 100, 100])
b = tf.layers.Conv2DTranspose(3, 3, data_format='channels_first')
output = b.apply(a)

prints:

2019-02-20 10:20:05.505595: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-02-20 10:20:05.578782: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-02-20 10:20:05.579477: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55fd579f65d0 executing computations on platform CUDA. Devices:
2019-02-20 10:20:05.579513: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): GeForce GTX 960M, Compute Capability 5.0
2019-02-20 10:20:05.606095: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2592000000 Hz                                
2019-02-20 10:20:05.606746: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x55fd57b39b00 executing computations on platform Host. Devices:
2019-02-20 10:20:05.606785: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>               
2019-02-20 10:20:05.607093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:                              
name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.0975
pciBusID: 0000:01:00.0
totalMemory: 1.96GiB freeMemory: 1.92GiB
2019-02-20 10:20:05.607118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0                                
2019-02-20 10:20:05.608205: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-02-20 10:20:05.608229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0                                                        
2019-02-20 10:20:05.608240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N                                                       
2019-02-20 10:20:05.608504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1742 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0)        

It can be seen that it initializes the GPU devices. However this should not happen in symbolic functions.

Initializing the GPU devices has many side effects.
It can lead to different types of failures, such as #8136 (comment). The largest side effect is that: any GPU-related flags given to a tf.Session created after device initialization will not take effect.
It will also make it much harder to use horovod because horovod requires initializing the GPU in specific ways (with visible_device_list). If a graph with Conv2DTranspose was created before creating the session (which is the standard way of using TF 1.0), horovod will fail to initialize the session. (cc @alsrgv ).

This bug exists for Conv2DTranspose, but not for Conv2D.
This bug exists in 1.13.0rc0. It does not exist in 1.12.0

Metadata

Metadata

Labels

TF 1.13Issues related to TF 1.13comp:kerasKeras related issuesstaleThis label marks the issue/pr stale - to be closed automatically if no activitystat:awaiting responseStatus - Awaiting response from authortype:bugBug

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions