Python crashes when loading Bert model from pretrained

1

I am here because I am encountering a outmost obscure problem. Right when I begin by creating my model I encounter that my gpu usage spikes and then my python code crashes. This only happens when I try to use any of the models 'from_pretrained' only, I haven't had issues with neither Tensorflow nor PyTourch by themselves (this behavior is only native to transformers)

For example: The problem arises when running this line of code, right at the beginning of my script ;

model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased')

I get the following messages, which are pretty standard but as you can see in the bottom the code simply stops.

<
2021-04-16 16:16:35.330093: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-16 16:16:38.495667: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-04-16 16:16:38.519178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2021-04-16 16:16:38.519500: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-04-16 16:16:38.528695: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-04-16 16:16:38.528923: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-04-16 16:16:38.533582: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-04-16 16:16:38.535368: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-04-16 16:16:38.540093: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-04-16 16:16:38.543728: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-04-16 16:16:38.544662: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-04-16 16:16:38.544888: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0
2021-04-16 16:16:38.545436: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-04-16 16:16:38.546588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1060 computeCapability: 6.1
coreClock: 1.6705GHz coreCount: 10 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 178.99GiB/s
2021-04-16 16:16:38.547283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0
2021-04-16 16:16:39.115250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1300] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-16 16:16:39.115490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0
2021-04-16 16:16:39.115592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1319] 0: N
2021-04-16 16:16:39.115856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1446] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4634 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2021-04-16 16:16:39.419407: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-04-16 16:16:39.709427: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll

Process finished with exit code -1073741819 (0xC0000005)

Has any one else seen this? There something I am missing here? Thank you for your help.

Here are the details of my system.

transformers version: Latest Platform: Windows Python version: 3.7 PyTorch version (GPU?): Latest Tensorflow version (GPU?): Latest Using GPU in script?: Yes, GeForce GTX 1060 computeCapability: 6.1 Using distributed or parallel set-up in script?: No Models I encountered this error on:

Models that are related to this issue: albert, bert, xlm:

--UPDATE: I further narrow the issue down to all the TF[ModelName] pre trained models.

python
tensorflow
bert-language-model
huggingface-transformers
asked on Stack Overflow Apr 16, 2021 by Claudio Mazzoni • edited Apr 17, 2021 by Claudio Mazzoni

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0