system fails midway through ANN training

0

midway through an ANN training, the code suddenly either stops or causes BSoD here is the code:

import tensorflow as tf
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense,Flatten
from keras.optimizers import adam
from keras.activations import relu,softmax

dataset = keras.datasets.fashion_mnist
(train_images,train_labels),(test_images,test_labels) = dataset.load_data()

train_images = train_images / 255.0
test_images = test_images / 255.0
class_names = [
"T-shirt/top",
"Trouser",
"Pullover",
"Dress",
"Coat",
"Sandal",
"Shirt",
"Sneaker",
"Bag",
"Ankle boot"
]

model = Sequential()
model.add(Flatten(input_shape=(28,28)))
model.add(Dense(128,activation="relu"))
model.add(Dense(10,activation="softmax"))

model.compile(optimizer="adam",loss="sparse_categorical_crossentropy",metrics=["accuracy"])

model.fit(train_images,train_labels,epochs=20,batch_size=12)

test_loss, test_acc = model.evaluate(test_images,test_labels)

print(f"test accuracy: {test_acc}")
print(f"test loss: {test_loss}")

here is the whole error message when it doesn't cause BSoD:

C:\Users\User\AppData\Local\Programs\Python\Python37\python.exe C:/Users/User/PycharmProjects/AI/tutorial.py
2020-01-26 17:53:22.098836: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2020-01-26 17:53:22.099178: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Using TensorFlow backend.
2020-01-26 17:53:27.826205: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'nvcuda.dll'; dlerror: nvcuda.dll not found
2020-01-26 17:53:27.826375: E tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
2020-01-26 17:53:27.828951: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: DESKTOP-507H6IM
2020-01-26 17:53:27.829139: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: DESKTOP-507H6IM
2020-01-26 17:53:27.831554: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
Epoch 1/20

   12/60000 [..............................] - ETA: 13:55 - loss: 2.5218 - accuracy: 0.1667
  600/60000 [..............................] - ETA: 21s - loss: 1.3435 - accuracy: 0.5400  
 1200/60000 [..............................] - ETA: 13s - loss: 1.1029 - accuracy: 0.6200
 1812/60000 [..............................] - ETA: 10s - loss: 0.9730 - accuracy: 0.6634
 2424/60000 [>.............................] - ETA: 8s - loss: 0.9054 - accuracy: 0.6852 
 3048/60000 [>.............................] - ETA: 7s - loss: 0.8507 - accuracy: 0.7008
 3636/60000 [>.............................] - ETA: 7s - loss: 0.8226 - accuracy: 0.7109
 4200/60000 [=>............................] - ETA: 6s - loss: 0.7984 - accuracy: 0.7176
 4836/60000 [=>............................] - ETA: 6s - loss: 0.7760 - accuracy: 0.7235
 5460/60000 [=>............................] - ETA: 6s - loss: 0.7497 - accuracy: 0.7313
 6060/60000 [==>...........................] - ETA: 5s - loss: 0.7253 - accuracy: 0.7421
 6636/60000 [==>...........................] - ETA: 6s - loss: 0.7191 - accuracy: 0.7455
 7056/60000 [==>...........................] - ETA: 5:52 - loss: 0.7101 - accuracy: 0.7476
 7440/60000 [==>...........................] - ETA: 5:31 - loss: 0.6963 - accuracy: 0.7519


Process finished with exit code -1073741819 (0xC0000005)

could it have anything to do with the Tensorflow build? I installed it using pip if I should build my own Tensorflow using Basel, please tell me how.

I am using pycharm 2019.3

windows 10

python 3.7

Tensorflow 2.0.0

CPU: AMD Ryzen 5 3600

python
tensorflow
pycharm
asked on Stack Overflow Jan 26, 2020 by Mahdi_J

1 Answer

1

I have faced a similar issue before.

Your system is perfectly capable of executing the workload.(have you overclocked your CPU?, that may be causing unstability also.)

Just uninstall tensorflow, python, maybe reset pycharm (export preferences). And reinstall the packages. This had worked fine for me. Usually this solves it.

But I had a friend for whom this didn't work. We had to reset his Windows to resolve the issue.

Hope this helps.

answered on Stack Overflow Jan 26, 2020 by am-3

User contributions licensed under CC BY-SA 3.0