I've been struggling with managing multiple Keras models with tf.Graphs
and tf.Sessions
for several weeks now. In short, I'd like to have multiple models open and switch between them as needed. This includes training new models, opening from file and making predictions.
The bottom line is: (almost) everything works fine until the program crashes with exit code 0xC0000005
. No error messages are given. Let me explain.
You get the point. This is how I currently manage the graphs and sessions. I use a context manager to set the created graph and session as defaults and later switch to the previous state.
class NeuralNetwork:
def __init__(self):
self.graph = tf.Graph()
self.session = tf.Session(graph=self.graph)
self.model = None
def close(self):
self.session.close()
del self.graph
self.graph = None
gc.collect()
@contextmanager
def _context(self):
prev = k.get_session()
k.set_session(self.session)
with self.graph.as_default(), self.session.as_default():
yield
k.set_session(prev)
def predict(self, x):
with self._context():
return self.model.predict(x)
def fit(self, x_train, y_train, n=20, batch=256):
with self._context():
self.model.fit(x_train, y_train, epochs=n, batch_size=batch, verbose=0)
def create(self, shape):
with self._context():
self.model = Sequential()
self.model.add(Dense(shape[1], input_dim=shape[0], activation='relu'))
self.model.add(Dropout(drop))
self.model.add(Dense(shape[2], activation='sigmoid'))
self.model.compile(loss='binary_crossentropy', optimizer='rmsprop')
def load(self, path, sfx=''):
with open(path / ('architecture' + sfx + '.json'), 'r') as f:
js = f.read()
with self._context():
self.model = model_from_json(js)
self.model.load_weights(path / ('weights' + sfx + '.h5'))
self.model.compile(loss='binary_crossentropy', optimizer='rmsprop')
def save(self, path, sfx=''):
path.mkdir(exist_ok=True)
with self._context():
js = self.model.to_json()
with open(path / ('architecture' + sfx + '.json'), 'w') as f:
f.write(js)
self.model.save_weights(path / ('weights' + sfx + '.h5'))
And with the above class, here's how a network is used elsewhere:
def create(self):
x, y = [], []
shape = (15, 30, 1)
self.predictor = NeuralNetwork()
self.predictor.create(shape)
self.predictor.fit(x, y)
self.predictor.save(path=self.path)
self.predictor.close()
def load(self):
self.predictor.load(path=self.path)
def predict(x):
# Executed only on loaded networks, never on created networks
# due to program structure
return self.predictor.predict(x)
Here are my previous efforts at articulating the problem.
To the best of my abilities and with the help of some people, I've tried to come up with a way to manage these resources (context manager and "closing" the network after training). But I have not come across documentation or a tutorial describing the process of Tensorflow or Keras resource management in detail.
My goals are two-fold.
If you can help me achieve or even step a tiny amount towards the direction of either one, I'd greatly appreciate it! I have the experience, that my struggles are neither unique nor something that others haven't already thought of. So I must just be lacking the proper approach.
The issue was resolved by updating all packages to their latest versions. Sadly, I made the upgrading in one go, which means I'm not sure what actually was the cause. But I'm willing to bet on Tensorflow.
Here are the package versions most likely involved in producing the error and their updated versions:
tensorflow==1.8.0 -> 1.12.0
numpy==1.14.5 -> 1.15.4
scikit-learn==0.19.1 -> 0.20.0
User contributions licensed under CC BY-SA 3.0