<!DOCTYPE html>
Explanation of saved files¶
In this reading, we'll take a closer look at the files saved by the ModelCheckpoint callback, when saving weights only.
Previously, you experimented with the ModelCheckpoint callback, which can be used to save model weights during training. You looked at the saved files using the ! ls
command. The saved files were the following:
-rw-r--r-- 1 aph416 staff 87B 2 Nov 17:04 checkpoint
-rw-r--r-- 1 aph416 staff 2.0K 2 Nov 17:04 checkpoint.index
-rw-r--r-- 1 aph416 staff 174K 2 Nov 17:04 checkpoint.data-00000-of-00001
So, what are each of these files?
checkpoint
¶
This file is by far the smallest, at only 87 bytes. It's actually so small that we can just look at it directly. It's a human readable file with the following text:
model_checkpoint_path: "checkpoint"
all_model_checkpoint_paths: "checkpoint"
This is metadata that indicates where the actual model data is stored.
checkpoint.index
¶
This file tells TensorFlow which weights are stored where. When running models on distributed systems, there may be different shards, meaning the full model may have to be recomposed from multiple sources. In the last notebook, you created a single model on a single machine, so there is only one shard and all weights are stored in the same place.
checkpoint.data-00000-of-00001
¶
This file contains the actual weights from the model. It is by far the largest of the 3 files. Recall that the model you trained had around 14000 parameters, meaning this file is roughly 12 bytes per saved weight.