Today we’re going to look at how to take snapshots of your FANN Artificial Neural Network  (ANN) while it’s training.

But why?

Well, maybe you want to fork competing GANs to progressively create ever increasingly  believable deepfakes because you want some of that sweet, sweet deepfake money… um… I mean, build a best seller Writer Bot… 😉 😛

Perhaps you want to compare how different processes, configurations or algorithms affect your ANN during training rather than waiting till the end.

Or… maybe, you have a pug and toddler who conspire to take turns crawling under your desk and push the power switch on the surge protector ~60 hours into a long and complex training process and you’d rather not lose days of work… again!

OK… not a third time smarty pants! 😛

Getting Started

Before we proceed I’d like to mention that the ideas in this post build on what we covered in the Getting Started Series but you don’t strictly need to have read them to follow along.

  • Getting Started 1 – How to install FANN and setup your test environment.
  • Getting Started 2 – Understanding Neural Networks and Training & Testing a Neural Network to perform XOR.
  • Getting Started 3 -Training & Testing AND, OR, NOT, NAND, NOR, XNOR Neural Networks.

If you read through Getting Started (and it’s recommended that you do) then you should generally be familiar with the FANN function fann_train_on_file() which lets us train an ANN on a .data file until it’s Mean Squared Error (MSE) matches or is less than the desired error we specify for the ANN.

Great for simple neural networks and proven training methods like OCR and Pathfinder and of course you can even use it to do Machine Learning from a Database.

BUT… it can be frustrating when you are  experimenting because it is an automated process.

Once you hand over your training data, FANN does its absolute best to learn what you gave it.

Provided the configuration you specified is appropriate for the problem, your ANN will learn to model your data.

So what’s the problem?

The problem is people suck at picking good settings on the first try! 😛

Using train_on_file() on complex training sets will leave you waiting potentially hours or even days to see how well your ANN does.

All the while your mind is left to ponder questions like:

  • Was that the right number of Hidden Neurons?
  • Should I have used more Hidden Layers?
  • Was One Hot really the right encoding method to go with?
  • Should I have used a different Activation Function?
  • Should I have used a different Training Algorithm?
  • Um… Imagine dozens of other deeply insightful and angst filled questions here… 😛

So what’s the solution?

What we need is a way to to check how well the neural network is learning early on.

This means we have to use a different training function since train_on_file() is completely automated.

FANN Training Functions

In this case we’re interested in fann_train_epoch() since it will allow us to perform 1 epoch (One epoch is where all of the training data is considered exactly once) and then return the MSE as a float representing how “accurate” or how much “error” the network currently has when considering the entire training data.

The goal of training the ANN is to reduce the MSE.

To continue training we simply need to wrap the fann_train_epoch() function in a loop that checks the MSE against a desired error.

Once the MSE is equal to or less than the desired error we can stop training.

The cool thing is we can now do other things besides check the MSE after each epoch… like log the MSE or take snapshots for example! 🙂

 

Say Cheese

A snapshot is created by using the fann_save() function.

Saving a snapshot let’s us keep a backup of the “best” and most recent versions of the ANN with the lowest MSE.

If you’re a gamer this is the equivalent of hitting the save button every time you level up rather than just once and after you you’ve won the game no less.

In the code I present below I use a queue to keep track of the snapshots and remove the oldest snapshots once we have more than the $max_snapshots value.

 

Code

This code demonstrates training XOR using fann_train_epoch() and will let you watch the training process by observing a pseudo MSE (mean squared error).

<?php
// Create a snapshots sub-folder
if(!is_file(dirname(__FILE__) . "/snapshots")){
  mkdir(dirname(__FILE__) . "/snapshots");
}

/*
XOR ANN Diagram:

L1 Input Layer:    (I)(I)

L2 Hidden Layer: (H) (H) (H)

L3 Output Layer:     (O)
*/

// Setup the variables to configure our ANN
$num_input = 2; // How many input neurons are there
$num_neurons_hidden = 3; // How many hidden neurons are there
$num_output = 1;  // How many output neurons are there
$num_layers = 3;  // 1 input layer + 1 hidden layer + 1 output layer
$divisor = 10000; // This is our accuracy knob. The higher this value is the smaller the desired error will be
$desired_error = 1 / $divisor; // = 0.0001 (if $divisor = 10000) the Mean Squared Error (MSE) we want
$max_epochs = 100000; //100K adjust as needed
$current_epoch = 0;   // Keep track of which epoch we're on here
$epochs_between_snapshots = 25; // Minimum number of epochs between saves, adjust as needed
$epochs_since_last_snapshot = 0; // Keep track of how many epochs since the last snapshot
$max_snapshots = 10; // How many snapshots to keep before we delete the oldest (and least accurate)
$snapshots = array(); // Keep track of the filename and path of the snapshots here
$hidden_activation = FANN_SIGMOID_SYMMETRIC; // Specify the hidden layer activation function
$output_activation = FANN_SIGMOID_SYMMETRIC; // Specify the output layer activation function
$training_algorithm = FANN_TRAIN_BATCH; // Specify the training algorithm here
$training_data = dirname(__FILE__) . "/xor.data"; // Specify the training file here


// Initialize psudo MSE to a number greater than the desired_error
// Technically we could just set this to 1 but I like how this makes
// it more obvious what is going on.
$psudo_MSE_result = $desired_error * $divisor; // 1
$best_mse = $psudo_MSE_result; // keep the last best seen MSE network score here


// The name of the ANN and associated files
// The naming convention I am using here is:
//
//     BaseName_input#(hidden#)output# - (hidden activation function)output activation function_RANDOM###
//
//     In practice it looks something like this: xor_2(3)1 - (5)(5)1_880
//
//     The 2(3)1 are "structure/layout":
//     2 Input Neurons
//     3 Hidden Neurons
//     1 Output Neuron
//
//     The (5)(5)1 are the configuration CONSTANT values assigned to:
//     FANN_SIGMOID_SYMMETRIC = 5
//     FANN_TRAIN_BATCH = 1
//
//     The 880 is the Random Number assigned to this training event
//     and resulting ANN.
//
// Complicated? You bet! Necessary? Nope! But... it makes looking at 
// snapshots a little easier/faster for me so... ultimately you can set 
// this to whatever you want.
$ann_name = "xor_$num_input($num_neurons_hidden)$num_output - ($hidden_activation)($output_activation)$training_algorithm" . '_' . mt_rand(100,999);


// The name of the log file we will use
$log_file = 'snapshots/training-log_' . $ann_name . '.csv';
$log = fopen($log_file, 'w'); // Open the log file for writing


// Create the ANN
$ann = fann_create_standard($num_layers, $num_input, $num_neurons_hidden, $num_output);

// Test that the ANN was created successfully
if ($ann) {
  echo 'Training XOR ANN... ' . PHP_EOL; 
  
  // Configure the ANN
  fann_set_activation_function_hidden($ann, $hidden_activation);
  fann_set_activation_function_output($ann, $output_activation);
  fann_set_training_algorithm ($ann , $training_algorithm);

  // Get the training data
  $train_data = fann_read_train_from_file($training_data);

  // We are trying to minimize psudo_MSE_result so check if 
  // it's greater than our desired_error, if so keep training 
  // so long as we are also under max_epochs
  while(($psudo_MSE_result > $desired_error) && ($current_epoch <= $max_epochs)){
    $current_epoch++; // increment the current_epoch
    $epochs_since_last_snapshot++;  // increment epochs_since_last_snapshot
  
    // http://php.net/manual/en/function.fann-train-epoch.php
    // Train one epoch with the training data stored in data. 
    // One epoch is where all of the training data is considered 
    // exactly once.
    // This function returns the MSE error as it is calculated 
    // either before or during the actual training. This is not the 
    // actual MSE after the training epoch, but since calculating this 
    // will require to go through the entire training set once more. 
    // It is more than adequate to use this value during training.
    $psudo_MSE_result = fann_train_epoch ($ann , $train_data ); // Train 1 epoch on all the data
    echo 'Epoch ' . $current_epoch . ' : ' . $psudo_MSE_result . PHP_EOL; // report
    
    // Build the log data
    $log_data = $current_epoch . ',' . $psudo_MSE_result; 
    
    // If we haven't saved the network in a while...
    // and the current network is better then the previous best network
    // as defined by the current MSE being less than the last best MSE,
    // then we meed to save!
    if(($epochs_since_last_snapshot >= $epochs_between_snapshots) && ($psudo_MSE_result < $best_mse)){
      
      $best_mse = $psudo_MSE_result; // We have a new best MSE
      
      // Save Snapshot
      fann_save($ann, dirname(__FILE__) . "/snapshots/$ann_name - $current_epoch.net");
      echo 'Snapshot Taken' . PHP_EOL; // report the save
      $epochs_since_last_snapshot = 0; // reset the count
      
      $log_data .= ',1'; // Note the save at this epoch in the log data
      
      // Add current snapshot to the queue
      array_unshift($snapshots, dirname(__FILE__) . "/snapshots/$ann_name - $current_epoch.net");
      $num_of_snapshots = count($snapshots) - 1; // Only need to count($num_of_snapshots) once
      
      // If we now have too many snapshots
      if($num_of_snapshots > $max_snapshots){
        
        // Remove the oldest snapshots first until we have 
        // less than $max_snapshots
        do{
          unlink($snapshots[$num_of_snapshots]); // Remove file
          unset($snapshots[$num_of_snapshots]);  // Remove the snapshot from memory
          --$num_of_snapshots; // decrement num_of_snapshots
        }while($num_of_snapshots > $max_snapshots);
        $snapshots = array_values(array_filter($snapshots)); // filter and re-index snapshots once
      }
    }else{
      $log_data .= ',0'; // We did not save a snapshot during this epoch
    }
    fwrite($log, $log_data . PHP_EOL); // Write epoch data to log file
  
  } // While we're training

  fclose($log); // Close log file

  echo 'Training Complete! Saving Final Network.' . PHP_EOL;
  // Save the final network
  fann_save($ann, dirname(__FILE__) . "/$ann_name.net");
  fann_destroy($ann); // Free memory
}
echo 'All Done!' . PHP_EOL;

Results

If you run the code above (and you installed FANN) then you should get output that looks something like this, note I shortened the output for convenience.


Training XOR ANN... 
Epoch 1 : 0.25012367963791
Epoch 2 : 0.25040853023529
Epoch 3 : 0.25014621019363
Epoch 4 : 0.25001245737076
Epoch 5 : 0.24214675004482
...
Epoch 24 : 0.210278244316578
Epoch 25 : 0.165620195567608
Snapshot Taken
...
Epoch 10328 : 0.00010004204523284
Epoch 10329 : 0.0001000228439807
Epoch 10330 : 0.00010000389738707
Snapshot Taken
Epoch 10331 : 9.9984776170459E-5
Training Complete! Saving Final Network.
All Done!

 

A .csv log file is generated during the training as well however we’re going to talk about that next week so remember to hit the follow button to make sure you don’t miss !

Also, if you enjoyed this post please leave a like and share it with your friends!

And if you have questions or comment… see the comments section.

And before you go, consider supporting me…


Support Me

Your direct monetary support finances my work and goes toward helping me obtain access to better tools and equipment so I can improve the quality of my content.  It also helps me eat, pay rent and of course we can’t forget to buy diapers for Xavier now can we? 😛

My little Xavier Logich

If you feel inclined to give me money and add your name on my Sponsors page then visit my Patreon page and pledge $1 or more a month and you will be helping me grow.

Thank you!

And as always, feel free to suggest a project you would like to see built or a topic you would like to hear me discuss in the comments and if it sounds interesting it might just get featured here on my blog for everyone to enjoy.

 

 

Much Love,

~Joy