Search

Geek Girl Joy

Artificial Intelligence, Simulations & Software

Tag

OCR

OCR 2 – The MNIST Database

I know I probably haven’t been posting as frequently as many of you would like or even at my normal quality because… well, like for many of you, this year has just sucked!

Someone I’ve known my whole life died recently, not from the virus though it didn’t help things.

She went in for a “routine” procedure where they needed to use general anesthesia and there were “complications” during the procedure. Something to do with her heart but if I’m being honest, I don’t know all the details at this time.

Also, I’m not sure how by anyone’s definition anything involving anesthesia is routine?

An ambulance was called and she was rushed to the hospital, long story short, despite being otherwise fine when she went in, she never woke up from her coma. πŸ˜₯

The hospital is/was on lock down like everyone else and so friends and family were unable to visit her before she died.

Her family intends to sue the Dr. for malpractice, personally… I think they should!

To add insult to injury, she was cremated without a funeral due to the whole pandemic social distancing BS that I’m just about ready to tell the government to go fuck itself over! 😦

I’m sorry, do my harsh words offend you? SHE DIED ALONE! That offends me!

Going forward, my advice… any procedure where they need to administer general anesthesia to you… or maybe any procedure at all… make sure it’s in a hospital or hospital adjacent (NOT A CLINIC) because those minutes waiting for an ambulance really do mean your life!

And if your doctor is like, “No worries this is routine… I’ve done this a thousand times”, maybe think carefully before putting your trust in that person.

Yes, we want doctors that are confident in their ability to treat us but make sure that it is confidence and not complacent hubris!

Further, no procedure is truly “routine” and a doctor, of all people, should know that and act accordingly!

“Primum non nocere”

~Hippocrates… (allegedly)

Regardless of the historical veracity of that quote, does the spirit of that principle still not apply?

Look, I’m not saying this to detract from the important life saving work doctors and medical workers do every day, it’s just that this is part of what’s going on in my life right now (and for many of you as well) and I’m sharing because I guess that’s what you do when you have a blog.

Additionally, less close to home, though still another terrible loss, John Horton Conway, notable math hero to geeks and nerds alike died as a result of complications from his contracting the Covid-19 virus. 😦

I’ve previously written a little about Conway’s work in my ancestor simulations series of posts.

Mysterious Game of Life Posts:

But that only scratches the surface of his work and famously Conway’s Game of Life was perhaps his least favorite but most well known work among non-mathematicians and it would both amuse and bug him if I only mentioned his game of life here so I’m not going to list his other accomplishments.

I’ll have a little chuckle off camera on his behalf. πŸ˜›

He really was a math genius and you would learn a lot of interesting, not to mention surreal… but I’ve said too much, ideas by reading about his accomplishments, which I encourage you to do!

In any case, people I know and admire need to stop dying because its killing me… not to mention my ratings and readership because I keep talking about it! πŸ˜›

I may have a terribly dark sense of humor at times, but going forward I demand strict adherence from all of you to the Oasis Doctrine! πŸ˜₯

Oh, and speaking of pretentious art…

The OCR 2 Wallpaper

The original OCR didn’t exactly have a wallpaper but I did create an image/logo to go along with the project and its blog posts:

For the reason you might think I made it look like an eye… because it looks like an non-evil Hal 9000! πŸ˜›

Also, I like the idea of depicting a robotic eye in relation to AI and neural networks because, even though I am not superstitious in any way, it carries some of the symbology of Illuminati, “The gaze of the Beholder”, “The Eye of Providence”, “The Evil Eye”, The Eye of Horus, The Eye of Ra, Eye of newt and needle… sorry. πŸ˜›

In this case, the eye of a robot invokes a sense of literal “Deus ex machina” (God from the machine) and it illustrates some peoples fears of “The Singularity” and of the possibility of an intelligence that is so much greater than our own that it calls in to question our ability to even comprehend it… hmmm… is that too lovecraftian? πŸ˜›

Anyway, because I enjoy the thought provoking symbology (maybe it’s just me), I wanted to keep the same concept of the robot eye but update it to look a little less like a simple cartoon to subtly imply it’s a more advanced version of OCR but that it still fundamentally does the same thing, which is most of the reasoning behind this wallpaper.

In any case, I hope you enjoy it.

OCR 2 Wallpaper
OCR 2 Wallpaper

If you’d like the wallpaper with the feature image text here’s that version.

OCR 2 Wallpaper (with text)
OCR 2 Wallpaper (with text)

So I guess having shared a few of the recent tragedies in my personal life and a couple of wallpapers, we should probably get mogating and talk about the point of today’s post!

We’re going to look at doing hand-written number (0-9) Optical Character Recognition using the MNIST database.

OCR 2 – The MNIST Dataset with PHP and FANN

I was recently contacted by a full-stack developer who wanted advice on creating his own OCR system for “stickers on internal vehicles”.

I think he means, some kind of warehouse robots?

He had seen my OCR ANN and seemingly preferred to work with PHP over Python, which if I’m being honest… I can’t exactly argue with!

PHP is C++ for the web and powers like almost 80-90% of the internet so it should come as no surprise to anyone (even though it does) that there are people who want to use it to build bots! πŸ˜›

But, if you would rather work with a different language there is a better than decent chance FANN has bindings for it so you should be able to use the ANN’s even if you are not using PHP.

So anyway, he gave me a dollar for my advice through Patreon and we had a brief conversation over messaging where I offered him a few suggestions and walked him through getting started.

Ultimately, because he lacks an AI/ML background and/or a sufficient familiarity with an AI/ML workflow he wasn’t very confident about proceeding so I recommended he follow my existing tutorials which should help him learn the basics of how to proceed.

Now here’s the thing, even among people who like my content and value my efforts, few people are generous enough to give me money for my advice and when they do, I genuinely appreciate it! πŸ™‚

So, as a thank you I want to offer another (more complete) example of how to use a neural network to do OCR.

If he followed my advice, he should be fairly close to being ready for a more complete real world OCR ANN example (assuming he is still reading πŸ˜› ) but if not, his loss is still your gain!

Today’s code implements OCR using the MNIST dataset and I demonstrate a basic form of pooling (though the stride is not adjustable as is) and I show convolutions using the GD image library, image convolution function and include 17 demonstration kernel matrices that you can experiment with, though not all are relevant or necessary for this project.

This is still very basic but everything you need to get started experimenting with OCR is here.

Having said that, in all honesty, to accomplish your goal requires building your own dataset and modifying the code I present here to meet your needs.

Neither are exactly hard but will require significant time and dedication to testing and refining your processes.

Obviously that’s not something I can cover in a single post or even assist you with for only a dollar, but since so few people show me the kindness and consideration you have, at a time of shrinking economies no less, I wanted to offer you this working OCR prototype to help you along your way.

Our Method

1. Download the MNIST dataset (link below, but it’s in the GitHub repo too).

2. Unpack/Export the data from the files to images and labels.

(technically we could even skip the images and go directly to a training file but I think it’s nice to have the images and labels in a human viewable format)

3. Create training and test data from images and labels.

4. Train the network.

5. Test the network.

The MNIST Dataset

MNIST stands for Modified National Institute of Standards and Technology database.

And since I’m still recovering from last nights food poisoning due to the Chicken Γ  la Nauseam we’re just going to use Wikipedia’s introduction to MNIST.

It’s easily as good as anything I could write and doesn’t require me actually write it so…

Wikipedia says:

“It’s a large database of handwritten digits that is commonly used for training various image processing systems.[1][2]”

It also says:

“It was created by “re-mixing” the samples from NIST’s original datasets. The creators felt that since NIST’s training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments.[5] Furthermore, the black and white images from NIST were normalized to fit into a 28×28 pixel bounding box and anti-aliased, which introduced grayscale levels.[5]”

Here’s 500 pseudo-random MNIST sample images:

I randomly selected 500 1’s, 3’s and 7’s and composited them into this 1337 animation. πŸ˜›

500 random 1337 MNIST images.
500 random 1337 MNIST images

Seriously though,Β  today we will be training a bot to identify which hand-written number (0-9) each 28×28 px image contains and then test the bot using images it hasn’t previously seen.

Our bot will learn using all 60K labeled training images and we’ll test it using the 10,000 labeled test images.

Here’s the wiki article if you would like to learn more about the database.

MNIST WIKI: https://en.wikipedia.org/wiki/MNIST_database

And as I said above, I’ve included the database in the GitHub repo but you can download it again from the original source if you prefer.

Original MNIST Download: http://yann.lecun.com/exdb/mnist/

Continue reading “OCR 2 – The MNIST Database”

Visualizing Your FANN Neural Network

At some point you will want a diagram of your FANN neural network.

Example Diagram

Programmatically generated diagram of XOR ANN
Programmatically generated diagram of XOR ANN
Programmatically generated XOR ANN Stats
Programmatically generated XOR ANN Stats

Reasons May Include:

  • You need artwork for your fridge or cubical and Van Gogh’s Starry Night was mysteriously unavailable!
  • You want an illustration to help potential investors understand some of the technical aspects of how your AI startup works.
  • You’re trying to convince the good people who enjoy your work to throw gobs of cash at your Patreon. πŸ˜›

But.. Your exact reasons may very! πŸ˜‰

None the less, read on because I’m giving you 100% free & fully functional code and explaining how it works.

I’m not even asking for your email address!

 

Continue reading “Visualizing Your FANN Neural Network”

Getting Started With Neural Networks and PHP in 2019

So it turns out that I am the worlds premiere neural network developer working with PHP as my language of choice… trust me that sounds way more glamorous than it actually is! πŸ˜›

I feel confident in saying this though because I literally wrote all the examples of neural networks in PHP.

I also added the bulk of the user contributed comments about the FANN extension over on PHP.net.

I’ve also written a bunch of PHP Neural Network Tutorials and more are on they way! πŸ™‚

Basically, I’ve worked very hard to establish my credibility so that when I speak on the subject I do so with a modicum of authority… hopefully. πŸ˜›

Anyway, I was contacted by a reader about my OCR Neural Network this past weekend, seems they were having some difficulty in getting FANN to work on their machine:

I have tried to run this project, but i am not able to resolve the error produce by fann library. please guide me with it how to resolve it. ~Abhijit Gutal

When in doubt ask for help!

I replied to Mr. Abhijit though I got no reply.

No worries though, I’ve been meaning to re-address the issue of setting up a FANN development environment since Amazon bought C9 so now seems like a good time to do an update!

What we’ll cover in today’s post:

  • How to download and install Virtualization Software for free.
  • I walk you through the basics of how to build a Virtual Machine and install Linux on it.
  • How to install Git, PHP, FANN & the PHP extension.
  • How to test everything is setup correctly.
  • What to do next.

All that in just one post!

So… Ready to get started creating your own neural networks using PHP? Continue reading “Getting Started With Neural Networks and PHP in 2019”

When Good Parts Go Bad!

Surprise!

Okay, so on Monday I published my article ‘No Posts‘ to inform everyone of some equipment trouble that befell my computer.

I notified everyone that I had to restore my computer to operational status prior to resuming my blogging activities.

Well I am now pleased to inform everyone that I believe I have resolved the problem with my computer that prevented me from working. It turns out that an electrolytic capacitor on my video card leaked and I had to desolder the bad capacitor, clean the board and replace the capacitor with an equivalent component from my home workshop. YAY!!!!

I don’t see this as a good long term solution or repair however because I am aware of the concept of ‘The Bathtub Curve‘ which I briefly mentioned in my post Are You Prepared For Disaster? and as my system is already quite well used and worn, I expect to be on the latter half of theΒ increasing failure curve.

Therefore before proceeding with further posts I believe it to be prudent to take a few days off posting and projects to put my machine through it’s paces and confirm to my satisfaction that the machine is production ready. If everything goes well expect me to resume posting next Monday, September the 18th.

Should we continue with the Ancestor Simulations Series or are you getting tired of this topic and prefer I blog about something else? Personally I am rather enjoying these Ancestor Simulation posts but I know a lot of you love, yes I said just ABSOLUTELY LOVE my neural networking posts! πŸ˜›

Getting Started with Neural Networks Series

Pathfinding From Scratch Using A Neural Network Series

Lets Teach AI How To Read Using PHP Series (Optical Character Recognition)

Machine Learning from a Database Series

 

These posts are some of your favorites and among my most read and reread articles.

Especially the ‘Getting Started’ series!

I’ve been looking for interesting and unique topics and problems to create Neural Network to solve and I think when we get to the “aliens / creatures / ancestors” posts we will be implementing some Neural Networks so that they behave in intelligent ways and will hopefully display emergent behavior but it all depends on how much time I have to dedicate to the project, how interested you guys are and how much raw processing and computation is required to produce a successful result.

For example I have a neural network project sitting on my ‘cutting room floor‘ where I attempt to do ‘text to speech‘ synthesis using raw wav files as ‘training data‘ however there was too many data points and WAY TOO MUCH processing to be viable as I had implemented it. The only thing it really generated consistently was ‘white noise‘ however I may try again in the future to use ‘phonemes‘ and we’ll see how it goes with a simplified methodology.

In any case, I’d like to leave you with another example of ‘good parts going bad’ and how this applies to the concept of the ‘The Bathtub Curve‘, it also further stresses the concept that you cannot plan for every failure!

Years ago I was working on an old used HP DC Series Business Desktop in a Small Form Factor.

As I recall I was performing refurb maintenance on it so that it could be redeployed to a factory environment. I had the case open and the computer sitting on my grounded work bench.

I was wearing my grounding strap and everything was all fairly routine.

I booted up HP DIAGS (or was it Hiren’s Boot CD? Can’t recall πŸ˜›) to perform a system wide hardware ‘stress test‘ (I will be doing something similar in the coming days with this computer) and then I retired to my ‘SysAdmin‘ desk to respond to emails. After about 5 minutes I get up and walk over to the machine and as I am looking (just looking) into the case at the CPU fan spinning I am instantly showered in sparks and there is a fountain of fireworks spewing from inside the computer!

With my Adrenalin starting to flow and my heart beating like a 555 timer I reached through the sparks to the surge protector and yank the NEMA 5-15 end of the IEC C13 to NEMA 5-15 Power Cable and just as quickly as the fireworks started the calm returned.

Of course I immediately reported everything to the IT Director who insisted I determine the cause!

After a through examination of the machine I had discovered that it was the PSU and not the CPU that had so epically failed!

After further research I discovered that the particular manufacturing batch that the PSU was a member of, used an inferior glue inside the power supply which became ‘hydrophilic‘ over time and absorbed ambient humidity in the air leading to a greater likelihood of failure over time, just as the bathtub curve generally predicts!

Fortuitously there was a recall on the PSU with the batch number we had and since this was a known issue to the manufacturer I was able to get the part replaced under warranty! πŸ™‚

There was no way we could have prevented the PSU from catastrophic failure however I was able to quickly diagnose and resolve the problem because I knew what I was doing, I had the right resources at my disposal and I had a vendor who was willing to work with me to resolve the issue to my satisfaction.

Fast forward to my recent video card failure and repair, its WAY out of warranty and so far from ‘state of the art‘ that you could call it a cave painting! πŸ˜›

As I said I expect that this minor setback is merely a harbinger that signals future hardware failure is on the horizon for my production PC so if you would like to help me upgrade my equipment so that we can avoid additional (potentially longer) interruptions to my posting or assist me with paying the $1,200 to repair the transmission on my PT Cruiser, or simply just want to say thanks for all the cool posts and code then consider supporting me over on Patreon.

If you would like to support me anonymously using bitcoins please use the address below or scan the QR code using your bitcoin wallet app, thank you!

Please Send Here: 1GgUHjVAYqFLBWufFTpkoDEti16DSSNNG7

With that, have a great week & I hope to see you all in the next post!

Please Like, Comment below & Share this post with your friends and followers on social media.

If you would like to suggest a topic or project for an upcoming post feel free to contact me.

Much Love,

~Joy

Lets teach AI how to read using PHP IV

 

Welcome back to the fourth and final installment of the OCR tutorial series. I know you have been eagerly awaiting this post which will tie everything together and enable us to finally test the OCR ANN.

Here are the previous posts in this series:

Lets teach AI how to read using PHP

Lets teach AI how to read using PHP Part 2

Lets teach AI how to read using PHP Part 3

 

Before proceeding here is the code we will be considering:

test_ocr.php


<style>
	.blue{color:blue;}
	.green{color:green;}
	.red{color:red;}
</style>

<?php
/*
    OCR( 
	    (string) $img, 
	    (char) $expected, 
		(array) $input, 
		(array)$lookup_array, 
		(FANN neural network resource)$ann
	   );
	
	Description: 
	
	With this function I simply try to illustrate how you could test the OCR ANN. 
				
	$img is a string that should be the name to the training image you are reading from. It is
	used to output the image to the browser results.
	
	$expected is a char that you are actually testing for. eg  
	
	$input should be an array of inputs (encoded pixel data)
	
	$lookup_array should be an array of ASCII characters normalized as floating point values in increments of 0.01
	
	$ann should be a FANN neural network resource.
	
	References:   
	   global - http://php.net/language.variables.scope
	   PHP_EOL - http://php.net/manual/en/reserved.constants.php
	   fann_run() - http://php.net/manual/en/function.fann-run.php
	   floor() - http://php.net/manual/en/function.floor.php
	   count() - http://php.net/manual/en/function.count.php
*/
function OCR($img, $expected, $input, $lookup_array, $ann) {
	global $correct; // refer to the non local $correct variable
	$output = ""; 
	
	/* Display image for reference */
	$output .= "Image: <img src='images/$img'><br>" . PHP_EOL;

	// Run the ANN
	$calc_out = fann_run($ann, $input);
	
	$output .= 'Raw: ' .  $calc_out[0] . '<br>' . PHP_EOL;
	$output .= 'Trimmed: ' . floor($calc_out[0]*100)/100 . '<br>' . PHP_EOL;
	$output .= 'Decoded Symbol: ';
	
	/* What did the ANN think it saw? */
	for($i = 0; $i < count($lookup_array); $i++) {
       if( floor($lookup_array[$i][0]*100)/100 == floor($calc_out[0]*100)/100) {
	        $output .= $lookup_array[$i][1] . '<br>' . PHP_EOL;
	        $output .= "Expected: $expected <br>" . PHP_EOL;
	        $output .= 'Result: ';
	        if($expected == $lookup_array[$i][1]){
	        	$output .= '<span class="green">Correct!</span>';
				
				++$correct;
				
	        }else{
	        	$output .= '<span class="red">Incorrect!</span> <a href="train_ocr.php">Retrain OCR</a>';
	        }
		}
	}
	$output .= '<br><br>' . PHP_EOL;
	
	return $output;	
}


$total = 11; // How many images are to be tested
$correct = 0; // The count of how many images were correctly read by the ANN


/* Setup a resource that points to our ANN .net file */
$train_file = (dirname(__FILE__) . '/ocr_float.net');

/* Confirm the ANN exists */
if (!is_file($train_file))
	die('<span class="red">The file ocr_float.net has not been created!</span><a href="train_ocr.php">Train OCR</a>' . PHP_EOL);

/* Create the ANN resource */
$ocr_ann = fann_create_from_file($train_file);
if ($ocr_ann) {
	// Display the images we are testing (hard coded)
	?>
	<h1 class='blue'>OCR Test</h1>
	<strong>Testing: </strong>
	<img src='images/38.png'> <!-- G -->
	<img src='images/68.png'> <!-- e -->
	<img src='images/68.png'> <!-- e -->
	<img src='images/74.png'> <!-- k -->
	<img src='images/38.png'> <!-- G -->
	<img src='images/72.png'> <!-- i -->
	<img src='images/81.png'> <!-- r -->
	<img src='images/75.png'> <!-- l -->
	<img src='images/41.png'> <!-- J -->
	<img src='images/78.png'> <!-- o -->
	<img src='images/88.png'> <!-- y -->
	<br>
	<?php

	/* 
	    Create the lookup_array from ASCII
		https://en.wikipedia.org/wiki/ASCII
		
        start: ascii dec 33 (!) 
        stop:  ascii dec 126 (~) 
    */
	$result_lookup_array = array();
	$curr = 0.00;
	for($i = 33; $i <= 126; $i++) {
		array_push($result_lookup_array, array($curr, chr($i)));
		$curr+= 0.01;
	}
	

	// For simplicity sake I hardcoded these values below as there is no need to prove that we can read
	// the pixel data for each image (as we already did that in generate_training_data.php) however you 
	// can implement similar methodology to what as I did with GenerateTrainingData() to read pixel values
	// programmaticlly into an array rather than manually specifying it as I show here.
	
	$test_G = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_e = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_k = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_i = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_r = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_l = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_J = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_o = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	$test_y = array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
	
       
	/* Test OCR and buffer results in the $output variable as string data */
	$output = "";
	$output .= OCR('38.png', 'G', $test_G, $result_lookup_array, $ocr_ann);
	$output .= OCR('68.png', 'e', $test_e, $result_lookup_array, $ocr_ann);
	$output .= OCR('68.png', 'e', $test_e, $result_lookup_array, $ocr_ann);
	$output .= OCR('74.png', 'k', $test_k, $result_lookup_array, $ocr_ann);
	$output .= OCR('38.png', 'G', $test_G, $result_lookup_array, $ocr_ann);
	$output .= OCR('72.png', 'i', $test_i, $result_lookup_array, $ocr_ann);
	$output .= OCR('81.png', 'r', $test_r, $result_lookup_array, $ocr_ann);
	$output .= OCR('75.png', 'l', $test_l, $result_lookup_array, $ocr_ann);
	$output .= OCR('41.png', 'J', $test_J, $result_lookup_array, $ocr_ann);
	$output .= OCR('78.png', 'o', $test_o, $result_lookup_array, $ocr_ann);
	$output .= OCR('88.png', 'y', $test_y, $result_lookup_array, $ocr_ann);
	
    // Determine how accurate the Neural Network is
    $percent_correct = round(($correct / $total) * 100, 2 );
    
	// Output the accuracy results
	echo "<strong>Results:</strong> $correct images correctly decoded out of $total. (<span class='"; 
	
	/* Add a css style to the percentage results */
	if($percent_correct < 70){echo "red'>";}
	elseif($percent_correct < 90){echo "blue'>";}
	else{echo "green'>";}
	
	/* Close css style span and offer link to retrain */
	echo $percent_correct . "%</span>)<br>Not good enough? <a href='train_ocr.php'>Retrain OCR</a><br><br>" ;
	
	/* display detailed results */
	echo "<h2 class='blue'>Details</h2>";
	echo $output;
	
	/* Free up memory associated with the OCR ANN resource. */ 
	fann_destroy($ocr_ann);
} else {
	die("<span class='red'>Invalid file format.</span>" . PHP_EOL);
}

?>

If you run this code the following things will happen:
  1. Β The images we are testing will display on page using HTML <img> elements.
  2. Our neural network will be loaded from the file we created in step 3 of this tutorial.
  3. 11 Tests using the OCR function I provide will be preformed.
  4. The results of the ANN will be computed then displayed as an accuracy %.
  5. Detailed results of each test result will be displayed.

Now that we have tested the OCR ANN, lets break it down and understand what is going on.

 

Step 1 – generate_training_images.php

In Step 1 we create our images and log file.

 

 

 

 

 

 

 

 

 

Step 2 –Β generate_training_data.php

In Step 2 we use the log file as a reference to step through each image and examine every pixel and assign it a value of 1 or 0 based on the color of the pixel. We then save our results to a new file called ocr.data.

 

Step 3 – train_ocr.php

In Step 3 we train the neural network and save it as ocr_float.net.

Step 4 – test_ocr.php (This part of the tutorial)

In Step 4 load the ANN from file ocr_float.net and then proceed to test it. In this image I ran multiple tests and excluded the individual image details however in the code I provided you will get additional data about each test image.

At this point our toy OCR neural network is complete and operating as well as can be expected.

Why is this a “toy” neural network? Because it is trained to “classify” or identify images in a single one off event rather than convolvingΒ (yes that is a real word πŸ˜› ) over the images and extracting features. Basically what this means is that while it eventually got a ~73% correct identification rate it’s not actually going to read text out of just any image… not only that we broke the cardinal rule of testing our ANN with the same data we trained it on (always test on new data not what it was trained on) so the ANN is “hyper fitting” its data set.

We could probably improve accuracy by moving the letters in the images, blurring them rotating them, changing their color and adding the ability for the ANN to work with more than black and white text etc… but in the end it would only be a marginal improvement.

To use OCR for more real world scenarios you will need to implement convolution layers. Which will allow you to not only read the letters and words in an image of any size or color but also do object recognition such as test image 1 is a cat and test image 2 is aΒ ti82 calculator…

And with that I hope you had as much fun following along with this tutorial as much as I had creating it! πŸ™‚

Please support me on Patreon.

If you would like to obtain a copy of this code from GitHub you can find the complete project here: OCR on GitHub

Note: This project (all the code, the title images as well as the infographic licensed under everyone’s favorite license (MIT LICENSE) so feel free to take this code and develop it into something amazing! Please just attribute me as the author of the initial code base. Also if you use this to create something cool, I’d love to hear about it! πŸ™‚

As always I hope you found this project both interesting and informative. Please Like, Comment & Share this post with your friends and followers on your social media platforms and don’t forget to click the follow button over on the top right of this page to get notified when I post something new.

If would like to suggest a topic or project for an upcoming post feel free to contact me.

Much Love,
~Joy

Lets teach AI how to read using PHP III

 

Welcome back to the third installment of the OCR series.

In the last post I showed you how to encode the images from a set of training images into training data for the OCR Neural Network we are building.

Here are the previous posts in this series:

Lets teach AI how to read using PHP

Lets teach AI how to read using PHP Part 2

Once the images are encoded as numbers representing the pixel color we are now ready to teach our ANN how to identify symbols in images.

train_ocr.php



<?php

set_time_limit ( 300 ); // max run time 5 minutes (adjust as needed)

$num_input = 160;
$num_output = 1;
$num_layers = 3;
$num_neurons_hidden = 107;
$desired_error = 0.00001;
$max_epochs = 5000000;
$epochs_between_reports = 10;

$ann = fann_create_standard($num_layers, $num_input, $num_neurons_hidden, $num_output);

if ($ann) {
	echo 'Training OCR... '; 
	fann_set_activation_function_hidden($ann, FANN_SIGMOID_SYMMETRIC);
	fann_set_activation_function_output($ann, FANN_SIGMOID_SYMMETRIC);

	$filename = dirname(__FILE__) . "/ocr.data";
	if (fann_train_on_file($ann, $filename, $max_epochs, $epochs_between_reports, $desired_error))
		fann_save($ann, dirname(__FILE__) . "/ocr_float.net");

	fann_destroy($ann);
}

echo 'All Done! Now run <a href="test_ocr.php">Test OCR</a><br>' . PHP_EOL;



If you run this code the following things will happen:
  1. Β A standard fully connected 3 layer backward propagating neural network will be created with 160 inputs, and 1 output.
  2. The ANN will be configured to use the Sigmoid activation function.
  3. The ANN is trained, saved and dumped from memory.

Now that we have trained the ANN all that is left is to test it, which I will cover in my next post.

If you would like to obtain a copy of this code from GitHub or fork this project to follow along as I release the code you can find this project here: OCR on GitHub

Note: This project (all the code, the title images as well as the infographic licensed under everyone’s favorite license (MIT LICENSE) so feel free to take this code and develop it into something amazing! Please just attribute me as the author of the initial code base. Also if you use this to create something cool, I’d love to hear about it! πŸ™‚

As always I hope you found this project both interesting and informative. Please Like, Comment & Share this post with your friends and followers on your social media platforms and don’t forget to click the follow button over on the top right of this page to get notified when I post something new.

Also please support me on Patreon.

If would like to suggest a topic or project for an upcoming post feel free to contact me.

Much Love,
~Joy

Lets teach AI how to read using PHP II

Welcome back I hope you enjoyed the last post in this series:

Lets teach AI how to read using PHP

We last left off by creating a set of training images for the OCR Neural Network we are building.

This time we will convert the training images we generated into training data that the OCR ANN can use to learn the letters, numbers and symbols in our training images.

This post will be on the short side as the operation is rather simple and the code is commented and referenced so refer to the code below directly. I will cover more of the details in the next post however if you have any questions, comments or trouble leave please leave it in the “Leave a Reply” section below and I will do my best to help out.

Prior to proceeding, you may find it helpful to refer back to the info graphic for this project:

 

generate_training_data.php



<?php
/*
    GenerateTrainingData(
                            (NULL)
                        );
    
    Description: 
        
    This function manages the image creation, call it to create a new image training set.
    
    The images will be 10px wide and 16px tall.
    
    [Example: Capital A Training Image]
    
      Pixels     Encoded
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  0000000000
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  0000000000
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  0000000000
    β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  0000110000
    β–ˆβ–ˆβ–ˆ    β–ˆβ–ˆβ–ˆ  0001111000
    β–ˆβ–ˆ  β–ˆβ–ˆ  β–ˆβ–ˆ  0011001100
    β–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆ  0110000110
    β–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆ  0110000110
    β–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆ  0110000110
    β–ˆ        β–ˆ  0111111110
    β–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆ  0110000110
    β–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆ  0110000110
    β–ˆ  β–ˆβ–ˆβ–ˆβ–ˆ  β–ˆ  0110000110
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  0000000000
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  0000000000
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  0000000000

    
    [Example: Capital A Prepared for ANN ]  
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
    0.32
    
    This algorithm works by iterating over an image pixel by pixel and evaluating the RBG color values of a pixel in an image and assigning a single numerical value to represent the color of the pixel.
    
    White pixels will be encoded as 1 and black pixels will be 0.
    
    Counts begin with 0.
    
    We proceed from the top left corner of the image which for descriptive purposes could be referred to as $row 0, and $col 0 (0,0) and the bottom most right pixel would be the last processed and would be (15,9). 
    
    All columns on a given row will be encoded prior to proceeding to the next row.
  
    As it is currently written the images should be in a sub-folder named 'images' and the image set should 
    be named from 0.png to the last image in the set e.g. 93.png or other if you changed the code that generates
    the training images.
    
    
    References:       
       file_exists() - http://php.net/manual/en/function.file-exists.php
       die() - http://php.net/manual/en/function.die.php
       fopen() - http://php.net/manual/en/function.fopen.php
       fgets() - http://php.net/manual/en/function.fgets.php
       PHP_EOL - http://php.net/manual/en/reserved.constants.php
       str_replace() - http://php.net/manual/en/function.str-replace.php
       array_push() - http://php.net/manual/en/function.array-push.php
       explode() - http://php.net/manual/en/function.explode.php
       feof() - http://php.net/manual/en/function.feof.php
       fclose() - http://php.net/manual/en/function.fclose.php
       getimagesize() - http://php.net/manual/en/function.getimagesize.php
       count() - http://php.net/manual/en/function.count.php
       fwrite() - http://php.net/manual/en/function.fwrite.php
       imagecreatefrompng() - http://php.net/manual/en/function.imagecreatefrompng.php
       imagecolorat() - http://php.net/manual/en/function.imagecolorat.php
       imagecolorsforindex() - http://php.net/manual/en/function.imagecolorat.php
       fwrite() - http://php.net/manual/en/function.fwrite.php
       imagedestroy() - http://php.net/manual/en/function.imagedestroy.php
       fclose() - http://php.net/manual/en/function.fclose.php
       
*/
function GenerateTrainingData() {
    
    /* 
        We will use the $image_array variable to store all the image data as an array.
    
        $image_array[$i][0] = File name / ASCII number
        $image_array[$i][1] = ASCII symbol
        $image_array[$i][2] = Desired output value from the ANN as a floating point number between -1 & 1
        $image_array[$i][3] = Encoded pixel data
    */
    $image_array = array();
    
    /* If no training images don't proceed. Determined by checking for images/0.png */
    if (!file_exists('images/0.png')) {
        /* No Image so do not proceed with the encoding. Output a hyperlink to generate_training_images.php */
        die('No training images run <a href="generate_training_images.php">generate_training_images.php</a> first.');
    }
    
    /* Create an empty file resource that points to ocr.data (where the training data will be stored) */
    $trainingfile = fopen("ocr.data", "w") or die("Unable to open: " . $trainingfile . '. Ending program.');
    
    /* Create a file resource that points to generate_images.log */
    $logfile = fopen("images/generate_images.log", "r") or die("Unable to open: " . $logfile . '. Ending program.');
    
    
    /*
        Use fgets() to open then buffer generate_images.log line by line.
        
        Use str_replace() on the buffered data to remove line ending terminators.
        
        Use explode() to split the remaining buffered data using spaces ' ' as
        delimiter into $image_array
                        
        Example generate_images.log excerpt:
        
        0 !
        1 "
        2 #
        3 $
        4 %
        5 &
        6 '
        7 (
        .......
    */    
    while (($buffer = fgets($logfile, 4096)) !== false) {
        $buffer = str_replace(PHP_EOL, '', $buffer);
        array_push($image_array,  explode(' ', $buffer));
    }
    if (!feof($logfile)) {
        echo "Error: unexpected fgets() fail while reading logfile.\'n";
    }
    
    /* Close the logfile */
    fclose($logfile);
    
    
    /* 
       Use getimagesize() to obtain the width and height of the training images.
       
       We could just hand code these but I wanted to demonstrate how you could 
       programmatically determine the image dimensions.
       
       imgsize[0] = width (10)
       imgsize[1] = height (16)
    */
    $imgsize = getimagesize('images/0.png');
    
    
    /*       
       Next we programmatically determine the number of inputs for the ANN by computing
       the area of the image in pixels. The area of a rectangle is Width * Height, therefore:
       
       10 * 16 = 160 (pixels / inputs) 
    */
    
    $num_of_inputs = $imgsize[0] * $imgsize[1]; 
    
    /* Determine how many images there are */
    $num_of_images = count($image_array);
    
    /* 
        Start writing to the training data file.        
        Write: $num_of_images $num_of_inputs 1
    */
    fwrite($trainingfile, "$num_of_images $num_of_inputs 1" . PHP_EOL);
    
    
    /* Process each image */
    for($i = 0; $i < $num_of_images; $i++){
        $curr_image = "images/$i.png";
        
        /* Load the training image into memory */
        $im = imagecreatefrompng($curr_image);
        
        /* 
            Determine the desired output value for this training image.
            
            Use array_push() to add the desired output to $image_array[$i][2]
        */
        if($i > 0){
            $output_value = 0.01 * $i;
        }else{
            $output_value = 0.00;
        }
        array_push($image_array[$i], $output_value); 
        
        
        
        /* 
            Step through the image and look at each pixel using imagecolorat().
            
            Use imagecolorsforindex() to split $rgb resource to separate $colors array.
            
            Assign the pixel a single value based on its RGB color.
            
            Concatenate all the pixel values into the $pixel_values variable.
        */
        $pixel_values = "";
        for($row = 0; $row < $imgsize[1]; $row++){
            for($col = 0; $col < $imgsize[0]; $col++){
                $rgb = imagecolorat($im, $col, $row);
                $colors = imagecolorsforindex($im, $rgb);
                
                if($colors['red'] >= 225 && $colors['green'] >= 225 && $colors['blue'] >= 225){
                    $pixel_values .= 1 . ' ';
                } else{
                    $pixel_values .= 0 . ' ';
                }
            }
        }
        /*
            Once every pixel has been scanned and encoded for use as inputs
            use array_push() to add the pixel value inputs to $image_array[$i][3]
        */
        array_push($image_array[$i], $pixel_values);
        
        /* Echo links and values for the image */
        echo "<a href='images/" . $image_array[$i][0] . ".png' target='_blank'>" . $image_array[$i][0] . "</a>.png encoded as " . $image_array[$i][2] . "<br>"  . PHP_EOL;
        echo "Pixel Data: " . $image_array[$i][3] . "<br>"  . PHP_EOL;
    
        /* Write the inputs and desired outputs to the training data file */    
        fwrite($trainingfile, $image_array[$i][3] . PHP_EOL . $image_array[$i][2] . PHP_EOL);
        
        /* Free up memory associated with the training image by destroying the resource. */ 
        imagedestroy($im);
    }
    /* All done! Close the training data file. */
    fclose($trainingfile);
}

/* Generate training data from training images. */
GenerateTrainingData();

/* In case the user wishes to review the training data file link to it. */
echo 'Training data: <a href="ocr.data">ocr.data</a><br>' . PHP_EOL;

/* Announce completion and link to next step. */
echo 'All Done! Now run <a href="train_ocr.php">Train OCR</a><br>' . PHP_EOL;



?>

If you run this code the following things will happen:
  1. Each image will be examined pixel by pixel and based on it’s color, encoded as a 0 (black) or 1 (white). As mentioned in the previous post I chose this encoding scheme becauseΒ  I wanted to use high contrast images with the text being white and the background being black, adjust as necessary.
  2. Each pattern of 1’s and 0’s will be assigned a floating point value between 0 and 1.
  3. All the encoded patterns and their values will be saved to a training data file for later use as inputs for the ANN.

Now comes the real fun! In my next post we will train the OCR Neural Network and take it for a test drive! πŸ˜›

If you would like to obtain a copy of this code from GitHub or fork this project to follow along as I release the code you can find this project here: OCR on GitHub

Note: This project (all the code, the title images as well as the infographic licensed under everyone’s favorite license (MIT LICENSE) so feel free to take this code and develop it into something amazing! Please just attribute me as the author of the initial code base. Also if you use this to create something cool, I’d love to hear about it! πŸ™‚

As always I hope you found this project both interesting and informative. Please Like, Comment & Share this post with your friends and followers on your social media platforms and don’t forget to click the follow button over on the top right of this page to get notified when I post something new.

Also please support me on Patreon.

If would like to suggest a topic or project for an upcoming post feel free to contact me.

Much Love,
~Joy

Lets teach AI how to read using PHP

OCR is a practical example of Optical Character Recognition using FANN. While this example is limited and does make mistakes, the concepts illustrated by OCR can be applied to a more robust stacked network that uses feature extraction and convolution layers to recognize text of any font in any size image.

At the end of this series of tutorials you will be able to build Neural Networks using PHP that can read characters from images! I will be giving you actual working code!

OCR is a practical example of Optical Character Recognition using FANN. While this example is limited and does make mistakes, the concepts illustrated by OCR can be applied to a more robust stacked network that uses feature extraction and convolution layers to recognize text of any font in any size image.

As mentioned this will be a series of posts so that I don’t overwhelm you guys with too much information all at once and so I don’t have to sit here and type ad infinitum (infinite recursion).

πŸ˜›

OCR (Optical Character Recognition) isn’t exactly a new subject but surprisingly its something that few computer scientists have actual experience building! Further, any examples you see are descriptions at best that frequently “devolve” into a math lesson that ultimately glosses over practical application and important implementation details!

Don’t get me wrong, I love math but that isn’t required to start learning. The FANN Library will act as an abstraction layer so we can focus on our data and objectives and not complex differential equations.

Additionally all the code will be thoroughly documented and intentionally simplified and referenced so that even a student with minimal experience can benefit. I happen to believe that Neural Networks are complex enough already, and the more people who know how to build and deploy these systems the faster we can find solutions to the most horrible problems we face as a species (currently incurable diseases, famine, wars over resources, the global energy crisis) need I go on?

So, I am going to provide you with the tools and basic knowledge of how to build POWERFUL artificial intelligence systems and deploy them to cloud servers, not to shabby eh? πŸ˜‰

Not so you can go get rich building games and businesses (which you could easily do and that’s fine) but it is my very real hope that at lease some of you can apply these tools to help make the world a better place, start in your own community today!

In a very surreal way I am reminded of a lyric from the song “I’d Love to Change the World” by Ten Years After

I’d love to change the world, But I don’t know what to do, So I’ll leave it up to you-ooo-ooo

Its rather haunting in how true that actually is, isn’t it?

If you do happen to start a business using these tools and techniques or you simply appreciate the content that I create please support me on Patreon and please share this project with your followers, friends and coworkers on social media.

Now to begin you will need an environment to build your ANN (Artificial Neural Network), rather than reproduce the steps here you can follow this tutorial I wrote to get setup to work: Getting started with Neural Networks using the FANN library, PHP and C9.io

Go follow the setup process and come back, I’ll wait. πŸ˜‰

So, in order for a Neural Network to be useful it needs a β€œproblem space” (basically the thing we want it to learn or do) and a “training set” (examples of data similar to the data the ANN will encounter in the β€œproblem space” but which already has a known value or solution). In this case, because we are teaching the ANN to read characters from images, we need a set of images with alpha-numeric characters in them.

Further, before getting started we have to make sure that we don’t violate the licensing of any training set we use so to keep things simple for this tutorial I will show you how to generate your own basic training data programmatically.

Because the training set will be generated programmatically rather than by hand we are excluding hand written characters from this example ANN however with the use of additional training sets and convolution layers you could add that functionality to this neural network, however again, I want this example to be understandable by everyone.

So with that being said, lets begin by generating the training set of images!

We can use the PHP GD libraryΒ  which is a library used to manipulate images.

The GD Lib is almost always installed (compiled into php) by your host for you already and you probably have used it in the past (even if you didn’t realize it) for your other projects so I wont cover it in too much detail, suffice it to say its pretty east to get access to.

The images in our simple training set will be 10 pixels wide and 16 pixels high.

(10*16 = 160 pixels per image)

And in this example I will use only black and white for simplicity, white will be 1 and black will be 0. It’s completely arbitrary and you could reverse the colors if you wanted but white text on black seems to be high contrast so I decided to use that.

It’s also worth noting the importance of understanding that fundamentally there is no reason why you can’t use floating point values and represent the entire color spectrum or just gray scale.

Switching to a float would allow you to encode more information as a gradient however this will increase complexity of your ANN and you may require more hidden neurons, layers or both in addition to increased training epochs. In reality you would likely be MUCH better off adding convolutions and building a more robust ANN.

This ANN does not use convolution layers and will make a few mistakes from time to time, my point is that this is a simplified, stripped down, easy to understand example however with a little work you could build this into a very robust OCR ANN.

So now lets look at a sample training image as well as dive into the code to generate them!

generate_training_images.php



<?php

/*
    NewImage( 
	          (int)$char, 
	          (mixed)$curr_image, 
			  (int)$x , 
			  (int)$y, 
			  (file resource)$logfile
			);
	
	Description: 
	
	This function actually creates the images. 
	
	It is written with an iterative process in mind where more than a single training image (batch creation) would be generated however a single image can be generated if you prefer.
	
	I have created the GenerateTrainingImages() function that will call this function for you so you *could*
	ignore this function but its the one that does the real work. 
		
	$char uses the chr() function to convert the number passed to an ASCII character.
	
	$curr_image is a mixed type variable used to name the image that is being generated. In this
	case (i.e how I used it in GenerateTrainingImages()) I have used it as a number so it's easy to 
	iterate over the image files using a simple for loop to programmatically create the images we
    need however you could create an array of strings or chars and use those instead however that is
	not quite as simple as the implementation shown here.
	
	$x & $y are direct pass-through variables for the x & y coordinate placement variables as defined
	in the imagestring() function documentation: http://php.net/manual/en/function.imagestring.php
    
	The main reason why I made them "pass-through" rather than hard coding them in the NewImage()
    function is simply that you may want to vary or stagger the placement of the character within the images.
	
	$logfile is a file resource variable that points to the file that will log results of generating
	the training images. You may want to use this file in later steps or for reference. Please note that 
	NewImage() does not open its own access to the file so you will need to open the file resource yourself
	prior to calling NewImage().
	
	
	Example NewImage() Usage:
	
	// Create $logfile resource
	$logfile = fopen("images/generate_images.log", "w") or die("Error: Unable to open: " . $logfile . '. Ending program.');
	
	// This will create a b/w image of an exclamation mark(!) named 0.png
	NewImage(33, 0, 1, 0, $logfile);
	
	// Close log file 
    fclose($logfile);
	
	
	References:	   
	   chr() - http://php.net/manual/en/function.chr.php
	   imagecreate() - http://us.php.net/manual/en/function.imagecreate.php
	   imagecolorallocate() - http://us.php.net/manual/en/function.imagecolorallocate.php
	   imagestring() - http://us.php.net/manual/en/function.imagestring.php
	   imagepng() - http://us.php.net/manual/en/function.imagepng.php
	   imagedestroy() - http://us.php.net/manual/en/function.imagedestroy.php
	   fwrite() - http://us.php.net/manual/en/function.fwrite.php
*/
function NewImage($char, $curr_image, $x , $y, $logfile){
    /* Size the images */
    $width = 10; // px
    $height= 16; // px
	
	/* Set the filename */
    $file_name = $curr_image . '.png';


    /* Create the image resource 
	
	   Note: The @ operator is used to suppress any errors generated by php expressions. 
	         http://us.php.net/manual/en/language.operators.errorcontrol.php
			 
	         I use it to suppress any errors from @imagecreate() and instead cast my own
			 error message by adding an "or die()" to the @imagecreate() statement.
	*/
    $image = @imagecreate($width, $height) or die("Error: Unable to Initialize Image Stream");
	
	
	/* Add colors to the image resource	
	   
	   Note: colors are defined by the RBG color model
	   https://en.wikipedia.org/wiki/RGB_color_model

	   RGB Black: (0, 0, 0)
	   RBG White: (255, 255, 255)	   
	*/
    $background_color = imagecolorallocate($image, 0, 0, 0);
    $text_color = imagecolorallocate($image, 255, 255, 255);
	
	/* Add $char to the image resource */
    imagestring($image, 5, $x, $y,  chr($char), $text_color);
	
    /* Draw the image buffer stream to the file */
    imagepng($image, './images/' . $file_name);
	
	/* Free the memory associated with the $image resource by using imagedestroy() */
    imagedestroy($image); 

    /* write to log file */ 
    fwrite($logfile, $curr_image . ' ' . chr($char). PHP_EOL);
	
	/* echo results and link to file for review */
	echo "<a href='images/$curr_image.png' target='_blank'>" . $curr_image . ".png</a> - " . chr($char) . " ...complete.<br>" . PHP_EOL;
}



/*
	GenerateTrainingImages(
	                        (NULL)
	                      );
	
	Description:
	
	This function manages the creation of the training images. Call this function to create a new training set. 

	
	Example GenerateTrainingImages() Usage:
	
	GenerateTrainingImages();
	
	
	References:
	file_exists() - http://us.php.net/manual/en/function.file-exists.php
	mkdir() - http://us.php.net/manual/en/function.mkdir.php
	fopen() - http://us.php.net/manual/en/function.fopen.php
	fclose() - http://us.php.net/manual/en/function.fclose.php
	
*/
function GenerateTrainingImages() { 

    /* Check if the "images" folder was already created */
    if (file_exists('images')) {
        echo "The images folder already exists, no changes to the images folder were made!<br>" . PHP_EOL;
    }
    else{ /* There is no "images" folder */
        echo "The images folder does not exists, creating one... ";
		
		/* try to create one with the correct folder permissions */
        if (!mkdir("images", 0755, true)) {
            die('fail! Ending program.'); /* Failed to create the folder */
        }else{
            echo 'success!<br>' . PHP_EOL; /* Successfully created the folder */
        }
    }

    /* Create log file resource to log the results of generating the training images */
    $logfile = fopen("images/generate_images.log", "w") or die("Unable to open: " . $logfile . '. Ending program.');

    /* Current number of generated images */    
    $curr_image = 0;

	
    /* 
        Training images set is defined in ASCII 
		https://en.wikipedia.org/wiki/ASCII
		
        start: ascii dec 33 (!) 
        stop:  ascii dec 126 (~) 
    */
	
	$start = 33;
	$stop = 126;
	$total = $stop - $start + 1; /* Add 1 because the count starts at 0 */
	echo "Starting batch creation of " .  $total . " images.<br><br>" . PHP_EOL;
	
    for($i = $start; $i <= $stop; $i++) {
        NewImage($i, $curr_image, 1, 0, $logfile);
        $curr_image++;
    }
	echo "<br>Batch complete.<br>" . PHP_EOL;
	echo "Log: <a href='images/generate_images.log' target='_blank'>generate_images.log</a><br>" . PHP_EOL;
    
	
    /* Close log file */
    fclose($logfile);
}

/* Kick the tires and light the fires! */
GenerateTrainingImages();

/* Announce completion and link to next step */
echo 'All Done! Now run <a href="generate_training_data.php">generate_training_data.php</a><br>' . PHP_EOL;



It looks like a lot of code, I know! If you are feeling overwhelmed delete my comments and you will see that the actual code is quite short and very simple.

If you run this code the following things will happen:
  1. A sub-folder called ‘images’ will be created for you where you ran generate_training_images.php.
  2. 94 training images will be generated for you and saved in the images sub-folder.
  3. A log file named ‘generate_images.log’ will be created.

At this point we are ready to use these images to create the training set that the Neural Network will use. We’ll cover that in the next post in this series.

If you would like to obtain a copy of this code from GitHub or fork this project to follow along as I release the code you can find this project here: OCR on GitHub

If you have any questions, comments or trouble leave please leave it in the comments below and I will do my best to help out.

As always I hope you found this project both interesting and informative. Please Like, Comment & Share this post with your friends and followers on your social media platforms and don’t forget to click the follow button over on the top right of this page to get notified when I post something new.

Also please support me on Patreon.

If would like to suggest a topic or project for an upcoming post feel free to contact me.

Much Love,
~Joy

Blog at WordPress.com.

Up ↑

%d bloggers like this: