What if we could teach our writer bot (see Bot Generated Stories & Bot Generated Stories II) how to understand the meaning of a sentence? Would that improve the bot’s ability to understand what was said?

Well, conceptually it’s not that different from what we do with programming languages and if you ask me, it sure seems like a good place to start with Natural Languages!

Natural Language

A “Natural Language” is any language humans use (though I guess we would include alien languages should we ever meet some 😛 ) that evolved over time through use rather than by design.

One of the first things you might notice when you contrast a natural language, like English, with artificially designed languages that are used for specific purposes (like programming a computer) is that natural languages have much more complexity and variation.

Further, natural languages tend to be far more “general purpose” than even the most capable artificial general purpose programming languages.

For example, just think of all the ways you can say you love Ice Cream or that the room temperature is hot.

Now if you are a programmer, contrast that with how many ways there are to create a loop in a program?

The answer is more than a few (off the top of my head I can think of for, for each, whiledo while, goto, recursive functions) but certainly many times less than the number of all the possible combinations of words you could use to describe how green something is!

How then, do computers understand what programmers say?

Click The Infographic for Full Size

Write Code

First, a programmer must write some valid code, like this for example:

$pi = 3.1415926535898;
for($i = 0; $i < $pi; $i++){
    echo $i . PHP_EOL;
}
echo 'PI is equal to: ' . ($pi + PI()) / 2;

Result:

0
1
2
3
PI is equal to: 3.1415926535898

The computer can understand what this code means and does exactly what it was asked to do, but how?

Lexical Analysis

Lexical Analysis  occurs in the early stages of the CompilationInterpretation processes, where the source code or script for a program is scanned by a program called a lexer which tries to find the smallest chunk of “whole” information, called a “Lexeme“, and will assign it a “type” or “tag” that denotes its specific purpose or function.

Lexed Code

You might be wondering what lexed code looks like. If we lex the example code from above we get a list that would be something like this if we represent it as JSON:

[
	["identifier","$pi"],
	["operator-equals","="],
	["literal-float","3.1415926535898"],
	["separator-terminator",";"],
	["keyword-for","for"],
	["separator-open-parentheses","("],
	["identifier","$i"],
	["operator-equals","="],
	["literal-integer","0"],
	["separator-terminator",";"],
	["identifier","$i"],
	["operator-less-than","<"],
	["identifier","$pi"],
	["separator-terminator",";"],
	["identifier","$i"],
	["operator-increment","++"],
	["separator-close-parentheses",")"],
	["open-curl","{"],
	["keyword-echo","echo"],
	["identifier","$i"],
	["operator-concatenate","."],
	["keyword-end-of-line","PHP_EOL"],
	["separator-terminator",";"],
	["separator-close-curl","}"],
	["keyword-echo","echo"],
	["literal-string","PI is equal to: "],
	["operator-concatenate","."],
	["separator-open-parentheses","("],
	["identifier","$pi"],
	["operator-plus","+"],
	["keyword-pi","PI"],
	["separator-close-parentheses",")"],
	["operator-divide","/"],
	["literal-integer","2"],
	["separator-terminator",";"]
]

What we’ve just done is give each lexeme a tag that is unambiguous as to what it’s intended role or function is.

Semantic Analysis

Then Semantic Analysis , sometimes called Parsing, checks the code to ensure that
there are no mistakes and establishes a hierarchy of relationships and meaning so the
code can be evaluated using the rules of the language.

Semantic Hierarchy

Parsing will group the expressions into a tree hierarchy that makes the intended meaning explicitly clear to the computer what we want it to do.

Here is the code above parsed and represented as JSON:

[
   {
      "tags":[
         "identifier",
         "operator-equals",
         "literal-float"
      ],
      "lexemes":[
         "$pi",
         "=",
         "3.1415926535898"
      ],
      "child-expressions":[

      ]
   },
   {
      "tags":[
         "identifier",
         "operator-equals",
         "literal-float"
      ],
      "lexemes":[
         "$pi",
         "=",
         "3.1415926535898"
      ],
      "child-expressions":[

      ]
   },
   {
      "tags":[
         "keyword-for"
      ],
      "lexemes":[
         "for"
      ],
      "child-expressions":[
         {
            "tags":[
               "identifier",
               "operator-equals",
               "literal-integer"
            ],
            "lexemes":[
               "$i",
               "=",
               "0"
            ],
            "child-expressions":[

            ]
         },
         {
            "tags":[
               "identifier",
               "operator-less-than",
               "identifier"
            ],
            "lexemes":[
               "$i",
               "<",
               "$pi"
            ],
            "child-expressions":[
               {
                  "tags":[
                     "keyword-echo",
                     "identifier",
                     "operator-concatenate",
                     "keyword-end-of-line"
                  ],
                  "lexemes":[
                     "echo",
                     "$i",
                     ".",
                     "PHP_EOL"
                  ],
                  "child-expressions":[

                  ]
               }
            ]
         },
         {
            "tags":[
               "identifier",
               "operator-increment"
            ],
            "lexemes":[
               "$i",
               "++"
            ],
            "child-expressions":[

            ]
         }
      ]
   },
   {
      "tags":[
         "keyword-echo",
         "literal-string",
         "operator-concatenate"
      ],
      "lexemes":[
         "echo",
         "PI is equal to: ",
         "."
      ],
      "child-expressions":[
         {
            "tags":[
               "operator-divide",
               "literal-integer"
            ],
            "lexemes":[
               "\/",
               "2"
            ],
            "child-expressions":[
               {
                  "tags":[
                     "identifier",
                     "operator-plus",
                     "keyword-pi"
                  ],
                  "lexemes":[
                     "$pi",
                     "+",
                     "PI"
                  ],
                  "child-expressions":[

                  ]
               }
            ]
         }
      ]
   }
]

Code Evaluation

Once the code has been analysed it can be Evaluated.

And since you’re curious, here’s what this code does:

  1. Declare a variable named $pi then sets it’s value to the number Pi with a length of 13 decimal places.
  2. A for loop is initialized.
  3. Expression 1 in the for loop is only evaluated once before the loop begins and it declares a variable named $i.
  4. Each time the for loop runs Expression 2 is evaluated using Boolean Algebra and if the result is logically TRUE then the code inside the loop runs. The expression is a comparison of the value of $i & $pi where if $i is less than $pi then the loop runs.
  5. Expression 3 is the third and final expression used to initialize the for loop. It runs after each iteration of the loop. The value of $i is incremented by 1 using the ++ increment operator.
  6. Each iteration of the loop “echo‘s” the value of $i to the screen.
  7. Once the for loop terminates the computer takes the value of $pi and add it to the value provided by the PHP language function PI() (which stores the value of Pi as a constant).
  8. That resulting sum is then divided by 2 giving us exactly Pi.
  9. This value is then concatenated with the string “Pi is equal to: ” and the whole string value is then echoed to to the screen.

This code does nothing of real value but it’s sufficiently short for us to lex by hand and long enough that it provides interesting results with nested child expressions, see the infographic hierarchy above.

How Does This Apply To Writer Bot?

Well, I guess the next question to ask is… can we lex a natural language? We’ll talk about that in my next post so remember to like, and follow!

Also, don’t forget to share this post with someone you think would find it interesting and leave your thoughts in the comments.

And before you go, consider helping me grow…


Help Me Grow

Your financial support allows me to dedicate the time & effort required to develop all the projects & posts I create and publish here on my blog.

If you would also like to financially support my work and add your name on my Sponsors page then visit my Patreon page and pledge $1 or more a month.

As always, feel free to suggest a project you would like to see built or a topic you would like to hear me discuss in the comments and if it sounds interesting it might just get featured here on my blog for everyone to enjoy.

 

 

Much Love,

~Joy