Search

Geek Girl Joy

Artificial Intelligence, Simulations & Software

Month

September 2018

Can a Bot Understand a Sentence?

What if we could teach our writer bot (see Bot Generated Stories & Bot Generated Stories II) how to understand the meaning of a sentence? Would that improve the bot’s ability to understand what was said?

Well, conceptually it’s not that different from what we do with programming languages and if you ask me, it sure seems like a good place to start with Natural Languages!

Natural Language

A “Natural Language” is any language humans use (though I guess we would include alien languages should we ever meet some 😛 ) that evolved over time through use rather than by design.

One of the first things you might notice when you contrast a natural language, like English, with artificially designed languages that are used for specific purposes (like programming a computer) is that natural languages have much more complexity and variation.

Further, natural languages tend to be far more “general purpose” than even the most capable artificial general purpose programming languages.

For example, just think of all the ways you can say you love Ice Cream or that the room temperature is hot.

Now if you are a programmer, contrast that with how many ways there are to create a loop in a program?

The answer is more than a few (off the top of my head I can think of for, for each, whiledo while, goto, recursive functions) but certainly many times less than the number of all the possible combinations of words you could use to describe how green something is!

How then, do computers understand what programmers say?

Click The Infographic for Full Size

Write Code

First, a programmer must write some valid code, like this for example:

$pi = 3.1415926535898;
for($i = 0; $i < $pi; $i++){
    echo $i . PHP_EOL;
}
echo 'PI is equal to: ' . ($pi + PI()) / 2;

Result:

0
1
2
3
PI is equal to: 3.1415926535898

The computer can understand what this code means and does exactly what it was asked to do, but how?

Lexical Analysis

Lexical Analysis  occurs in the early stages of the CompilationInterpretation processes, where the source code or script for a program is scanned by a program called a lexer which tries to find the smallest chunk of “whole” information, called a “Lexeme“, and will assign it a “type” or “tag” that denotes its specific purpose or function.

Lexed Code

You might be wondering what lexed code looks like. If we lex the example code from above we get a list that would be something like this if we represent it as JSON:

[
	["identifier","$pi"],
	["operator-equals","="],
	["literal-float","3.1415926535898"],
	["separator-terminator",";"],
	["keyword-for","for"],
	["separator-open-parentheses","("],
	["identifier","$i"],
	["operator-equals","="],
	["literal-integer","0"],
	["separator-terminator",";"],
	["identifier","$i"],
	["operator-less-than","<"],
	["identifier","$pi"],
	["separator-terminator",";"],
	["identifier","$i"],
	["operator-increment","++"],
	["separator-close-parentheses",")"],
	["open-curl","{"],
	["keyword-echo","echo"],
	["identifier","$i"],
	["operator-concatenate","."],
	["keyword-end-of-line","PHP_EOL"],
	["separator-terminator",";"],
	["separator-close-curl","}"],
	["keyword-echo","echo"],
	["literal-string","PI is equal to: "],
	["operator-concatenate","."],
	["separator-open-parentheses","("],
	["identifier","$pi"],
	["operator-plus","+"],
	["keyword-pi","PI"],
	["separator-close-parentheses",")"],
	["operator-divide","/"],
	["literal-integer","2"],
	["separator-terminator",";"]
]

What we’ve just done is give each lexeme a tag that is unambiguous as to what it’s intended role or function is.

Semantic Analysis

Then Semantic Analysis , sometimes called Parsing, checks the code to ensure that
there are no mistakes and establishes a hierarchy of relationships and meaning so the
code can be evaluated using the rules of the language.

Semantic Hierarchy

Parsing will group the expressions into a tree hierarchy that makes the intended meaning explicitly clear to the computer what we want it to do.

Here is the code above parsed and represented as JSON:

[
   {
      "tags":[
         "identifier",
         "operator-equals",
         "literal-float"
      ],
      "lexemes":[
         "$pi",
         "=",
         "3.1415926535898"
      ],
      "child-expressions":[

      ]
   },
   {
      "tags":[
         "identifier",
         "operator-equals",
         "literal-float"
      ],
      "lexemes":[
         "$pi",
         "=",
         "3.1415926535898"
      ],
      "child-expressions":[

      ]
   },
   {
      "tags":[
         "keyword-for"
      ],
      "lexemes":[
         "for"
      ],
      "child-expressions":[
         {
            "tags":[
               "identifier",
               "operator-equals",
               "literal-integer"
            ],
            "lexemes":[
               "$i",
               "=",
               "0"
            ],
            "child-expressions":[

            ]
         },
         {
            "tags":[
               "identifier",
               "operator-less-than",
               "identifier"
            ],
            "lexemes":[
               "$i",
               "<",
               "$pi"
            ],
            "child-expressions":[
               {
                  "tags":[
                     "keyword-echo",
                     "identifier",
                     "operator-concatenate",
                     "keyword-end-of-line"
                  ],
                  "lexemes":[
                     "echo",
                     "$i",
                     ".",
                     "PHP_EOL"
                  ],
                  "child-expressions":[

                  ]
               }
            ]
         },
         {
            "tags":[
               "identifier",
               "operator-increment"
            ],
            "lexemes":[
               "$i",
               "++"
            ],
            "child-expressions":[

            ]
         }
      ]
   },
   {
      "tags":[
         "keyword-echo",
         "literal-string",
         "operator-concatenate"
      ],
      "lexemes":[
         "echo",
         "PI is equal to: ",
         "."
      ],
      "child-expressions":[
         {
            "tags":[
               "operator-divide",
               "literal-integer"
            ],
            "lexemes":[
               "\/",
               "2"
            ],
            "child-expressions":[
               {
                  "tags":[
                     "identifier",
                     "operator-plus",
                     "keyword-pi"
                  ],
                  "lexemes":[
                     "$pi",
                     "+",
                     "PI"
                  ],
                  "child-expressions":[

                  ]
               }
            ]
         }
      ]
   }
]

Code Evaluation

Once the code has been analysed it can be Evaluated.

And since you’re curious, here’s what this code does:

  1. Declare a variable named $pi then sets it’s value to the number Pi with a length of 13 decimal places.
  2. A for loop is initialized.
  3. Expression 1 in the for loop is only evaluated once before the loop begins and it declares a variable named $i.
  4. Each time the for loop runs Expression 2 is evaluated using Boolean Algebra and if the result is logically TRUE then the code inside the loop runs. The expression is a comparison of the value of $i & $pi where if $i is less than $pi then the loop runs.
  5. Expression 3 is the third and final expression used to initialize the for loop. It runs after each iteration of the loop. The value of $i is incremented by 1 using the ++ increment operator.
  6. Each iteration of the loop “echo‘s” the value of $i to the screen.
  7. Once the for loop terminates the computer takes the value of $pi and add it to the value provided by the PHP language function PI() (which stores the value of Pi as a constant).
  8. That resulting sum is then divided by 2 giving us exactly Pi.
  9. This value is then concatenated with the string “Pi is equal to: ” and the whole string value is then echoed to to the screen.

This code does nothing of real value but it’s sufficiently short for us to lex by hand and long enough that it provides interesting results with nested child expressions, see the infographic hierarchy above.

How Does This Apply To Writer Bot?

Well, I guess the next question to ask is… can we lex a natural language? We’ll talk about that in my next post so remember to like, and follow!

Also, don’t forget to share this post with someone you think would find it interesting and leave your thoughts in the comments.

And before you go, consider helping me grow…


Help Me Grow

Your financial support allows me to dedicate the time & effort required to develop all the projects & posts I create and publish here on my blog.

If you would also like to financially support my work and add your name on my Sponsors page then visit my Patreon page and pledge $1 or more a month.

As always, feel free to suggest a project you would like to see built or a topic you would like to hear me discuss in the comments and if it sounds interesting it might just get featured here on my blog for everyone to enjoy.

 

 

Much Love,

~Joy

Advertisements

Why Rule Based Story Generation Sucks

Welcome back, today we’re going to talk about why rule based story generation sucks and I’ll also present some code that you can use to create your very own rule based story generator!

But before we get started I’d like to draw your attention to the title header image I used in my last article Rule Based Story Generation which you should also read because it helps add context to this article… anyway, back to the image. It presents a set of scrabble blocks that spell out:

“LETS GO ON ADVENTURES”.

I added over the image my usual title text in Pacifico font and placed it as though to subtly imply it too might simply be a randomly selected block of text just slotted in wherever it would fit.

Combined the new “title rule” it reads:

“LETS GO ON ‘Rule Based Story Generation’ ADVENTURES”.

Perhaps it may be too subtle, I know! 😛

In any case, I like how the scrabble tiles capitalization is broken with the script.

It almost illustrates the sort of mishmash process we’re using to build sentences that are sometimes almost elegant though much of the time, ridged and repetitive.

It’s important to understand that developing an artificial intelligence that can write as well as a human is still an open (unsolved) area of research, which makes it a wonderfully exciting challenge for us to work on and we’re really only just getting started! 😉

Of course this isn’t as far as we can go (see Rule Based Story GenerationA Halloween Tale) and I am currently working on how we go even farther than I already have… but we’ll get there.

 

A Flawed Stepping Stone

Rule based story generation is an important, yet flawed, stepping stone in helping us understand the problem of building a writer bot that will write best sellers!

Despite the flaws with using rules to create stories, we may in the future partially rely on “bot written” rules in a layered generative system so none of this is necessarily wrong, just woefully incomplete, especially by my standards as I like to publish fully functioning prototypes… more or less. 😉

Before I present the code however let’s briefly look at why rules suck!

 

Reasons Rule Based Generation Sucks

Here’s a non-exhaustive list of reasons why rule based story generation sucks:

  • Some rules create Ambiguity.
  • Lack of correlation between Independent Clauses.
  • Complete lack of Dependent Clauses that will need to correlate with their associated independent clauses.
  • Run-On Sentences are WAY easy to generate!
  • Random, or hard coded Verb Tense & Aspect is used.
  • Forced, improper or unusual grammar.
  • Random events, and no cohesive growth of ideas over time (lack of Narrative).
  • No Emergence means that all meaningful possibilities are input by a person… manually! 😦
  • Placeholder for anything that I may have forgot to include here.

If we intend to build the “writer bot” AI described in Bot Generated Stories & Bot Generated Stories II we will have to find ways of mitigating or eliminating the issues in this list!

We could be proactive at trying to resolve some of these issues with the rule based system but most of our efforts will boil down to writing additional rules (if this, then that) preconditions and really… nobody has time for that!

Besides, if a person is writing the rules… even if the specific details selected for a sentence are random or possibly even stochastic (random but based on probability), wouldn’t you still say that the person who wrote the rules, kinda sorta wrote the text too?

I mean, even if there’s an asterisk next to the writers name and in itty-bitty tiny text somewhere it says (*written with the aid of a bot) or vice versa, it’s still hard to say that whoever wrote the rules and made the lists didn’t write the resulting text… to some extent, right?

If you agree… disagree? Feel free to present your argument in the comments!

Ultimately for the reasons listed above (and a fair amount of testing) I am confident that hand written rule based story generation is not the way to go!

 

New Rules

In addition to a new rule (name action.) I am including a little something special in the code below.

I call them Rule 8 & Rule 9 because they were written in that order but what makes then unique from rules (1 – 7 which I wrote) is that they were effectively written by a bot.

What I mean when I say the bot “wrote the rule” is that the pattern used in the rules was extracted by a learning bot (pattern matching algorithm/process).

Here are examples of the new rules:

Rule 7

Axton went fishing.
Briana mopped the floor.
Kenny felt pride.
Ryann played sports.
Chaim felt content.
Alaina road a horse.
Elian setup a tent.
Brian had fun.
Meadow heard a noise.
Jewel learned a new skill.

 

Rule 8 – Bot Generated Rule

Freya handled Along the ship!
Bethany dipped Inside the dollar!
Kyla appointed Regarding the scenery!
Aryanna filed yieldingly the honoree.
Madeline demanded of the pannier!
Kailey repaid there the courage.
Finley came With the button.
Sawyer criticised owing to majestically icicle!
Armani included again down canopy!
Genevieve snapped Behind the computer!

 

Rule 9 – Bot Generated Rule

Ulises besides Maxim approved the scraper Near.
Nova that is Louisa ringed the scraper Down.
Alec besides Killian eased the cope Outside.
Sylas consequently Zain beat safely exit Above.
Conrad yet Alfredo owed the definition Within.
Danica that is Jackson paid the sweater fully.
Hugh and Kori substituted the pitching heavily.
Julissa but Colton separated the lie Down.
Liberty but Barbara reformed the lamp kissingly.
Zion yet Rosemary ascertained true fat under.
Neither Desiree nor Nadia filed the protocol Ahead of.
Neither Rudy nor Rowan aided the weedkiller commonly.
Grace indeed Jad caused the beast best.
Jaelynn besides Maddux cheered the panda Against.
Ari yet Ayla elected the seaside without.
Blakely moreover Karsyn stimulated jealousy shadow owlishly.
Prince further Lennon exhibited the worm Except.
Clay thus Rohan embraced the tsunami each.
Sabrina but Avery stressed far paste Excluding.
Gregory so Dallas engaged new egghead clearly.
Neither Lydia nor Walter escaped naturally margin previously.
Dylan namely Elaina kept suspiciously shed oddly.
Neither Jedidiah nor Karsyn devised the bathhouse kookily.
Kareem so River pointed wetly yoga Ahead of.
Ansley accordingly Alessandro laughed the brood By.
Omar otherwise Sofia obtained the clipper Per.
Walker so August summoned the tile yeah.
Remy moreover Cody raised the handball loyally.
Aadhya so Adelynn allocated the fear Amidst.
Mohamed likewise Hudson inspected the hyphenation Like.

While that does make generating rules easier and it also can aid in resolving some of the issues with generating stories using rules, it really only amounts to an interesting “babble” generator at best, though perhaps it could be coupled with several other systems in layers to create something closer to a story?

Maybe through the use of “rewrite rules” that could fix the verb tenses and pronouns perhaps?

Here are the results of 15 randomly selected rules:

Leanna Odom is a woman but to the south of a
sports stadium, a bird built a nest for in the
zoo, a guy road a bike nor next to a city jail, a
car got a flat tire for Ricky Solis is very garish
and outside a farm , robots attacked yet Armani
Lowery is very attentive but Jeffrey and Zaniyah
permitted the extent Excluding. nor Azariah seemed
fatally the route! but Maddux proved hopefully
monthly wasp! nor Blaine wrote a poem. and inside
an abandoned ghost town, a book was written and
Neither Seth nor Callan behaved the steeple In
addition to. but Lucille ate a nice meal. and
Lorelei meditated.

Code

Below is the code for Generate.php and it’s the main program file. It uses Functions.php as well as some text files and you can find all the files you need for this project over on my GitHub for free: RuleBasedStoryGeneration on Github

<?php
// include all the functions
include('Functions.php');
// set up the parts of speech array
// functions will globally point to this variable
$pos = LoadPartsOfSpeech();
$number_of_sentences = 30; // how many sentences/rules are generated/used
$story = ''; // the string we concatenate rules on to
// for whatever number you set $number_of_sentences to...
foreach(range(1, $number_of_sentences, 1) as $number){
    
    $rule_subject = random_int(1, 3);
    
    // randomly determine the type of rule to use,
    // randomly select the rule, compute its result and concatenate with 
    // the existing $story
    if($rule_subject == 1){ // action or event
        
        $rule_subject = random_int(1, 4);
        
        if($rule_subject <= 3){
             $story .= Rule(1); // event
        }
        elseif($rule_subject == 4){
             $story .= Rule(7); // action
        }
    }
    elseif($rule_subject == 2){ // people related
        $rule_subject = random_int(1, 6);
        
        if($rule_subject == 1){
             $story .= Rule(2);
        }
        elseif($rule_subject == 2){
             $story .= Rule(3);
        }
        elseif($rule_subject == 3){
             $story .= Rule(4);
        }
        elseif($rule_subject == 4){
             $story .= Rule(5);
        }
        elseif($rule_subject == 5){
             $story .= Rule(6);
        }
        elseif($rule_subject == 6){
             $story .= Rule(7);
        }
    }
    elseif($rule_subject == 3){ // bot generated
        $rule_subject = random_int(1, 2);
        
        if($rule_subject == 1){
             $story .= Rule(8);
        }
        elseif($rule_subject == 2){
             $story .= Rule(9);
        }
    }
    
        
    // if this is not the last sentence/rule concatenate a conjunction
    if($number <= ($number_of_sentences - 1)){
        $story .= $pos['space'] . Get($pos['conjunctions']['pure']) . $pos['space'];
    }
}
// after the loop wrap the text at 50 chars and output the story
echo wordwrap($story, 50, PHP_EOL);
/*
 * Example Output
 * 
Jayleen ended By the lip! so Jada called a family
member. nor Aidan Lester is gifted and Emma Walton
is very clumsy and Grey proceeded widely literally
runner. or Santana Norman is a man yet Nico
Bartlett is very pitiful yet Aliana Browning is
rich and Rowan introduced Past the colloquia. but
Holly built a robot. so Morgan Dorsey is a person
for London fooled Against the cappelletti. but
Neither Emory nor Angel angered the order angrily.
or Hezekiah Beasley is very panicky and Leighton
did almost vivaciously author. so Foster Justice
is a man but Rory Parker is a beautiful person so
Reagan Rivera is a person but Kai Zamora is clever
nor beyond a newspaper company, dinosaur bones
were uncovered yet beyond a houseboat, a bird
built a nest so Kyle Goff is a man or on the
mountains, a new species of insect was identified
nor Galilea Mckinney is very worried or Gunner Orr
is very guilty but Otto Gaines is a small man nor
Gia Hendrix is powerful and Robert Mcdaniel is a
beautiful man so to the south of a newspaper
company, science breakthroughs were made so
Carmelo Rodgers is very witty
 
   
 */

 

Please remember to like share and follow!


Help Me Grow

Your financial support allows me to dedicate the time & effort required to develop all the projects & posts I create and publish here on my blog.

If you would also like to financially support my work and add your name on my Sponsors page then visit my Patreon page and pledge $1 or more a month.

As always, feel free to suggest a project you would like to see built or a topic you would like to hear me discuss in the comments and if it sounds interesting it might just get featured here on my blog for everyone to enjoy.

 

 

Much Love,

~Joy

Rule Based Story Generation

In my last post Bot Generated Stories II  I left off posing the question:

How then did I get my bot to generate A Halloween Tale?

~GeekGirlJoy

The short and most direct answer I can give without getting into all the nitty-gritty design & engineering details (we’ll do that in another post) is that I built a Markov Model (a math based bot) then I “trained” my bot on some texts to “teach” it what “good” patterns of text look like.

After my bot “read” approximately 350K words, the 3 books and a few smaller texts (some of my work and a few other general text sources to fill-out the vocabulary), I gave my bot a “seed word” that it would use to start the generative process.

What’s really going on “under the hood” is a lot like the “predictive text” feature many of you use every day when you send a text on your phone.

It’s that row of words that appears above or below your text while you type, suggesting the next possible word based on what you last said, what it was trained on and your personal speech patterns it learned while it watched you send texts in the past.

Well… my writer bot is sort of a “souped up”, “home-brewed” version of that…. just built with generating a story in mind rather than predicting the next most likely word someone sending a text would need.

The thing is, we’re not going to talk about that bot today. 😛

Xavier and I have been sick and I’m just not in the mood to do a math & code heavy post today, instead we’re going to talk about rule based story generation. 😉

Rule Based Story Generation

One simple and seemingly effective way to generate a story that absolutely works (at least to some extent) is to use “rules” to select elements from groups or lists and assemble the parts into a story.

The idea is pretty simple actually, if you select or write good rules then the result is they would generate unique (enough) sentences that when combined are a story.

For example, lets say I create a generative “rule” like this:

Rule: proximity location, event.

Seems simple enough but this rule can actually generate quite a few mildly interesting sentences, like this one for example:

in a railroad station, aliens attacked.

Result: (proximity: in) (location: a railroad station), (event: aliens attacked).

Not bad huh? You’d totally read that story right? 😛

Here’s a few more results using this rule so you can see what it can do. Note that the rule pattern never changes (proximity location, event.) but due to different values being randomly selected each sentence is different and the general “tone” of the sentence changes a bit as well:

Below a dank old bar, science breakthroughs were made.

I like that one 😛

Next to a village, a robbery happened.

To the west of a cave, a bird built a nest.

On the deck of a giant spaceship, a nuclear bomb was detonated.

Eat your heart out Kubrick! 😉 😛

Beyond the mountains, a child giggled.

Notice that all three parts of the rule (proximity location, event.) can affect the tone and meaning of the generated result.

What if the rule had generated:

“On the deck of a giant spaceship, a child giggled.”

That is a vastly different result than the one in the examples above, yet perhaps it is the same story with only seconds separating both events? Maybe…

“On the deck of a giant spaceship, a child giggled. The hoards of aliens were defeated. In the distance a voice yells “Dinner’s ready!”, a toy spaceship falls to the floor as little feet scurry unseen through the house.”

What makes the determination in the readers mind about what is actually going on is the context of what was said before this sentence and what will be said after. There are those cases where not saying something is saying something too… but dammit I can’t model that! 😛

Now, lets look at how the proximity can change the meaning.

Here’s the proximity list I used with this rule:

in
inside
outside
near
on
around
above
below
next to
close to
far from
to the north of
to the south of
to the east of
to the west of
beyond

Each ‘proximity’ by itself seems pretty self explanatory in its meaning but when combined with a location the meaning can change. For example, it seems fairly natural to discuss something being ‘beyond’ something else like “the fence is beyond the water tower” but lets say that you have an ambiguous ‘location’ like Space?

1930s & 40’s  Pulp Scifi aside… what does it mean to be “Beyond Space”? 😛

Clearly we’ve run into one of the limitations of rule based story generation, of which there  seems to be many… but in this case I’m referring to unintended ambiguity.

At best a rule would reduce ambiguity and at worst it could inject significant ambiguity into a sentence. Ambiguity in this case should be understood as lack of clarity or a variance in what the reader is supposed to understand is occurring and what they believe is occurring.

Limitations aside, this type of rule based generative system is surprisingly effective at crafting those direct and matter of fact type statements.

The type of problem you could write an “If This Then That” sort of rule for… hmmm.

 

A Few More Rules

Here are a few more rules to help you get a feel for how this whole “rule” thing works:

Rule: name is very positive_personality_trait
&
Rule: name is very negative_personality_trait

See if you can tell which is which in this list:

Channing Lynn is very faithful
Jerome Puckett is very defeated
Arturo Thomas is very nice
Damon Gregory is very grumpy
Calvin Weeks is very repulsive
Joaquin Hicks is very gentle
Amanda Calhoun is very thoughtless
Matthias Welch is very polite
Carter Camacho is very scary
Jay Dyer is very happy
Harper Buckley is very helpless
Trenton Bauer is very kind
Kane Owen is very lazy
Lauryn Vasquez is very obedient
Aleah Gilmore is very angry
Ameer Cortez is very brave
Kase Wolfe is very worried

This rule is static and can be improved by having fewer “hard coded” elements.

Instead of the result always containing the word “very” you might instead have a gradient of words that are selected at random (or based on some precondition) that would modify the meaning or intensity of the trait, i.e. mildly, extremely, slightly, not particularly, etc…. which could lead to interesting combinations, we could call the gradient of terms, oh I don’t know… adverbs. 😛

Technically though, adverbs in general are too broad of a category to treat as simple building blocks in a rule like this but you could build a list of adverbs that would apply in this case and replace the word ‘very’ with a selection from that list which would result in more variation in the personality trait descriptions.

Lets look at another rule.

Rule: name is adjective_condition

Annalee Sargent is shy
Hugh Oconnor is helpful
Tessa Rojas is gifted
Cristian Castaneda is inexpensive
Heavenly Patel is vast
Gibson Hines is unimportant
Alora Bush is alive
Leona Estes is mushy

I don’t know about you but…

I’ve always found that “mushy” people are very positive_social_anecdote! 😛

Are you starting to see how the rules work? 😉

Much like the rule above that could be improved by replacing the “hard coded” adverb (very) with a gradient that is selected at random (or based on some precondition) the verb ‘is’ in this rule could be replaced with a gradient of verb tenses i.e. is, was, will be, etc…

Now, if you want to get more complicated… you could even build a system that uses or implements preconditions as I mentioned above.

An example of a precondition I gave above was verb tense to determine if something has, is or will happen… which would then modify the succeeding rules that follow and are related to it… but it’s also possible to build preconditions that modify rules directly from properties that are innate to your characters,  settings, objects in the locations, the weather, the time of day etc…

For example consider the Rule: name is a gender

This rule must be able to determine the gender of the name given to it in order for the rule to work. In this case, the gendered name would act as a precondition that modifies the result of the rule.

Reyna Dunlap is a woman
Nikolai Cummings is a man
Emerald Lynch is a woman
Lucas Woodward is a man
Bailey Ramsey is a woman
Matias Miller is a man
Tinley Hansen is a woman
Mckenzie Davidson is a woman

It’s also possible however for a name to be gender neutral, like Jamie for example, and the rule cannot simply break if the name is both male & female or neither in the case of a new or non-typical name and that level of abstraction (care and detail given to each rule so as to prevent a rule from breaking) has to extend to all rules in all cases which is why using rules to write stories is impractical.

Related to the last rule is this Rule: name is a adjectives_appearance gender

Mallory Joseph is a clean woman
Talon Vazquez is a bald man
Kody Maxwell is a magnificent man
Meredith Strickland is a stocky woman
Jaliyah Haynes is a plump woman
Brian Leblanc is a ugly man
Collins Warren is a scruffy woman
Tenley Robbins is a chubby woman
Brantley Mcpherson is a chubby man
Killian Sawyer is a fit man

Here again you see the rule must identify the gender of the name given to it… but what’s more important is that I used the “present tense” ‘is’ when its just as valid grammatically to say that “Killian Sawyer was a fit man and in fact even if it is grammatically correct to say he “is fit” he might not even be alive any longer and ‘was’ would be logically correct being the past tense with the implication the he is no longer fit rather than dead and additional information would be required by both the reader and system to make the determination that Killian was dead but the point still stands.

Using preconditions on all aspects of the story such as the characters, places, things etc. could enable the system to properly determine if it should say something is, was, will be, maybe, never was, never will be etc… it could examine the established who, what, when, where, why & how and use that information to determine what rule to use next which would progress the story further.

It’s easy to imagine how some rules would only be applicable if certain conditions had occurred or were “scheduled” to occur later in the story. Some rules might naturally form branching chains or tree hierarchies within the “flow” of the rules.

This implies if not requires some form of higher order logic, state machines and the use of formal logic programming with special care and attention given to the semantic meaning or value of each part of each rule…

Well nobody said it was going to be easy! 😛

These Are Not The Droids You Are Looking For

Sounds too easy right? Well… you’re probably right.

I mean sure you can do this in theory and next week I will provide a crude “proof of concept” example with code demonstrating the rules I used here today, but even if you create a bunch of rules it’s not like “earth shatteringly good” and you can’t just write them and you’re done, there is a lot of trial and error to get this sort of thing just right.

Personally I’ve never even seen a fully functional implementation of this type of thing… sure i’ve seen template based stories that use placeholders but nothing as dynamic as I believe would be required to make a rule based system work as described.

Again I am not talking about simply randomly selecting rule after rule… I mean sure you could do that but you won’t get anything really useful out of it.

To do this right your system would select the rules that properly describe past, present & future events correctly based on where in the story the rule is used and it can’t simply just swap out the names and objects or descriptions in your story without concern for the context that the rule is being applied.

To do rule based story generation right means that you get different stories each time the system runs not cookie cutter templatized tales. You cant simply write a story template and fill it with placeholders and then select and assemble “story legos” and get a good story.

Though at least hypothetically it could work if you wrote enough rules and built a more robust system that keeps track of the story state, the characters and their motivations, the objects, where they are, what they are etc… of course this is tedious and ultimately still amounts to you doing an awful lot of work that looks like writing a story.

I do believe (at least in theory) a rule based story generative system as described here could work but you would be limited to the manually written rules in the system (or are you? 😉 ) and how well the state machine used the rules.

Further, its debatable that even if a rule based story generation system worked, could it actually be good enough to be the “best seller” writer bot that we’re looking for?

Seemingly the major limiting factor to me appears to be hand writing, refining and testing the rules.

Suggest A Rule

As I said I will present the code for these rules in my next post but I’d like to ask you to suggest a rule in the comments for this post and I will try to include as many of them as possible in the code and I will give the suggester credit for the rule.

Please remember to like, share & follow!


Your financial support allows me to dedicate the time & effort required to develop all the projects & posts I create and publish here on my blog.

If you would also like to financially support my work and add your name on my Sponsors page then visit my Patreon page and pledge $1 or more a month.

As always, feel free to suggest a project you would like to see built or a topic you would like to hear me discuss in the comments and if it sounds interesting it might just get featured here on my blog for everyone to enjoy.

 

 

Much Love,

~Joy

Bot Generated Stories II

In my last post Bot Generated Stories  I left off describing how text based story generation can give way to a sort of  “holodeck”  virtual reality where you don’t just read a story but can explore an entire simulated world built around giving you a narrative experience unique to your preferences and choices.

The first step is to build a “writer bot” that isn’t quite as good as a human writer (but capable none the less) so that it can work along side a human and aid in the writing process. This would allow the bot to rely on the human to determine what is “interesting” while the bot offers suggestions when a sort of “say something” button is pushed though my friend Oliver suggests the phrase “gimme some magic”. 😛

As described, this bot would act as a “digital muse”  of sorts, offering suggestions along the way with a human selecting and writing the details from a set of possibilities while allowing the human author to throw out the bots suggestions and take the story in completely different directions than what the bot generated.

In many ways my “writer bot” is far from this goal because it fails when it comes to generating sentences that have actual meaning and correlation with the desired topic between clauses but I will talk about this in more depth in another article.

What my bot is good at is generating sentences that are better than random and I can illustrate this quite simply.

 

My First Attempt: Yule’s Heuristic’s

My first experiments used a bot with random word selection from a very large word list to produce content.

It’s important to note that I did not expect good results I just needed something to compare all future attempts against and random selection seemed like the worst way to do it.  If any of my bots along the way produced content even slightly better than random it would be a step in the right direction.

My initial methodology was basically just to pull words at random from the built in Linux dictionary and throw in the occasional period or comma (no commas shown in example below) to create sentences and clauses. I then concatenated those pseudo sentences and randomly added a break to create paragraphs.

Also, mostly for my own amusement I programmatically generated a “contents” section with chapter titles and page numbers that line up, though outside of those “rules” the following was pseudo randomly generated.

Note this is the first output I generated when I first began working on creating a “writer bot” (it’s terrible – though some of it a amusing):

Yule's Heuristic's
GeekGirlJoy

Table of Contents

Chapter 1: Rebroadcast's Sulkies Borough Whitewashes Swim........................ 3
Chapter 2: Culottes Mutability's Corroborations Moet's Competent................ 29
Chapter 3: Guesting Unicycles Neckerchieves Studious Oviduct.................... 52
Chapter 4: Penes Unknown Mileposts.............................................. 64
Chapter 5: Cupboard Exult Tower's............................................... 78
Chapter 6: Letha Bookmarking Kmart's Concentrate................................ 92
Chapter 7: Defensiveness Fielder Input Kilometers.............................. 112
Chapter 8: Bugging Outperforms Assault's....................................... 126
Chapter 9: Meany Conviviality's Unintelligent Plods Yards...................... 146
Chapter 10: Coal's Euphemism Union's Heterosexuality's......................... 166
Chapter 11: Ill Atrocious Inputting Moderator.................................. 180
Chapter 12: Marrieds Weissmuller Surrendering.................................. 200
Chapter 13: Convene Asylum's Dustiness's Permeated............................. 211
Chapter 14: Methodist's Prosecuted Jewelers.................................... 222
Chapter 15: Remoteness's Goblin Freeholder's Sixth's........................... 231
Chapter 16: Provo Peafowls Offensiveness's Bonsai's............................ 244
Chapter 17: Personal's Diastolic Questioning................................... 256
Chapter 18: Agitates Contingency's Gastronomy's Lineup's Gallic................ 266
Chapter 19: Garbling Poked Pithiest Depp Specialists........................... 289
Chapter 20: Lit Condolences Webb Levying Laurel................................ 302


Chapter 1: Rebroadcast's Sulkies Borough Whitewashes Swim

Delawarean Paraná harbinger diodes tutoring repairman slice posits blamer. Classicist boor's betting Markham chunky Monroe's wasting authorize abductors glance's vatting. Installments skateboarded stein lining's goodbye interstice critiqued onslaught's mute's failing's.

Hell's Egyptian's Battle compensates handsomest rookeries droves taxidermist's spaciousness's expunged majors standstill culpability. Viscus's absorbents mutability's Whitsunday's Matthew Socrates nitrate's dwarfism opulence's diffuse budgerigars silenter perversity's. Quart Nescafe newspaperwoman's guest sidestroke outdistancing scald workingmen waggle overlapping.

Flagons crochet's compunction duties Elysée objecting mace headrest's chlorinating enraptured softly enmeshed Bessemer's. Prays bracelets reamed Lagos's moisture particular's foisting. Okra Virginia's granddaughter's kronor individualism's sightseers haziest wagons sandiest Appaloosa's overcasting Fisher Minos's.

Cogwheel's nutritionist Ares erogenous inconsistent gummiest sachem lien connection skivvies successor secretion. Eisenhower supernatural Freemasonry's rostrum Rudolph's causeway's ocean's. Exhortations quibbles recounted innocent intermissions academician's hardwoods lard hindsight's austerest dabbled scalawags.

Despicable topography's narration's glaze's homograph's molehill's doyens zoo malnutrition's neutralization's. Hunts Schultz footnote Kroc's proton processioning inadvertence's Mars dialed Noemi prithee Sheena Parrish. Rightist's departure's padre Joule mangled roughneck's jazziest affiliate spinoffs cops scrubbing deter.

Consenting Kevin's budgies Balinese's neat warthogs plumb scapegoating bombardiers Burke hoppers. Storyteller's rationals lethargy jitterbug's poorhouse threats pipeline extolled jogs grandee. Vastness alluvial's bloodstain experimentation cigaret Karenina's Orval's thereto banishment abattoir's Chrysler abducts Yolanda's.

...

And it goes on like that for 312 pages of mind numbing random goodness. 😛

That bot had a huge vocabulary but clearly using randomly selected words is terrible!

Not a single sentence was coherent! Which is actually what I expected so Yule’s Heuristic’s was a success in that it failed as planed though I had to go beyond random text if I wanted close to resembling a story.

How then did I get my bot to generate A Halloween Tale?

We’ll talk about that and more in my next post Rule Based Story Generation.

Please remember to like, share & follow!

Your financial support allows me to dedicate the time & effort required to develop all the projects & posts I create and publish here on my blog.

If you would also like to financially support my work and add your name on my Sponsors page then visit my Patreon page and pledge $1 or more a month.

As always, feel free to suggest a project you would like to see built or a topic you would like to hear me discuss in the comments and if it sounds interesting it might just get featured here on my blog for everyone to enjoy.

 

 

Much Love,

~Joy

Create a free website or blog at WordPress.com.

Up ↑