Bulding a Word-Based Generator II: Using the "Page of Generators" standard PHP Code.

by Steven Savage

This document will walk you through the actual creation of the "Space Phenomena" generator. The idea being to create a generator to make interesting ideas for unusual space phenomena - maybe not scientific, more for Space Opera type stories ala Star Trek.

This is what I call a "Mid-Range" generator. It's neither a simple (since we're going to deal with some complex names) nor hideously complex (there won't be any truly complicated dependencies and rules). It's a good example of how to create something useful that also doesn't involve microscopic rule management

(I'm not against microscopic rule management, but there are times it gets old and times you don't need it).

We'll be using my standard basic word generator code. This code is suited for combining words based on classicication and in certain relevant orders. It does not involve self-aware data - but if we wanted more than a name, it would have.

Sadly, I can't guarantee anything about this document or code or what it will do - if you use any of it, you're resonsible for the results or lack of the same. With so many different browsers, servers, versions of PHP, etc. I can't make any guarantees.

Now, let's get going.

PART 1: Example Data

So, let's take some example phenomena from real life and science fiction:

Temporal Rift - We've seen enough of these in "Voyager" to last a lifetime.
Black Hole - The Old Classic.
Unstable Wormhole - Delta Quadrant, anyone?
Subspace Disturbance - Always interferring with communications.
Stellar Nursrey - Those places stars come from.
Neutron Star - It's a real phenomena.

 

NOTE: I find, that to make a good generator, you need a minimum of five to ten examples to analyze, depending on complexity, and you need a representative sample. This will usually let you get a basic starting point - but prepare to alter and expand what you learn.

Analyzing this we find a very common structure - what I usually call Descriptor/Object and Actor - something to describe or associate with the the actor, the "centerpiece" of the term or combination.

Descriptor/Objects:
Temporal
Black
Unstable
Subspace
Stellar
Neutron

Actors:
Rift
Hole
Wormhole
Disturbance
Nursery
Star

But, if we look, we see some interesting things:

1) The word "Unstable" could describe a describer, such as an "Unstable Black Wormhole." So some descriptors can describe descriptors. Let's call them Metadescriptors.

2) The word "Nursery" could quantify a regular descriptor-actor pairing, such as "Black Hole Nursery." So you have an "Actor" that modifies an existing Actor. Let's call this a "Metaactor."

NOTE: It's important when analyzing your data to be aware that some classifications of words can actually be split in two. This kind of deep analyses allows you to make richer generators.

NOTE: Don't be afraid to revise your ideas while designing. You rarely get it the best the first or even second time.

So, let's look at our new classifications

Metadescriptors:
Unstable

Descriptors:
Black
Neutron
Stellar
Subspace
Temporal

Actors:
Disturbance
Hole
Rift
Star
Wormhole

 

Metaactors:
Nursery

 

PART 2: Patterns of Data and testing them

It's not too hard to look at our examples and find patterns the data can appear in:

MetaDescriptor-Actor
Descriptor-Actor
Descriptor-Metaactor
MetaDescriptor-Descriptor-Actor
MetaDescriptor-Actor-Metaactor
Descriptor-Actor-Metaactor
MetaDescriptor-Descriptor-Actor-Metaactor

So let's pick words at random and see what we get:

Unstable Rift
Temporal Hole
Subspace Nursery
Unstable Stellar Wormhole
Unstable Disturbance Nursery
Stellar Star Nursery
Unstable Neutron Rift Nursery

Now that we've created some examples via "mental randomization," let's look them over.

Most of these are decent, but the Metaaactor "Nursery" makes them sound kind of lame. However, the basic idea of a Metaactor works - so let's come up with another Metaactor or two and see if they sound better.

Well, a Stellar Nursery is a place stars are born. So Metaactors basically represent groupings, relations, etc. So let's add two more Metaactors - Nexus and Confluence. Sounds science-fictiony.

And let's pop these names in place of Nursery in our above examples

Supspace Nexus
Unstable Disturbance Nexus
Unstable Disturbance Confluence
Unstable Neutron Rift Nexus
Unstable Neutron Rift Confluence

OK, these sound better. So the idea of the Metaactor classification works, it's just that what inspired it, the word "Nursery" sounds kind of lame. But let's keep it in the vocabulary. It'll be complete and it may work.

NOTE: If you come up with a concept that sounds bad with the words that led you to develop it, see if it gets any better by adding similar words. It could be your limited data set.

NOTE: Some words are questionable in their usefulness. My basic rule is that unless it appears that a word will never be useful, keep it in, especially if it was in your original sources. You can always remove it later.

 

PART 3: Re-evaluate data

Now, we've seen our ideas in action. We've got a plan. We've got a structure that works reasonably well. Is there anything we may be forgetting?

Well, one hears about things like "Protostars." Maybe we need a "prefix" we can put in front of Actors like Proto, Quasi, etc.

Let's add "Proto" to the front of the Actor in the above examples that use Actors

Unstable Proto-Rift
Temporal Proto-Hole
Unstable Stellar Proto-Wormhole
Unstable Proto-Disturbance Nursery
Stellar Proto-Star Nursery
Unstable Neutron Proto-Rift Nursery

Looking it over, this should be another option - prefixes, just to make things "extra cooler. So now we've got a whole new section of data. However, as this is technically a descriptor, it probably should replace descriptors, not work with them

Our kinds of data include:

Metadescriptors
Descriptors
Prefixes
Actors
Metaactors

Our combinations of data now include:

Actor-Metadescriptor
Descriptor-Actor
Descriptor-Metaactor
Descriptor-Actor-Metaactor
MetaDescriptor-Actor
MetaDescriptor-Actor-Metaactor
MetaDescriptor-Descriptor-Actor
MetaDescriptor-Descriptor-Actor-Metaactor
MetaDescriptor-Prefix-Actor
MetaDescriptor-Prefix-Actor-Metaactor
Prefix-Actor
Prefix-Actor-Metadescriptor

That's also twelve possible combinations of words. If we get a good set of vocabulary, this amount of combinations will mean producing a lot of mixes.

So, now what? Well, we develop it.

 

PART 4: Test Run

Using the Basic Word generator code (well, actually, a copy of it from another project), I changed the files and entered some basic data - both what we had above and a few more words. Here's an example of what was generated. I put an asterisk (*) by ones I particularly likes and felt inspired by.

*Artificial Planetary Cluster
*Dyna-Storm Nursery
*Galactic Vortex Node
Induced Hole Nexus
Quasi-Storm System
Rotating White Planet
Stable Un-Cloud Vortex
Stellar Star
*Unstable Antimatter Wormhole
*Vortex System

 

Looking these over, the worst one is "Stellar Star" (sort of redundant, isn't it?) There are five good ones out of ten, and the rest aren't egregeous (though "Rotating White Planet" sounds like some kind of Generic World). That's actually not bad - a fifty percent "cool" rate is pretty good first thing out with a limited vocabulary.

But let's do one more run:

*Anti-Vortex
*Cloud System
*Galactic Stellar Cloud Vortex
Induced Disturbance
Non-Vortex
*Rotating Galactic Cluster System
*Rotating Planet Node
*Stable Black String
*System Nexus
*Unstable Cluster System

This one has eight decent out of 10. There's nothing as bad as our friend the Stellar Star, though we do see a pattern here - our worst problem isn't wordiness (which is often a problem in generators like these), but generic-sounding terms popping up. The complex things manage to sound actually interesting ("Captain, it's the first Rotating Galactic Cluster System ever encountered! Before now they were only theoretical!"). Our potential flaw seems to be generic and poorly-defined phenomena.

I generated several more examples, and found that this pattern held - on average half or even more were good, and the problems were rather generic sounding terms and the occasional odd match.

So the question arises, can we do anything about this.

In this case, the generic phenomena seem to happen when we have a simple description of an actor. Rather ironically, this was the exact pattern of many of the phenomena we analyzed to create our base data and pattern.

Largely, this can't be avoided, I've found. This is a very basic pattern of data, so removing it could actually skew the results towards being more complex, and thus potentially too complex, and eliminates a simple and effective combination of data. So we should hold onto it.

 

NOTE: Test your generator a few times to make sure that its producing useful results. See if there are any patterns that give you ideas to improve things.

NOTE: In combinations of data, in general, removing even one possibility can have wide repercussions. Removing one word may have little effect, but removing a combination of data is removing ways words can be joined.

So, what's next?

Simple, finish.

 

PART 5: Finishing Up

At this point, I just began fleshing out vocabulary and adding it into the program.

Usually I do one of two things:

1) List all the words then break them into categories.

2) Generate the categories separately.

As you add data, run the generator - I usually give it a few runs for every category I add to. This is a good final test, just in case.

My major finding as I added data was that some things could be Metadescriptrs OR Descriptors, and that some could be either Actors OR Metaactors. Thus I made sure they were tagged to be both. However, I had to think over some very carefully to make sure they fit.

When you're happy, it's done. Release it to the world!