DATA AND DATA ABSTRACTION
A question I often get is, simply "how do I get my data for generators?" Though it's discussed specifically in other generators, I wanted to write a general essay to help people out.
STEP ONE:
Determining the kind of generator
When building a generator, at first glance, it often
seems that there are two parts: the data, and the patterns it can
appear in.
This is true in the case of relatively simple generators - examples on this site include the Magic Item Generator, the Space Phenomena Generator, and so forth. These basically match up words in certain categories in certain orders.
However, in the case of more complex entities, the data and the pattern merge. There are not so much set patterns of data, but relations of data, where certain kinds of data require certain patterns, and thes patterns interact. In short, data and pattern is inseparable.
This more complex kind of generator would be more like my Bookspinner, the Gundam Maker, etc. I've also called these "madlib" generators or "Text Connection" generators. These generators are defined by the fact that there is no simple list of patterns, that data may often reference itself, and that there is a kind of "cascade" effect where various patterns interact.
It is extremely important to determine if your generator will be simple (essentially put words in pre-determined patterns) or have complex relations of data.
An example may be a generator to make non-human species. This may have a simple pattern like following:
Large Humanoid with Non-functional wings
Muscular Centaur-form with tigerlike features.
Here we have a basic pattern - Descriptor, form, extra feature. That's not particuarly complex.
But if you wanted to have details - the color of the wings, if the centuar-form had four legs or six, then you'd have a "Text Connection" generator, where you'd start with a random seed, and depending on what is chosen, one choice may cause another, and another, etc.
Determining the complexity of your generator is very important. Anayze the data that inspired it carefully - and be careful of over OR underreaching yourself.
SUMMARY: First, determine how your generator will work - simple association of words, or complex relations - in other words, is the data separate from pattern or are the patterns in the data?
STEP TWO: Finding
basic data and patterns
Now you've determined what kind of generator you're
going to build, it's time to break down your source material to
find data and patterns.
Simply put, gather some examples of what you want to make, come up with some, write them down, and see what it tells you about possible patterns of data and source data.
This is an odd mix of art and science. I can give examples, but I can't give a system. I can give framework, but not specifics. It's something you learn, like drawing or riding a bike.
What I do is, essentially, try and find the patterns to the data, and then find data from my source ideas and flesh it out a bit, then see if, by hand, I can make my data and patterns work together.
To give an example, let's go with the idea of alien species as given above.
We've found a basic pattern - some general descriptor of the form, a specific descriptor of the form, and extras the form may have. It'd be like:
<DESCRIPTOR><BASIC FORM><EXTRA>
OK, some examples:
Descriptors: Large, small, muscular.
Basic Form: Humanoid, Centaur-form
Extra: wings, antennae, compound eyes.
So we can already see some variants:
Large Humanoid with Wings
Small Centaur-form with compound eyes
Small humanoid with antennae
Of course this is obnoxiously simplistic. And it's not nearly as complex as some generators could get. But you get the idea - analyze, write, and play.
STEP THREE: Push it
a bit
After you get a pattern, push it a bit. Try adding some
more vocabulary, some more variants. For instance perhaps our
alien generator could have several patterns beyond
<DESCRIPTOR><BASIC FORM><EXTRA>, such as:
Of course if this was a more complex generator, with say, the color of wings, etc. there'd be even more here.
STEP FOUR: Test-code
Once you have some data and patterns (or
data-that-is-patterns), put it into code. Try it out. There's
only so much you can do in your head, so its time to start the
Generator.
Put in your existing data and patterns and give it a test run. Add a bit more, try out some ideas, some data, see what happens.
This is very important, as it lets you quickly test data and patterns to see if it works. You can also pass around a link to people to see if it works for them.
Most importantly, you can test the patterns for or in your data here. You may not have a huge stock of data, but you may have enough to test to see if it can be put into the propper orders and contexts. If you'e doing a more complex generator, test out as many possible relations as you can expect to see.
STEP FIVE: Stock it
Once you have everything working, then you can stock
your data. This is one of the most complex areas, and brings up
several questions, but the major one is:
Where do I get data?
Simply, find as many sources as possible - online sources are invaluable (such as www.thesaurus.com). Check your personal library. A used or overstock book store may have cheap books on your subject.
One thing you may have to confront is that you may NOT know enough about your subject - IE you need "metadata" to figure things out. Be sure you know or find out about your subject enough to work on a generator for it - and that knowledge or gaining of knowledge will help you put in basic data and flesh it out.
I hope this essay helped you on your generator work.