‘Old School’ Procedural Text Generators

Submitter: Jason Boyd, Toronto Metropolitan U

——————————————————

The experiment:

The first topic in the course “Narrative in a Digital Age” is “Procedural Text Generators.” In this class, we discuss (among other things) Hugh Kenner and Joseph O’Rourke’s “A Travesty Generator for Micros” (BYTE, Nov. 1984), which uses Markov chaining to generate text, based on a letter sequence or word analysis of an input text or texts. As the post “Markov and You” from the Coding Horror blog points out: “What’s amazing to me about Markov chains is how unbelievably simple they are. […] Given an adequate input corpus, they work almost uncannily well, a testament to the broad power of rudimentary statistical inference” (Atwood).

Students experiment with two simple online text generators that use Markov chaining, the Gibberish Generator (Enevoldsen) and charNG (both of which have explanatory documentation). We also look at two ‘Twitterbots’ that use Kate Compton’s Tracery, which allows users to create word arrays and grammars to generate new text, in conjunction with v buckenham’s now (thanks to the change in Twitter’s ownership) sadly defunct Cheap Bots Done Quick!: Fairy Fables (@Fairy_Fables) and a strange voyage (@str_voyage). A quick perusal of the feed of the former makes evident it is an automated generator (it is crudely ‘formulaic’), but the feed of the latter shows how hard it can be to detect whether a text (or tweet) is algorithmically generated.

Results:

Although I have not to date used ChatGPT in this course, if I did, what I cover in this class would be helpful in demystifying the basic procedures that ChatGPT uses. I think students would quickly see that this material, rather than just being abstractly ‘interesting,’ has ‘real world’ and timely significance, given the anxieties swirling around AI that have been generated by the accessibility of tools like ChatGPT, and the tendency for people to be overawed by ChaptGPT’s output because they do not understand its fairly simple process for text generation. Markov chaining does connect to stylometry (the analysis of, and theoretically replication of, individual writing styles), and it would be interesting to explore more fully how the generators that use Markov chaining could ‘impersonate’ the writing ‘voice’ of particular writers. This impersonation has of course serious implications for creativity, intellectual property, and fraud, issues which ChatGPT also raises.

Relevant resources:

Contact:

  • Email: jason[DOT]boyd[AT]torontomu[DOT]ca
  • Twitter: @jasonaboyd

Leave a Reply

Your email address will not be published. Required fields are marked *

*