July 18, 2012

Just to prove my theory...

In my last entry I made a Shakespeare generator.

I completed it with a Strindberg generator:


Som det!
Fiskar utanför ett ord.
Att följa mig från vita skjortbröstet och vila; fyllde i sin levnads mål, ingen annan sak till fånga och jag är bittert ibland med god dag?
Varpå ett tyst och lämnade Arvid Falk tittar jag säger oss, men nu svarade Sellén.
Ty hans brustna moln av utlänningen, tyst att han!
Låt mig!
Och vad har lossnat här, är gudfruktig!
Varför han så finns välgörenhet också!
Hennes nåd skulle komma hit utan ser jag tala, i Byggmästarbanken.
När di kommer visst inte känner sig böjd att Sellén, här har du opp.
Nu har du fått slag.
Men han uttalade alltid; den tog upp.
Stod Ygberg såg huru skulle fosterlandets namn; befann sig på vår begravningsbön.
Konglig sekter o den ena lakanet blivit snillen, att han sin överrock med en stol skjutas på en gumma, Karlen super ju värden höjde första revbenet.
Nåå!
Denne sköt hela sin motståndares argumenter.

And a Goethe generator:


Benutz es hinkte, höchster Herr ich schnell quillt, dem ich weiß, so wird nicht, folge ziehn die Gift?
Als sonst kamt ihr anständig seh' ich, der Geist verwirrt genug, Echo!
Zwar ist sie zu schauen aus millionen kehlen.
Da steigen meine Widersacher drängen, ich bei den erdball abgeründet, dem seeisch heitern feste Unzugängliche Mauer spaltet sich ganz ausgewachsen schauen, die Füße tragen; aber hüte dich auf!
Fackeln, die Kraft, Hiebei darfst du nichts an das Proszenium.
Nun fort, uns im dienen, wir?
Du scheinest mir doch gar selten aber flammt's empor und solch einen!
Begnügen sollt' ich allen Seiten hundert Hügeln unterbrochner Fläche fließt Peneios: auf!
Und Land, sie sich ergetzten, wir sehn!
Der flammend sich die thronen hehr, Umstellend ruhigen Friedensweiher.
Durchlauchtigster, angebaut.
Die Neigung, gemein!
Den Mittelpunkt.
Der schönsten einer längst gewohnt der Ernte zu dem heiligen liebeshort.
Ein Schnippchen schlagen deine macht erst sah sie genießen wollten, ein Marmorblock als er sich selbst außer mir den klügsten Mann, schlachterzogne junge Brut?
Ich nicht benommen; Fuß nimmt jeder, Chiron könntest du's doch gewandter sein, entflammte stroh.
Sind gewohnt, hier ja mein, wie das Papiergespenst der Treppe, so sehr es am besten frommt; über der einzig vom meinem Geiste.

Was considering a Bukowski generator, but I foresaw the results...

Lightning storms in Sweden when I write this.

July 8, 2012

The Shakespeare Generator


A fool thinks himself to be wise, but a wise man knows himself to be a fool.

I didn't write that. Shakespeare did. Wish I had though.

For us with an ounce less talent but with a will to cut corners I decided to make a Shakespeare generator.

So what is the Shakespeare algorithm?

I don't know, but I can show you my take on the subject.

The easiest way would of course to just randomize the words. In the example above you could get stuff like:
"A think a to himself."
Or
"be knows a to a"

Definitely words from the Shakespeare vocabulary, but still crappy. You will not find any memorable quotes there unless you randomize a lot. A lot like in this.

If I were a linguist I would try the other end of the spectrum. I would analyze each word and implement the grammatic rules on when and how to bend them.

For me that sounds too tiresome. I want quick results.

So I constructed a graph out of Macbeth.

The basic fact was that Macbeth is made out of sentences and the sentences are made out of words.
All sentences has a start and an end, made out of an ending sign such as "?", ".", "!". To make it easier for me I decided that "," ";" and ":" also ends a sentence. After all it is word groups rather than pure sentences that I am seeking.

So the graph starts with a start node and that node is connected to all words that starts a sentence, namely "a" and "but".


And then I just continued adding nodes for each word to the following word. Eg. "A" -> "Fool", "Fool" to "Thinks" etc. 

When a word came to an ending sign, I added a link to a node of that sign, such as "Wise" -> "," and "Fool" -> "."

When a word came up that already had a node, I didn't add a new node, but just a new edge, so the "Wise" node is connected both from "Be" and from "A". 

When the same word pair came up a second time, such as the double connection between "A" and "Fool", I added a one to the weight of the edge, ending up with a weighted graph.

I did this for the complete Macbeth which means that a word such as "A" is connected to 119 other words with different weights depending on how often those words shows up.

When my graph is ready I can randomize a trip through it from start node to an end node. Randomized based on the weight of the edges. Wise huh? This pseudorandomization also sees to that words usually have the correct ending and that all words fit with the word before it.

So some generated sentences:


Mingle with you make us.
Come to the subject of our chimneys were a ring the gate make such welcome.
Whey-face?
Every man.
Madam, 
and blind-worm's sting, 
I' the honest men our offices and 'tis the instruments, 
at peace?


"Come to the subjects of our chimneys" Poetry!

Next step will be to make a graph consisting of all Shakespeares works.

Am I wise or not? (or just a simple fool)

PS. The graph was made using this tool. Recommended!

July 5, 2012

Generate word like sentences

Randomizing data is fun so I decided to make a program to automatically fill objects with data.
i started out with totally random data, but I realized that 'lhghhjfdr' is not that fun to put as value in a string so I have been producing generators.

Generators that can do the Lorem ipsum, generate random names, superhero names, street names, titles, cities... and generating random stuff started to be fun again.

One thing I wanted to do was to generate word like words, strings that have no meaning but are easy to read and pronounce.  Pseudo random text.
I found this and it gave me a basic algorithm, I spiced it up with this to give each letter a statistical weight.

The random word is made of a random number of parts that are concatenated. The parts are either v, cv or cvc where c stands for a consonant and v stands for a vocal. which part type I choose is of course random.

The probability for each letter is according to the letter frequency in the english language:


private void LoadLetters()
{
Vowels = new List<WeightedLetter>();
Consonants = new List<WeightedLetter>();
Vowels.Add(new WeightedLetter() { Letter = 'a', Weight = 81 });
Vowels.Add(new WeightedLetter() { Letter = 'e', Weight = 127 });
Vowels.Add(new WeightedLetter() { Letter = 'i', Weight = 70 });
Vowels.Add(new WeightedLetter() { Letter = 'o', Weight = 75 });
Vowels.Add(new WeightedLetter() { Letter = 'u', Weight = 28 });
Consonants.Add(new WeightedLetter() { Letter = 'b', Weight = 15 });
Consonants.Add(new WeightedLetter() { Letter = 'c', Weight = 28 });
Consonants.Add(new WeightedLetter() { Letter = 'd', Weight = 43 });
Consonants.Add(new WeightedLetter() { Letter = 'f', Weight = 23 });
Consonants.Add(new WeightedLetter() { Letter = 'g', Weight = 20 });
Consonants.Add(new WeightedLetter() { Letter = 'h', Weight = 60 });
Consonants.Add(new WeightedLetter() { Letter = 'j', Weight = 2 });
Consonants.Add(new WeightedLetter() { Letter = 'k', Weight = 7 });
Consonants.Add(new WeightedLetter() { Letter = 'l', Weight = 40 });
Consonants.Add(new WeightedLetter() { Letter = 'm', Weight = 25 });
Consonants.Add(new WeightedLetter() { Letter = 'n', Weight = 67 });
Consonants.Add(new WeightedLetter() { Letter = 'p', Weight = 19 });
Consonants.Add(new WeightedLetter() { Letter = 'q', Weight = 1 });
Consonants.Add(new WeightedLetter() { Letter = 'r', Weight = 60 });
Consonants.Add(new WeightedLetter() { Letter = 's', Weight = 63 });
Consonants.Add(new WeightedLetter() { Letter = 't', Weight = 90 });
Consonants.Add(new WeightedLetter() { Letter = 'v', Weight = 10 });
Consonants.Add(new WeightedLetter() { Letter = 'w', Weight = 24 });
Consonants.Add(new WeightedLetter() { Letter = 'x', Weight = 2 });
Consonants.Add(new WeightedLetter() { Letter = 'y', Weight = 20 });
Consonants.Add(new WeightedLetter() { Letter = 'z', Weight = 1 });
}


Choosing parts:



private static string GeneratePart()
{
int PartTypeChoose = GeneratorData.Instance.Randomizer.Next(8);
if (PartTypeChoose == 0) //less single vovels
//v
{
return  GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Vowels);
}
else if (PartTypeChoose < 4) //3 out of 8 seems good
//cv
return GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Consonants) + 
                                            GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Vowels);
else 
//cvc
return GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Consonants) + 
                                            GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Vowels) + 
                                            GeneratorData.Instance.GetWeightedLetter(GeneratorData.Instance.Consonants);
}

Creating some text returns this:

hix ridholo nilroclel ananlu lishavho sehehef casho tiba tendutose senoset tevsehro atota rignu getil hamoko votrof riheu deta oa reue sorfartim rocdino atehhi sesraw mit etos wereh ret teqe moletoh tata e hecladi dadug lepxu dehdon hey tifoi usencaw bi hepnis ita hemeleyna nohni wahi coyitif xor nahtan sohaa wenora no sateto tapef larnemacag tiyco cagsa yer husra ta fa

Isn't it beautiful?