I have implemented a simple Genetic Algorithm to generate short story based on Aesop fables. Here are the parameters I'm using:
Mutation: Single word swap mutation with tested rate with 0.01.
Crossover: Swap the story sentences at given point. rate - 0.7
Selection: Roulette wheel selection - https://stackoverflow.com/a/5315710/536474
Fitness function: 3 different function. highest score of each is 1.0. so total highest fitness score is 3.0.
Population size: Since I'm using 86 Aesop fables, I tested population size with 50.
Initial population: All 86 fable sentence orders are shuffled in order to make complete nonsense. And my goal is to generate something meaningful(at least at certain level) from these structure lost fables.
Stop Condition: 3000 generations. And the results are below:

However, this still did not produce a favorable result. I was expecting the plot that goes up over the generations. Any ideas to why my GA performing worse result?
Update: As all of you suggested, I've employed elitism by 10% of current generation copied to next generation. Result still remains the same: 

Probably I should use tournament selection.
All of the above responses are great and I'd look into them. I'll add my thoughts.
Mutation
Your mutation rate seems fine although with Genetic Algorithms mutation rate can cause a lot of issues if it's not right. I'd make sure you test a lot of other values to be sure.
With mutation I'd maybe use two types of mutation. One that replaces words with other from your dictionary, and one that swaps two words within a sentence. This would encourage diversifying the population as a whole, and shuffling words.
Crossover
I don't know exactly how you've implemented this but one-point crossover doesn't seem like it'll be that effective in this situation. I'd try to implement an n-point crossover, which will do a much better job of shuffling your sentences. Again, I'm not sure how it's implemented but just swapping may not be the best solution. For example, if a word is at the first point, is there ever any way for it to move to another position, or will it always be the first word if it's chosen by selection?
If word order is important for your chosen problem simple crossover may not be ideal.
Selection
Again, this seems fine but I'd make sure you test other options. In the past I've found rank based roulette selection to be a lot more successful.
Fitness
This is always the most important thing to consider in any genetic algorithm and with the complexity of problem you have I'd make doubly sure it works. Have you tested that it works with 'known' problems?
Population Size
Your value seems small but I have seen genetic algorithms work successfully with small populations. Again though, I'd experiment with much larger populations to see if your results are any better.
The most popular suggestion so far is to implement elitism and I'd definitely recommend it. It doesn't have to be much, even just the best couple of chromosome every generation (although as with everything else I'd try different values).
Another sometimes useful operator to implement is culling. Destroy a portion of your weakest chromosomes, or one that are similar to others (or both) and replace them with new chromosomes. This should help to stop your population going 'stale', which, from your graph looks like it might be happening. Mutation only does so much to diversify the population.
You may be losing the best combinations, you should keep the best of each generation without crossing(elite). Also, your function seems to be quite stable, try other types of mutations, that should improve.
Drop 5% to 10% of your population to be elite, so that you don't lose the best you have.
Make sure your selection process is well set up, if bad candidates are passing through very often it'll ruin your evolution. You might also be stuck in a local optimum, you might need to introduce other stuff into your genome, otherwise you wont move far.
Moving sentences and words around will not probably get you very far, introducing new sentences or words might be interesting.
If you think of story as a point x,y and your evaluation function as f(x,y), and you're trying to find the max for f(x,y), but your mutation and cross-over are limited to x -> y, y ->y, it makes sense that you wont move far. Granted, in your problem there is a lot more variables, but without introducing something new, I don't think you can avoid locality.
As @GettnDer said, elitism might help a lot.
What I would suggest is to use different selection strategy. The roulette wheel selection has one big problem: imagine that the best indidivual's fitness is e.g. 90% of the sum of all fitnesses. Then the roulette wheel is not likely to select the other individuals (see e.g. here). The selction strategy I like the most is the tournament selection. It is much more robust to big differences in fitness values and the selection pressure can be controlled very easily.
Novelty Search
I would also give a try to Novelty Search. It's relatively new approach in evolutionary computation, where you don't do the selection based on the actual fitness but rather based on novelty which is supposed to be some metric of how an individual is different in its behaviour from the others (but you still compute the fitness to catch the good ones). Of special interest might be combinations of classical fitness-driven algorithms and novelty-driven ones, like the this one by J.-B. Mouret.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With