Algorithms to identify Markov generated content?

1 Answers

One simple approach would be to have a large group of humans read input text for you and see if the text makes sense. I'm only half-joking, this is a tricky problem.

I believe this to be a hard problem, because Markov-chain generated text is going to have a lot of the same properties of real human text in terms of word frequency and simple relationships between the ordering of words.

The differences between real text and text generated by a Markov chain are in higher-level rules of grammar and in semantic meaning, which are hard to encode programmatically. The other problem is that Markov chains are good enough at generating text that they sometimes come up with grammatically and semantically correct statements.

As an example, here's an aphorism from the kantmachine:

Today, he would feel convinced that the human will is free; to-morrow, considering the indissoluble chain of nature, he would look on freedom as a mere illusion and declare nature to be all-in-all.

While this string was written by a computer program, it's hard to say that a human would never say this.

I think that unless you can give us more specific details about the computer and human-generated text that expose more obvious differences it will be difficult to solve this using computer programming.

answered Sep 30 '22 14:09

James Thompson

Related questions
                            
                                Why am I getting this error in scrapy - python3.7 invalid syntax
                            
                                What does it mean for an attribute name to end in an underscore?
                            
                                Python requests - check if a particular header exists
                            
                                convert pandas dataframe to json object - pandas
                            
                                Django: Custom User Model fields not appearing in Django admin
                            
                                Sorting according to clockwise point coordinates
                            
                                OpenCV: error: (-215:Assertion failed) _src.type() == CV_8UC1 in function 'cv::equalizeHist'
                            
                                PySpark: Subtract Two Timestamp Columns and Give Back Difference in Minutes (Using F.datediff gives back only whole days)
                            
                                Prevent Visual Studio Code from activating the Python virtual environment
                            
                                How to keep null values when writing to csv
                            
                                Difference between Process.run() and Process.start()
                            
                                Understanding the output shape of conv2d layer in keras
                            
                                Unable to start pipenv with python 3.7
                            
                                Using list.count to sort a list in-place using .sort() does not work. Why?
                            
                                How to add points or markers to line chart using plotly express?
                            
                                TypeError: Cannot cast array data from dtype('int64') to dtype('int32') according to the rule 'safe' while plotting a seaborn.regplot
                            
                                How to use repeat() function when building data in Keras?
                            
                                Make every fields as optional with Pydantic
                            
                                What is "lambda binding" in Python?
                            
                                Why does Paramiko hang if you use it while loading a module?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithms to identify Markov generated content?

Tags:

python

algorithm

markov

agiliq

People also ask

1 Answers

James Thompson

Recent Activity

Donate For Us