I am looking for a way to test a particular string to determine if it contains code.
For instance, I would like to pass a string such as "body{font-weight: bold;}" and determine that it is CSS.
I would like to do it for:
HTML, CSS, JavaScript, Ruby, C,C++,C#
I am guessing that it would be regex of some sort, but I am pretty stumped!
You need some kind of a classifier that uses a heurisitic/statistical approach. The accuracy will be better if the input string is larger (e.g. it's hard to say what language = belongs to).
Here's an example of a classifier that uses bayesian methods - http://www.rubyinside.com/sourceclassifier-identifying-programming-languages-quickly-1431.html
The highlight.js script does detection in javascript. Take a look at the source.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With