I can't find anything other than closed-source web applications. Are there any active projects? I'd be interested in using the software in something I'm developing and getting involved.
Since you're assuming two categories, almost any classifier will probably do ok. Some suggestions:
As an earlier commenter said, starting from a known sample of text (and there should be plenty... newspaper corpuses might be good), train and classify, on some reasonable attributes (maybe presence / absence or words or word pairs).
This one should be (comparatively) easy.
If you're using python, even something as simple as the Natural Language Toolkit (cf: nltk.org) and their book should get you a lot of way there.
Here's another web site that claims to do this: GenderAnalyzer. However it is relying on another website called uClassify.com that is down as I write this. They have a contact link at the bottom for questions.
It sounds like an academic outfit: "In our lab it seems to works pretty well".
There are applications like "The Gender Genie" which operate within a reasonable degree of success: http://bookblog.net/gender/genie.php (and particularly with longer texts)
It doesn't need to be entirely successful. I would have huge amounts of data to deal with, and it's mostly just for fun.
If anyone knows of anything, please do share.
Richard
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With