Given, say, a recipe (list of ingredients, steps, etc.) in free text form, how could I parse that in such a way I can pull out the ingredients (e.g. quantity, unit of measurements, ingredient name, etc.) usin PHP?
Assume that the free text is somewhat formatted.
To do it 'properly', you need to define some sort of grammar, and then maybe use a LALR
parser or some tools such as yacc
, bison
or Lex
to build a parser. Assuming you dont want to do that, its strpos()
ftw!
There is openNlp in java for name entity extraction which can fetch you what you are looking see this : http://opennlp.sourceforge.net/models-1.5/
Then you can use php-java connector to get results into php.
There's very similar question for Java. In short, you need dictionaries (of, say, ingredients) and regex-like language over terms (annotations). You can do it in Java and invoke it from PHP via web service or you can try to re-implement it in PHP (note, that in second case you may have significant slowdown).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With