Does anyone know of a Java library that will let me parse .PO files? I simply want to create a Map of IDs and Values so I can load them into a database.
Outside strings, white lines and comments may be used freely. Comments start at the beginning of a line with ' # ' and extend until the end of the PO file line. Comments written by translators should have the initial ' # ' immediately followed by some white space.
A . PO file is a portable object file, which is text-based. These types of files are used in commonly in software development. The . PO file may be referenced by Java programs, GNU gettext, or other software programs as a properties file.
I searched the Internet and couldn't find an existing library, either. If you use Scala, it's quite easy to write a parser yourself, thanks to its parser combinator feature.
Call PoParser.parsePo("po file content")
. The result is a list of Translation
.
I have made this code into a library (can be used by any JVM languages, including Java, of course!): https://github.com/ngocdaothanh/scaposer
import scala.util.parsing.combinator.JavaTokenParsers trait Translation case class SingularTranslation( msgctxto: Option[String], msgid: String, msgstr: String) extends Translation case class PluralTranslation( msgctxto: Option[String], msgid: String, msgidPlural: String, msgstrNs: Map[Int, String]) extends Translation // http://www.gnu.org/software/hello/manual/gettext/PO-Files.html object PoParser extends JavaTokenParsers { // Removes the first and last quote (") character of strings // and concats them. private def unquoted(quoteds: List[String]): String = quoteds.foldLeft("") { (acc, quoted) => acc + quoted.substring(1, quoted.length - 1) } // Scala regex is single line by default private def comment = rep(regex("^#.*".r)) private def msgctxt = "msgctxt" ~ rep(stringLiteral) ^^ { case _ ~ quoteds => unquoted(quoteds) } private def msgid = "msgid" ~ rep(stringLiteral) ^^ { case _ ~ quoteds => unquoted(quoteds) } private def msgidPlural = "msgid_plural" ~ rep(stringLiteral) ^^ { case _ ~ quoteds => unquoted(quoteds) } private def msgstr = "msgstr" ~ rep(stringLiteral) ^^ { case _ ~ quoteds => unquoted(quoteds) } private def msgstrN = "msgstr[" ~ wholeNumber ~ "]" ~ rep(stringLiteral) ^^ { case _ ~ number ~ _ ~ quoteds => (number.toInt, unquoted(quoteds)) } private def singular = (opt(comment) ~ opt(msgctxt) ~ opt(comment) ~ msgid ~ opt(comment) ~ msgstr ~ opt(comment)) ^^ { case _ ~ ctxto ~ _ ~ id ~ _ ~ s ~ _ => SingularTranslation(ctxto, id, s) } private def plural = (opt(comment) ~ opt(msgctxt) ~ opt(comment) ~ msgid ~ opt(comment) ~ msgidPlural ~ opt(comment) ~ rep(msgstrN) ~ opt(comment)) ^^ { case _ ~ ctxto ~ _ ~ id ~ _ ~ idp ~ _ ~ tuple2s ~ _ => PluralTranslation(ctxto, id, idp, tuple2s.toMap) } private def exp = rep(singular | plural) def parsePo(po: String): List[Translation] = { val parseRet = parseAll(exp, po) if (parseRet.successful) parseRet.get else Nil } }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With