Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a Java library for parsing gettext PO files? [closed]

Tags:

Does anyone know of a Java library that will let me parse .PO files? I simply want to create a Map of IDs and Values so I can load them into a database.

like image 664
Mike Sickler Avatar asked Jan 08 '11 19:01

Mike Sickler


People also ask

How to comment. PO file?

Outside strings, white lines and comments may be used freely. Comments start at the beginning of a line with ' # ' and extend until the end of the PO file line. Comments written by translators should have the initial ' # ' immediately followed by some white space.

What is .po format?

A . PO file is a portable object file, which is text-based. These types of files are used in commonly in software development. The . PO file may be referenced by Java programs, GNU gettext, or other software programs as a properties file.


1 Answers

I searched the Internet and couldn't find an existing library, either. If you use Scala, it's quite easy to write a parser yourself, thanks to its parser combinator feature.

Call PoParser.parsePo("po file content"). The result is a list of Translation.

I have made this code into a library (can be used by any JVM languages, including Java, of course!): https://github.com/ngocdaothanh/scaposer

import scala.util.parsing.combinator.JavaTokenParsers  trait Translation  case class SingularTranslation(   msgctxto: Option[String],   msgid:    String,   msgstr:   String) extends Translation  case class PluralTranslation(   msgctxto:    Option[String],   msgid:       String,   msgidPlural: String,   msgstrNs:    Map[Int, String]) extends Translation  // http://www.gnu.org/software/hello/manual/gettext/PO-Files.html object PoParser extends JavaTokenParsers {   // Removes the first and last quote (") character of strings   // and concats them.   private def unquoted(quoteds: List[String]): String =     quoteds.foldLeft("") { (acc, quoted) =>       acc + quoted.substring(1, quoted.length - 1)     }    // Scala regex is single line by default   private def comment = rep(regex("^#.*".r))    private def msgctxt = "msgctxt" ~ rep(stringLiteral) ^^ {     case _ ~ quoteds => unquoted(quoteds)   }    private def msgid = "msgid" ~ rep(stringLiteral) ^^ {     case _ ~ quoteds => unquoted(quoteds)   }    private def msgidPlural = "msgid_plural" ~ rep(stringLiteral) ^^ {     case _ ~ quoteds => unquoted(quoteds)   }    private def msgstr = "msgstr" ~ rep(stringLiteral) ^^ {     case _ ~ quoteds => unquoted(quoteds)   }    private def msgstrN = "msgstr[" ~ wholeNumber ~ "]" ~ rep(stringLiteral) ^^ {     case _ ~ number ~ _ ~ quoteds => (number.toInt, unquoted(quoteds))   }    private def singular =     (opt(comment) ~ opt(msgctxt) ~      opt(comment) ~ msgid ~      opt(comment) ~ msgstr ~ opt(comment)) ^^ {     case _ ~ ctxto ~ _ ~ id ~ _ ~ s ~ _ =>       SingularTranslation(ctxto, id, s)   }    private def plural =     (opt(comment) ~ opt(msgctxt) ~      opt(comment) ~ msgid ~      opt(comment) ~ msgidPlural ~      opt(comment) ~ rep(msgstrN) ~ opt(comment)) ^^ {     case _ ~ ctxto ~ _ ~ id ~ _ ~ idp ~ _ ~ tuple2s ~ _ =>       PluralTranslation(ctxto, id, idp, tuple2s.toMap)   }    private def exp = rep(singular | plural)    def parsePo(po: String): List[Translation] = {     val parseRet = parseAll(exp, po)     if (parseRet.successful) parseRet.get else Nil   } } 
like image 135
Ngoc Dao Avatar answered Oct 13 '22 01:10

Ngoc Dao