Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JPA Database structure for internationalisation

Tags:

java

jpa

I am trying to get a JPA implementation of a simple approach to internationalisation. I want to have a table of translated strings that I can reference in multiple fields in multiple tables. So all text occurrences in all tables will be replaced by a reference to the translated strings table. In combination with a language id, this would give a unique row in the translated strings table for that particular field. For example, consider a schema that has entities Course and Module as follows :-

Course int course_id, int name, int description

Module int module_id, int name

The course.name, course.description and module.name are all referencing the id field of the translated strings table :-

TranslatedString int id, String lang, String content

That all seems simple enough. I get one table for all strings that could be internationalised and that table is used across all the other tables.

How might I do this in JPA, using eclipselink 2.4?

I've looked at embedded ElementCollection, ala this... JPA 2.0: Mapping a Map - it isn't exactly what i'm after cos it looks like it is relating the translated strings table to the pk of the owning table. This means I can only have one translatable string field per entity (unless I add new join columns into the translatable strings table, which defeats the point, its the opposite of what I am trying to do). I'm also not clear on how this would work across entites, presumably the id of each entity would have to use a database wide sequence to ensure uniqueness of the translatable strings table.

BTW, I tried the example as laid out in that link and it didn't work for me - as soon as the entity had a localizedString map added, persisting it caused the client side to bomb but no obvious error on the server side and nothing persisted in the DB :S

I been around the houses on this about 9 hours so far, I've looked at this Internationalization with Hibernate which appears to be trying to do the same thing as the link above (without the table definitions it hard to see what he achieved). Any help would be gratefully achieved at this point...

Edit 1 - re AMS anwser below, I'm not sure that really addresses the issue. In his example it leaves the storing of the description text to some other process. The idea of this type of approach is that the entity object takes the text and locale and this (somehow!) ends up in the translatable strings table. In the first link I gave, the guy is attempting to do this by using an embedded map, which I feel is the right approach. His way though has two issues - one it doesn't seem to work! and two if it did work, it is storing the FK in the embedded table instead of the other way round (I think, I can't get it to run so I can't see exactly how it persists). I suspect the correct approach ends up with a map reference in place of each text that needs translating (the map being locale->content), but I can't see how to do this in a way that allows for multiple maps in one entity (without having corresponding multiple columns in the translatable strings table)...

like image 845
IrishDubGuy Avatar asked Nov 17 '12 00:11

IrishDubGuy


2 Answers

(I'm Henno who replied to hwellman's blog.) My initial approach was very similar to your approach and it does the job. It meets the requirement that any field from any entity can reference a localized String Map with a general database table that does not have to reference other more concrete tables. Indeed I also use it for multiple fields in our Product entity (name, description, details). I also had the "problem" that JPA generated a table with only a primary key column and a table for the values that referenced this id. With OpenJPA I had no need for a dummy column:

public class StringI18N {

    @OneToMany(mappedBy = "parent", cascade = ALL, fetch = EAGER, orphanRemoval = true)
    @MapKey(name = "locale")
    private Map<Locale, StringI18NSingleValue> strings = new HashMap<Locale, StringI18NSingleValue();
...

OpenJPA simply stores Locale as a String. Because we don't really need an extra entity StringI18NSingleValue so I think your mapping using @ElementCollection is a bit more elegant.

There is an issue you have to be aware of though: do you allow sharing a Localised with multiple entities, and how do you prevent orphaned Localised entities when the owning entity is removed? Simply using cascade all is not sufficient. I decided to see a Localised as much as possible as a "value object" and not allow it to be shared with other entities so that we don't have to think about multiple references to the same Localised and we can safely use orphan removal. So my Localised fields are mapped like:

@OneToOne(cascade = ALL, orphanRemoval = true)

Depending on my use case I also use fetch = EAGER/LAZY and optional = false or true. When using optional = false I use @JoinColumn(nullable=false) so OpenJPA generates a not null constraint on the join column.

Whenever I do need to copy a Localized to another entity, I do not use the same reference but I create a new Localized instance with the same contents and no id yet. Otherwise you may get hard to debug problems where changinIf you don't do this you are still sharing an instance with multiple entities and you may get surprising bugs where changing a Localised String can change another String at another entity.

So far so good, however in practice I found that OpenJPA has N+1 select problems when selecting entities that contain one or more Localized Strings. It does not efficiently fetch an element collection (I reported this as https://issues.apache.org/jira/browse/OPENJPA-1920). That problem is probably solved by using a Map<Locale, StringI18NSingleValue>. However OpenJPA can also not efficiently fetch structures of the form A 1..1 B 1..* C which is also what happens here (I reported this as https://issues.apache.org/jira/browse/OPENJPA-2296). This can seriously affect the performance of your application.

Other JPA providers may have similar N+1 select problems. If the performance of fetching Category is of concern to you, I would check whether or not the number of queries used for fetching Category depends on the number of entities. I know that with Hibernate you can force batch fetching or subselect to solve these kind of problems. I also know EclipseLink has similar features that may or may not work.

Out of desperation to solve this performance issue I actually had to accept living with a design I don't really like: I simply added a String field for each language I had to support to the Localised. For us this is possible because we currently only need to support a few languages. This resulted in only one (denormalized) Localised table. JPA can then efficiently join the Localised table in queries, but this will not scale well for many languages and does not support an arbitrary number of languages. For maintainability I kept the external interface of Localised the same and only changed the implementation from a Map to a field-per-language so that we may easily switch back in the future.

like image 112
Henno Vermeulen Avatar answered Oct 13 '22 18:10

Henno Vermeulen


OK, I think I have it. It looks like a simplified version of the first link in my question will work, just using a ManyToOne relationship to a Localised entity (with a different joinColumn for each text element in your main entity) and a simple ElementCollection for the Map within that Localised entity. I coded a slightly different example than my question, with just one entity (Category), having two text elements that need multiple entries for each locale (name and description).

Note this was done against Eclipselink 2.4 going to MySQL.

Two notes about this approach - as you can see in the first link, using ElementCollection forces a separate table to be created, which results in two tables for the translatable strings - one just holds the ID (Locaised) that is the FK in the main one (Localised_strings) that holds all the Map info. The name Localised_strings is the automatic/default name - you can use another one with the @CollectionTable annotation. Overall, this isn't ideal from a DB point of view but not the end of the world.

Second is that, at least for my combination of Eclipselink and MySQL, persisting to a single (auto generated) column table gives an error :( So i've added in a dummy column w a default value in the entity, this is purely to overcome that issue.

import java.io.Serializable;
import java.lang.Long;
import java.lang.String;
import java.util.HashMap;
import java.util.Map;

import javax.persistence.*;


@Entity

public class Category implements Serializable {

@GeneratedValue(strategy = GenerationType.IDENTITY)
@Id
private Long id;
@ManyToOne(cascade=CascadeType.ALL)
@JoinColumn(name="NAME_ID")
private Localised nameStrings = new Localised();

@ManyToOne(cascade=CascadeType.ALL)
@JoinColumn(name="DESCRIPTION_ID")
private Localised descriptionStrings = new Localised();

private static final long serialVersionUID = 1L;

public Category() {

    super();
}  

public Category(String locale, String name, String description){
    this.nameStrings.addString(locale, name);
    this.descriptionStrings.addString(locale, description);
}
public Long getId() {
    return this.id;
}

public void setId(Long id) {
    this.id = id;
}   

public String getName(String locale) {
    return this.nameStrings.getString(locale);
}

public void setName(String locale, String name) {
    this.nameStrings.addString(locale, name);
}
public String getDescription(String locale) {
    return this.descriptionStrings.getString(locale);
}

public void setDescription(String locale, String description) {
    this.descriptionStrings.addString(locale, description);
}

}




import java.util.HashMap;
import java.util.Map;

import javax.persistence.ElementCollection;
import javax.persistence.Embeddable;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;

@Entity
public class Localised {

    @Id @GeneratedValue(strategy=GenerationType.IDENTITY)
    private int id;
    private int dummy = 0;
    @ElementCollection
    private Map<String,String> strings = new HashMap<String, String>();

    //private String locale;    
    //private String text;

    public Localised() {}

    public Localised(Map<String, String> map) {
        this.strings = map;
    }

    public void addString(String locale, String text) {
        strings.put(locale, text);
    }

    public String getString(String locale) {
        String returnValue = strings.get(locale);
        return (returnValue != null ? returnValue : null);
    }

}

So these generate tables as follows :-

CREATE TABLE LOCALISED (ID INTEGER AUTO_INCREMENT NOT NULL, DUMMY INTEGER, PRIMARY KEY (ID))
CREATE TABLE CATEGORY (ID BIGINT AUTO_INCREMENT NOT NULL, DESCRIPTION_ID INTEGER, NAME_ID INTEGER, PRIMARY KEY (ID))
CREATE TABLE Localised_STRINGS (Localised_ID INTEGER, STRINGS VARCHAR(255), STRINGS_KEY VARCHAR(255))
ALTER TABLE CATEGORY ADD CONSTRAINT FK_CATEGORY_DESCRIPTION_ID FOREIGN KEY (DESCRIPTION_ID) REFERENCES LOCALISED (ID)
ALTER TABLE CATEGORY ADD CONSTRAINT FK_CATEGORY_NAME_ID FOREIGN KEY (NAME_ID) REFERENCES LOCALISED (ID)
ALTER TABLE Localised_STRINGS ADD CONSTRAINT FK_Localised_STRINGS_Localised_ID FOREIGN KEY (Localised_ID) REFERENCES LOCALISED (ID)

A Main to test it...

import java.util.List;

import javax.persistence.EntityManager;
import javax.persistence.EntityManagerFactory;
import javax.persistence.Persistence;
import javax.persistence.Query;

public class Main {
  static EntityManagerFactory emf = Persistence.createEntityManagerFactory("javaNetPU");
  static EntityManager em = emf.createEntityManager();

  public static void main(String[] a) throws Exception {
    em.getTransaction().begin();


    Category category = new Category();

    em.persist(category);

    category.setName("EN", "Business");
    category.setDescription("EN", "This is the business category");


    category.setName("FR", "La Business");
    category.setDescription("FR", "Ici es la Business");

    em.flush();

    System.out.println(category.getDescription("EN"));
    System.out.println(category.getName("FR"));


    Category c2 = new Category();
    em.persist(c2);

    c2.setDescription("EN", "Second Description");
    c2.setName("EN", "Second Name");

    c2.setDescription("DE", "Zwei  Description");
    c2.setName("DE", "Zwei  Name");

    em.flush();


    //em.remove(category);


    em.getTransaction().commit();
    em.close();
    emf.close();

  }
}

This produces output :-

This is the business category
La Business

and the following table entries :-

Category
"ID"    "DESCRIPTION_ID"    "NAME_ID"
"1"         "1"                 "2"
"2"         "3"                 "4"

Localised
"ID"    "DUMMY"
"1"         "0"
"2"         "0"
"3"         "0"
"4"         "0"

Localised_strings

"Localised_ID"  "STRINGS"                        "STRINGS_KEY"
"1"                 "Ici es la Business"                 "FR"
"1"                 "This is the business category"      "EN"
"2"                 "La Business"                        "FR"
"2"                 "Business"                       "EN"
"3"                 "Second Description"                 "EN"
"3"                 "Zwei  Description"              "DE"
"4"                 "Second Name"                        "EN"
"4"                 "Zwei  Name"                         "DE"

Uncommenting the em.remove correctly deletes both the Category and it's associated Locaised/Localised_strings entries.

Hope that all helps someone in the future.

like image 35
IrishDubGuy Avatar answered Oct 13 '22 18:10

IrishDubGuy