Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow chunk response in Play 2.2

In my play-framework-based web application users can download all the rows of different database tables in csv or json format. Tables are relatively large (100k+ rows) and I am trying to stream back the result using chunking in Play 2.2.

However the problem is although println statements shows that the rows get written to the Chunks.Out object, they do not show up in the client side! If I limit the rows getting sent back it will work, but it also has a big delay in the beginning which gets bigger if I try to send back all the rows and causes a time-out or the server runs out of memory.

I use Ebean ORM and the tables are indexed and querying from psql doesn't take much time. Does anyone have any idea what might be the problem?

I appreciate your help a lot!

Here is the code for one of the controllers:

@SecureSocial.UserAwareAction
public static Result showEpex() {

    User user = getUser();
    if(user == null || user.getRole() == null)
        return ok(views.html.profile.render(user, Application.NOT_CONFIRMED_MSG));

    DynamicForm form = DynamicForm.form().bindFromRequest();
    final UserRequest req = UserRequest.getRequest(form);

    if(req.getFormat().equalsIgnoreCase("html")) {
        Page<EpexEntry> page = EpexEntry.page(req.getStart(), req.getFinish(), req.getPage());
        return ok(views.html.epex.render(page, req));
    }

    // otherwise chunk result and send back
    final ResultStreamer<EpexEntry> streamer = new ResultStreamer<EpexEntry>();
    Chunks<String> chunks = new StringChunks() {
            @Override
            public void onReady(play.mvc.Results.Chunks.Out<String> out) {

                Page<EpexEntry> page = EpexEntry.page(req.getStart(), req.getFinish(), 0);
                ResultStreamer<EpexEntry> streamer = new ResultStreamer<EpexEntry>();
                streamer.stream(out, page, req);
            }
    };
    return ok(chunks).as("text/plain");
}

And the streamer:

public class ResultStreamer<T extends Entry> {

private static ALogger logger = Logger.of(ResultStreamer.class);

public void stream(Out<String> out, Page<T> page, UserRequest req) {

    if(req.getFormat().equalsIgnoreCase("json")) {
        JsonContext context = Ebean.createJsonContext();
        out.write("[\n");
        for(T e: page.getList())
            out.write(context.toJsonString(e) + ", ");
        while(page.hasNext()) {
            page = page.next();
            for(T e: page.getList())
                out.write(context.toJsonString(e) + ", ");
        }
        out.write("]\n");
        out.close();
    } else if(req.getFormat().equalsIgnoreCase("csv")) {
        for(T e: page.getList())
            out.write(e.toCsv(CSV_SEPARATOR) + "\n");
        while(page.hasNext()) {
            page = page.next();
            for(T e: page.getList())
                out.write(e.toCsv(CSV_SEPARATOR) + "\n");
        }
        out.close();
    }else {
        out.write("Invalid format! Only CSV, JSON and HTML can be generated!");
        out.close();
    }
}


public static final String CSV_SEPARATOR = ";";
} 

And the model:

@Entity
@Table(name="epex")
public class EpexEntry extends Model implements Entry {

    @Id
    @Column(columnDefinition = "pg-uuid")
    private UUID id;
    private DateTime start;
    private DateTime finish;
    private String contract;
    private String market;
    private Double low;
    private Double high;
    private Double last;
    @Column(name="weight_avg")
    private Double weightAverage;
    private Double index;
    private Double buyVol;
    private Double sellVol;

    private static final String START_COL = "start";
    private static final String FINISH_COL = "finish";
    private static final String CONTRACT_COL = "contract";
    private static final String MARKET_COL = "market";
    private static final String ORDER_BY = MARKET_COL + "," + CONTRACT_COL + "," + START_COL;

    public static final int PAGE_SIZE = 100;

    public static final String HOURLY_CONTRACT = "hourly";
    public static final String MIN15_CONTRACT = "15min";

    public static final String FRANCE_MARKET = "france";
    public static final String GER_AUS_MARKET = "germany/austria";
    public static final String SWISS_MARKET = "switzerland";

    public static Finder<UUID, EpexEntry> find = 
            new Finder(UUID.class, EpexEntry.class);

    public EpexEntry() {
    }

    public EpexEntry(UUID id, DateTime start, DateTime finish, String contract,
            String market, Double low, Double high, Double last,
            Double weightAverage, Double index, Double buyVol, Double sellVol) {
        this.id = id;
        this.start = start;
        this.finish = finish;
        this.contract = contract;
        this.market = market;
        this.low = low;
        this.high = high;
        this.last = last;
        this.weightAverage = weightAverage;
        this.index = index;
        this.buyVol = buyVol;
        this.sellVol = sellVol;
    }

    public static Page<EpexEntry> page(DateTime from, DateTime to, int page) {

        if(from == null && to == null)
            return find.order(ORDER_BY).findPagingList(PAGE_SIZE).getPage(page);
        ExpressionList<EpexEntry> exp = find.where();
        if(from != null)
            exp = exp.ge(START_COL, from);
        if(to != null)
            exp = exp.le(FINISH_COL, to.plusHours(24));
        return exp.order(ORDER_BY).findPagingList(PAGE_SIZE).getPage(page);
    }

    @Override
    public String toCsv(String s) {
        return id + s + start + s + finish + s + contract + 
                s + market + s + low + s + high + s + 
                last + s + weightAverage + s + 
                index + s + buyVol + s + sellVol;   
    }
like image 824
p00ya00 Avatar asked Oct 09 '13 22:10

p00ya00


1 Answers

1. Most of browsers wait for 1-5 kb of data before showing any results. You can check if Play Framework actually sends data with command curl http://localhost:9000.

2. You create streamer twice, remove first final ResultStreamer<EpexEntry> streamer = new ResultStreamer<EpexEntry>();

3. - You use Page class for retrieving large data set - this is incorrect. Actually you do one big initial request and then one request per iteration. This is SLOW. Use simple findIterate().

add this to EpexEntry (feel free to change it as you need)

public static QueryIterator<EpexEntry> all() {
    return find.order(ORDER_BY).findIterate();
}

your new stream method implementation:

public void stream(Out<String> out, QueryIterator<T> iterator, UserRequest req) {

    if(req.getFormat().equalsIgnoreCase("json")) {
        JsonContext context = Ebean.createJsonContext();
        out.write("[\n");
        while (iterator.hasNext()) {
            out.write(context.toJsonString(iterator.next()) + ", ");
        }
        iterator.close(); // its important to close iterator
        out.write("]\n");
        out.close();
    } else // csv implementation here

And your onReady method:

            QueryIterator<EpexEntry> iterator = EpexEntry.all();
            ResultStreamer<EpexEntry> streamer = new ResultStreamer<EpexEntry>();
            streamer.stream(new BuffOut(out, 10000), iterator, req); // notice buffering here

4. Another problem is - you call Out<String>.write() too often. Call of write() means that server needs to send new chunk of data to client immediately. Every call of Out<String>.write() have significant overhead.

Overhead appears because server needs to wrap response into chunked result - 6-7 bytes for each message Chunked response Format. Since you send small messages, overhead is significant. Also, server needs to wrap your reply in TCP packet which size will be far less from optimal. And, server needs to perform some internal action to send an chunk, this is also require some resources. As result, download bandwidth will be far from optimal.

Here is simple test: send 10000 lines of text TEST0 to TEST9999 in chunks. This takes 3 sec on my computer in average. But with buffering this takes 65 ms. Also, download sizes are 136 kb and 87.5 kb.

Example with buffering:

Controller

public class Application extends Controller {
    public static Result showEpex() {
        Chunks<String> chunks = new StringChunks() {
            @Override
            public void onReady(play.mvc.Results.Chunks.Out<String> out) {
                new ResultStreamer().stream(out);
            }
        };
        return ok(chunks).as("text/plain");
    }
}

new BuffOut class. It's dumb, I know

public class BuffOut {
    private StringBuilder sb;
    private Out<String> dst;

    public BuffOut(Out<String> dst, int bufSize) {
        this.dst = dst;
        this.sb = new StringBuilder(bufSize);
    }

    public void write(String data) {
        if ((sb.length() + data.length()) > sb.capacity()) {
            dst.write(sb.toString());
            sb.setLength(0);
        }
        sb.append(data);
    }

    public void close() {
        if (sb.length() > 0)
            dst.write(sb.toString());
        dst.close();
    }
}

This implementation have 3 second download time and 136 kb size

public class ResultStreamer {
    public void stream(Out<String> out) {
    for (int i = 0; i < 10000; i++) {
            out.write("TEST" + i + "\n");
        }
        out.close();
    }
}

This implementation have 65 ms download time and 87.5 kb size

public class ResultStreamer {
    public void stream(Out<String> out) {
        BuffOut out2 = new BuffOut(out, 1000);
        for (int i = 0; i < 10000; i++) {
            out2.write("TEST" + i + "\n");
        }
        out2.close();
    }
}
like image 183
Viktor Aseev Avatar answered Oct 30 '22 13:10

Viktor Aseev