Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Typescript compilation speed - trying to workaround but stuck at merging

Tags:

typescript

I have been using Typescript over the last 3 months to create very complex CRUD applications. The compile-time safety offered by Typescript has provided significant speedups in my work - capturing errors at compile-time is a Godsend, compared to seeing them manifest as exceptions and misbehaviour at run-time.

There's a catch, though.

I have to deal with hundreds of tables, so I am using a custom-built code generator that starts from the DB schema, and automatically generates lots of Typescript files. As long as the schema is small, this works perfectly - but for a very large schema containing hundreds of tables, the compilation time of tsc is becoming an issue - I am seeing compilation times of 15 minutes for a set of 400 files... (as well as the dreaded compilation error of "CALL_AND_RETRY_2 Allocation failed" - that is, out of memory issues...)

So far, I have been using tsc in a Makefile, invoking it with the "tsc --out ..." syntax, that generates a single .js from all my .ts files. I therefore thought that I could solve this problem by doing the build in an incremental fashion: compiling each .ts on its own (that is, passing only one .ts file to tsc at at time) and in the end, concatenating all the generated .js in a single one. This indeed appeared to work - only the changed files need to be recompiled during normal development (and only the initial compilation passes through all of them, and therefore takes a lot of time).

But it turned out that this too, has a problem: in order to make each .ts "standalone-compile-able", I had to add all the relevant dependencies on top - that is, lines like

/// <reference path=...

... on top of each .ts file.

And it turns out that because of these references, the generated .js files contain identical sections, that are repeated across many of them... So when I concatenate the .js files, I get multiple definitions for the same functions, and worse, global scope statements (var global = new ...) repeated!

I therefore need a way to somehow intelligently "merge" the generated .js files, to avoid seeing replicated function definitions...

Is there some way to do that merge in a smart manner, avoiding repetitions? Or maybe some other way to accelerate the compilation?

Any suggestions most welcome... The tsc compilation speed is 30-100x slower than normal compilers - it is really a blocking point now.

UPDATE, 2 days later

Basarat (see his answer below) helped me apply his solution in my project. It turns out that even though his solution works perfectly with small and average sized projects, with mine I got the dreaded "FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory" error - which is the same error I get when I use "tsc --out ...".

In the end, my Makefile-based solution is the only thing that worked - doing it like this:

%.js:   %.ts
    @UPTODATE=0 ;                                                          \
    if [ -f "$<".md5 ] ; then                                              \
            md5sum -c "$<".md5 >/dev/null 2>&1 && {                        \
                    UPTODATE=1 ;                                           \
            } ;                                                            \
    fi ;                                                                   \
    if [ $$UPTODATE -eq 0 ] ; then                                         \
            echo Compiling $<  ;                                           \
            tsc --sourcemap --sourceRoot /RevExp/src/ --target ES5 $< || { \
                    rm $@ "$<".md5 ;                                       \
                    exit 1 ;                                               \
            } ;                                                            \
            md5sum "$<" > "$<".md5 ;                                       \
    fi

...which does two things: it uses MD5 checksums to figure out when to actually perform a compilation, and it does the compilation in a "standalone" manner (i.e. without tsc's "--out" option).

In the actual target rule, I used to merge the generated .js files... but this left me without working .map files (for debugging) - so I now generated direct includes in the index.html:

${WEBFOLDER}/index.html:        $(patsubst %.ts,%.js,${CONTROLLERS_SOURCES}) ${WEBFOLDER}/index.html.template
    cat ${WEBFOLDER}/index.html.template > $@ || exit 1
    REV=$$(cat revision) ;                                                                                      \
    for i in $(patsubst %.ts,%.js,${CONTROLLERS_SOURCES}) ; do                                                  \
        BASE=$$(basename $$i) ;                                                                                 \
        echo "      <script type='text/javascript' src='js/$${BASE}?rev=$$REV'></script>" >> $@ ;              \
    done || exit 1
    cat RevExp/templates/index.html.parallel.footer >> $@ || exit 1
    cp $(patsubst %.ts,%.js,${CONTROLLERS_SOURCES}) ${WEBFOLDER}/js/ || exit 1

I will leave the question open for future contributions...

like image 837
ttsiodras Avatar asked Oct 30 '13 09:10

ttsiodras


2 Answers

I have a grunt plugin that can manage your typescript project : https://github.com/basarat/grunt-ts

I see compile times of 6 seconds for around 250 files. Here is a video tutorial with grunt-ts in use : http://www.youtube.com/watch?v=0-6vT7xgE4Y&hd=1

like image 164
basarat Avatar answered Oct 27 '22 05:10

basarat


I have reached 300+ *.ts files in my project, and I am stuck with 0.9.1.1 compiler (newer versions use incompatible version of TypeScript and I'd had to perform a huge refactoring to make the compiler happy) with compile times ~25 sec. I use tsc app.ts --out app.js.

I get similar timing for tsc app.ts that is, when not using --out app.js but instead generating lots of small js files.

However, if I compile a single file which does not have too many dependencies I get times ~5 sec (I've performed tsc for each *.ts file separately to measure this, there were several outliers which took more than 10 seconds, but most of the files compile quickly as they are near the bottom of the hierachy of dependencies). So my first idea would be to create a system which:

  1. watches for new and modified *.ts files
  2. performs a tsc foo.ts on the modified file
  3. concatenates all *.js files preserving the order of dependencies

You could implement 1st step by comparing result of ls -ltc --full-time $(find ts -type f -name '*.ts') every second, or by something more advanced like inotify. The 3rd step is not that hard as tsc preserves the ///reference annotations in js files, so you could search for them and perform a simple O(n) topological sort.

However, I think we could improve the second step by using tsc -d option to create declaration files. In the 2nd step the tsc will create not only foo.js, but will also investigate and compile all the dependencies of foo.ts, which is a waste of time. OTOH if foo.ts had only referenced *.d.ts files (which in turn would have no dependencies, or at least very limited number of them) the process of recompilation of foo.ts could be faster.

In order for this to work, you have to find and replace all ///reference so that they point to bar.d.ts, not bar.ts.

find -name '*.ts' | 
xargs sed -i -r 's/(\/\/\/<reference path=".*([^d]|[^\.]d)).ts"\/>/\1.d.ts"\/>/g'

should perform the necessary change.

You also need to generate all the *.d.ts files for the first time. It's a bit like a chicken and egg problem, as you need these files to perform any compilation. The good news is that this should work if you compile files in the topological order of references.

So we need to build a topologically sorted list of files. There is a tsort program which performs this task, provided you have a list of edges. I can find all the dependencies with the following grep

grep --include=*.ts --exclude=*.d.ts --exclude-dir=.svn -o '///<reference path.*"/>' -R .

The only problem is that the output contains the references verbatim, for example:

./entities/school_classes.ts:///<reference path="../common/app_backbone.d.ts"/>

which means we have to resolve relative paths to some canonical form. One more detail is that we actually depend on *.ts not the *.d.ts for the purpose of sorting. This simple parse.sh script takes care of that:

#!/bin/bash
here=`pwd`
while read line
do
   a=${line/:*/}
   t=${line/\"\/>/}
   b=${t/*\"/}
   c=$(cd `dirname $a`;cd `dirname $b`;pwd);d=$(cd `dirname $a`;basename $b);
   B="$c/$d"
   B=${B/$here/.}
   B=${B/.d.ts/.ts}
   echo "$a $B"
done

Putting it all together with tsort generates a list of files in the correct order:

grep --include=*.ts --exclude=*.d.ts --exclude-dir=.svn -o '///<reference path.*"/>' -R . |  
./parse.sh | 
tsort |
xargs -n1 tsc -d

Now, this will probably fail, simply because if the project was never compiled this way before it probably does not have the dependencies defined precisely enough to avoid problems. Also if you use some global variables (like var myApp;) throughout your app (myApp.doSomething()) you may need to declare it in a *.d.ts file and reference it. Now, you might think that this would create a circular dependency (app requires module x, while module x requires app), but recall that we now only depend on *.d.ts files. So there is no circularity now (this is somewhat similar to the way it works in C or C++, where one depends only on header files).

Once you have all that missing references fixed and all *.d.ts and *.ts files compiled. You can start watching for changes and only recompile the changed files. But beware, if you change something in file foo.ts you might also recompile files that require foo.ts. Not because that is necessary to update them - actually they should not change at all during recompilation. Rather, this is needed for validation - all users of foo.d.ts should check if the new interface of foo is compatible with the way it is used by them! So, you might want to recompile all users of foo.d.ts whenever foo.d.ts changes. This might be much more tricky and time consuming, but should happen rarely (only if you change the shape of foo). Another option would be to simply rebuild everything (in the topological order) in such cases.

I'm in the middle of implementing this approach, so I'll update my answer once I'm done (or fail).

UPDATE So, I've managed to implement all this, using tsort and gnu make, to make my life easier with dependency resolution. The problem is it was actually slower than the original tsc --out app.js app.ts. The reason behind this is that there is a big overhead for 0.9.1.1 compiler to perform a single compilation - even for a file as simple as

class A{
}

time tsc test.ts yields above 3 seconds. Now, if you have to recompile a single file, this is fine. But once you realize that you have to recompile all files that depend on it (mainly to perform type checking) and then files that depend on them etc., you need to perform something like 5 to 10 such compilations. So even though each compilation step is very fast (3 sec << 25 sec) the overall experience is worse (~50 sec!).

The main benefit of this exercise for me, was that I had to fix many bugs and missing dependencies to make it work:)

like image 39
qbolec Avatar answered Oct 27 '22 04:10

qbolec