Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a fresh install of Haskell-Stack and GHC so large/big?

Tags:

When doing a fresh install of Haskell Stack through the install script from here:

wget -qO- https://get.haskellstack.org/ | sh

Followed by:

stack setup

you will end up with a $HOME/.stack/ directory of 1.5 GB size (from just a 120+ MB download). Further if you run:

stack update

the size increases to 2.5 GB.

I am used to Java which is usually considered large/big (covers pretty much everything and has deprecated alternatives for backwards compatibility), but as a comparison: an IDE including a JDK, a stand alone JDK, and the JDK source is probably around 1.5 GB in size.

On the other hand, that Haskell which is a "small beautiful" language (from what I have heard and read, this is probably referring mostly to the syntax and semantics, but still), is that large/big, seems strange to me.

  1. Why is it so big (is it related to this question?)?
  2. Is this size normal or have I installed something extra?
  3. If there are several (4?, 5?) flavors of everything, then can I remove all but one?
  4. Are some of the data cache/temporary that can be removed?
  5. The largest directories are: .stack/programs/x86_64-linux/ghc-tinfo6-nopie-8.2.2/lib/ghc-8.2.2 (1.3 GB) and .stack/indices/Hackage (980 MB). I assume the first one are installed packages (and related to stack setup) and the latter is some index over the Hackage package archive (and related to stack update)? Can these be reduced (as above in 3 or grabbing needed Hackage information online)?
like image 658
Klorax Avatar asked Feb 12 '18 21:02

Klorax


People also ask

Why is the Haskell compiler so big?

Writing a haskell compiler is a fairly big undertaking whereas a C compiler could be written in a few days. This is because C is complicated, but low-level, and Haskell is simple, but high-level.

How big is GHC?

Between 5,700 and 6,100 students are enrolled at GHC in any given semester, representing 49 different countries.

Where is Haskell stack installed?

Stack-built files generally go in either the Stack root directory (default: ~/. stack on Unix-like operating systems, or, %LOCALAPPDATA%\Programs\stack on Windows) or ./. stack-work directories local to each project. The Stack root directory holds packages belonging to snapshots and any Stack-installed versions of GHC.


1 Answers

As you can probably see by inspection, it is a combination of:

  • three flavors (static, dynamic, and profiled) of the GHC runtime (about 400 megs total) and the core GHC libraries (another 700 megs total) plus 100 megs of interface files and another 200 megs of documentation and 120 megs of compressed source (1.5 gigs total, all under programs/x86_64-linux/ghc-8.2.2* or similar)
  • two identical copies of the uncompressed Hackage index 00-index.tar and 01-index.tar, each containing the .cabal file for every version of every package ever published in the Hackage database, each about 457 megs, plus a few other files to bring the total up to 1.0 gigs

The first of these is installed when you run stack setup; the second when you run stack update.

To answer your questions:

  1. It's so big because clearly no one has made any effort to make it smaller, as evidenced by the whole 00-index.tar, 00-index.tar.gz, and 01-index.tar situation.
  2. That's a normal size for a minimum install.
  3. You can remove the profile versions (the *_p.a files) if you never want to compile a program with profiling. I haven't tested this extensively, but it seems to work. I guess this'll save you around 800 megs. You can also remove the static versions (all *.a files) if you only want to dynamically link programs (i.e., using ghc -dynamic). Again, I haven't tested this extensively, but it seems to work. Removing the dynamic versions would be very difficult -- you'd have to find a way to remove only those *.so files that GHC itself doesn't need, and anything you did remove would no longer be loadable in the interpreter.
  4. Several things are cached and you can remove them. For example, you can remove 00-index.tar and 00-index.tar.gz (saving about half a gigabyte), and Stack seems to run fine. It'll recreate them the next time you run stack update, though. I don't think this is documented anywhere, so it'll be a lot of trial and error determining what can be safely removed.
  5. I think this question has already been covered above.

A propos of nothing, the other day, I saw a good deal on some 3-terabyte drives, and in my excitement I ordered two before realizing I didn't really have anything to put on them. It kind of puts a few gigabytes in perspective, doesn't it?

I guess I wouldn't expend a lot of effort trying to trim down your .stack directory, at least on a beefy desktop machine. If you're working on a laptop with a relatively small SSD, think about maybe putting your .stack directory on a filesystem that supports transparent compression (e.g., Btrfs), if you think it's likely to get out of hand.

like image 99
K. A. Buhr Avatar answered Oct 21 '22 00:10

K. A. Buhr