Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extend GHC's Thread State Object

I'd like to add two extra fields of type StgWord32 to the thread state object (TSO). Based on the information I found on the GHC-Wiki and from looking at the source code, I have extended the struct in /includes/rts/storage/TSO.h and changed the program that creates different offsets (creating DerivedConstants.h). The compiler, the rts, and a simple application re-compile, but at the end of the execution (in hs_exit_) the garbage collector complains:

 internal error: scavenge_stack: weird activation record found on stack: 45

I guess it has to to with cmm and/or the STG implementation details (the offsets are generated since the structs are not visible at cmm level, correct me if I'm wrong). Is the order of fields significant? Have I missed a file that should be changed?

I use a debug build of the compiler and RTS and a rather dated ghc 6.12.3 on a 64bit architecture. Any hints to relevant documentation and comments on the difference between ghc 6 and 7 regarding TSO handling are welcome, too.

like image 783
jev Avatar asked Apr 13 '15 14:04

jev


2 Answers

The error that you are getting comes from: ghc/rts/sm/Scav.c. Specifically at line 1917:

 default:
    barf("scavenge_stack: weird activation record found on stack: %d", (int)(info->i.type));

It looks like you need to also modify ClosureTypes.h, which you can find in ghc/includes/rts/storage. This file seems to contain the different kinds of headers that can appear in a heap object. I've also run into some strange bootstrapping errors, where if I try to rebuild using the stage-1 compiler, I get the error you mentioned, but if I do a clean build, then it compiles just fine.

like image 70
Matt Avatar answered Nov 07 '22 05:11

Matt


A workaround that turned out good enough for me was to introduce a separate data structure for each Capability that would hold the additional information for each lightweight thread. I have used a HashTable (see rts/Hash.h and .c) mapping from thread id to the custom info struct. The entries were added when the threads were created from sparks (in schduleActiveteSpark).

Timing the creation, insertion, lookup and destruction of the entries and the table showed negligible overhead for small programs. The main overhead results from the actual usage of the information and should ideally be kept outside of the innermost scheduler loop. For the THREADED_RTS build one needs to ensure that other Capabilities don't access tables that are not their own (or use a mutex if such access is required, which is potential source of additional overhead).

like image 21
jev Avatar answered Nov 07 '22 04:11

jev