Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need multi-threading memory manager

I will have to create a multi-threading project soon I have seen experiments ( delphitools.info/2011/10/13/memory-manager-investigations ) showing that the default Delphi memory manager has problems with multi-threading.

enter image description here

So, I have found this SynScaleMM. Anybody can give some feedback on it or on a similar memory manager?

Thanks

like image 798
Server Overflow Avatar asked May 20 '11 13:05

Server Overflow


People also ask

Why is multi threading needed?

What Is Multithreading Used For? The main reason for incorporating threads into an application is to improve its performance. Performance can be expressed in multiple ways: A web server will utilize multiple threads to simultaneous process requests for data at the same time.

Does multithreading increase memory usage?

Multi-threading gets around requiring additional memory because it relies on a shared memory between threads. Shared memory removes the additional memory overhead but still incurs the penalty of increased context switching.

Does multithreading improve CPU performance?

The ultimate goal of multithreading is to increase the computing speed of a computer and thus also its performance. To this end, we try to optimize CPU usage. Rather than sticking with a process for a long time, even when it's waiting on data for example, the system quickly changes to the next task.


2 Answers

Our SynScaleMM is still experimental.

EDIT: Take a look at the more stable ScaleMM2 and the brand new SAPMM. But my remarks below are still worth following: the less allocation you do, the better you scale!

But it worked as expected in a multi-threaded server environment. Scaling is much better than FastMM4, for some critical tests.

But the Memory Manager is perhaps not the bigger bottleneck in Multi-Threaded applications. FastMM4 could work well, if you don't stress it.

Here are some (not dogmatic, just from experiment and knowledge of low-level Delphi RTL) advice if you want to write FAST multi-threaded application in Delphi:

  • Always use const for string or dynamic array parameters like in MyFunc(const aString: String) to avoid allocating a temporary string per each call;
  • Avoid using string concatenation (s := s+'Blabla'+IntToStr(i)) , but rely on a buffered writing such as TStringBuilder available in latest versions of Delphi;
  • TStringBuilder is not perfect either: for instance, it will create a lot of temporary strings for appending some numerical data, and will use the awfully slow SysUtils.IntToStr() function when you add some integer value - I had to rewrite a lot of low-level functions to avoid most string allocation in our TTextWriter class as defined in SynCommons.pas;
  • Don't abuse on critical sections, let them be as small as possible, but rely on some atomic modifiers if you need some concurrent access - see e.g. InterlockedIncrement / InterlockedExchangeAdd;
  • InterlockedExchange (from SysUtils.pas) is a good way of updating a buffer or a shared object. You create an updated version of of some content in your thread, then you exchange a shared pointer to the data (e.g. a TObject instance) in one low-level CPU operation. It will notify the change to the other threads, with very good multi-thread scaling. You'll have to take care of the data integrity, but it works very well in practice.
  • Don't share data between threads, but rather make your own private copy or rely on some read-only buffers (the RCU pattern is the better for scaling);
  • Don't use indexed access to string characters, but rely on some optimized functions like PosEx() for instance;
  • Don't mix AnsiString/UnicodeString kind of variables/functions, and check the generated asm code via Alt-F2 to track any hidden unwanted conversion (e.g. call UStrFromPCharLen);
  • Rather use var parameters in a procedure instead of function returning a string (a function returning a string will add an UStrAsg/LStrAsg call which has a LOCK which will flush all CPU cores);
  • If you can, for your data or text parsing, use pointers and some static stack-allocated buffers instead of temporary strings or dynamic arrays;
  • Don't create a TMemoryStream each time you need one, but rely on a private instance in your class, already sized in enough memory, in which you will write data using Position to retrieve the end of data and not changing its Size (which will be the memory block allocated by the MM);
  • Limit the number of class instances you create: try to reuse the same instance, and if you can, use some record/object pointers on already allocated memory buffers, mapping the data without copying it into temporary memory;
  • Always use test-driven development, with dedicated multi-threaded test, trying to reach the worse-case limit (increase number of threads, data content, add some incoherent data, pause at random, try to stress network or disk access, benchmark with timing on real data...);
  • Never trust your instinct, but use accurate timing on real data and process.

I tried to follow those rules in our Open Source framework, and if you take a look at our code, you'll find out a lot of real-world sample code.

like image 61
Arnaud Bouchez Avatar answered Sep 19 '22 14:09

Arnaud Bouchez


If your app can accommodate GPL licensed code, then I'd recommend Hoard. You'll have to write your own wrapper to it but that is very easy. In my tests, I found nothing that matched this code. If your code cannot accommodate the GPL then you can obtain a commercial licence of Hoard, for a significant fee.

Even if you can't use Hoard in an external release of your code you could compare its performance with that of FastMM to determine whether or not your app has problems with heap allocation scalability.

I have also found that the memory allocators in the versions of msvcrt.dll distributed with Windows Vista and later scale quite well under thread contention, certainly much better than FastMM does. I use these routines via the following Delphi MM.

unit msvcrtMM;

interface

implementation

type
  size_t = Cardinal;

const
  msvcrtDLL = 'msvcrt.dll';

function malloc(Size: size_t): Pointer; cdecl; external msvcrtDLL;
function realloc(P: Pointer; Size: size_t): Pointer; cdecl; external msvcrtDLL;
procedure free(P: Pointer); cdecl; external msvcrtDLL;

function GetMem(Size: Integer): Pointer;
begin
  Result := malloc(size);
end;

function FreeMem(P: Pointer): Integer;
begin
  free(P);
  Result := 0;
end;

function ReallocMem(P: Pointer; Size: Integer): Pointer;
begin
  Result := realloc(P, Size);
end;

function AllocMem(Size: Cardinal): Pointer;
begin
  Result := GetMem(Size);
  if Assigned(Result) then begin
    FillChar(Result^, Size, 0);
  end;
end;

function RegisterUnregisterExpectedMemoryLeak(P: Pointer): Boolean;
begin
  Result := False;
end;

const
  MemoryManager: TMemoryManagerEx = (
    GetMem: GetMem;
    FreeMem: FreeMem;
    ReallocMem: ReallocMem;
    AllocMem: AllocMem;
    RegisterExpectedMemoryLeak: RegisterUnregisterExpectedMemoryLeak;
    UnregisterExpectedMemoryLeak: RegisterUnregisterExpectedMemoryLeak
  );

initialization
  SetMemoryManager(MemoryManager);

end.

It is worth pointing out that your app has to be hammering the heap allocator quite hard before thread contention in FastMM becomes a hindrance to performance. Typically in my experience this happens when your app does a lot of string processing.

My main piece of advice for anyone suffering from thread contention on heap allocation is to re-work the code to avoid hitting the heap. Not only do you avoid the contention, but you also avoid the expense of heap allocation – a classic twofer!

like image 34
David Heffernan Avatar answered Sep 21 '22 14:09

David Heffernan