Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ICU u_fgetfile incompatible with runtime in release builds for VS2012

I'm trying to pass the handle returned from u_fgetfile into fseek/fread functions.

When linking my application with the debug runtime libraries (/MTd /MDd) there is no crash, but if I link against the static versions this simple code crashes:

#include <stdio.h>
#include "unicode\ustdio.h"

int main()
{
    UFILE* file;
    file = u_fopen("C:\\test.txt","r",NULL,"UTF-8");
    fseek(u_fgetfile(file),3,SEEK_SET);
}

Now this happens with both official builds of ICU and when I build custom builds with Visual Studio 2012 (building ICU in debug or release doesn't matter).

The only thing I have found out is that there seems to be some mismatch in the FILE structure, but I really don't know.

Edit:

As part of adding a bounty to this question, here's a fully functional VS2012 project containing both reproducer program (same as the code posted above) and icu with source and binaries. Get it here: http://goo.gl/urTuU

like image 509
monoceres Avatar asked Jul 08 '13 21:07

monoceres


1 Answers

It seems to me like the issue is within _lock_file where it says:

    /*
     * The way the FILE (pointed to by pf) is locked depends on whether
     * it is part of _iob[] or not
     */
    if ( (pf >= _iob) && (pf <= (&_iob[_IOB_ENTRIES-1])) )
    {
        /*
         * FILE lies in _iob[] so the lock lies in _locktable[].
         */
        _lock( _STREAM_LOCKS + (int)(pf - _iob) );
        /* We set _IOLOCKED to indicate we locked the stream */
        pf->_flag |= _IOLOCKED;
    }
    else
        /*
         * Not part of _iob[]. Therefore, *pf is a _FILEX and the
         * lock field of the struct is an initialized critical
         * section.
         */
        EnterCriticalSection( &(((_FILEX *)pf)->lock) );

A "normal" FILE* will enter the top branch, the pointer returned from u_fgetfile will enter the bottom branch. Here it is assumed that it is a _FILEX*, which is most likely simply not correct.

As we see, the runtime compares to see if the file pointer fb is within _iob. But, in the debugger, we can see clearly that it is far outside of it (at least in the release build).

Given that u_fgetfile just returns a FILE* that was stored within the UFILE structure, we can inspect finit_owner in ufile.c to see how the FILE* ends up in our structure in the first place. After reading that code, I must assume that in a release build, two separate instances of the _iob array exist in the CRT, but in the debug build, only a single instance exists.

To get around this problem, you're going to want to make sure that the FILE* is created in the same thread as your main application. To do that, you can utilize u_finit, like so:

FILE* filePointer = fopen("test.txt","r");
UFILE* file = u_finit(filePointer,NULL,"UTF-8");

fseek(filePointer,3,SEEK_SET); // <- won't crash

Regarding your issue that came up after this, it seems to me like the underlying problem is simply sharing a FILE* between libraries, which fails because they have separate storage areas for FILE*. I find this somewhat confusing, but I don't have the necessary understanding of the involved components (and the style of the Windows C runtime code isn't helping either).

So, if the FILE* is allocated in ICU, then you can't lock it in your main application and vice versa (and trying to read or seek will always involve locking).

Unless there is a very obvious solution to this problem, which I'm missing, I would recommend emulating the behavior of u_fgets() (or whatever else you'll need) in your main application.
From what I can tell, u_fgets() just calls fread() to read data from the file and then uses ucnv_toUnicode(), with the converter stored in the UFILE (which you can retrieve with u_fgetConverter()), to convert the read data into a UChar*.

One way that seems to work is linking ICU statically. I don't know if that is an option for you, but it seems to resolve the issue on my end.

I downloaded the latest release of ICU (51.2) and compiled it with this helpful script. I then linked the project against the libraries in icu-release-static-win32-vs2012 (link with sicuuc.lib, sicuio.lib, sicudt.lib, sicuin.lib).

Now u_fgets() no longer causes an access violation. Of course, now my .exe is almost 23 MB big.

like image 95
Oliver Salzburg Avatar answered Oct 06 '22 02:10

Oliver Salzburg