Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing a Character Array from VBA to Fortran DLL through a Type is corrupting the other Type members

Believe it or not, that title is about as short as I could make it and still describe the problem I'm having!

So here's the scenario: I'm calling a Fortran DLL from VBA, and the DLL uses user-defined types or whatever the Fortran name is for that (structs?) as an argument and copies the type back to the caller for validation.

The type has an array of fixed-length characters and some run of the mill integers.

I've noticed some funny behavior in any attributes defined after this character array that I'll go over below, right after I describe my boiled-down testing setup:


The Fortran Side:

Here's the main program:

SUBROUTINE characterArrayTest (simpleTypeIn, simpleTypeOut)


           use simpleTypeDefinition
           

!GCC$ ATTRIBUTES STDCALL :: characterArrayTest


           type(simpleType),                INTENT(IN)     :: simpleTypeIn
           type(simpleType),                INTENT(OUT)    :: simpleTypeOut
           
         
           simpleTypeOut = simpleTypeIn
              
        
END SUBROUTINE characterArrayTest

And here is the simpleTypeDefinition module file:

Module simpleTypeDefinition


  Type simpleType

     character (len=1)  :: CharacterArray(1) 
     !The length of the array is one here, but modified in tests
     
     integer   (kind=2) :: FirstInteger

     integer   (kind=2) :: SecondInteger

     integer   (kind=2) :: ThirdInteger

  End Type simpleType

  
End Module simpleTypeDefinition

The compilation step:

 gfortran -c simpleTypeDefinition.f90 characterArrayTest.f90
 gfortran -shared -static -o characterArrayTest.dll characterArrayTest.o

Note: This is the 32-bit version of gfortran, as I'm using the 32-bit version of Excel.


The VBA Side:

First, the mirrored simpleType and declare statements:

Type simpleType

    CharacterArray(0) As String * 1  
    'The length of the array is one here, but modified in tests
    
    FirstInteger As Integer
    
    SecondInteger As Integer
    
    ThirdInteger As Integer
    
End Type

Declare Sub characterArrayTest Lib "characterArrayTest.dll" _
Alias "characterarraytest_@8" _
(simpleTypeIn As simpleType, simpleTypeOut As simpleType)

Next, the calling code:

Dim simpleTypeIn As simpleType
Dim simpleTypeOut As simpleType

simpleTypeIn.CharacterArray(0) = "A"
'simpleTypeIn.CharacterArray(1) = "B"
'simpleTypeIn.CharacterArray(1) = "C"
'simpleTypeIn.CharacterArray(3) = "D"

simpleTypeIn.FirstInteger = 1
simpleTypeIn.SecondInteger = 2
simpleTypeIn.ThirdInteger = 3

Call Module4.characterArrayTest(simpleTypeIn, simpleTypeOut)

The Strange, Buggy Behavior:

Now that we're past the setup, I can describe what's happening:

(I'm playing around with the length of the character array, while leaving the length of the individual characters set to one. I match the character array parameters on both sides in all cases.)


Test case: CharacterArray length = 1

For this first case, everything works great, I pass in the simpleTypeIn and simpleTypeOut from VBA, the Fortran DLL accepts it and copies simpleTypeIn to simpleTypeOut, and after the call VBA returns simpleTypeOut with identical attributes CharacterArray, FirstInteger, and so forth.


Test case: CharacterArray length = 2

This is where things get interesting.

Before the call, simpleTypeIn was as defined. Right after the call, simpleTypeIn.ThirdInteger had changed from 3 to 65! Even weirder, 65 is the ASCII value for the character A, which is simpleTypeIn.CharacterArray(0).

I tested this relationship by changing "A" to "(", which has an ASCII value of 40, and sure enough, simpleTypeIn.ThirdInteger changed to 40. Weird.

In any case, one would expect that simpleTypeOut would be a copy of whatever weird thing simpleTypeIn has been morphed to, but not so! simpleTypeOut was a copy of simpleTypeIn except for simpleTypeOut.ThirdInteger, which was 16961!


Test case: CharacterArray length = 3

This case was identical to case 2, oddly enough.


Test case: CharacterArray length = 4

In this also weird case, after the call simpleTypeIn.SecondInteger changed from 2 to 65, and simpleTypeIn.ThirdInteger changed from 3 to 66, which is the ASCII value for B.

Not to be outdone, simpleTypeOut.SecondInteger came out as 16961 and simpleTypeOut.ThirdInteger was 17475. The other values copied successfully (I decommented the B, C, and D character assignments to match the array size.)


Observations:

This weird corruption seems to be linear with respect to the bytes in the character array. I did some testing that I'll catalogue if anyone wants on Monday with individual characters of length 2 instead of 1, and the corruption happened when the array had a size of 1, as opposed to waiting until the size was 2. It also didn't "skip" additional corruption when the size of the array was 3 like the size = 1 case did.


This is easily a hall of fame bug for me; I'm sure you can imagine how much concentrated fun this was to isolate in a large-scale program with a ton of Type attributes. If anyone has any ideas it'd be greatly appreciated!

If I don't get back to you right away it's because I'm calling it a day, but I'll try to monitor my inbox.

like image 526
Andres Salas Avatar asked Sep 22 '18 00:09

Andres Salas


2 Answers

(This answer is based on an understanding of Fortran, but not VBA)

In this case, and in most cases, Fortran won't automatically resize arrays for you. When you reference the second element of character array (with simpleTypeIn.CharacterArray(1) = "B"), that element doesn't exist and it isn't created.

Instead, the code attempts to set whatever memory would be at the location of the second element of the character array, if it were to exist. In this case, that memory appears to be used to store the integers instead.

You can see the same thing happening if you forget about VBA entirely. Here is a sample code entirely in Fortran to demonstrate similar behavior:

enet-mach5% cat main.f90 
! ===== Module of types
module types_m
   implicit none

   type simple_t
      character(len=1) :: CharacterArray(1) 
      integer :: int1, int2, int3
   end type simple_t
end module types_m


! ===== Module of subroutines
module subroutines_m
   use types_m, only : simple_t
   implicit none
contains

! -- Subroutine to modify first character, this should work
subroutine sub1(s)
   type(simple_t), intent(INOUT) :: s

   s%CharacterArray(1) = 'A'
end subroutine sub1

! -- Subroutine to modify first and other (nonexistent) characters, should fail
subroutine sub2(s)
   type(simple_t), intent(INOUT) :: s

   s%CharacterArray(1) = 'B'
   s%CharacterArray(2:8) = 'C'
end subroutine sub2

end module subroutines_m


! ===== Main program, drives test
program main
   use types_m, only : simple_t
   use subroutines_m, only : sub1, sub2
   implicit none

   type(simple_t) :: s

   ! -- Set values to known
   s%int1 = 1
   s%int2 = 2
   s%int3 = 3
   s%CharacterArray(1) = 'X'

   ! -- Write out values of s
   write(*,*) 'Before calling any subs:'
   write(*,*) 's character: "', s%CharacterArray, '"'
   write(*,*) 's integers: ', s%int1, s%int2, s%int3

   ! -- Call first subroutine, should be fine
   call sub1(s)

   write(*,*) 'After calling sub1:'
   write(*,*) 's character: "', s%CharacterArray, '"'
   write(*,*) 's integers: ', s%int1, s%int2, s%int3

   ! -- Call second subroutine, should overflow character array and corrupt
   call sub2(s)

   write(*,*) 'After calling sub2:'
   write(*,*) 's character: "', s%CharacterArray, '"'
   write(*,*) 's integers: ', s%int1, s%int2, s%int3

   write(*,*) 'complete'

end program main

In this case, I've put both modules and the main routine in the same file. Typically, one would keep them in separate files but it's ok for this example. I also had to set 8 elements of CharacterArray to manifest an error, but the exact sizing depends on the system, compiler, and optimization settings. Running this on my machine yields:

enet-mach5% gfortran --version
GNU Fortran (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
Copyright (C) 2013 Free Software Foundation, Inc.

GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING

enet-mach5% gfortran main.f90 && ./a.out
main.f90:31.20:

   s%CharacterArray(2:8) = 'C'
                    1
Warning: Lower array reference at (1) is out of bounds (2 > 1) in dimension 1
 Before calling any subs:
 s character: "X"
 s integers:            1           2           3
 After calling sub1:
 s character: "A"
 s integers:            1           2           3
 After calling sub2:
 s character: "B"
 s integers:   1128481603           2           3
 complete

Gfortran is smart enough to flag a compile-time warning that s%CharacterArray(2) is out of bounds. You can see the character array isn't resized, and the value of int1 is corrupted instead. If I compile with more run-time checking, I get a full error instead:

enet-mach5% gfortran -fcheck=all main.f90 && ./a.out
main.f90:31.20:

   s%CharacterArray(2:8) = 'C'
                    1
Warning: Lower array reference at (1) is out of bounds (2 > 1) in dimension 1
 Before calling any subs:
 s character: "X"
 s integers:            1           2           3
 After calling sub1:
 s character: "A"
 s integers:            1           2           3
At line 31 of file main.f90
Fortran runtime error: Index '2' of dimension 1 of array 's' outside of expected range (1:1)
like image 95
Ross Avatar answered Sep 28 '22 04:09

Ross


Looks like I'm (Edit: not) collecting my own bounty today!

The root of this problem lies in the fact that VBA takes 2 bytes per character while Fortran expects 1 byte per character. The memory garbling is caused by the character array taking up more space in memory than Fortran expects. The way to send 1 byte characters over to Fortran is as such:


Type Definition:

Type simpleType

    CharacterArray(3) As Byte

    FirstInteger As Integer

    SecondInteger As Integer

    ThirdInteger As Integer

End Type

Conversion from VBA character to Byte values:

Dim tempByte() As Byte

tempByte = StrConv("A", vbFromUnicode)
simpleTypeIn.CharacterArray(0) = tempByte(0)

tempByte = StrConv("B", vbFromUnicode)
simpleTypeIn.CharacterArray(1) = tempByte(0)

tempByte = StrConv("C", vbFromUnicode)
simpleTypeIn.CharacterArray(2) = tempByte(0)

tempByte = StrConv("D", vbFromUnicode)
simpleTypeIn.CharacterArray(3) = tempByte(0)

This code successfully passes the strings passed as arguments to the StrConv function. I tested that they translated to the proper ASCII characters in the Fortran DLL and they did! Also, the integers are no longer passed back incorrectly! A hall of fame bug has been stamped.

like image 45
Andres Salas Avatar answered Sep 28 '22 02:09

Andres Salas