Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python/numpy generated binary file to be read by C

I am creating a 5*7 integer matrix binary file in python called random_from_python_int.dat, I then read this binary file from C. Somehow I can not get the correct numbers Here is my python code to generate this matrix:

import numpy as np
np.random.seed(10)
filename = "random_from_python_int.dat"
fileobj = open(filename, mode='wb')
b = np.random.randint(100, size=(5,7))
b.tofile(fileobj)
fileobj.close

this will generate a matrix

[ [  9 15 64 28 89 93 29]
  [  8 73 0  40 36 16 11]
  [ 54 88 62 33 72 78 49]
  [ 51 54 77 69 13 25 13]
  [ 92 86 30 30 89 12 65] ]

But when I read it from the C code from below:

#include <stdio.h>
#include <math.h>
int main()
{
  /* later changed 'double' to 'int', but that still had issues */
  double randn[5][7];

  char buff[256];
  FILE *latfile;

  sprintf(buff,"%s","random_from_python_int.dat");
  latfile=fopen(buff,"r");
  fread(&(randn[0][0]),sizeof(int),35,latfile);
  fclose(latfile);
  printf("\n %d     %d     %d     %d     %d     %d     %d",randn[0][0],randn[0][1],randn[0][2],randn[0][3],randn[0][4],randn[0][5],randn[0][6]);
  printf("\n %d     %d     %d     %d     %d     %d     %d",randn[1][0],randn[1][1],randn[1][2],randn[1][3],randn[1][4],randn[1][5],randn[1][6]);
  printf("\n %d     %d     %d     %d     %d     %d     %d",randn[2][0],randn[2][1],randn[2][2],randn[2][3],randn[2][4],randn[2][5],randn[2][6]);
  printf("\n %d     %d     %d     %d     %d     %d     %d",randn[3][0],randn[3][1],randn[3][2],randn[3][3],randn[3][4],randn[3][5],randn[3][6]);
  printf("\n %d     %d     %d     %d     %d     %d     %d\n",randn[4][0],randn[4][1],randn[4][2],randn[4][3],randn[4][4],randn[4][5],randn[4][6]);
}

It will give me (adjusted for spaces to avoid scrolling on the stackoverflow site):

      28      15         64      93         29 -163754450   9
      40      73          0      16         11 -163754450   8
      33      88         62      17         91 -163754450  54
     256       0 1830354560       0    4196011 -163754450 119
 4197424 4197493 1826683808 4196128 2084711472 -163754450  12

I am not sure what is wrong. I have tried this writing a float matrix in python and read it as double in C, it works fine. But this integer matrix just does not work.

like image 405
harmony Avatar asked Oct 25 '25 13:10

harmony


1 Answers

As @tdube writes, the quick summary of your issue is: Your numpy implementation writes 64bit integers, while your C code reads 32bit integers.

As for some more details, read on.

When you write and read integers as two's complement binary data, you need to make certain that the following three integer properties are the same for both the producer and the consumer of the binary data: integer size, integer endianness, integer signedness.

The signedness is signed for both numpy and C, so we have a match here.

The endianness is not an issue here because both numpy and the C program are on the same machine, and thus you probably have the same endianness (regardless of what endianness it might actually be).

However, the size is the issue.

By default, numpy.random.randint uses np.int as its dtype. np.int is of unknown size from the documentation but turns out to be 64 bits on your system.

The numpy scalars reference lists a few integer types (remarkably not including np.int), of which three combinations are interesting for robustly interfacing with programs outside of numpy:

 # | numpy    | C
---+----------+---------
 1 | np.int32 | int32_t
 2 | np.int64 | int64_t
 3 | np.intc  | int

If you only happen to interface your numpy based software to the same C environment used to build numpy, using the (np.intc, int) pair of types (from case 3) looks safe.

However, I would strongly prefer one of the explicitly sized types (cases 1 and 2) for the following reasons:

  • It is absolutely obvious what size the integer is in both numpy and C.

  • You can thus use your numpy generated output to interface to a program compiled with a different C environment which may have a different size int.

  • You can even use your numpy generated output to interface to a program written in a completely different language or compiled for and running on a completely different machine. You have to consider endianness for the different machine scenario, though.

like image 141
ndim Avatar answered Oct 28 '25 02:10

ndim



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!