Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++: How to read a lot of data from formatted text files into program?

I'm writing a CFD solver for specific fluid problems. So far the mesh is generated every time running the simulation, and when changing geometry and fluid properties,the program needs to be recompiled.

For small-sized problem with low number of cells, it works just fine. But for cases with over 1 million cells, and fluid properties needs to be changed very often, It is quite inefficient.

Obviously, we need to store simulation setup data in a config file, and geometry information in a formatted mesh file.

  1. Simulation.config file
% Dimension: 2D or 3D
N_Dimension= 2
% Number of fluid phases
N_Phases=  1
% Fluid density (kg/m3)
Density_Phase1= 1000.0
Density_Phase2= 1.0
% Kinematic viscosity (m^2/s)
Viscosity_Phase1=  1e-6
Viscosity_Phase2=  1.48e-05
...
  1. Geometry.mesh file
% Dimension: 2D or 3D
N_Dimension= 2
% Points (index: x, y, z)
N_Points= 100
x0 y0
x1 y1
...
x99 y99
% Faces (Lines in 2D: P1->p2)
N_Faces= 55
0 2
3 4
...
% Cells (polygons in 2D: Cell-Type and Points clock-wise). 6: triangle; 9: quad
N_Cells= 20
9 0 1 6 20
9 1 3 4 7
...
% Boundary Faces (index)
Left_Faces= 4
0
1
2
3
Bottom_Faces= 6
7
8
9
10
11
12
...

It's easy to write config and mesh information to formatted text files. The problem is, how do we read these data efficiently into program? I wonder if there is any easy-to-use c++ library to do this job.

like image 565
KOF Avatar asked Jun 28 '19 18:06

KOF


People also ask

What does read () do in C?

The read() function reads data previously written to a file. If any portion of a regular file prior to the end-of-file has not been written, read() shall return bytes with value 0. For example, lseek() allows the file offset to be set beyond the end of existing data in the file.

Which mode is used to open an existing text file for reading and writing?

w - opens or create a text file in write mode. a - opens a file in append mode. r+ - opens a file in both read and write mode. a+ - opens a file in both read and write mode.

Which function is used to read data from file?

Steps To Read A File: Open a file using the function fopen() and store the reference of the file in a FILE pointer. Read contents of the file using any of these functions fgetc(), fgets(), fscanf(), or fread(). File close the file using the function fclose().


1 Answers

As a first-iteration solution to just get something tolerable - take @JosmarBarbosa's suggestion and use an established format for your kind of data - which also probably has free, open-source libraries for you to use. One example is OpenMesh developed at RWTH Aachen. It supports:

  • Representation of arbitrary polygonal (the general case) and pure triangle meshes (providing more efficient, specialized algorithms)
  • Explicit representation of vertices, halfedges, edges and faces.
  • Fast neighborhood access, especially the one-ring neighborhood (see below).
  • [Customization]

But if you really need to speed up your mesh data reading, consider doing the following:

  1. Separate the limited-size meta-data from the larger, unlimited-size mesh data;
  2. Place the limited-size meta-data in a separate file and read it whichever way you like, it doesn't matter.
  3. Arrange the mesh data as several arrays of fixed-size elements or fixed-size structures (e.g. cells, faces, points, etc.).
  4. Store each of the fixed-width arrays of mesh data in its own file - without using streaming individual values anywhere: Just read or write the array as-is, directly. Here's an example of how a read would look. Youll know the appropriate size of the read either by looking at the file size or the metadata.

Finally, you could avoid explicitly-reading altogether and use memory-mapping for each of the data files. See

fastest technique to read a file into memory?

Notes/caveats:

  • If you write and read binary data on systems with different memory layout of certain values (e.g. little-endian vs big-endian) - you'll need to shuffle the bytes around in memory. See also this SO question about endianness.
  • It might not be worth it to optimize the reading speed as much as possible. You should consider Amdahl's law, and only optimize it to a point where it's no longer a significant fraction of your overall execution time. It's better to lose a few percentage points of execution time, but get human-readable data files which can be used with other tools supporting an established format.
like image 75
einpoklum Avatar answered Sep 21 '22 17:09

einpoklum