Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best method of Textfile Parsing in C#?

Tags:

c#

fileparse

I want to parse a config file sorta thing, like so:

[KEY:Value]     
    [SUBKEY:SubValue]

Now I started with a StreamReader, converting lines into character arrays, when I figured there's gotta be a better way. So I ask you, humble reader, to help me.

One restriction is that it has to work in a Linux/Mono environment (1.2.6 to be exact). I don't have the latest 2.0 release (of Mono), so try to restrict language features to C# 2.0 or C# 1.0.

like image 239
Bernard Avatar asked Aug 17 '08 22:08

Bernard


People also ask

What Isfile handling in C?

File handling refers to the method of storing data in the C program in the form of an output or input that might have been generated while running a C program in a data file, i.e., a binary file or a text file for future analysis and reference in that very program.

What is parsing text files?

Computer programmers often use parsing programs to convert text into formats that other applications can use. Parsers split items in a text string into separate fields. If, for example, you have a business database application that reads comma-delimited input files, a parser can help you create a comma-delimited file.

Can you parse in C?

Some programs can just process an entire file at once, and other programs need to examine the file line-by-line. In the latter case, you likely need to parse data in each line. Fortunately, the C programming language has a standard C library function to do just that.

What is EOF programming?

In computing, end-of-file (EOF) is a condition in a computer operating system where no more data can be read from a data source. The data source is usually called a file or stream.


2 Answers

I considered it, but I'm not going to use XML. I am going to be writing this stuff by hand, and hand editing XML makes my brain hurt. :')

Have you looked at YAML?

You get the benefits of XML without all the pain and suffering. It's used extensively in the ruby community for things like config files, pre-prepared database data, etc

here's an example

customer:
  name: Orion
  age: 26
  addresses:
    - type: Work
      number: 12
      street: Bob Street
    - type: Home
      number: 15
      street: Secret Road

There appears to be a C# library here, which I haven't used personally, but yaml is pretty simple, so "how hard can it be?" :-)

I'd say it's preferable to inventing your own ad-hoc format (and dealing with parser bugs)

like image 50
Orion Edwards Avatar answered Oct 13 '22 05:10

Orion Edwards


I was looking at almost this exact problem the other day: this article on string tokenizing is exactly what you need. You'll want to define your tokens as something like:

@"(?&ltlevel>\s) | " +
@"(?&ltterm>[^:\s]) | " +
@"(?&ltseparator>:)"

The article does a pretty good job of explaining it. From there you just start eating up tokens as you see fit.

Protip: For an LL(1) parser (read: easy), tokens cannot share a prefix. If you have abc as a token, you cannot have ace as a token

Note: The article's missing the | characters in its examples, just throw them in.

like image 43
eplawless Avatar answered Oct 13 '22 04:10

eplawless