Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I Need a Human Readable, Yet Parse-able Document Format

I'm working on one of those projects where there are a million better ways to accomplish what I need but I have no choice and I have to do it this way. Here it is:

There is a web form, when the user fills it out and hits a submit a human readable text file is created using the form data. It looks like this:

field_1: value for field one

field_2: value for field two
more data for field two (field two has a newline in it!)

field3: some more data

My problem is this: I need to parse this text file back into the web form so that the user can edit it.

How could I, in a foolproof way, accomplish this? A database is not an option, I have to use these text files.

My Questions:

  • Is there a foolproof way to do this using the format in the example above?
  • What human readable format would work better (in other words I can change the format)
  • Human readable means that a non programmer could read it and know what is what.

This project uses PHP.

UPDATE

By human readable I mean that anyone could read the text and not be overwhelmed by it, including your grandmother.

like image 936
joshwbrick Avatar asked Apr 07 '10 21:04

joshwbrick


People also ask

Which file format contains human-readable data?

Human readable formats Examples include PDF files published for people to read and understand some data. The image below contains an exmaple of open data in a PDF format. A human can easily read and understand the data (some statistical knowledge may be required, results may vary).

What is a human-readable document?

A human-readable medium or human-readable format is a representation of data or information that can be naturally read by humans. In computing, human-readable data is often encoded as ASCII or Unicode text, rather than presented in a binary representation.

What is a human-readable and machine readable document?

Machine-readable data may be classified into two groups: human-readable data that is marked up so that it can also be read by machines (e.g. microformats, RDFa, HTML), and data file formats intended principally for processing by machines (CSV, RDF, XML, JSON).

Which files are not in human-readable form?

whereas binary file contains a sequence or a collection of bytes which are not in a human-readable format. text files follow some simple rules whereas binary files do not. both are a stream of bytes.


2 Answers

I Need a Human Readable, Yet Parse-able Document Format

This is what YAML was designed to be. You can read more about it on their site or on Wikipedia.

To quote Wikipedia:

YAML syntax was designed to be easily mapped to data types common to most high-level languages: list, hash, and scalar. Its familiar indented outline and lean appearance makes it especially suited for tasks where humans are likely to view or edit data structures, such as configuration files, dumping during debugging, and document headers

The advantage over XML is that it doesn't use tags which might confuse users. And I think it's cleaner than INI (which was also mentioned) because it simply uses colons instead of equals signs, semicolons and quotes.

Sample YAML looks like:

invoice: 34843
date   : 2001-01-23
bill-to: &id001
    given  : Chris
    family : Dumars
    address:
        lines: |
            458 Walkman Dr.
            Suite #292
        city    : Royal Oak
        state   : MI
        postal  : 48046
ship-to: *id001
product:
    - sku         : BL394D
      quantity    : 4
      description : Basketball
      price       : 450.00
    - sku         : BL4438H
      quantity    : 1
      description : Super Hoop
      price       : 2392.00
tax  : 251.42
total: 4443.52
comments: >
    Late afternoon is best.
    Backup contact is Nancy
    Billsmer @ 338-4338.
like image 143
Josh Avatar answered Sep 18 '22 15:09

Josh


I'd say either use

  • INI files or
  • YAML or
  • Markdown or
  • Textile

or just about any lightweight markup language you deem appropriate.

like image 21
Gordon Avatar answered Sep 20 '22 15:09

Gordon