Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP library for creating/manipulating fixed-width text files

We have a web application that does time-tracking, payroll, and HR. As a result, we have to write a lot of fixed-width data files for export into other systems (state tax filings, ACH files, etc). Does anyone know of a good library for this where you can define the record types/structures, and then act on them in an OOP paradigm?

The idea would be a class that you hand specifications, and then work with an instance of said specification. IE:

$icesa_file = new FixedWidthFile();
$icesa_file->setSpecification('icesa.xml');
$icesa_file->addEmployer( $some_data_structure );

Where icesa.xml is a file that contains the spec, although you could just use OOP calls to define it yourself:

$specification = new FixedWidthFileSpecification('ICESA');
$specification->addRecordType(
    $record_type_name = 'Employer',
    $record_fields = array(
         array('Field Name', Width, Vailditation Type, options)
         )
     );

EDIT: I'm not looking for advice on how to write such a library--I just wanted to know if one already existed. Thank you!!

like image 752
Bryan Agee Avatar asked Apr 25 '11 22:04

Bryan Agee


People also ask

How do I create a fixed width file?

To configure a fixed-width file format, you specify the number of columns and the width, name, and datatype for each column. You can also set advanced fixed-width format properties. For example, you can specify how to handle null characters or specify the default date format for each column.

Can a text file be fixed width?

Data in a fixed-width text file is arranged in rows and columns, with one entry per row. Each column has a fixed width, specified in characters, which determines the maximum amount of data it can contain. No delimiters are used to separate the fields in the file.

What is fixed width textfield?

A fixed-width file (or flat file database) is a text file where the number and widths of fields is fixed. Data in each field is padded using a padding character (default is the SPACE (\u0020) character).


1 Answers

I don't know of a library that does exactly what you want, but it should be rather straight-forward to roll your own classes that handle this. Assuming that you are mainly interested in writing data in these formats, I would use the following approach:

(1) Write a lightweight formatter class for fixed width strings. It must support user defined record types and should be flexible with regard to allowed formats

(2) Instantiate this class for every file format you use and add required record types

(3) Use this formatter to format your data

As you suggested, you could define the record types in XML and load this XML file in step (2). I don't know how experienced you are with XML, but in my experience XML formats often causes a lot of headaches (probably due to my own incompetence regarding XML). If you are going to use these classes only in your PHP program, there's not much to gain from defining your format in XML. Using XML is a good option if you will need to use the file format definitions in many other applications as well.

To illustrate my ideas, here is how I think you would use this suggested formatter class:

<?php
include 'FixedWidthFormatter.php' // contains the FixedWidthFormatter class
include 'icesa-format-declaration.php' // contains $icesaFormatter
$file = fopen("icesafile.txt", "w");

fputs ($file, $icesaFormatter->formatRecord( 'A-RECORD', array( 
    'year' => 2011, 
    'tein' => '12-3456789-P',
    'tname'=> 'Willie Nelson'
)));
// output: A2011123456789UTAX     Willie Nelson                                     

// etc...

fclose ($file);
?>

The file icesa-format-declaration.php could contain the declaration of the format somehow like this:

<?php
$icesaFormatter = new FixedWidthFormatter();
$icesaFormatter->addRecordType( 'A-RECORD', array(
    // the first field is the record identifier
    // for A records, this is simply the character A
    'record-identifier' => array(
        'value' => 'A',  // constant string
        'length' => 1 // not strictly necessary
                      // used for error checking
    ),
    // the year is a 4 digit field
    // it can simply be formatted printf style
    // sourceField defines which key from the input array is used
    'year' =>  array(
        'format' => '% -4d',  // 4 characters, left justified, space padded
        'length' => 4,
        'sourceField' => 'year'
    ),
    // the EIN is a more complicated field
    // we must strip hyphens and suffixes, so we define
    // a closure that performs this formatting
    'transmitter-ein' => array(
        'formatter'=> function($EIN){
            $cleanedEIN =  preg_replace('/\D+/','',$EIN); // remove anything that's not a digit
            return sprintf('% -9d', $cleanedEIN); // left justified and padded with blanks
        },
        'length' => 9,
        'sourceField' => 'tein'
    ),
    'tax-entity-code' => array(
        'value' => 'UTAX',  // constant string
        'length' => 4
    ),
    'blanks' => array(
        'value' => '     ',  // constant string
        'length' => 5
    ),
    'transmitter-name' =>  array(
        'format' => '% -50s',  // 50 characters, left justified, space padded
        'length' => 50,
        'sourceField' => 'tname'
    ),
    // etc. etc.
));
?>

Then you only need the FixedWidthFormatter class itself, which could look like this:

<?php

class FixedWidthFormatter {

    var $recordTypes = array();

    function addRecordType( $recordTypeName, $recordTypeDeclaration ){
        // perform some checking to make sure that $recordTypeDeclaration is valid
        $this->recordTypes[$recordTypeName] = $recordTypeDeclaration;
    }

    function formatRecord( $type, $data ) {
        if (!array_key_exists($type, $this->recordTypes)) {
            trigger_error("Undefinded record type: '$type'");
            return "";
        }
        $output = '';
        $typeDeclaration = $this->recordTypes[$type];
        foreach($typeDeclaration as $fieldName => $fieldDeclaration) {
            // there are three possible field variants:
            //  - constant fields
            //  - fields formatted with printf
            //  - fields formatted with a custom function/closure
            if (array_key_exists('value',$fieldDeclaration)) {
                $value = $fieldDeclaration['value'];
            } else if (array_key_exists('format',$fieldDeclaration)) {
                $value = sprintf($fieldDeclaration['format'], $data[$fieldDeclaration['sourceField']]);
            } else if (array_key_exists('formatter',$fieldDeclaration)) {
                $value = $fieldDeclaration['formatter']($data[$fieldDeclaration['sourceField']]);
            } else {
                trigger_error("Invalid field declaration for field '$fieldName' record type '$type'");
                return '';
            }

            // check if the formatted value has the right length
            if (strlen($value)!=$fieldDeclaration['length']) {
                trigger_error("The formatted value '$value' for field '$fieldName' record type '$type' is not of correct length ({$fieldDeclaration['length']}).");
                return '';
            }
            $output .= $value;
        }
        return $output . "\n";
    }
}


?>

If you need read support as well, the Formatter class could be extended to allow reading as well, but this might be beyond the scope of this answer.

like image 160
Jakob Egger Avatar answered Sep 26 '22 14:09

Jakob Egger