Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cobol String Delimited By Trailing SPACES

WORKING-STORAGE.
    FIRST-STRING    PIC X(15) VALUE SPACES.
    SECOND-STRING     PIC X(15) VALUE SPACES.
    OUTPUT-STRING     PIC X(31) VALUE SPACES.

If FIRST-NAME = 'JON SNOW, ' and LAST-NAME = 'KNOWS NOTHING. ', how can I get:

I want to Get :

OUTPUT-STRING = 'JON SNOW, KNOWS NOTHING.         '

When I try :

String FIRST-STRING DELIMITED BY SPACES
       ' ' DELIMITED BY SIZE
       SECOND-STRING DELIMITED BY SIZE
       INTO OUTPUT-STRING

I Get 'JON KNOWS NOTHING. '

And When I try :

String FIRST-STRING DELIMITED BY SIZE
       SECOND-STRING DELIMITED BY SIZE
       INTO OUTPUT-STRING

I get 'JON SNOW, KNOWS NOTHING. '

I have found a tweak which consist of String FIRST-STRING DELIMITED BY ' ' (two spaces) But there is no guarantee that my FIRST-STRING doesn't contains two spaces which will result in losing part of it.

like image 447
raz_user Avatar asked Dec 02 '22 16:12

raz_user


1 Answers

Firstly, kudos, as many would go with the delimited-by-two-spaces and not be at all concerned at the possible consequences. Note that if the data is followed by one trailing space only, you also get "unexpected" output. Note also that your field definition for OUTPUT-STRING is one byte short, as you are inserting a space to separate the data. With both fields entirely filled with data, you will lose the final byte of SECOND-STRING.

COBOL is a language of fixed-length fields (except when they are variable). Which means there are no "standard" delimiters, so any character or value can appear at any position in a field. Further, the default padding character, where source is shorter than the target field, is space, which is a perfectly normal separator for words.

In your, and the many similar, case you need to know the length of the actual data part of your field (excluding the trailing blanks).

A very common way to do this is as suggested by @user4341206 in their answer, https://stackoverflow.com/a/31938039/1927206.

Under the 1985 COBOL Standard, INSPECT can be be used to count leading spaces, but cannot be used to count trailing spaces. FUNCTION REVERSE can be used first to turn trailing spaces into leading spaces, so that INSPECT can count them.

Once you know the number of trailing blanks, you can use the LENGTH OF special-register or FUNCTION LENGTH to determine the length of the fixed-length field (both are (or can be, depending on compiler) evaluated at compile time). The difference between the length of the field and the number of trailing blanks gives you the length of the data.

Once you have the length of the data, and bearing in mind that it may be blank (depends on possibilities for the data) and it may be the same length as the field

Be aware that if you have a lot of data, you may not want to reverse your field and use INSPECT (probably a run-time routine) compared to a simple loop from the end of the field to count the trailing blanks.

Note, that compilers like AcuCOBOL (now part of Micro Focus's COBOL offerings) have a Language Extension which offers TRAILING as an option for INSPECT. Note, even the 2014 COBOL Standard does not have TRAILING as an option for INSPECT.

Either way, with the length of the data you are done. Sort of.

You can use reference-modification within the STRING statement:

String FIRST-STRING ( 1 : length-field-you-define ) DELIMITED BY SIZE
       ' ' DELIMITED BY SIZE
       SECOND-STRING DELIMITED BY SIZE
   INTO OUTPUT-STRING

Note, you should be able to remove BY SIZE, as SIZE is the default, but it does make it clearer to the human reader.

You could also use MOVE with reference-modification on the target field:

MOVE FIRST-STRING            TO OUTPUT-STRING  
                                 ( 1 : length-field-you-define )
MOVE SPACE                   TO OUTPUT-STRING  
                                 ( length-field-you-define + 1 : 1 )
MOVE SECOND-STRING           TO OUTPUT-STRING  
                                 ( length-field-you-define + 2 :  )

There is a specific issue with reference-modification (mentioned on the other answer) which is you length field should not be zero.

The evaluation of length shall result in a positive nonzero integer.

Length in this context is the second item, after the :, in the reference-modification notation. In this case it means length-field-you-define must not be zero, which it could be calculated to if FIRST-STRING is entirely space.

The potential problem is with this:

MOVE FIRST-STRING            TO OUTPUT-STRING  
                                 ( 1 : length-field-you-define )

Therefore, depending on your data (if it may contain blanks), you have to "protect" against that.

    IF FIRST-STRING EQUAL TO SPACE
        PERFORM                  COPY-SECOND-STRING-ONLY
    ELSE
        PERFORM                  CONCATENATE-FIRST-AND-SECOND
    END-IF
    ...
COPY-SECOND-STRING-ONLY.
    MOVE SECOND-STRING           TO OUTPUT-STRING
    .
CONCATENATE-FIRST-AND-SECOND.
    calculate length
    MOVE FIRST-STRING            TO OUTPUT-STRING  
                                    ( 1 : length-field-you-define )
    MOVE SPACE                   TO OUTPUT-STRING  
                                    ( length-field-you-define + 1 : 1 )
    MOVE SECOND-STRING           TO OUTPUT-STRING  
                                    ( length-field-you-define + 2 :  )
    .

If you use a reference-modification with a length of zero, the result is undefined, although it may "work" with your compiler.

The solutions with STRING and the variable-length fields won't "fail", because the compiler outside of reference-modification is happy with zero length items.

However, the same "protection" should be employed for two reasons: you'll insert a leading blank (the "separator"); you'll make your code explicit, so people won't have to ask themselves "what happens when the first field is blank"; you'll save on processing.

In this way also your program "describes your data" better. Along with "know your data" as a necessity for accurate program design, the more your program describes the data, the more difficult it is to create errors of commission or omission, the more easy it is to understand, and the easier it is to change when, as happens, the structure of the data changes.

You could also look at STRING using the WITH POINTER option. Firstly, MOVE FIRST-STRING to OUTPUT-STRING (which will also clear the unused bytes in OUTPUT-STRING to space). Then add one to the length-field-you-define (for the intervening space) and use that in the STRING for the WITH POINTER.

Although that is perfectly valid, if using, it is an occasion to comment, as many people who regularly use STRING have no idea of the use of WITH POINTER, so help them out.

A further possibility is to use variable-length fields.

Unfortunately, not all COBOL compilers make this easy. A "Complex ODO", which this would require in its purest form, is non-Standard, but is an IBM Extension to the language.

LINKAGE SECTION.
01  L-MAPPING-OF-OUTPUT-STRING.
    05  L-MOOS-FIRST-STRING.
        10  FILLER OCCURS 0 TO 15 TIMES
            DEPENDING ON length-field-you-define.
            15  FILLER                          PIC X.
    05  L-MOOS-SEPARATOR-SPACE                  PIC X.
    05  L-MOOS-SECOND-STRING                    PIC X(15).

    ...
    SET ADDRESS OF L-MAPPING-OF-OUTPUT-STRING
                             TO ADDRESS OF 
                                 OUTPUT-STRING  
    MOVE FIRST-STRING        TO L-MOOS-FIRST-STRING
    MOVE SPACE               TO L-MOOS-SEPARATOR-SPACE
    MOVE SECOND-STRING       TO L-MOOS-SECOND-STRING

If you have lots of data, the fastest way is the reference-modidifcation-only suggestion. My opinion of reference-modification is that it tends to obfuscate, because people tend to use it in an obfuscatory (and unnecessary) way.

My preference is for the last, where the PROCEDURE DIVISION code is very simple: you find the length of the data in the first field; you just do three plain MOVEs.

Perhaps you can try each, to become more aware of the possibilities for future situations.

like image 67
Bill Woodger Avatar answered Dec 05 '22 05:12

Bill Woodger