Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

working with strings in c

Tags:

c

may someone please help me understand these lines of code in the program below this program according the writer it writes a string of hello world then there is a function in it that also reverses the string to world hello,my quest is what does this code do?

char * p_divs = divs; //what does divs do
    char tmp;
    while(tmp = *p_divs++)
        if (tmp == c) return 1

;

also this code in the void function

*dest = '\0';//what does this pointer do?
    int source_len = strlen(source); //what is source
    if (source_len == 0) return;
    char * p_source = source + source_len - 1;
    char * p_dest = dest;
    while(p_source >= source){
        while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;

this is the main program

#include <stdio.h>
#include <string.h>

int inDiv(char c, char * divs){
    char * p_divs = divs;
    char tmp;
    while(tmp = *p_divs++)
        if (tmp == c) return 1;
    return 0;
}

void reverse(char * source, char * dest, char * divs){
    *dest = '\0';
    int source_len = strlen(source);
    if (source_len == 0) return;
    char * p_source = source + source_len - 1;
    char * p_dest = dest;
    while(p_source >= source){
        while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;
        if (p_source < source) break;
        char * w_end = p_source;
        while((p_source >= source) && (!inDiv(*p_source, divs))) p_source--;
        char * w_beg = p_source + 1;
        for(char * p = w_beg; p <= w_end; p++) *p_dest++ = *p;
        *p_dest++ = ' ';
    }
    *p_dest = '\0';
}

#define MAS_SIZE 100

int main(){
    char source[MAS_SIZE], dest[MAS_SIZE], divs[MAS_SIZE];
    printf("String          : "); gets(source);
    printf("Dividers        : "); gets(divs);
    reverse(source, dest, divs);
    printf("Reversed string : %s", dest);
    return 0;  
}
like image 249
kryticrecte Avatar asked Feb 20 '23 19:02

kryticrecte


1 Answers

Here, inDiv can be called to search for the character c in the string divs, for example:

inDiv('x', "is there an x character in here somewhere?') will return 1
inDiv('x', "ahhh... not this time') will return 0

Working through it:

int inDiv(char c, char * divs)
{
    char * p_divs = divs;    // remember which character we're considering
    char tmp;
    while(tmp = *p_divs++)   // copy that character into tmp, and move p_divs to the next character
                             // but if tmp is then 0/false, break out of the while loop
         if (tmp == c) return 1;  // if tmp is the character we're searching for, return "1" meaning found
    return 0;   // must be here because tmp == 0 indicating end-of-string - return "0" meaning not-found
}

We can infer things about reverse by looking at the call site:

int main()
{
    char source[MAS_SIZE], dest[MAS_SIZE], divs[MAS_SIZE];
    printf("String          : ");
    gets(source);
    printf("Dividers        : ");
    gets(divs);
    reverse(source, dest, divs);
    printf("Reversed string : %s", dest);

We can see gets() called to read from standard input into character arrays source and divs -> those inputs are then provided to reverse(). The way dest is printed, it's clearly meant to be a destination for the reversal of the string in source. At this stage, there's no insight into the relevance of divs.

Let's look at the source...

void reverse(char * source, char * dest, char * divs)
{
    *dest = '\0'; //what does this pointer do?
    int source_len = strlen(source); //what is source
    if (source_len == 0) return;
    char* p_source = source + source_len - 1;
    char* p_dest = dest;
    while(p_source >= source)
    {
        while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;

Here, *dest = '\0' writes a NUL character into the character array dest - that's the normal sentinel value encoding the end-of-string position - putting it in at the first character *dest implies we want the destination to be cleared out. We know source is the textual input that we'll be reversing - strlen() will set source_len to the number of characters therein. If there are no characters, then return as there's no work to do and the output is already terminated with NUL. Otherwise, a new pointer p_source is created and initialised to source + source_len - 1 -> that means it's pointing at the last non-NUL character in source. p_dest points at the NUL character at the start of the destination buffer.

Then the loop says: while (p_source >= source) - for this to do anything p_source must initially be >= source - that makes sense as p_source points at the last character and source is the first character address in the buffer; the comparison implies we'll be moving one or both towards the other until they would cross over - doing some work each time. Which brings us to:

while((p_source >= source) && (inDiv(*p_source, divs))) p_source--;

This is the same test we've just seen - but this time we're only moving p_source backwards towards the start of the string while inDiv(*p_source, divs) is also true... that means that the character at *p_source is one of the characters in the divs string. What it means is basically: move backwards until you've gone past the start of the string (though this test has undefined behaviour as Michael Burr points out in comments, and really might not work if the string happens to be allocated at address 0 - even if relative to some specific data segment, as the pointer could go from 0 to something like FFFFFFFF hex without ever seeming to be less than 0) or until you find a character that's not one of the "divider" characters.

Here we get some real insight into what the code's doing... dividing the input into "words" separated by any of a set of characters in the divs input, then writing them in reverse order with space delimiters into the destination buffer. That's getting ahead of ourselves a bit - but let's track it through:

The next line is...

if (p_source < source) break;

...which means if the loop exited having backed past the front of the source string, then break out of all the loops (looking ahead, we see the code just puts a new NUL on the end of the already-generated output and returns - but is that what we'd expect? - if we'd been backing through the "hello" in "hello world" then we'd hit the start of the string and terminate the loop without copying that last "hello" word to the output! The output will always be all the words in the input - except the first word - reversed - that's not the behaviour described by the author).

Otherwise:

char* w_end = p_source;  // remember where the non-divider character "word" ends

// move backwards until there are no more characters (p_source < source) or you find a non-divider character
while((p_source >= source) && (!inDiv(*p_source, divs))) p_source--;

// either way that loop exited, the "word" begins at p_source + 1
char * w_beg = p_source + 1;

// append the word between w_beg and w_end to the destination buffer
for(char* p = w_beg; p <= w_end; p++) *p_dest++ = *p;

// also add a space...
*p_dest++ = ' ';

This keeps happening for each "word" in the input, then the final line adds a NUL terminator to the destination.

*p_dest = '\0';

Now, you said:

according [to] the writer it writes a string of hello world then there is a function in it that also reverses the string to world hello

Well, given inputs "hello world" and divider characters including a space (but none of the other characters in the input), then the output would be "hello world " (note the space at the end).

For what it's worth - this code isn't that bad... it's pretty normal for C handling of ASCIIZ buffers, though the assumptions about the length of the input are dangerous and it's missing that first word....

** How to fix the undefined behaviour **

Regarding the undefined behaviour - the smallest change to address that is to change the loops so they terminate when at the start of the buffer, and have the next line explicitly check why it terminated and work out what behaviour is required. That will be a bit ugly, but isn't rocket science....

like image 162
Tony Delroy Avatar answered Mar 05 '23 06:03

Tony Delroy