Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I remove last few characters from a string in C

Tags:

c

I know I can use substr() to have the first n number of characters from a string. However, i want to remove the last few character. Is it valid to use -2 or -3 as the end position in C like the way I can do it in Python?

like image 551
TheRookierLearner Avatar asked Jan 27 '13 06:01

TheRookierLearner


4 Answers

You can simply place a null termination character right where you want the string to end like so:

int main()
{
    char s[] = "I am a string";
    int len = strlen(s);
    s[len-3] = '\0';
    printf("%s\n",s);
}

This would give you:

"I am a str"

like image 114
Lefteris Avatar answered Nov 03 '22 00:11

Lefteris


C is not like Python; string indices are not "smart". Saying str[-3] quite literally means "the character three bytes before the start"; accessing this memory is undefined behaviour.

If you want to get the last few characters of a string as another string, it suffices to get a pointer to the first character you want:

char *endstr = str + (strlen(str) - 3); // get last 3 characters of the string

If you want to delete the last few characters, it suffices to set the kth-from-the-end character to a null (\0):

str[strlen(str)-3] = 0; // delete last three characters
like image 21
nneonneo Avatar answered Nov 03 '22 00:11

nneonneo


Here's a possible implementation of a substr() function, including test code. Note that the test code does not push the boundaries — buffer length shorter than requested string or buffer length of zero.

#include <string.h>

extern void substr(char *buffer, size_t buflen, char const *source, int len);

/*
** Given substr(buffer, sizeof(buffer), "string", len), then the output
** in buffer for different values of len is:
** For positive values of len:
** 0    ""
** 1    "s"
** 2    "st"
** ...
** 6    "string"
** 7    "string"
** ...
** For negative values of len:
** -1   "g"
** -2   "ng"
** ...
** -6   "string"
** -7   "string"
** ...
** Subject to buffer being long enough.
** If buffer is too short, the empty string is set (unless buflen is 0,
** in which case, everything is left untouched).
*/
void substr(char *buffer, size_t buflen, char const *source, int len)
{
    size_t srclen = strlen(source);
    size_t nbytes = 0;
    size_t offset = 0;
    size_t sublen;

    if (buflen == 0)    /* Can't write anything anywhere */
        return;
    if (len > 0)
    {
        sublen = len;
        nbytes = (sublen > srclen) ? srclen : sublen;
        offset = 0;
    }
    else if (len < 0)
    {
        sublen = -len;
        nbytes = (sublen > srclen) ? srclen : sublen;
        offset = srclen - nbytes;
    }
    if (nbytes >= buflen)
        nbytes = 0;
    if (nbytes > 0)
        memmove(buffer, source + offset, nbytes);
    buffer[nbytes] = '\0';
}

#ifdef TEST

#include <stdio.h>

struct test_case
{
    const char *source;
    int         length;
    const char *result;
};

static struct test_case tests[] =
{
    {   "string",  0, ""            },
    {   "string", +1, "s"           },
    {   "string", +2, "st"          },
    {   "string", +3, "str"         },
    {   "string", +4, "stri"        },
    {   "string", +5, "strin"       },
    {   "string", +6, "string"      },
    {   "string", +7, "string"      },
    {   "string", -1, "g"           },
    {   "string", -2, "ng"          },
    {   "string", -3, "ing"         },
    {   "string", -4, "ring"        },
    {   "string", -5, "tring"       },
    {   "string", -6, "string"      },
    {   "string", -7, "string"      },
};
enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };

int main(void)
{
    int pass = 0;
    int fail = 0;

    for (int i = 0; i < NUM_TESTS; i++)
    {
        char buffer[20];
        substr(buffer, sizeof(buffer), tests[i].source, tests[i].length);
        if (strcmp(buffer, tests[i].result) == 0)
        {
            printf("== PASS == %2d: substr(buffer, %zu, \"%s\", %d) = \"%s\"\n",
                   i, sizeof(buffer), tests[i].source, tests[i].length, buffer);
            pass++;
        }
        else
        {
            printf("!! FAIL !! %2d: substr(buffer, %zu, \"%s\", %d) wanted \"%s\" actual \"%s\"\n",
                   i, sizeof(buffer), tests[i].source, tests[i].length, tests[i].result, buffer);
            fail++;
        }
    }
    if (fail == 0)
    {
        printf("== PASS == %d tests passed\n", NUM_TESTS);
        return(0);
    }
    else
    {
        printf("!! FAIL !! %d tests out of %d failed\n", fail, NUM_TESTS);
        return(1);
    }
}

#endif /* TEST */

The function declaration should be in an appropriate header. The variable sublen helps the code compile cleanly under:

gcc -O3 -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
        -Wold-style-definition -Werror -DTEST substr.c -o substr 

Test output:

== PASS ==  0: substr(buffer, 20, "string", 0) = ""
== PASS ==  1: substr(buffer, 20, "string", 1) = "s"
== PASS ==  2: substr(buffer, 20, "string", 2) = "st"
== PASS ==  3: substr(buffer, 20, "string", 3) = "str"
== PASS ==  4: substr(buffer, 20, "string", 4) = "stri"
== PASS ==  5: substr(buffer, 20, "string", 5) = "strin"
== PASS ==  6: substr(buffer, 20, "string", 6) = "string"
== PASS ==  7: substr(buffer, 20, "string", 7) = "string"
== PASS ==  8: substr(buffer, 20, "string", -1) = "g"
== PASS ==  9: substr(buffer, 20, "string", -2) = "ng"
== PASS == 10: substr(buffer, 20, "string", -3) = "ing"
== PASS == 11: substr(buffer, 20, "string", -4) = "ring"
== PASS == 12: substr(buffer, 20, "string", -5) = "tring"
== PASS == 13: substr(buffer, 20, "string", -6) = "string"
== PASS == 14: substr(buffer, 20, "string", -7) = "string"
== PASS == 15 tests passed

In a comment to another answer, cool_sops asks:

Why wouldn't this work: memcpy(new_string, old_string, strlen(old_string) - 3; &new_string[strlen(old_string) - 3] = '\0' Assuming new_string and old_string both are char pointers and strlen(old_string) > 3?

Assuming you remove the &, insert the missing ) and ;, the pointers point at valid non-overlapping locations, and the length condition is satisfied, then that should be OK for copying all but the last 3 characters from the old string into the new string, as you could test by embedding it into some test code. It doesn't attempt to deal with copying the last three characters of the old string which is what the question primarily seemed to ask about.

#include <string.h>
#include <stdio.h>
int main(void)
{
    char new_string[32] = "XXXXXXXXXXXXXXXX";
    char old_string[] = "string";
    memcpy(new_string, old_string, strlen(old_string) - 3);
    new_string[strlen(old_string) - 3] = '\0';
    printf("<<%s>> <<%s>>\n", old_string, new_string);
    return(0);
}

Output:

<<string>> <<str>>

However, beware of tricky coincidences; I chose a sample old string that is 6 characters long, and the -3 give 'length -3' equal to 3 too. To get the last N characters, you need code more like:

#include <assert.h>
#include <string.h>
#include <stdio.h>

int main(void)
{
    int  N = 3;
    char new_string[32] = "XXXXXXXXXXXXXXXX";
    char old_string[] = "dandelion";
    int  sublen = strlen(old_string) - N;

    assert(sublen > 0);
    memcpy(new_string, old_string + sublen, N);
    new_string[N] = '\0';
    printf("<<%s>> <<%s>>\n", old_string, new_string);
    return(0);
}

Output:

<<dandelion>> <<ion>>

Note, writing little programs like this is good practice, and can be educational. Writing lots of code is one way to get better at writing code.

The only trap to be aware of is that if you are testing 'undefined behaviour', you simply get the response from a single compiler, but other compilers may generate code that behaves differently. This code is not exercising undefined behaviour, so it's fine. Identifying undefined behaviour is tricky, so you can partially ignore this commentary, but make sure you compile with the stringent warning options on your compiler that you can stomach — they help identify undefined behaviour.

I have a supply of sample programs that I keep (under source control) in a directory called vignettes; they are little cameos of programs that illustrate a technique that I can refer to if I think I might need it again in the future. They're complete; they work; (they're more complex than these specific examples, but I've been programming in C longer than you have;) but they are toys — useful toys.

like image 24
Jonathan Leffler Avatar answered Nov 02 '22 22:11

Jonathan Leffler


No you have to use strlen() like this to get the last characters.

substr(strlen(str)-4,3);

Remember strings are 0 based so this gives you the last 3.

So the general technique is

substr(strlen(str)-n-1,n);

(of course the string has to be longer than n)

If you want to get the last 3 use this:

substr(0,strlen(str)-4);

Or in general

substr(0,strlen(str)-n-1);
like image 44
Hogan Avatar answered Nov 03 '22 00:11

Hogan