I know I can use substr()
to have the first n
number of characters from a string. However, i want to remove the last few character. Is it valid to use -2
or -3
as the end position in C like the way I can do it in Python?
You can simply place a null termination character right where you want the string to end like so:
int main()
{
char s[] = "I am a string";
int len = strlen(s);
s[len-3] = '\0';
printf("%s\n",s);
}
This would give you:
"I am a str"
C is not like Python; string indices are not "smart". Saying str[-3]
quite literally means "the character three bytes before the start"; accessing this memory is undefined behaviour.
If you want to get the last few characters of a string as another string, it suffices to get a pointer to the first character you want:
char *endstr = str + (strlen(str) - 3); // get last 3 characters of the string
If you want to delete the last few characters, it suffices to set the kth-from-the-end character to a null (\0
):
str[strlen(str)-3] = 0; // delete last three characters
Here's a possible implementation of a substr()
function, including test code. Note that the test code does not push the boundaries — buffer length shorter than requested string or buffer length of zero.
#include <string.h>
extern void substr(char *buffer, size_t buflen, char const *source, int len);
/*
** Given substr(buffer, sizeof(buffer), "string", len), then the output
** in buffer for different values of len is:
** For positive values of len:
** 0 ""
** 1 "s"
** 2 "st"
** ...
** 6 "string"
** 7 "string"
** ...
** For negative values of len:
** -1 "g"
** -2 "ng"
** ...
** -6 "string"
** -7 "string"
** ...
** Subject to buffer being long enough.
** If buffer is too short, the empty string is set (unless buflen is 0,
** in which case, everything is left untouched).
*/
void substr(char *buffer, size_t buflen, char const *source, int len)
{
size_t srclen = strlen(source);
size_t nbytes = 0;
size_t offset = 0;
size_t sublen;
if (buflen == 0) /* Can't write anything anywhere */
return;
if (len > 0)
{
sublen = len;
nbytes = (sublen > srclen) ? srclen : sublen;
offset = 0;
}
else if (len < 0)
{
sublen = -len;
nbytes = (sublen > srclen) ? srclen : sublen;
offset = srclen - nbytes;
}
if (nbytes >= buflen)
nbytes = 0;
if (nbytes > 0)
memmove(buffer, source + offset, nbytes);
buffer[nbytes] = '\0';
}
#ifdef TEST
#include <stdio.h>
struct test_case
{
const char *source;
int length;
const char *result;
};
static struct test_case tests[] =
{
{ "string", 0, "" },
{ "string", +1, "s" },
{ "string", +2, "st" },
{ "string", +3, "str" },
{ "string", +4, "stri" },
{ "string", +5, "strin" },
{ "string", +6, "string" },
{ "string", +7, "string" },
{ "string", -1, "g" },
{ "string", -2, "ng" },
{ "string", -3, "ing" },
{ "string", -4, "ring" },
{ "string", -5, "tring" },
{ "string", -6, "string" },
{ "string", -7, "string" },
};
enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };
int main(void)
{
int pass = 0;
int fail = 0;
for (int i = 0; i < NUM_TESTS; i++)
{
char buffer[20];
substr(buffer, sizeof(buffer), tests[i].source, tests[i].length);
if (strcmp(buffer, tests[i].result) == 0)
{
printf("== PASS == %2d: substr(buffer, %zu, \"%s\", %d) = \"%s\"\n",
i, sizeof(buffer), tests[i].source, tests[i].length, buffer);
pass++;
}
else
{
printf("!! FAIL !! %2d: substr(buffer, %zu, \"%s\", %d) wanted \"%s\" actual \"%s\"\n",
i, sizeof(buffer), tests[i].source, tests[i].length, tests[i].result, buffer);
fail++;
}
}
if (fail == 0)
{
printf("== PASS == %d tests passed\n", NUM_TESTS);
return(0);
}
else
{
printf("!! FAIL !! %d tests out of %d failed\n", fail, NUM_TESTS);
return(1);
}
}
#endif /* TEST */
The function declaration should be in an appropriate header. The variable sublen
helps the code compile cleanly under:
gcc -O3 -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
-Wold-style-definition -Werror -DTEST substr.c -o substr
Test output:
== PASS == 0: substr(buffer, 20, "string", 0) = ""
== PASS == 1: substr(buffer, 20, "string", 1) = "s"
== PASS == 2: substr(buffer, 20, "string", 2) = "st"
== PASS == 3: substr(buffer, 20, "string", 3) = "str"
== PASS == 4: substr(buffer, 20, "string", 4) = "stri"
== PASS == 5: substr(buffer, 20, "string", 5) = "strin"
== PASS == 6: substr(buffer, 20, "string", 6) = "string"
== PASS == 7: substr(buffer, 20, "string", 7) = "string"
== PASS == 8: substr(buffer, 20, "string", -1) = "g"
== PASS == 9: substr(buffer, 20, "string", -2) = "ng"
== PASS == 10: substr(buffer, 20, "string", -3) = "ing"
== PASS == 11: substr(buffer, 20, "string", -4) = "ring"
== PASS == 12: substr(buffer, 20, "string", -5) = "tring"
== PASS == 13: substr(buffer, 20, "string", -6) = "string"
== PASS == 14: substr(buffer, 20, "string", -7) = "string"
== PASS == 15 tests passed
In a comment to another answer, cool_sops asks:
Why wouldn't this work:
memcpy(new_string, old_string, strlen(old_string) - 3; &new_string[strlen(old_string) - 3] = '\0'
Assumingnew_string
andold_string
both arechar
pointers andstrlen(old_string) > 3
?
Assuming you remove the &
, insert the missing )
and ;
, the pointers point at valid non-overlapping locations, and the length condition is satisfied, then that should be OK for copying all but the last 3 characters from the old string into the new string, as you could test by embedding it into some test code. It doesn't attempt to deal with copying the last three characters of the old string which is what the question primarily seemed to ask about.
#include <string.h>
#include <stdio.h>
int main(void)
{
char new_string[32] = "XXXXXXXXXXXXXXXX";
char old_string[] = "string";
memcpy(new_string, old_string, strlen(old_string) - 3);
new_string[strlen(old_string) - 3] = '\0';
printf("<<%s>> <<%s>>\n", old_string, new_string);
return(0);
}
Output:
<<string>> <<str>>
However, beware of tricky coincidences; I chose a sample old string that is 6 characters long, and the -3 give 'length -3' equal to 3 too. To get the last N characters, you need code more like:
#include <assert.h>
#include <string.h>
#include <stdio.h>
int main(void)
{
int N = 3;
char new_string[32] = "XXXXXXXXXXXXXXXX";
char old_string[] = "dandelion";
int sublen = strlen(old_string) - N;
assert(sublen > 0);
memcpy(new_string, old_string + sublen, N);
new_string[N] = '\0';
printf("<<%s>> <<%s>>\n", old_string, new_string);
return(0);
}
Output:
<<dandelion>> <<ion>>
Note, writing little programs like this is good practice, and can be educational. Writing lots of code is one way to get better at writing code.
The only trap to be aware of is that if you are testing 'undefined behaviour', you simply get the response from a single compiler, but other compilers may generate code that behaves differently. This code is not exercising undefined behaviour, so it's fine. Identifying undefined behaviour is tricky, so you can partially ignore this commentary, but make sure you compile with the stringent warning options on your compiler that you can stomach — they help identify undefined behaviour.
I have a supply of sample programs that I keep (under source control) in a directory called vignettes
; they are little cameos of programs that illustrate a technique that I can refer to if I think I might need it again in the future. They're complete; they work; (they're more complex than these specific examples, but I've been programming in C longer than you have;) but they are toys — useful toys.
No you have to use strlen() like this to get the last characters.
substr(strlen(str)-4,3);
Remember strings are 0 based so this gives you the last 3.
So the general technique is
substr(strlen(str)-n-1,n);
(of course the string has to be longer than n
)
If you want to get the last 3 use this:
substr(0,strlen(str)-4);
Or in general
substr(0,strlen(str)-n-1);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With