Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to trim a std::string?

I'm currently using the following code to right-trim all the std::strings in my programs:

std::string s;
s.erase(s.find_last_not_of(" \n\r\t")+1);

It works fine, but I wonder if there are some end-cases where it might fail?

Of course, answers with elegant alternatives and also left-trim solution are welcome.

like image 999
Milan Babuškov Avatar asked Oct 19 '08 19:10

Milan Babuškov


People also ask

How do you trim a std::string in C++?

The trimming string means removing whitespaces from left and right part of the string. To trim the C++ string, we will use the boost string library. In that library, there are two different methods called trim_left() and trim_right(). To trim string completely, we can use both of them.

How do you trim a string?

trim() The trim() method removes whitespace from both ends of a string and returns a new string, without modifying the original string. Whitespace in this context is all the whitespace characters (space, tab, no-break space, etc.) and all the line terminator characters (LF, CR, etc.).

How do I remove extra spaces from a string in C++?

To remove redundant white spaces from a given string, we form a new string by concatenating only the relevant characters. We skip the multiple white spaces using the while loop and add only the single last occurrence of the multiple white spaces.

What trim () in string does?

Trim() Removes all leading and trailing white-space characters from the current string.


25 Answers

EDIT Since c++17, some parts of the standard library were removed. Fortunately, starting with c++11, we have lambdas which are a superior solution.

#include <algorithm> 
#include <cctype>
#include <locale>

// trim from start (in place)
static inline void ltrim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(), [](unsigned char ch) {
        return !std::isspace(ch);
    }));
}

// trim from end (in place)
static inline void rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(), [](unsigned char ch) {
        return !std::isspace(ch);
    }).base(), s.end());
}

// trim from both ends (in place)
static inline void trim(std::string &s) {
    ltrim(s);
    rtrim(s);
}

// trim from start (copying)
static inline std::string ltrim_copy(std::string s) {
    ltrim(s);
    return s;
}

// trim from end (copying)
static inline std::string rtrim_copy(std::string s) {
    rtrim(s);
    return s;
}

// trim from both ends (copying)
static inline std::string trim_copy(std::string s) {
    trim(s);
    return s;
}

Thanks to https://stackoverflow.com/a/44973498/524503 for bringing up the modern solution.

Original answer:

I tend to use one of these 3 for my trimming needs:

#include <algorithm> 
#include <functional> 
#include <cctype>
#include <locale>

// trim from start
static inline std::string &ltrim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(),
            std::not1(std::ptr_fun<int, int>(std::isspace))));
    return s;
}

// trim from end
static inline std::string &rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(),
            std::not1(std::ptr_fun<int, int>(std::isspace))).base(), s.end());
    return s;
}

// trim from both ends
static inline std::string &trim(std::string &s) {
    return ltrim(rtrim(s));
}

They are fairly self-explanatory and work very well.

EDIT: BTW, I have std::ptr_fun in there to help disambiguate std::isspace because there is actually a second definition which supports locales. This could have been a cast just the same, but I tend to like this better.

EDIT: To address some comments about accepting a parameter by reference, modifying and returning it. I Agree. An implementation that I would likely prefer would be two sets of functions, one for in place and one which makes a copy. A better set of examples would be:

#include <algorithm> 
#include <functional> 
#include <cctype>
#include <locale>

// trim from start (in place)
static inline void ltrim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(),
            std::not1(std::ptr_fun<int, int>(std::isspace))));
}

// trim from end (in place)
static inline void rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(),
            std::not1(std::ptr_fun<int, int>(std::isspace))).base(), s.end());
}

// trim from both ends (in place)
static inline void trim(std::string &s) {
    ltrim(s);
    rtrim(s);
}

// trim from start (copying)
static inline std::string ltrim_copy(std::string s) {
    ltrim(s);
    return s;
}

// trim from end (copying)
static inline std::string rtrim_copy(std::string s) {
    rtrim(s);
    return s;
}

// trim from both ends (copying)
static inline std::string trim_copy(std::string s) {
    trim(s);
    return s;
}

I am keeping the original answer above though for context and in the interest of keeping the high voted answer still available.

like image 79
Evan Teran Avatar answered Oct 04 '22 22:10

Evan Teran


Using Boost's string algorithms would be easiest:

#include <boost/algorithm/string.hpp>

std::string str("hello world! ");
boost::trim_right(str);

str is now "hello world!". There's also trim_left and trim, which trims both sides.


If you add _copy suffix to any of above function names e.g. trim_copy, the function will return a trimmed copy of the string instead of modifying it through a reference.

If you add _if suffix to any of above function names e.g. trim_copy_if, you can trim all characters satisfying your custom predicate, as opposed to just whitespaces.

like image 39
Leon Timmermans Avatar answered Oct 04 '22 22:10

Leon Timmermans


What you are doing is fine and robust. I have used the same method for a long time and I have yet to find a faster method:

const char* ws = " \t\n\r\f\v";

// trim from end of string (right)
inline std::string& rtrim(std::string& s, const char* t = ws)
{
    s.erase(s.find_last_not_of(t) + 1);
    return s;
}

// trim from beginning of string (left)
inline std::string& ltrim(std::string& s, const char* t = ws)
{
    s.erase(0, s.find_first_not_of(t));
    return s;
}

// trim from both ends of string (right then left)
inline std::string& trim(std::string& s, const char* t = ws)
{
    return ltrim(rtrim(s, t), t);
}

By supplying the characters to be trimmed you have the flexibility to trim non-whitespace characters and the efficiency to trim only the characters you want trimmed.

like image 39
Galik Avatar answered Oct 04 '22 22:10

Galik


Use the following code to right trim (trailing) spaces and tab characters from std::strings (ideone):

// trim trailing spaces
size_t endpos = str.find_last_not_of(" \t");
size_t startpos = str.find_first_not_of(" \t");
if( std::string::npos != endpos )
{
    str = str.substr( 0, endpos+1 );
    str = str.substr( startpos );
}
else {
    str.erase(std::remove(std::begin(str), std::end(str), ' '), std::end(str));
}

And just to balance things out, I'll include the left trim code too (ideone):

// trim leading spaces
size_t startpos = str.find_first_not_of(" \t");
if( string::npos != startpos )
{
    str = str.substr( startpos );
}
like image 32
Bill the Lizard Avatar answered Oct 04 '22 22:10

Bill the Lizard


Try this, it works for me.

inline std::string trim(std::string& str)
{
    str.erase(str.find_last_not_of(' ')+1);         //suffixing spaces
    str.erase(0, str.find_first_not_of(' '));       //prefixing spaces
    return str;
}
like image 25
user818330 Avatar answered Oct 04 '22 21:10

user818330


Bit late to the party, but never mind. Now C++11 is here, we have lambdas and auto variables. So my version, which also handles all-whitespace and empty strings, is:

#include <cctype>
#include <string>
#include <algorithm>

inline std::string trim(const std::string &s)
{
   auto wsfront=std::find_if_not(s.begin(),s.end(),[](int c){return std::isspace(c);});
   auto wsback=std::find_if_not(s.rbegin(),s.rend(),[](int c){return std::isspace(c);}).base();
   return (wsback<=wsfront ? std::string() : std::string(wsfront,wsback));
}

We could make a reverse iterator from wsfront and use that as the termination condition in the second find_if_not but that's only useful in the case of an all-whitespace string, and gcc 4.8 at least isn't smart enough to infer the type of the reverse iterator (std::string::const_reverse_iterator) with auto. I don't know how expensive constructing a reverse iterator is, so YMMV here. With this alteration, the code looks like this:

inline std::string trim(const std::string &s)
{
   auto  wsfront=std::find_if_not(s.begin(),s.end(),[](int c){return std::isspace(c);});
   return std::string(wsfront,std::find_if_not(s.rbegin(),std::string::const_reverse_iterator(wsfront),[](int c){return std::isspace(c);}).base());
}
like image 45
David G Avatar answered Oct 04 '22 22:10

David G


http://ideone.com/nFVtEo

std::string trim(const std::string &s)
{
    std::string::const_iterator it = s.begin();
    while (it != s.end() && isspace(*it))
        it++;

    std::string::const_reverse_iterator rit = s.rbegin();
    while (rit.base() != it && isspace(*rit))
        rit++;

    return std::string(it, rit.base());
}
like image 21
Pushkoff Avatar answered Oct 04 '22 22:10

Pushkoff


I like tzaman's solution, the only problem with it is that it doesn't trim a string containing only spaces.

To correct that 1 flaw, add a str.clear() in between the 2 trimmer lines

std::stringstream trimmer;
trimmer << str;
str.clear();
trimmer >> str;
like image 21
Michaël Schoonbrood Avatar answered Oct 04 '22 22:10

Michaël Schoonbrood


With C++17 you can use basic_string_view::remove_prefix and basic_string_view::remove_suffix:

std::string_view trim(std::string_view s)
{
    s.remove_prefix(std::min(s.find_first_not_of(" \t\r\v\n"), s.size()));
    s.remove_suffix(std::min(s.size() - s.find_last_not_of(" \t\r\v\n") - 1, s.size()));

    return s;
}

A nice alternative:

std::string_view ltrim(std::string_view s)
{
    s.remove_prefix(std::distance(s.cbegin(), std::find_if(s.cbegin(), s.cend(),
         [](int c) {return !std::isspace(c);})));

    return s;
}

std::string_view rtrim(std::string_view s)
{
    s.remove_suffix(std::distance(s.crbegin(), std::find_if(s.crbegin(), s.crend(),
        [](int c) {return !std::isspace(c);})));

    return s;
}

std::string_view trim(std::string_view s)
{
    return ltrim(rtrim(s));
}
like image 34
Phidelux Avatar answered Oct 04 '22 21:10

Phidelux


In the case of an empty string, your code assumes that adding 1 to string::npos gives 0. string::npos is of type string::size_type, which is unsigned. Thus, you are relying on the overflow behaviour of addition.

like image 26
Greg Hewgill Avatar answered Oct 04 '22 21:10

Greg Hewgill


Hacked off of Cplusplus.com

std::string choppa(const std::string &t, const std::string &ws)
{
    std::string str = t;
    size_t found;
    found = str.find_last_not_of(ws);
    if (found != std::string::npos)
        str.erase(found+1);
    else
        str.clear();            // str is all whitespace

    return str;
}

This works for the null case as well. :-)

like image 26
Paul Nathan Avatar answered Oct 04 '22 20:10

Paul Nathan


s.erase(0, s.find_first_not_of(" \n\r\t"));                                                                                               
s.erase(s.find_last_not_of(" \n\r\t")+1);   
like image 37
freeboy1015 Avatar answered Oct 04 '22 22:10

freeboy1015


My solution based on the answer by @Bill the Lizard.

Note that these functions will return the empty string if the input string contains nothing but whitespace.

const std::string StringUtils::WHITESPACE = " \n\r\t";

std::string StringUtils::Trim(const std::string& s)
{
    return TrimRight(TrimLeft(s));
}

std::string StringUtils::TrimLeft(const std::string& s)
{
    size_t startpos = s.find_first_not_of(StringUtils::WHITESPACE);
    return (startpos == std::string::npos) ? "" : s.substr(startpos);
}

std::string StringUtils::TrimRight(const std::string& s)
{
    size_t endpos = s.find_last_not_of(StringUtils::WHITESPACE);
    return (endpos == std::string::npos) ? "" : s.substr(0, endpos+1);
}
like image 21
DavidRR Avatar answered Oct 04 '22 20:10

DavidRR


With C++11 also came a regular expression module, which of course can be used to trim leading or trailing spaces.

Maybe something like this:

std::string ltrim(const std::string& s)
{
    static const std::regex lws{"^[[:space:]]*", std::regex_constants::extended};
    return std::regex_replace(s, lws, "");
}

std::string rtrim(const std::string& s)
{
    static const std::regex tws{"[[:space:]]*$", std::regex_constants::extended};
    return std::regex_replace(s, tws, "");
}

std::string trim(const std::string& s)
{
    return ltrim(rtrim(s));
}
like image 24
Some programmer dude Avatar answered Oct 04 '22 20:10

Some programmer dude


My answer is an improvement upon the top answer for this post that trims control characters as well as spaces (0-32 and 127 on the ASCII table).

std::isgraph determines if a character has a graphical representation, so you can use this to alter Evan's answer to remove any character that doesn't have a graphical representation from either side of a string. The result is a much more elegant solution:

#include <algorithm>
#include <functional>
#include <string>

/**
 * @brief Left Trim
 *
 * Trims whitespace from the left end of the provided std::string
 *
 * @param[out] s The std::string to trim
 *
 * @return The modified std::string&
 */
std::string& ltrim(std::string& s) {
  s.erase(s.begin(), std::find_if(s.begin(), s.end(),
    std::ptr_fun<int, int>(std::isgraph)));
  return s;
}

/**
 * @brief Right Trim
 *
 * Trims whitespace from the right end of the provided std::string
 *
 * @param[out] s The std::string to trim
 *
 * @return The modified std::string&
 */
std::string& rtrim(std::string& s) {
  s.erase(std::find_if(s.rbegin(), s.rend(),
    std::ptr_fun<int, int>(std::isgraph)).base(), s.end());
  return s;
}

/**
 * @brief Trim
 *
 * Trims whitespace from both ends of the provided std::string
 *
 * @param[out] s The std::string to trim
 *
 * @return The modified std::string&
 */
std::string& trim(std::string& s) {
  return ltrim(rtrim(s));
}

Note: Alternatively you should be able to use std::iswgraph if you need support for wide characters, but you will also have to edit this code to enable std::wstring manipulation, which is something that I haven't tested (see the reference page for std::basic_string to explore this option).

like image 40
Clay Freeman Avatar answered Oct 04 '22 21:10

Clay Freeman


This is what I use. Just keep removing space from the front, and then, if there's anything left, do the same from the back.

void trim(string& s) {
    while(s.compare(0,1," ")==0)
        s.erase(s.begin()); // remove leading whitespaces
    while(s.size()>0 && s.compare(s.size()-1,1," ")==0)
        s.erase(s.end()-1); // remove trailing whitespaces
}
like image 32
synaptik Avatar answered Oct 04 '22 20:10

synaptik


An elegant way of doing it can be like

std::string & trim(std::string & str)
{
   return ltrim(rtrim(str));
}

And the supportive functions are implemented as:

std::string & ltrim(std::string & str)
{
  auto it =  std::find_if( str.begin() , str.end() , [](char ch){ return !std::isspace<char>(ch , std::locale::classic() ) ; } );
  str.erase( str.begin() , it);
  return str;   
}

std::string & rtrim(std::string & str)
{
  auto it =  std::find_if( str.rbegin() , str.rend() , [](char ch){ return !std::isspace<char>(ch , std::locale::classic() ) ; } );
  str.erase( it.base() , str.end() );
  return str;   
}

And once you've all these in place, you can write this as well:

std::string trim_copy(std::string const & str)
{
   auto s = str;
   return ltrim(rtrim(s));
}
like image 44
jha-G Avatar answered Oct 04 '22 21:10

jha-G


I guess if you start asking for the "best way" to trim a string, I'd say a good implementation would be one that:

  1. Doesn't allocate temporary strings
  2. Has overloads for in-place trim and copy trim
  3. Can be easily customized to accept different validation sequences / logic

Obviously there are too many different ways to approach this and it definitely depends on what you actually need. However, the C standard library still has some very useful functions in <string.h>, like memchr. There's a reason why C is still regarded as the best language for IO - its stdlib is pure efficiency.

inline const char* trim_start(const char* str)
{
    while (memchr(" \t\n\r", *str, 4))  ++str;
    return str;
}
inline const char* trim_end(const char* end)
{
    while (memchr(" \t\n\r", end[-1], 4)) --end;
    return end;
}
inline std::string trim(const char* buffer, int len) // trim a buffer (input?)
{
    return std::string(trim_start(buffer), trim_end(buffer + len));
}
inline void trim_inplace(std::string& str)
{
    str.assign(trim_start(str.c_str()),
        trim_end(str.c_str() + str.length()));
}

int main()
{
    char str [] = "\t \nhello\r \t \n";

    string trimmed = trim(str, strlen(str));
    cout << "'" << trimmed << "'" << endl;

    system("pause");
    return 0;
}
like image 41
Jorma Rebane Avatar answered Oct 04 '22 20:10

Jorma Rebane


For what it's worth, here is a trim implementation with an eye towards performance. It's much quicker than many other trim routines I've seen around. Instead of using iterators and std::finds, it uses raw c strings and indices. It optimizes the following special cases: size 0 string (do nothing), string with no whitespace to trim (do nothing), string with only trailing whitespace to trim (just resize the string), string that's entirely whitespace (just clear the string). And finally, in the worst case (string with leading whitespace), it does its best to perform an efficient copy construction, performing only 1 copy and then moving that copy in place of the original string.

void TrimString(std::string & str)
{ 
    if(str.empty())
        return;

    const auto pStr = str.c_str();

    size_t front = 0;
    while(front < str.length() && std::isspace(int(pStr[front]))) {++front;}

    size_t back = str.length();
    while(back > front && std::isspace(int(pStr[back-1]))) {--back;}

    if(0 == front)
    {
        if(back < str.length())
        {
            str.resize(back - front);
        }
    }
    else if(back <= front)
    {
        str.clear();
    }
    else
    {
        str = std::move(std::string(str.begin()+front, str.begin()+back));
    }
}
like image 36
mbgda Avatar answered Oct 04 '22 20:10

mbgda


Here is a solution for trim with regex

#include <string>
#include <regex>

string trim(string str){
    return regex_replace(str, regex("(^[ ]+)|([ ]+$)"),"");
}
like image 36
Sadidul Islam Avatar answered Oct 04 '22 20:10

Sadidul Islam


Trim C++11 implementation:

static void trim(std::string &s) {
     s.erase(s.begin(), std::find_if_not(s.begin(), s.end(), [](char c){ return std::isspace(c); }));
     s.erase(std::find_if_not(s.rbegin(), s.rend(), [](char c){ return std::isspace(c); }).base(), s.end());
}
like image 22
GutiMac Avatar answered Oct 04 '22 22:10

GutiMac


str.erase(0, str.find_first_not_of("\t\n\v\f\r ")); // left trim
str.erase(str.find_last_not_of("\t\n\v\f\r ") + 1); // right trim

Try it online!

like image 43
Arty Avatar answered Oct 04 '22 20:10

Arty


Contributing my solution to the noise. trim defaults to creating a new string and returning the modified one while trim_in_place modifies the string passed to it. The trim function supports c++11 move semantics.

#include <string>

// modifies input string, returns input

std::string& trim_left_in_place(std::string& str) {
    size_t i = 0;
    while(i < str.size() && isspace(str[i])) { ++i; };
    return str.erase(0, i);
}

std::string& trim_right_in_place(std::string& str) {
    size_t i = str.size();
    while(i > 0 && isspace(str[i - 1])) { --i; };
    return str.erase(i, str.size());
}

std::string& trim_in_place(std::string& str) {
    return trim_left_in_place(trim_right_in_place(str));
}

// returns newly created strings

std::string trim_right(std::string str) {
    return trim_right_in_place(str);
}

std::string trim_left(std::string str) {
    return trim_left_in_place(str);
}

std::string trim(std::string str) {
    return trim_left_in_place(trim_right_in_place(str));
}

#include <cassert>

int main() {

    std::string s1(" \t\r\n  ");
    std::string s2("  \r\nc");
    std::string s3("c \t");
    std::string s4("  \rc ");

    assert(trim(s1) == "");
    assert(trim(s2) == "c");
    assert(trim(s3) == "c");
    assert(trim(s4) == "c");

    assert(s1 == " \t\r\n  ");
    assert(s2 == "  \r\nc");
    assert(s3 == "c \t");
    assert(s4 == "  \rc ");

    assert(trim_in_place(s1) == "");
    assert(trim_in_place(s2) == "c");
    assert(trim_in_place(s3) == "c");
    assert(trim_in_place(s4) == "c");

    assert(s1 == "");
    assert(s2 == "c");
    assert(s3 == "c");
    assert(s4 == "c");  
}
like image 5
vmrob Avatar answered Oct 04 '22 21:10

vmrob


This can be done more simply in C++11 due to the addition of back() and pop_back().

while ( !s.empty() && isspace(s.back()) ) s.pop_back();
like image 5
Brent Bradburn Avatar answered Oct 04 '22 21:10

Brent Bradburn


I'm not sure if your environment is the same, but in mine, the empty string case will cause the program to abort. I would either wrap that erase call with an if(!s.empty()) or use Boost as already mentioned.

like image 3
Steve Avatar answered Oct 04 '22 21:10

Steve