I wanted to ask about a specific point made in Effective C++.
It says:
A destructor should be made virtual if a class needs to act like a polymorphic class. It further adds that since
std::string
does not have a virtual destructor, one should never derive from it. Alsostd::string
is not even designed to be a base class, forget polymorphic base class.
I do not understand what specifically is required in a class to be eligible for being a base class (not a polymorphic one)?
Is the only reason that I should not derive from std::string
class is it does not have a virtual destructor? For reusability purpose a base class can be defined and multiple derived class can inherit from it. So what makes std::string
not even eligible as a base class?
Also, if there is a base class purely defined for reusability purpose and there are many derived types, is there any way to prevent client from doing Base* p = new Derived()
because the classes are not meant to be used polymorphically?
Following are the properties which a derived class doesn't inherit from its parent class : 1) The base class's constructors and destructor. 2) The base class's friend functions. 3) Overloaded operators of the base class.
The std::string class manages the underlying storage for you, storing your strings in a contiguous manner. You can get access to this underlying buffer using the c_str() member function, which will return a pointer to null-terminated char array. This allows std::string to interoperate with C-string APIs.
std::string class in C++ C++ has in its definition a way to represent a sequence of characters as an object of the class. This class is called std:: string. String class stores the characters as a sequence of bytes with the functionality of allowing access to the single-byte character.
The derived class inherits all members and member functions of a base class. The derived class can have more functionality with respect to the Base class and can easily access the Base class. A Derived class is also called a child class or subclass.
To offer the counter side to the general advice (which is sound when there are no particular verbosity/productivity issues evident)...
There is at least one scenario where public derivation from bases without virtual destructors can be a good decision:
This may sound quite restrictive, but there are plenty of cases in real world programs matching this scenario.
Programming is about compromises. Before you write a more conceptually "correct" program:
If the potential problems involve usage of the objects that you just can't imagine anyone attempting given your insights into their accessibility, scope and nature of usage in the program, or you can generate compile-time errors for dangerous use (e.g. an assertion that derived class size matches the base's, which would prevent adding new data members), then anything else may be premature over-engineering. Take the easy win in clean, intuitive, concise design and code.
Say you have a class D publicly derived from B. With no effort, the operations on B are possible on D (with the exception of construction, but even if there are a lot of constructors you can often provide effective forwarding by having one template for each distinct number of constructor arguments: e.g. template <typename T1, typename T2> D(const T1& x1, const T2& t2) : B(t1, t2) { }
. Better generalised solution in C++0x variadic templates.)
Further, if B changes then by default D exposes those changes - staying in sync - but someone may need to review extended functionality introduced in D to see if it remains valid, and the client usage.
Rephrasing this: there is reduced explicit coupling between base and derived class, but increased coupling between base and client.
This is often NOT what you want, but sometimes it is ideal, and other times a non issue (see next paragraph). Changes to the base force more client code changes in places distributed throughout the code base, and sometimes the people changing the base may not even have access to the client code to review or update it correspondingly. Sometimes it is better though: if you as the derived class provider - the "man in the middle" - want base class changes to feed through to clients, and you generally want clients to be able - sometimes forced - to update their code when the base class changes without you needing to be constantly involved, then public derivation may be ideal. This is common when your class is not so much an independent entity in its own right, but a thin value-add to the base.
Other times the base class interface is so stable that the coupling may be deemed a non issue. This is especially true of classes like Standard containers.
Summarily, public derivation is a quick way to get or approximate the ideal, familiar base class interface for the derived class - in a way that's concise and self-evidently correct to both the maintainer and client coder - with additional functionality available as member functions (which IMHO - which obviously differs with Sutter, Alexandrescu etc - can aid usability, readability and assist productivity-enhancing tools including IDEs)
Item 35 of C++ Coding Standards lists issues with the scenario of deriving from std::string
. As scenarios go, it's good that it illustrates the burden of exposing a large but useful API, but both good and bad as the base API is remarkably stable - being part of the Standard Library. A stable base is a common situation, but no more common than a volatile one and a good analysis should relate to both cases. While considering the book's list of issues, I'll specifically contrast the issues' applicability to the cases of say:
a) class Issue_Id : public std::string { ...handy stuff... };
<-- public derivation, our controversial usage
b) class Issue_Id : public string_with_virtual_destructor { ...handy stuff... };
<- safer OO derivation
c) class Issue_Id { public: ...handy stuff... private: std::string id_; };
<-- a compositional approach
d) using std::string
everywhere, with freestanding support functions
(Hopefully we can agree the composition is acceptable practice, as it provides encapsulation, type safety as well as a potentially enriched API over and above that of std::string
.)
So, say you're writing some new code and start thinking about the conceptual entities in an OO sense. Maybe in a bug tracking system (I'm thinking of JIRA), one of them is say an Issue_Id. Data content is textual - consisting of an alphabetic project id, a hyphen, and an incrementing issue number: e.g. "MYAPP-1234". Issue ids can be stored in a std::string
, and there will be lots of fiddly little text searches and manipulation operations needed on issue ids - a large subset of those already provided on std::string
and a few more for good measure (e.g. getting the project id component, providing the next possible issue id (MYAPP-1235)).
On to Sutter and Alexandrescu's list of issues...
Nonmember functions work well within existing code that already manipulates
string
s. If instead you supply asuper_string
, you force changes through your code base to change types and function signatures tosuper_string
.
The fundamental mistake with this claim (and most of the ones below) is that it promotes the convenience of using only a few types, ignoring the benefits of type safety. It's expressing a preference for d) above, rather than insight into c) or b) as alternatives to a). The art of programming involves balancing the pros and cons of distinct types to achieve reasonable reuse, performance, convenience and safety. The paragraphs below elaborate on this.
Using public derivation, the existing code can implicitly access the base class string
as a string
, and continue to behave as it always has. There's no specific reason to think that the existing code would want to use any additional functionality from super_string
(in our case Issue_Id)... in fact it's often lower-level support code pre-existing the application for which you're creating the super_string
, and therefore oblivious to the needs provided for by the extended functions. For example, say there's a non-member function to_upper(std::string&, std::string::size_type from, std::string::size_type to)
- it could still be applied to an Issue_Id
.
So, unless the non-member support function is being cleaned up or extended at the deliberate cost of tightly coupling it to the new code, then it needn't be touched. If it is being overhauled to support issue ids (for example, using the insight into the data content format to upper-case only leading alpha characters), then it's probably a good thing to ensure it really is being passed an Issue_Id
by creating an overload ala to_upper(Issue_Id&)
and sticking to either the derivation or compositional approaches allowing type safety. Whether super_string
or composition is used makes no difference to effort or maintainability. A to_upper_leading_alpha_only(std::string&)
reusable free-standing support function isn't likely to be of much use - I can't recall the last time I wanted such a function.
The impulse to use std::string
everywhere isn't qualitatively different to accepting all your arguments as containers of variants or void*
s so you don't have to change your interfaces to accept arbitrary data, but it makes for error prone implementation and less self-documenting and compiler-verifiable code.
Interface functions that take a string now need to: a) stay away from
super_string
's added functionality (unuseful); b) copy their argument to a super_string (wasteful); or c) cast the string reference to a super_string reference (awkward and potentially illegal).
This seems to be revisiting the first point - old code that needs to be refactored to use the new functionality, albeit this time client code rather than support code. If the function wants to start treating its argument as an entity for which the new operations are relevant, then it should start taking its arguments as that type and the clients should generate them and accept them using that type. The exact same issues exists for composition. Otherwise, c)
can be practical and safe if the guidelines I list below are followed, though it is ugly.
super_string's member functions don't have any more access to string's internals than nonmember functions because string probably doesn't have protected members (remember, it wasn't meant to be derived from in the first place)
True, but sometimes that's a good thing. A lot of base classes have no protected data. The public string
interface is all that's needed to manipulate the contents, and useful functionality (e.g. get_project_id()
postulated above) can be elegantly expressed in terms of those operations. Conceptually, many times I've derived from Standard containers, I've wanted not to extend or customise their functionality along the existing lines - they're already "perfect" containers - rather I've wanted to add another dimension of behaviour that's specific to my application, and requires no private access. It's because they're already good containers that they're good to reuse.
If
super_string
hides some ofstring
's functions (and redefining a nonvirtual function in a derived class is not overriding, it's just hiding), that could cause widespread confusion in code that manipulatesstring
s that started their life converted automatically fromsuper_string
s.
True for composition too - and more likely to happen as the code doesn't default to passing things through and hence staying in sync, and also true in some situations with run-time polymorphic hierarchies as well. Samed named functions that behave differently in classes that initial appear interchangeable - just nasty. This is effectively the usual caution for correct OO programming, and again not a sufficient reason to abandon the benefits in type safety etc..
What if
super_string
wants to inherit fromstring
to add more state [explanation of slicing]
Agreed - not a good situation, and somewhere I personally tend to draw the line as it often moves the problems of deletion through a pointer to base from the realm of theory to the very practical - destructors aren't invoked for additional members. Still, slicing can often do what's wanted - given the approach of deriving super_string
not to change its inherited functionality, but to add another "dimension" of application-specific functionality....
Admittedly, it's tedious to have to write passthrough functions for the member functions you want to keep, but such an implementation is vastly better and safer than using public or nonpublic inheritance.
Well, certainly agree about the tedium....
std::string
you can't intercept a few functions and expect your objects to remain uppercase: any code that accesses them via a std::string&
or ...*
can use std::string
's original function implementations to change the value)Such derivation is not without issues so don't consider it unless the end result justifies the means. That said, I flatly reject any claim that this can't be used safely and appropriately in particular cases - it's just a matter of where to draw the line.
I do sometimes derive from std::map<>
, std::vector<>
, std::string
etc - I've never been burnt by the slicing or delete-via-base-class-pointer issues, and I've saved a lot of time and energy for more important things. I don't store such objects in heterogeneous polymorphic containers. But, you need to consider whether all the programmers using the object are aware of the issues and likely to program accordingly. I personally like to write my code to use heap and run-time polymorphism only when needed, while some people (due to Java backgrounds, their prefered approach to managing recompilation dependencies or switching between runtime behaviours, testing facilities etc.) use them habitually and therefore need to be more concerned about safe operations via base class pointers.
I think this statement reflects the confusion here (emphasis mine):
I do not understand what specifically is required in a class to be eligible for being a base clas (not a polymorphic one)?
In idiomatic C++, there are two uses for deriving from a class:
boost::iterator_facade
-- which show up when the CRTP is in use.There is absolutely no reason to publicly derive a class in C++ if you're not trying to do something polymorphic. The language comes with free functions as a standard feature of the language, and free functions are what you should be using here.
Think of it this way -- do you really want to force clients of your code to convert to using some proprietary string class simply because you want to tack on a few methods? Because unlike in Java or C# (or most similar object oriented languages), when you derive a class in C++ most users of the base class need to know about that kind of a change. In Java/C#, classes are usually accessed through references, which are similar to C++'s pointers. Therefore, there's a level of indirection involved which decouples the clients of your class, allowing you to substitute a derived class without other clients knowing.
However, in C++, classes are value types -- unlike in most other OO languages. The easiest way to see this is what's known as the slicing problem. Basically, consider:
int StringToNumber(std::string copyMeByValue) { std::istringstream converter(copyMeByValue); int result; if (converter >> result) { return result; } throw std::logic_error("That is not a number."); }
If you pass your own string to this method, the copy constructor for std::string
will be called to make a copy, not the copy constructor for your derived object -- no matter what child class of std::string
is passed. This can lead to inconsistency between your methods and anything attached to the string. The function StringToNumber
cannot simply take whatever your derived object is and copy that, simply because your derived object probably has a different size than a std::string
-- but this function was compiled to reserve only the space for a std::string
in automatic storage. In Java and C# this is not a problem because the only thing like automatic storage involved are reference types, and the references are always the same size. Not so in C++.
Long story short -- don't use inheritance to tack on methods in C++. That's not idiomatic and results in problems with the language. Use non-friend, non-member functions where possible, followed by composition. Don't use inheritance unless you're template metaprogramming or want polymorphic behavior. For more information, see Scott Meyers' Effective C++ Item 23: Prefer non-member non-friend functions to member functions.
EDIT: Here's a more complete example showing the slicing problem. You can see it's output on codepad.org
#include <ostream> #include <iomanip> struct Base { int aMemberForASize; Base() { std::cout << "Constructing a base." << std::endl; } Base(const Base&) { std::cout << "Copying a base." << std::endl; } ~Base() { std::cout << "Destroying a base." << std::endl; } }; struct Derived : public Base { int aMemberThatMakesMeBiggerThanBase; Derived() { std::cout << "Constructing a derived." << std::endl; } Derived(const Derived&) : Base() { std::cout << "Copying a derived." << std::endl; } ~Derived() { std::cout << "Destroying a derived." << std::endl; } }; int SomeThirdPartyMethod(Base /* SomeBase */) { return 42; } int main() { Derived derivedObject; { //Scope to show the copy behavior of copying a derived. Derived aCopy(derivedObject); } SomeThirdPartyMethod(derivedObject); }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With