GotW-ish Solution: The ‘clonable’ pattern

This is the solution to GotW-ish: The ‘clonable’ pattern.

In summary, a distinguished C++ ISO C++ committee expert emailed me to ask:

[To avoid slicing], for each derived class, [I could] write something like

class D: public B {
public:
   shared_ptr<B> clone() const {
       return make_shared<D>(*this);   // not make_shared<B>
   }
   // ...
};

and then I can write

shared_ptr<B> b1 = /* as before */;
shared_ptr<B> b2 = b1->clone();

and b2 will now point to a shared_ptr<B> that is bound to an object with the same dynamic type as *b1.

However, this technique requires me to insert a member function into every class derived from B, with ugly bugs resulting from failure to do so.

So my question is whether there some way of accomplishing this automatically that I’ve missed?

Let’s take a look.

JG Question

  1. Describe as many approaches as you can think of that could let us semi- or fully-automate this pattern, over just writing it by hand every time as recommended in C++ Core Guidelines #C.130. What are each approach’s advantages and drawbacks?

There are two basic approaches in today’s C++: the Curiously Recurring Template Pattern (a.k.a. "CRTP"), and macros (a.k.a. "ick").

But first let’s consider a class of alternatives that is similar, even though it doesn’t answer this specific question or achieve the basic goal.

Nonintrusive solutions

There are nonintrusive solutions such as using type erasure, which don’t require the class to actually have a clone function. One example currently in the standardization proposal pipeline is P0201: A polymorphic value-type for C++. P0201 leads to code like this:

// The class hierarchy is unaffected

class B {
};

class C : public B {
};

class D : public C {
};

// Wrappers enable writing code that's similar to the question...

polymorphic_value<B> b1(D());           // similar to the target use case
polymorphic_value<B> b2 = poly;

The nonintrusive approaches are interesting too, but they don’t satisfy this particular question about how to automate the intrusive clone pattern. They also generally don’t satisfy the original motivation of the question which was to prevent slicing, because with nonintrusive approaches users can still create objects of the types directly and slice them:

D d;
B b = d;                                // oops, still works

Only an intrusive solution can make the copy constructor nonpublic or suppressed as part of automating the clone pattern, and all of the intrusive solutions can be extended to do this, with varying degrees of robustness and usability.

So, how can we automate the pattern in the question?

CRTP: The Curiously Recurring Template Pattern

Since C++98, the main recommended method is to use a variation of CRTP, the Curiously Recurring Template Pattern. The idea is that we instantiate a base class with our own type, and the base class provides the boilerplate we want. CRTP leads to code like this (live example — note that all the live examples use reflection to show the code that gets generated):

// User code: using the library to write our own types (many times)

class B : public clonable_base<B> {
};

class C : public clonable<B,B,C> {
};

class D : public clonable<B,C,D> {
};

shared_ptr<B> b1 = make_shared<D>();    // target use case works
shared_ptr<B> b2 = b1->clone();

The implementation typically looks something like this:

// Library code: implementing the CRTP helpers (once)

template <typename Self>
class clonable_base {
public:
    virtual std::unique_ptr<Self> clone() const {
        return std::make_unique<Self>(static_cast<const Self&>(*this));
    }
};

template <typename Base, typename Intermediate, typename Self>
class clonable : public Intermediate {
public:
    std::unique_ptr<Base> clone() const override {
        return std::make_unique<Self>(static_cast<const Self&>(*this));
    }
};

Advantages include:

  • It’s standard C++: Works on all compilers.
  • It semi-automates the pattern.
  • It’s extensible: It can be directly extended to require nonpublic copying.

Drawbacks include:

  • It’s incomplete and repetitive: It requires cooperation from the code that uses it to supply the right types. It also violates the DRY principle (don’t repeat yourself). If we have to repeat the types, we can get them wrong, and I did make that mistake while writing the samples.
  • It makes it harder to diagnose mistakes: If the supplied types are wrong, the error messages can be subtle. For example, as I was writing the live example, sometimes I wrote the template arguments incorrectly (because cut-and-paste), and it took me longer than I’d like to admit to diagnose the bug because the error message was related to the static_cast downcast inside the clonable implementation which wasn’t the root cause.

Macros

And there are, well, macros. They lead to code like this (live example):

// User code: using the macros to write our own types (many times)

class B {
    CLONABLE_BASE(B);
};

class C : public B {
    CLONABLE(B);
};

class D : public C {
    CLONABLE(B);
};

shared_ptr<B> b1 = make_shared<D>();    // target use case works
shared_ptr<B> b2 = b1->clone();

The implementation typically looks something like this:

// Library code: implementing the macros (once)

#define CLONABLE_BASE(Base) \
    virtual std::unique_ptr<Base> clone() const { \
        return std::unique_ptr<Base>(new Base(*this)); \
    }

#define CLONABLE(Base) \
    std::unique_ptr<Base> clone() const override { \
        using Self = std::remove_cv_t<std::remove_reference_t<decltype(*this)>>; \
        return std::unique_ptr<Self>(new Self(*this));  \
    }

Advantages include:

  • It’s standard C++: Works on all compilers.
  • It semi-automates the pattern: Though less so than CRTP did.
  • It’s extensible: It can be directly extended to require nonpublic copying.
  • It’s easier than CRTP to diagnose mistakes, if you have a modern compiler: If the supplied types are wrong, the error messages are more obvious, at least with a compiler that has good diagnostics for macros.

Drawbacks include:

  • It’s brittle: Macros are outside the language and can also alter other code in the same file. We hates macroses. Sneaky little macroses. Wicked. Tricksy. False.
  • It’s incomplete and repetitive: Like CRTP, we have to supply information and repeat things, but a little less than with CRTP.

Summary so far

You can find more examples and variations of these proposed by a number of people on the original post’s comments and on the Reddit thread.

Both CRTP and macros have drawbacks. And perhaps the most fundamental is this point from the original question (emphasis added):

However, [writing clone manually] requires me to insert a member function into every class derived from B, with ugly bugs resulting from failure to do so.

Can we do better?

Guru Question

  1. Show a working Godbolt.org link that shows how class authors can write as close as possible to this code with the minimum possible additional boilerplate code:
class B {
};

class C : public B {
};

class D : public C {
};

and that still permits the class’ users to write exactly the following:

shared_ptr<B> b1 = make_shared<D>();
shared_ptr<B> b2 = b1->clone();

Reflection and metaclasses: Basic "starter" solution

Future compile-time reflection will give us an opportunity to do better. The following is based on the active reflection proposals currently in the standardization proposal pipeline, and the syntactic sugar of writing a compile-time consteval metaclass function I am proposing in P0707. Note that draft C++20 already contains part of the first round of reflection-related work to land in the standard: consteval functions that are guaranteed to run at compile time, which came from the reflection work and are designed specifically to be used to manipulate reflection information.

The idea is that we use reflection to actually look at the class and compute and generate what we need. Three common things it lets us do are to express:

  • Requirements: We can check for mistakes in the users’ code, and report them with clean and readable compile-time diagnostics.
  • Defaults: We can apply defaults, such as to make member functions public by default.
  • Generated functions: We can generate functions, such as clone.

Let’s start with a simple direct example that does just answers the immediate question, and leads to code like this live example):

// User code: using the library to write our own types (many times)

class(clonable) B {
};

class(clonable) C : public B {
};

class(clonable) D : public C {
};

shared_ptr<B> b1 = make_shared<D>();    // target use case works
shared_ptr<B> b2 = b1->clone();

The implementation is a compile-time consteval function that takes the reflection of the class and inspects it:

consteval void clonable(meta::info source) {
    using namespace meta;

    // 1. Repeat bases and members

    for (auto mem : base_spec_range(source)) -> mem;
    for (auto mem : member_range(source)) -> mem;

    // 2. Now apply the clonable-specific default/requirements/generations:

    auto clone_type = type_of(source);          // if no base has a clone() we'll use our own type
    bool base_has_clone = false;                // remember whether we found a base clone already

    // For each base class...
    for (auto mem : base_spec_range(source)) {  
        // Compute clone() return type: Traverse this base class's member
        //  functions to find any clone() and remember its return type.
        //  If more than one is found, make sure the return types agree.
        for (auto base_mem : member_fn_range(mem)) {
            if (strcmp(name_of(base_mem), "clone") == 0) {
                compiler.require(!base_has_clone || clone_type == return_type_of(base_mem),
                    "incompatible clone() types found: if more than one base class introduces "
                    "a clone() function, they must have the same return type");
                clone_type = return_type_of(base_mem);
                base_has_clone = true;
            }
        }
    }

    // Apply generated function: provide polymorphic clone() function using computed clone_type
    if (base_has_clone) {   // then inject a virtual overrider
        -> __fragment struct Z {
            typename(clone_type) clone() const override {
                return std::unique_ptr<Z>(new Z(*this));  // invoke nonpublic copy ctor
            }
        };
    }
    else {                  // else inject a new virtual function
        -> __fragment struct Z {
            virtual std::unique_ptr<Z> clone() const {
                return std::unique_ptr<Z>(new Z(*this));  // invoke nonpublic copy ctor
            }
        };
    }
};

Advantages include:

  • It fully automates the pattern.
  • It’s extensible: It can be directly extended to require nonpublic copying. (See next section.)
  • It’s complete and nonrepetitive: The code that uses clonable only has to say that one word. It doesn’t have to supply the right types or repeat names; reflection lets the metafunction discover and compute exactly what it needs, accurately every time.
  • It’s easy to diagnose mistakes: We can’t make the mistakes we made with CRTP and macros, plus we get as many additional new high-quality diagnostics we might want. In this example, we already get a clear compile-time error if we create a class hierarchy that introduces clone() twice with two different types.

Drawbacks include:

  • It’s not yet standard C++: The reflection proposals are progressing not but yet ready to be adopted.

But wait… all of the solutions so far are flawed

It turns out that by focusing on clone and showing empty-class examples, we have missed a set of usability and correctness problems. Fortunately, we will solve those too in just a moment.

Consider this slightly more complete example of the above code to show what it’s like to write a non-empty class, and a print test function that lets us make sure we really are doing a deep clone:

class(clonable) B {
public:
    virtual void print() const { std::cout << "B"; }
private:
    int bdata;
};

class(clonable) C : public B {
public:
    void print() const override { std::cout << "C"; }
private:
    int cdata;
};

class(clonable) D : public C {
public:
    void print() const override { std::cout << "D"; }
private:
    int ddata;
};

This "works" fine. But did you notice it has pitfalls?

Take a moment to think about it: If you encountered this code in a code review, would you approve it?


OK, for starters, all of these classes are polymorphic, but all of them have public non-virtual destructors and public copy constructors and copy assignment operators. That’s not good. Remember one of the problems of a nonintrusive solution was that it doesn’t actually prevent slicing because you can still write this:

D d;
B b = d;                                // oops, still works

So what we should actually be writing using all of the solutions so far is something like this:

class(clonable) B {
public:
    virtual void print() const { std::cout << "B"; }
    virtual ~B() noexcept { }
    B() = default;
protected:
    B(const B &) = default;
    B& operator=(const B&) = delete;
private:
    int bdata;
};

class(clonable) C : public B {
public:
    void print() const override { std::cout << "C"; }
    ~C() noexcept override { }
    C() = default;
protected:
    C(const C &) = default;
    C& operator=(const C&) = delete;
private:
    int cdata;
};

class(clonable) D : public C {
public:
    void print() const override { std::cout << "D"; }
    ~D() noexcept override { }
    D() = default;
protected:
    D(const D &) = default;
    D& operator=(const D&) = delete;
private:
    int ddata;
};

That’s a lot of boilerplate.

In fact, it turns out that even though the original question was about the boilerplate code of the clone function, most of the boilerplate is in other functions assumed and needed by clone pattern that weren’t even mentioned in the original question, but come up as soon as you try to use the proposed patterns in even simple real code.

Metaclasses: Fuller "real" solution

Fortunately, as I hinted above, we can do even better. The metaclass function can take care of all of this for us:

  • Apply default accessibilities and qualifiers: Make base classes and member functions public by default, data members private by default, and the destructor virtual by default.
  • Apply requirements: Check and enforce that a polymorphic type doesn’t have a public copy/move constructor, doesn’t have assignment operators, and that the destructor is either public and virtual or protected and nonvirtual. Note that these are accurate compile-time errors, the best kind.
  • Generate functions: Generate a public virtual destructor if the user doesn’t provide one. Generate a protected copy constructor if the user doesn’t provide one. Generate a default constructor if all bases and members are default constructible.

Now the same user code is:

  • Simple and clean. As far as I can tell, it literally could not be significantly shorter — we have encapsulated a whole set of opt-in defaults, requirements, and generated functions under the single word "clonable" library name that a class author can opt into by uttering that single Word of Power.
  • Correct by default.
  • Great error messages if the user writes a mistake.

Live example

class(clonable) B {
    virtual void print() const { std::cout << "B"; }
    int bdata;
};

class(clonable) C : B {
    void print() const override { std::cout << "C"; }
    int cdata;
};

class(clonable) D : C {
    void print() const override { std::cout << "D"; }
    int ddata;
};

That’s it. (And, I’ll add: This is "simplifying C++.")

How did we do it?

In my consteval library, I added the following polymorphic metaclass function, which is invoked by clonable (i.e., a clonable is-a polymorphic). I made it a separate function for just the usual good code factoring reasons: polymorphic offers nicely reusable behavior even for non-clonable types. Here is the code, in addition to the above cloneable which adds the computed clone at the end — and remember, we only need to write the following library code once, and then class authors can enjoy the above simplicity forever:

// Library code: implementing the metaclass functions (once)

consteval void polymorphic(meta::info source) {
    using namespace meta;

    // For each base class...
    bool base_has_virtual_dtor = false;
    for (auto mem : base_spec_range(source)) {

        // Remember whether we found a virtual destructor in a base class
        for (auto base_mem : member_fn_range(mem))
            if (is_destructor(base_mem) && is_virtual(base_mem)) {
                base_has_virtual_dtor = true;
                break;
            }

        // Apply default: base classes are public by default
        if (has_default_access(mem))
            make_public(mem);

        // And inject it
        -> mem;
    }

    // For each data member...
    for (auto mem : data_member_range(source)) {

        // Apply default: data is private by default
        if (has_default_access(mem))
            make_private(mem);

        // Apply requirement: and the programmer must not have made it explicitly public
        compiler.require(!is_public(mem),
            "polymorphic classes' data members must be nonpublic");

        // And inject it
        -> mem;
    }

    // Remember whether the user declared these SMFs we will otherwise generate
    bool has_dtor         = false;
    bool has_default_ctor = false;
    bool has_copy_ctor    = false;

    // For each member function...
    for (auto mem : member_fn_range(source)) {
        has_default_ctor |= is_default_constructor(mem);

        // If this is a copy or move constructor...
        if ((has_copy_ctor |= is_copy_constructor(mem)) || is_move_constructor(mem)) {
            // Apply default: copy/move construction is protected by default in polymorphic types
            if (has_default_access(mem))
                make_protected(mem);

            // Apply requirement: and the programmer must not have made it explicitly public
            compiler.require(!is_public(mem),
                "polymorphic classes' copy/move constructors must be nonpublic");
        }

        // Apply requirement: polymorphic types must not have assignment
        compiler.require(!is_copy_assignment_operator(mem) && !is_move_assignment_operator(mem),
            "polymorphic classes must not have assignment operators");

        // Apply default: other functions are public by default
        if (has_default_access(mem))
            make_public(mem);

        // Apply requirement: polymorphic class destructors must be
        // either public and virtual, or protected and nonvirtual
        if (is_destructor(mem)) {
            has_dtor = true;
            compiler.require((is_protected(mem) && !is_virtual(mem)) ||
                             (is_public(mem) && is_virtual(mem)),
                "polymorphic classes' destructors must be public and virtual, or protected and nonvirtual");
        }

        // And inject it
        -> mem;
    }

    // Apply generated function: provide default for destructor if the user did not
    if (!has_dtor) {
        if (base_has_virtual_dtor)
            -> __fragment class Z { public: ~Z() noexcept override { } };
        else
            -> __fragment class Z { public: virtual ~Z() noexcept { } };
    }

    // Apply generated function: provide defaults for constructors if the user did not
    if (!has_default_ctor)
         -> __fragment class Z { public: Z() =default; };
    if (!has_copy_ctor)
         -> __fragment class Z { protected: Z(const Z&) =default; };

}

11 thoughts on “GotW-ish Solution: The ‘clonable’ pattern

  1. @HenrikVallgren: It is only an experimental implementation for now. The link is in P0707 if you want to build it yourself locally and try it outside of Godbolt.

    @Freekjan: Yes, allowing a metaclass function to apply to derived classes is an extension we’re considering. Combine that with @Marzo’s extension for example…

  2. Wow, Herb!

    This could fixe the one ugly hack in a class hierarchy of mine. Is it available in a compiler yet?

  3. While this is clearly a step in the right direction, programmers new to metaclasses would still be used to typing

    class E: public D {
    public:
        void print() const override { std::cout << "E"; }
    private:
        int edata;
    };
    

    when modifying the inheritance hierarchy. I did not have the time to fully look into the consequences, but would it be possible to make metaclasses somehow inheritable, such that, by default, E would also become clonable, just because its superclasses are also clonable. At least is this case, it would make for the true answer to the guru question where a bit of library code and a small modification to only the base class would make the entire type hierarchy do the right thing.

  4. Forgot to use code block in my previous comment and now it looks weird and gibberish :(

    What was lost in the formatting is that it would be nice if the following cloning was possible:

    B* b; C* c; D* d; // just here to indicate type of b, c, d
    unique_ptr<B> = b->clone();
    unique_ptr<C> = c->clone();
    unique_ptr<D> = d->clone();
    

    But we can’t have that can we – It’s a bummer that the smart pointers can’t be used as covariant return types, like raw pointers can. Is there anything that could be done to address that?

  5. This would be perfect if B::clone returned unique_ptr, C::clone returned unique_ptr and D::clone returned unique_ptr, but we can’t have that can we – It’s a bummer that the smart pointers can’t be used as covariant return types, like raw pointers can. Is there anything that could be done to address that?

  6. @Paul: Fixed, thanks!

    @Joerg: Fixed, thanks!

    @Nick asked on the original thread: “I haven’t tried it yet but can’t the macro versions lose a parameter by using decltype(*this) for the current class?” You are right, and I had meant to add that to the macro solution but forgot… now added, thanks! Note that you have to spell it a little more cumbersomely as std::remove_cv_t<std::remove_reference_t>… in C++20 it’ll be slightly better, std::remove_cvref_t.

  7. GotW-ish Solution: The ‘clonable’ patter -> pattern
    Missed ‘n’.

  8. Beautiful!

    Just a minor correction: I think it should be

    meta::compiler.require(!base_has_clone || clone_type == return_type_of(base_mem),
    

    (with type_of(base_mem) it goes wrong if you actually have two bases introducing the same clone method)

Comments are closed.