3

Is the following Move operations legit, as in there is no UB. Instead of two swaps I want to implement a one swap of the 16 Bytes of MyString.

class MyString {
    std::size_t size_;
    char* str_;
    
    MyString(const char* str, std::size_t len): size_(len), str_(size_ ? new char[size_+1] : nullptr) {
      if (size_) {
        memcpy(str_, str, len+1);
      } 
    }

    void bitwise_swap(MyString* l, MyString* r) {
        constexpr size_t size = sizeof(MyString);
        alignas (MyString)  char tmp[size]; // is alignas even necessary? 
        std::memcpy(&tmp, l, size);
        std::memcpy(l, r, size);
        std::memcpy(r, &tmp, size);
    }
    
public:    
    MyString(const char* str = ""): MyString(str, strlen(str)) {
    }
    
    MyString(MyString&& s) noexcept : size_(0), str_(nullptr) {
        bitwise_swap(this, &s);
    }
    
    MyString& operator=(MyString&& s) {
        bitwise_swap(this, &s); 
        return *this;
    }
    
    ~MyString() {
      delete[] str_;
    }
};

Note that MyString is clearly not trivially copyable. However I'm always swapping the internal bit representation of one MyString object with another.

Also, I'm fine with the operator= swaping with it's parameter. If xvalue is received (i.e lvalue that was std::moved, I'm fine with it carrying my "garbage"). So, please compare to this versions:

MyString(MyString&& s) noexcept : size_(0), str_(nullptr) {
   std::swap(size_, s.size_);
   std::swap(str_, s.str_);
}

And:

MyString& operator=(MyString&& s) {
   std::swap(size_, s.size_);
   std::swap(str_, s.str_);
   return *this;
}

If this code is valid. Some would (rightfully) say this is a premature optimization, but can anyone claim one version is usually more efficient than the other or that typical compilers typically do this kind of optimization themselves? i.e turn the two swaps of 8Bytes into one swap of 16Bytes.

This version is defined according to the standard: https://compiler-explorer.com/z/xanvffGbG But isn't the straightforward code also valid? Can anyone quote the standard to prove it is valid or that it is not?

2 Answers 2

2

Note that MyString is clearly not trivially copyable. However I'm always swapping the internal bit representation of one MyString object with another.

std::memcpy on a non-trivially-copyable type is UB. Specifically [basic.types.general]/2 and [basic.types.general]/3 specify the behavior of copying underlying bytes from an object to a character array and back to an object of the same type only for trivially-copyable types.

Even if the type was trivially-copyable, it would not generally be defined behavior. The mentioned paragraphs additionally require that neither object is potentially-overlapping, meaning for example that they must not be base class subobjects.

For an example why the last part is important: A derived class might reuse tail padding of the base class and in that case memcpy would incorrectly overwrite the tail padding of the derived object.

Swapping the individual members one-by-one on the other hand is fine. (That would be the case even if you use memcpy, because they are trivially-copyable and not potentially-overlapping).

7
  • Thanks! As per the links [basic.types.general]/2 and [basic.types.general]/3, memcpy of trivially copyable is defined. But I don't see anything claiming that that memcpy of described data is UB.
    – dwto
    Commented Nov 20 at 8:16
  • @dwto Nothing is describing the behavior for other types, so it is undefined. Given that the paragraphs mention the type properies explicitly, it is also not likely to be an unintended omission. Commented Nov 20 at 14:50
  • @dwto Also remember that the standard doesn't even require the storage occupied by non-trivially-copyable and non-standard-layout types to be contiguous, nor does it define any relation between object representations and values for non-trivially-copyable types. Commented Nov 20 at 14:52
  • To my knowledge standard requires: 1. each object to be of same size in bytes. 2. all members appear in natural order. 3. padding and vtable may be part of object. 4. all object of type T must have the same layout. From all of the above and since MyString has no virtual function and hence no vtable I still don't see a rule that prevents the MyString code above. Please correct me if I'm wrong.
    – dwto
    Commented Nov 20 at 20:09
  • @dwto 1. Each object type has a specific size which corresponds to the size of storage occupied by a complete object of the type. This doesn't apply in general for other objects though. In general objects are not contiguous in storage. Consider virtual bases for an example in practice; 2. That applies to members, but not base classes; 3. yes; 4. I don't think that is actually guaranteed by the standard. For example offsetof is only conditionally-supported for non-standard-layout types. I don't see any guarantee of this for non-trivially-copyable and non-standard-layout types. Commented Nov 20 at 20:25
1

Note that MyString is clearly not trivially copyable. However I'm always swapping the internal bit representation of one MyString object with another.

Your memcpy is equivalent to copying the two trivially-copyable data members (and only them - there should be no padding on any common system as size_t and pointers are both basically counts of bytes in the address space and should have the same bit width, there's no base class or vtable pointer...), so it has defined behaviour.

That said, I'd hazard that memcpy is (an ugly hack and) very unlikely to be faster. But, why trust our gut instinct when you can try it for the compilers/version/optimisation-flags you care about on Compiler Explorer - https://godbolt.org/ - and see for yourself?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.