Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ Tuple vs Struct

Tags:

c++

struct

tuples

Is there is any difference between using a std::tuple and a data-only struct?

typedef std::tuple<int, double, bool> foo_t;

struct bar_t {
    int id;
    double value;
    bool dirty;
}

From what I have found online, I found that there are two major differences: the struct is more readable, while the tuple has many generic functions that can be used. Should there be any significant performance difference? Also, is the data layout compatible with each other (interchangeably casted)?

like image 620
Alex Koay Avatar asked May 01 '11 23:05

Alex Koay


People also ask

What is the difference between a tuple and a struct?

So, use tuples when you want to return two or more arbitrary pieces of values from a function, but prefer structs when you have some fixed data you want to send or receive multiple times.

Is tuple faster than struct?

Now our struct is slightly faster than that of a tuple now (around 3% with clang and less than 1% with gcc), however, we do need to write our customized swap function for all of our structs.

Is tuple a struct?

Since tuples are structs there is no much point in making them immutable. The readonliness of a struct is a property of the whole variable. If a tuple variable is assignable, it can be changed to contain any value regardless of the readonliness of individual fields.

Is tuple a C++?

Tuples in C++ A tuple is an object that can hold a number of elements. The elements can be of different data types. The elements of tuples are initialized as arguments in order in which they will be accessed.


11 Answers

We have a similar discussion about tuple and struct and I write some simple benchmarks with the help from one of my colleague to identify the differences in term of performance between tuple and struct. We first start with a default struct and a tuple.

struct StructData {
    int X;
    int Y;
    double Cost;
    std::string Label;

    bool operator==(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) == std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }

    bool operator<(const StructData &rhs) {
        return X < rhs.X || (X == rhs.X && (Y < rhs.Y || (Y == rhs.Y && (Cost < rhs.Cost || (Cost == rhs.Cost && Label < rhs.Label)))));
    }
};

using TupleData = std::tuple<int, int, double, std::string>;

We then use Celero to compare the performance of our simple struct and tuple. Below is the benchmark code and performance results collected using gcc-4.9.2 and clang-4.0.0:

std::vector<StructData> test_struct_data(const size_t N) {
    std::vector<StructData> data(N);
    std::transform(data.begin(), data.end(), data.begin(), [N](auto item) {
        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<> dis(0, N);
        item.X = dis(gen);
        item.Y = dis(gen);
        item.Cost = item.X * item.Y;
        item.Label = std::to_string(item.Cost);
        return item;
    });
    return data;
}

std::vector<TupleData> test_tuple_data(const std::vector<StructData> &input) {
    std::vector<TupleData> data(input.size());
    std::transform(input.cbegin(), input.cend(), data.begin(),
                   [](auto item) { return std::tie(item.X, item.Y, item.Cost, item.Label); });
    return data;
}

constexpr int NumberOfSamples = 10;
constexpr int NumberOfIterations = 5;
constexpr size_t N = 1000000;
auto const sdata = test_struct_data(N);
auto const tdata = test_tuple_data(sdata);

CELERO_MAIN

BASELINE(Sort, struct, NumberOfSamples, NumberOfIterations) {
    std::vector<StructData> data(sdata.begin(), sdata.end());
    std::sort(data.begin(), data.end());
    // print(data);

}

BENCHMARK(Sort, tuple, NumberOfSamples, NumberOfIterations) {
    std::vector<TupleData> data(tdata.begin(), tdata.end());
    std::sort(data.begin(), data.end());
    // print(data);
}

Performance results collected with clang-4.0.0

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    196663.40000 |            5.08 | 
Sort            | tuple           | Null            |              10 |               5 |         0.92471 |    181857.20000 |            5.50 | 
Complete.

And performance results collected using gcc-4.9.2

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    219096.00000 |            4.56 | 
Sort            | tuple           | Null            |              10 |               5 |         0.91463 |    200391.80000 |            4.99 | 
Complete.

From the above results we can clearly see that

  • Tuple is faster than a default struct

  • Binary produce by clang has higher performance that that of gcc. clang-vs-gcc is not the purpose of this discussion so I won't dive into the detail.

We all know that writing a == or < or > operator for every single struct definition will be a painful and buggy task. Let replace our custom comparator using std::tie and rerun our benchmark.

bool operator<(const StructData &rhs) {
    return std::tie(X,Y,Cost, Label) < std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
}

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    200508.20000 |            4.99 | 
Sort            | tuple           | Null            |              10 |               5 |         0.90033 |    180523.80000 |            5.54 | 
Complete.

Now we can see that using std::tie makes our code more elegant and it is harder to make mistake, however, we will loose about 1% performance. I will stay with the std::tie solution for now since I also receive a warning about comparing floating point numbers with the customized comparator.

Until now we have not has any solution to make our struct code run faster yet. Let take a look at the swap function and rewrite it to see if we can gain any performance:

struct StructData {
    int X;
    int Y;
    double Cost;
    std::string Label;

    bool operator==(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) == std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }

    void swap(StructData & other)
    {
        std::swap(X, other.X);
        std::swap(Y, other.Y);
        std::swap(Cost, other.Cost);
        std::swap(Label, other.Label);
    }  

    bool operator<(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) < std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }
};

Performance results collected using clang-4.0.0

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    176308.80000 |            5.67 | 
Sort            | tuple           | Null            |              10 |               5 |         1.02699 |    181067.60000 |            5.52 | 
Complete.

And the performance results collected using gcc-4.9.2

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    198844.80000 |            5.03 | 
Sort            | tuple           | Null            |              10 |               5 |         1.00601 |    200039.80000 |            5.00 | 
Complete.

Now our struct is slightly faster than that of a tuple now (around 3% with clang and less than 1% with gcc), however, we do need to write our customized swap function for all of our structs.

like image 158
hungptit Avatar answered Oct 02 '22 20:10

hungptit


Well, here's a benchmark that doesn't construct a bunch of tuples inside the struct operator==(). Turns out there's a pretty significant performance impact from using tuple, as one would expect given that there's no performance impact at all from using PODs. (The address resolver finds the value in the instruction pipeline before the logic unit ever even sees it.)

Common results from running this on my machine with VS2015CE using the default 'Release' settings:

Structs took 0.0814905 seconds.
Tuples took 0.282463 seconds.

Please monkey with it until you're satisfied.

#include <iostream>
#include <string>
#include <tuple>
#include <vector>
#include <random>
#include <chrono>
#include <algorithm>

class Timer {
public:
  Timer() { reset(); }
  void reset() { start = now(); }

  double getElapsedSeconds() {
    std::chrono::duration<double> seconds = now() - start;
    return seconds.count();
  }

private:
  static std::chrono::time_point<std::chrono::high_resolution_clock> now() {
    return std::chrono::high_resolution_clock::now();
  }

  std::chrono::time_point<std::chrono::high_resolution_clock> start;

};

struct ST {
  int X;
  int Y;
  double Cost;
  std::string Label;

  bool operator==(const ST &rhs) {
    return
      (X == rhs.X) &&
      (Y == rhs.Y) &&
      (Cost == rhs.Cost) &&
      (Label == rhs.Label);
  }

  bool operator<(const ST &rhs) {
    if(X > rhs.X) { return false; }
    if(Y > rhs.Y) { return false; }
    if(Cost > rhs.Cost) { return false; }
    if(Label >= rhs.Label) { return false; }
    return true;
  }
};

using TP = std::tuple<int, int, double, std::string>;

std::pair<std::vector<ST>, std::vector<TP>> generate() {
  std::mt19937 mt(std::random_device{}());
  std::uniform_int_distribution<int> dist;

  constexpr size_t SZ = 1000000;

  std::pair<std::vector<ST>, std::vector<TP>> p;
  auto& s = p.first;
  auto& d = p.second;
  s.reserve(SZ);
  d.reserve(SZ);

  for(size_t i = 0; i < SZ; i++) {
    s.emplace_back();
    auto& sb = s.back();
    sb.X = dist(mt);
    sb.Y = dist(mt);
    sb.Cost = sb.X * sb.Y;
    sb.Label = std::to_string(sb.Cost);

    d.emplace_back(std::tie(sb.X, sb.Y, sb.Cost, sb.Label));
  }

  return p;
}

int main() {
  Timer timer;

  auto p = generate();
  auto& structs = p.first;
  auto& tuples = p.second;

  timer.reset();
  std::sort(structs.begin(), structs.end());
  double stSecs = timer.getElapsedSeconds();

  timer.reset();
  std::sort(tuples.begin(), tuples.end());
  double tpSecs = timer.getElapsedSeconds();

  std::cout << "Structs took " << stSecs << " seconds.\nTuples took " << tpSecs << " seconds.\n";

  std::cin.get();
}
like image 41
Khatharr Avatar answered Oct 02 '22 19:10

Khatharr


If you're using several different tuples in your code you can get away with condensing the number of functors you are using. I say this because I've often used the following forms of functors:

template<int N>
struct tuple_less{
    template<typename Tuple>
    bool operator()(const Tuple& aLeft, const Tuple& aRight) const{
        typedef typename boost::tuples::element<N, Tuple>::type value_type;
        BOOST_CONCEPT_REQUIRES((boost::LessThanComparable<value_type>));

        return boost::tuples::get<N>(aLeft) < boost::tuples::get<N>(aRight);
    }
};

This might seem like overkill but for each place within the struct I'd have to make a whole new functor object using a struct but for a tuple, I just change N. Better than that, I can do this for every single tuple as opposed to creating a whole new functor for each struct and for each member variable. If I have N structs with M member variables that NxM functors I would need to create (worse case scenario) that can be condensed down to one little bit of code.

Naturally, if you're going to go with the Tuple way, you're also going to need to create Enums for working with them:

typedef boost::tuples::tuple<double,double,double> JackPot;
enum JackPotIndex{
    MAX_POT,
    CURRENT_POT,
    MIN_POT
};

and boom, you're code is completely readable:

double guessWhatThisIs = boost::tuples::get<CURRENT_POT>(someJackPotTuple);

because it describes itself when you want to get the items contained within it.

like image 28
wheaties Avatar answered Oct 02 '22 18:10

wheaties


Tuple has built in default(for == and != it compares every element, for <.<=... compares first, if same compares second...) comparators: http://en.cppreference.com/w/cpp/utility/tuple/operator_cmp

edit: as noted in the comment C++20 spaceship operator gives you a way to specify this functionality with one(ugly, but still just one) line of code.

like image 45
NoSenseEtAl Avatar answered Oct 02 '22 20:10

NoSenseEtAl


Also, is the data layout compatible with each other (interchangeably casted)?

Oddly I can't see a direct response to this part of the question.

The answer is: no. Or at least not reliably, as the layout of the tuple is unspecified.

Firstly, your struct is a Standard Layout Type. The ordering, padding and alignment of the members are well-defined by a combination of the standard and your platform ABI.

If a tuple was a standard layout type, and we knew the fields were laid out in the order the types are specified, we might have some confidence it would match the struct.

The tuple is normally implemented using inheritance, in one of two ways: the old Loki/Modern C++ Design recursive style, or the newer variadic style. Neither is a Standard Layout type, because both violate the following conditions:

  1. (prior to C++14)

    • has no base classes with non-static data members, or

    • has no non-static data members in the most derived class and at most one base class with non-static data members

  2. (for C++14 and later)

    • Has all non-static data members and bit-fields declared in the same class (either all in the derived or all in some base)

since each leaf base class contains a single tuple element (NB. a single-element tuple probably is a standard layout type, albeit not a very useful one). So, we know the standard does not guarantee the tuple has the same padding or alignment as the struct.

Additionally, it's worth noting that the older recursive-style tuple will generally lay out the data members in reverse order.

Anecdotally, it has sometimes worked in practice for some compilers and combinations of field types in the past (in one case, using recursive tuples, after reversing the field order). It definitely doesn't work reliably (across compilers, versions etc.) now, and was never guaranteed in the first place.

like image 27
Useless Avatar answered Oct 02 '22 19:10

Useless


Well, a POD struct can often be (ab)used in low-level contiguous chunk reading and serializing. A tuple might be more optimized in certain situations and support more functions, as you said.

Use whatever is more appropriate for the situation, there ain't no general preference. I think (but I haven't benchmarked it) that performance differences won't be significant. The data layout is most likely not compatible and implementation specific.

like image 34
orlp Avatar answered Oct 02 '22 19:10

orlp


Judging by other answers, performance considerations are minimal at best.

So it really should come down to practicality, readability and maintainability. And struct is generally better because it creates types which are easier to read and to understand.

Sometimes, a std::tuple (or even std::pair) might be necessary to deal with code in a highly generic way. For example, some operations related to variadic parameter packs would be impossible without something like std::tuple. std::tie is a great example of when std::tuple can improve code (prior to C++20).

But anywhere you can use a struct, you probably should use a struct. It will bestow semantic meaning on the elements of your type. That's invaluable in understanding and using the type. In turn, this can help avoid silly mistakes:

// hard to get wrong; easy to understand
cat.arms = 0;
cat.legs = 4;

// easy to get wrong; hard to understand
std::get<0>(cat) = 0;
std::get<1>(cat) = 4;
like image 21
John McFarlane Avatar answered Oct 02 '22 19:10

John McFarlane


As far as the "generic function" go, Boost.Fusion deserves some love... and especially BOOST_FUSION_ADAPT_STRUCT.

Ripping from the page: ABRACADBRA

namespace demo
{
    struct employee
    {
        std::string name;
        int age;
    };
}

// demo::employee is now a Fusion sequence
BOOST_FUSION_ADAPT_STRUCT(
    demo::employee
    (std::string, name)
    (int, age))

This means that all Fusion algorithms are now applicable to the struct demo::employee.


EDIT: Regarding the performance difference or layout compatibility, tuple's layout is implementation defined so not compatible (and thus you should not cast between either representation) and in general I would expect no difference performance-wise (at least in Release) thanks to the inlining of get<N>.

like image 31
Matthieu M. Avatar answered Oct 02 '22 19:10

Matthieu M.


My experience is that over time functionality starts to creep up on types (like POD structs) which used to be pure data holders. Things like certain modifications which shouldn't require inside knowledge of the data, maintaining invariants etc.

That is a good thing; it's the foundation of object orientation. It is the reason why C with classes was invented. Using pure data collections like tuples are not open to such logical extension; structs are. That's why I would almost always opt for structs.

Related is that like all "open data objects", tuples violate the information hiding paradigm. You cannot change that later without throwing out the tuple wholesale. With a struct, you can move gradually towards access functions.

Another issue is type safety and self-documenting code. If your function receives an object of type inbound_telegram or location_3D it's clear; if it receives a unsigned char * or tuple<double, double, double> it is not: the telegram could be outbound, and the tuple could be a translation instead of a location, or perhaps the minimum temperature readings from the long weekend. Yes, you can typedef to make intentions clear but that does not actually prevent you from passing temperatures.

These issues tend to become important in projects which exceed a certain size; the disadvantages of tuples and advantages of elaborate classes become not visible and indeed are an overhead in small projects. Starting with proper classes even for inconspicuous little data aggregates pays late dividends.

Of course one viable strategy would be to use a pure data holder as the underlying data provider for a class wrapper which provides operations on that data.

like image 24
Peter - Reinstate Monica Avatar answered Oct 02 '22 19:10

Peter - Reinstate Monica


Don't worry about speed or layout, that's nano-optimisation, and depends on the compiler, and there's never enough difference to influence your decision.

You use a struct for things that meaningfully belong together to form a whole.

You use a tuple for things that are together coincidentally. You can use a tuple spontaneously in your code.

like image 42
gnasher729 Avatar answered Oct 02 '22 18:10

gnasher729


There shouldn't be a performance difference (even an insignificant one). At least in the normal case, they will result in the same memory layout. Nonetheless, casting between them probably isn't required to work (though I'd guess there's a pretty fair chance it normally will).

like image 42
Jerry Coffin Avatar answered Oct 02 '22 18:10

Jerry Coffin