1
\$\begingroup\$

Given a Timeit method which runs for n times the provided function, which approach between multithreading and multiprocessing should be better to speed up the execution of all iterations and avoid the loop? Multithreading would be memory safe? What would be a pseudo code of the implementation? Thanks in advance

    typedef std::chrono::milliseconds milliseconds;

    struct Result {

        milliseconds m_min;
        milliseconds m_max;
        milliseconds m_avg;
        milliseconds m_sum;

        Result(milliseconds min, milliseconds max, milliseconds avg,
            milliseconds sum)
            : m_min(min), m_max(max), m_avg(avg), m_sum(sum) {};

    };

    template<typename Function, typename ... Args>
    Result Timeit(uint16_t iters, bool verbose, Function& f, Args&& ... args) noexcept(false) {

        if (iters < 1) throw std::runtime_error("Iters must be >= 1.");

        typedef std::chrono::time_point<std::chrono::high_resolution_clock> time_point;
        typedef std::vector<std::chrono::milliseconds> Clocks;

        Clocks clocks;
        clocks.reserve(iters);

        for (int it = 0; it < iters; it++) {
            time_point t0 = std::chrono::high_resolution_clock::now();
            f(std::forward<Args>(args)...);
            time_point t1 = std::chrono::high_resolution_clock::now();
            clocks.emplace_back(
                std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count());
        }

        milliseconds max = *std::max_element(clocks.begin(), clocks.end());
        milliseconds min = *std::min_element(clocks.begin(), clocks.end());
        milliseconds sum = std::accumulate(clocks.begin(), clocks.end(), milliseconds(0));
        milliseconds avg = sum / clocks.size();

        if (verbose) {
            std::cout << "=======================================================================\n";
            std::cout << "=========================== SHOW TIMEIT RESULTS =======================\n";
            std::cout << "=======================================================================\n\n";

            std::cout << "\tIterations completed\t: " << iters << "\n";
            std::cout << "\tTotal milliseconds\t: " << sum << "\n";
            std::cout << "\tAverage milliseconds\t: " << avg << "\n";
            std::cout << "\tMaxima milliseconds\t: " << max << "\n";
            std::cout << "\tMinima milliseconds\t: " << min << "\n\n\n";
        }

        return Result(min, max, avg, sum);
    }
\$\endgroup\$
2
  • \$\begingroup\$ Welcome to Code Review! Is the code presented here code that you wrote or maintain? \$\endgroup\$ Commented Sep 27, 2022 at 17:13
  • \$\begingroup\$ It is a casual code that I wrote extending the functionalities of the classic Timer class. Also, is inspired by the Pythonic timeit. \$\endgroup\$
    – Liiqquid
    Commented Sep 27, 2022 at 17:40

1 Answer 1

1
\$\begingroup\$

Prefer using over typedef

using declarations are a bit easier to read than typedefs, especially when it comes to making aliases for function pointers or arrays.

Don't create type aliases unnecessarily; Clocks is used only once, so nothing was gained by making that an alias. Also, often you can avoid having to write a type explicitly by using auto.

Naming things

clocks sounds like it is storing clocks... but it is actually storing durations, so durations would be a better name.

Result is a very generic name, which could cause conflicts: consider that some other function might want to return a result in a struct. To solve this, either make sure the name is unique, like TimeitResult, or put everything in a namespace, which itself should have a unique enough name.

Store durations in the clock's own resolution

By using std::chrono::milliseconds to store durations, you are losing precision unnecessarily. Consider the case where one invocation of f() completes in less than one millisecond. It's better to not specify a fixed time resolution at all, but store time in the clock's native resolution:

using clock = std::chrono::high_resolution_clock;
using duration = clock::duration;

struct Result {
    duration min;
    duration max;
    duration avg;
    duration sum;
};

Only convert to milliseconds when you actually need to print the values:

std::vector<duration> durations;
...
auto t0 = clock::now();
f(std::forward<Args(args)...);
auto t1 = clock::now();
durations.emplace_back(t1 - t0);
...
std::cout << "\tTotal milliseconds\t: "
          << std::chrono::duration_cast<std::chrono::milliseconds>(sum).count() << "\n";

Note that it might be better to use std::chrono::steady_clock than std::chrono::high_resolution_clock; it's not guaranteed that the latter is monotonic.

Don't pass functions by (const) reference

You are passing f by reference, however this will prevent you from writing:

Timeit(1, true, []{});

Passing by const reference would solve this case, but would be problematic if the function passed in requires it to be mutable. So pass functions either by value or by rvalue reference.

Use std::minmax_element()

If you need both the minimum and maximum of a range, use std::minmax_element(). Since your code requires C++20, you can use std::ranges::minmax_element() and structured bindings to simply write:

auto [min, max] = std::ranges::minmax_element(clocks);

Don't let Timeit() print the results

Your Timeit() both performs timing measurements and handles printing the results to std::cout. This complicates the function, and it's inflexible: what if you want to print the results to a file instead? What if you want to print it to both std::cout and a file?

The solution is to not let Timeit() handle printing of the results at all. Instead, consider creating a friend ostream& operator<<() that will format a Result. If you want the "verbose" behaviour, the caller can then simply do:

std::cout << Timeit(...);

Other things to think about

Why is iters a 16-bit integer? What if you have a very fast function that you want to run a million times? I suggest you use a std::size_t instead for the number of iterations. Also make sure that the iteration variable it has the same type as iters.

If you don't know how fast a function is going to be, it's hard to come up with a value of iters that is large enough to get good statistics but not too large that it takes forever to run. However, you could let Timeit() figure out how many times to run the functions, for example by running it once and checking the time it took, and then calculating how many times it needs to run it so it runs for a small number of seconds.

The first time you run a function it might run quite slow because the caches are cold, the branch predictors have not seen anything to predict yet, and there are many other things that can take longer the first time. In fact, you might need to run the function many times before everything is warmed up. So ideally Timeit() should first do a dummy loop before doing the actual loop of iters iterations.

Multithreading

Given a Timeit method which runs for n times the provided function, which approach between multithreading and multiprocessing should be better to speed up the execution of all iterations and avoid the loop?

It's hard to say what the best approach would be. It would heavily depend on whether the provided function is meant to run in parallel in the same process or in separate processes.

Multithreading would be memory safe?

That again depends on the provided function. If it's not thread-safe, you obviously cannot run it in several threads at once.

What would be a pseudo code of the implementation?

That's not how it works here on Code Review; you provide a fully working implementation and we review it. And again, it depends on the provided function what the best way to run it is so you get realistic results.

However, if the function is thread-safe, then probably your best option would be to implement a thread pool, and then schedule iters executions of the provided function on the thread pool.

\$\endgroup\$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.