2

I love PLINQ. In all of my test cases on various parallelization patterns, it performs well and consistently. But I recently ran into a question that has me a little bothered: what is the functional difference between these two examples? What, hopefully if anything, qualifies the PLINQ example to not be similar or equivalent to the following anti-pattern?

PLINQ:

public int PLINQSum()
{
    return Enumerable.Range(0, N)
        .AsParallel()
        .Select((x) => x + 1)
        .Sum();
}

Sync over async:

public int AsyncSum()
{
    var tasks = Enumerable.Range(0, N)
        .Select((x) => Task.Run(() => x + 1));

    return Task.WhenAll(tasks).Result.Sum();
}
3
  • 1
    One observed difference in behavior: in my tests, where N is the number of operations to complete, and M is the size of the operations, as N increases and M decreases, PLINQ remains at or below synchronous runtime, whereas the async example above skyrockets to several orders of magnitude slower than the synchronous test.
    – RTD
    Commented Feb 7, 2022 at 3:47
  • 1
    See this answer. It discusses the difference between Parallel.ForEach vs Task.Run within foreach, but this equally applies to PLINQ. Commented Feb 7, 2022 at 9:03
  • The other difference in your case is that as soon as you nest Task.Run in a Select, you are then working with Task<int> objects and not int values, so you cannot easily chain additional PLINQ operations (e.g. Where) without waiting for those tasks to complete. Commented Feb 7, 2022 at 9:05

1 Answer 1

3

The AsyncSum method is not an example of Sync over Async. It is an example of using the Task.Run method with the intention of parallelizing a calculation. You might think that Task = async, but it's not. The Task class was introduced with the .NET Framework 4.0 in 2010, as part of the Task Parallel Library, two years before the advent of the async/await technology with the .NET Framework 4.5 in 2012.

What is Sync over Async: We use this term to describe a situation where an asynchronous API is invoked and then waited synchronously, causing a thread to be blocked until the completion of the asynchronous operation. It is implied that the asynchronous API has a truly asynchronous implementation, meaning that it uses no thread while the operation is in-flight. Most, but not all, of the asynchronous APIs that are built-in the .NET platform have truly asynchronous implementations.

The two examples in your question are technically different, but not because one of them is Sync over Async. None of them is. Both are parallelizing a synchronous operation (the mathematical addition x + 1), that cannot be performed without utilizing the CPU. And when we use the CPU, we use a thread.

Characterizing the AsyncSum method as anti-pattern might be fair, but not because it is Sync over Async. You might want to call it anti-pattern because:

  1. It allocates and schedules a Task for each number in the sequence, incurring a gigantic overhead compared to the tiny computational work that has to be performed.
  2. It saturates the ThreadPool for the whole duration of the parallel operation.
  3. It forces the ThreadPool to create additional threads, resulting in oversubscription (more threads than CPUs). This results in the operating system having more work to do (switching between threads).
  4. It has bad behavior in case of exceptions. Instead of stopping the operation as soon as possible after an error has occurred, it will invoke the lambda invariably for all elements in the sequence. As a result you'll have to wait for longer until you observe the error, and finally you might observe a huge number of errors.
  5. It doesn't utilize the current thread. The current thread is blocked doing nothing, while all the work is done by ThreadPool threads. In comparison the PLINQ utilizes the current thread as one of its worker threads. This is something that you could also do manually, by creating some of the tasks with the Task constructor (instead of Task.Run), and then use the RunSynchronously method in order to run them on the current thread, while the rest of the tasks are scheduled on the ThreadPool.
var task1 = new Task<int>(() => 1 + 1); // Cold task
var task2 = Task.Run(() => 2 + 2); // Hot task scheduled on the ThreadPool
task1.RunSynchronously(); // Run the cold task on the current thread
int sum = Task.WhenAll(task1, task2).Result.Sum(); // Wait both tasks

The name AsyncSum itself is inappropriate, since there is nothing asynchronous happening inside this method. A better name could be WhenAll_TaskRun_Sum.

4
  • While I agree with some of your assessment here, according to the Microsoft documentation on Task.Run, the action passed to the method is the action to be executed "asynchronously", sans the "async" keyword, The best definition of the sync over async anti-pattern is "when you’re using a blocking wait on an async method, instead of awaiting the results asynchronously. This wastes the thread, causes unresponsiveness (if called from the UI), and exposes you to potential deadlocks." makolyte.com/fixing-the-sync-over-async-antipattern
    – RTD
    Commented Feb 7, 2022 at 15:50
  • Although, you're right that PLINQ is apparently re-using the originating thread via under-the-hood optimization, and this isn't occurring in the latter example.
    – RTD
    Commented Feb 7, 2022 at 15:55
  • @RTD you are right that the docs describe the Task.Run action parameter as "The work to execute asynchronously". But if you see the docs for the Thread constructor, it says: start: "A delegate that represents the methods to be invoked when this thread begins executing". My point is that the term "asynchronous" is used selectively in the docs, even for APIs that do similar things. Both the Task.Run(action) and the new Thread(action).Start() invoke an action on another thread. Would you say that starting a thread is an asynchronous operation? Or that it's an anti-pattern? Commented Feb 7, 2022 at 16:37
  • My understanding is that there is some overlap between "async" and "parallel", but they are not equivalent as an asynchronous task may or may not be scheduled for the originating thread, depending on the task scheduler, if it is a background thread from the threadpool. In the provided example that might not be the case, it can be a foreground thread in, say, a console application, but if for some unfortunate reason it were ASP.Net servicing a request, it would be from the threadpool, so it becomes sync over async, blocking a worker thread from re-use.
    – RTD
    Commented Feb 7, 2022 at 17:04

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.