Cookie Consent by Free Privacy Policy Generator 📌 Closures: Performance implications


✅ Closures: Performance implications


💡 Newskategorie: Programmierung
🔗 Quelle: dev.to

Performance implications with closures capture

Originally posted on https://www.celsojr.com/post/closures-performance-implications

If you're coding in a resource-saving mission-critical way and want to avoid the default closure capture — something I don't recommend if your struct is big and/or you're not really sure what you're doing because the way the compiler is doing this for you is one of the most efficient ways to do closure capture for many different scenarios, you can simply avoid closures with an imperative code style and/or traditional functions because compositions are not exactly intended for better performance.

Or you can create your own helper class or struct to do this for you. But believe me, it will be VERY hard for you to take care of all your closures alone depending on the size of your application and the APIs you're working with. And also VERY difficult to beat the efficiency of the default closure capture mechanism done by the compiler. But let's give it a try. Let's look at the following code example:

using System;
using System.Runtime.CompilerServices;

static class ClosureCompare
{
    private static int n = 0;
    private delegate void AddDelegate(int n);

    private readonly static DisplayStruct adder = new DisplayStruct(ref n);
    private readonly static AddDelegate invoke = adder.Add;

    static void Main()
    {
       invoke(1);
       Console.WriteLine(adder.GetValue()); // Output the result: 1
    }
}

readonly unsafe struct DisplayStruct
{
    private readonly int* num;

    public DisplayStruct(ref int initialValue)
    {
        fixed (int* ptr = &initialValue)
        {
            num = ptr;
        }
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void Add(int n)
    {
        *num += n;
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public int GetValue()
    {
        return *num;
    }
}

Note that I'm working with a low-level pointer so I can change a "read-only" property, and so I'm also running this code in an unsafe environment. I know this code doesn't seem very convenient, but otherwise, believe me, it won't be worth doing your own closure capture because the compiler will do it better than you. If you check the low-level C# code now, you will see that the old <>c__DisplayClass0_0 is now gone.

But, on the other hand, if you DO NOT want to use closures at all, you can still have a specialized struct to do the dirty work for you, like so:

using System;
using System.Runtime.CompilerServices;

static class ClosureCompare
{
    static void Main()
    {
       int num = 0;

       DisplayStruct adder = new DisplayStruct(ref num);

       adder.Add(1); // Perform addition operation

       Console.WriteLine(adder.GetResult()[0]); // Output the result: 1
    }
}

public readonly ref struct DisplayStruct
{
    private readonly Span num;

    public DisplayStruct(ref int initialValue)
    {
        num = new Span(ref initialValue);
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public void Add(int n)
    {
        if (num.Length > 0)
            num[0] += n;
        else
            throw new InvalidOperationException("Span is empty.");
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public Span GetResult()
    {
        return num;
    }
}

Note the use of the ref modifier for this new struct. That will both allow you to use the Span as a struct field and also prevent people from using this struct to work with the most common closures because you cannot use a ref local inside an anonymous method, lambda expression, or query expression as per their official documentation. You can try this code yourself, but at least on my machine it was the fastest one in the last LTS runtime. And of course, it's always recommended to do your own benchmarks when necessary because this result below can vary from one machine to another:

// * Summary *

BenchmarkDotNet v0.13.7, Windows 11 (10.0.22631.3447)
AMD Ryzen 5 1600, 1 CPU, 12 logical and 6 physical cores
.NET SDK 8.0.204
  [Host]   : .NET 8.0.4 (8.0.424.16909), X64 RyuJIT AVX2
  .NET 7.0 : .NET 7.0.18 (7.0.1824.16914), X64 RyuJIT AVX2
  .NET 8.0 : .NET 8.0.4 (8.0.424.16909), X64 RyuJIT AVX2


|                Method |  Runtime |        Mean |     Error |    StdDev |      Median | Ratio | Rank |   Gen0 | Allocated |
|---------------------- |--------- |------------:|----------:|----------:|------------:|------:|-----:|-------:|----------:|
|  CustomClosureCapture | .NET 7.0 |   3.2424 ns | 0.0994 ns | 0.1104 ns |   3.2463 ns |  0.16 |    1 |      - |         - |
|             NoClosure | .NET 7.0 |  12.3134 ns | 0.2658 ns | 0.2356 ns |  12.2293 ns |  0.60 |    2 |      - |         - |
| DefaultClosureCapture | .NET 7.0 |  20.4729 ns | 0.4359 ns | 0.9840 ns |  20.2206 ns |  1.00 |    3 | 0.0017 |      88 B |
|                       |          |             |           |           |             |       |      |        |           |
|             NoClosure | .NET 8.0 |   0.0454 ns | 0.0299 ns | 0.0456 ns |   0.0337 ns | 0.002 |    1 |      - |         - |
|  CustomClosureCapture | .NET 8.0 |   3.1839 ns | 0.0956 ns | 0.1243 ns |   3.1440 ns | 0.149 |    2 |      - |         - |
| DefaultClosureCapture | .NET 8.0 |  21.5585 ns | 0.4636 ns | 0.4961 ns |  21.5625 ns | 1.000 |    3 | 0.0014 |      88 B |

How to avoid surprises with closures

This is better to understand how the closure capture work in C# to avoid surprises. We already know that closures are something running in a different scope or environment, whether it is a function or an expression. And by different scope or environment, closures can also be running in a different thread. That is when we should start to be more aware of how things work. Let's take a look at this code:

using System;
using System.Threading;

static class ClosureCompare
{
    static void Main()
    {
       int[] arr = [1, 2, 3, 4, 5];

       foreach(int n in arr)
       {
           ThreadPool.QueueUserWorkItem(_ => Console.Write(n));
       }

       // Wait a bit for the Thread Pool threads to do their work
       // as we are not joining the threads together again
       Thread.Sleep(2000);
    }
}

This code should run smoothly and, on most machines, two seconds should be enough for all threads to be scheduled and perform their work on time. It should output something like 12345 to the console, but not always in the same order because the execution scheduling is not being managed by code and it depends on availability of threads. So far so good, ah? And what about this next code snippet below, now using a for loop?

using System;
using System.Threading;

static class ClosureCompare
{
    static void Main()
    {
       int[] arr = [1, 2, 3, 4, 5];

       for (int i = 0; i < arr.Length; i++)
       {
           // Without capturing the closure here, value is always 5 leading to
           // an unhandled out of range exception that you may never know about
           ThreadPool.QueueUserWorkItem(_ => Console.Write(arr[i]));
       }

       // Wait a bit for the Thread Pool threads to do their work
       // as we are not joining the threads together again
       Thread.Sleep(2000);
    }
}

But why this error if we have limited the loop for to the same size of the array correctly?

i < arr.Length

Well, we will see that the way these loops work is a little different. Microsoft has changed the way the foreach loop works since version 5.0 of the language. According to the C# language specification [1], "The placement of v inside the while loop is important for how it is captured by any anonymous function occurring in the embedded_statement."

And if you take a look at the generated low-level C# code, you will see that the foreach is still capturing the closure by reference. But now, the compiler is creating a new instance of that helper class with a copy of the array item for each iteration. Can you spot the difference in the code snippet below?

private static void Main()
{
    <>c__DisplayClass0_0 <>c__DisplayClass0_ = new <>c__DisplayClass0_0();
    int[] array = new int[5];
    RuntimeHelpers.InitializeArray(array, (RuntimeFieldHandle));
    <>c__DisplayClass0_.arr = array;
    int[] arr = <>c__DisplayClass0_.arr;
    int num = 0;
    // FOREACH LOOP TRANSLATED
    while (num < arr.Length)
    {
        <>c__DisplayClass0_1 <>c__DisplayClass0_2 = new <>c__DisplayClass0_1();
        <>c__DisplayClass0_2.n = arr[num]; // TAKING A COPY OF THE ARRAY ITEM HERE
        ThreadPool.QueueUserWorkItem(new WaitCallback(<>c__DisplayClass0_2.<Main>b__0));
        num++;
    }
    <>c__DisplayClass0_2 <>c__DisplayClass0_3 = new <>c__DisplayClass0_2();
    <>c__DisplayClass0_3.CS$<>8__locals1 = <>c__DisplayClass0_;
    <>c__DisplayClass0_3.i = 0;
    // LOOP FOR TRANSLATED
    while (<>c__DisplayClass0_3.i < <>c__DisplayClass0_3.CS$<>8__locals1.arr.Length)
    {
        ThreadPool.QueueUserWorkItem(new WaitCallback(<>c__DisplayClass0_3.<Main>b__1));
        <>c__DisplayClass0_3.i++;
    }
    Thread.Sleep(2000);
}

This is interesting the way these loops are translated into the same while loop and are still different. But the key point is to know how do they work, because this is not a problem as it may look like. And the way the loop for works, can possibly be more performant if not degraded by JIT [2] compilation.

So, what's happening with the way this for loop is written is that when the loop is translated into a while loop, what we get is a reference to the variable i of the helper class. This way, when the variable is incremented for the last time within the loop from 4 to 5, before the last loop check, those threads that have a reference to that same variable will be trying to access index 5 of an array of size 4.

And the value of the variable will almost always be 5, because at this point the loop has already incremented it by 5 times. The loop runs faster, a matter of nano seconds or even less, than the operation necessary to instrument the creation of threads in the thread pool including, but not limited, to the scheduling of execution.

In order to make it work as expected, we just need to copy the current increment by manually capturing the closure by value instead of by reference, as shown in the code snippet below:

using System;
using System.Threading;

static class ClosureCompare
{
    static void Main()
    {
       int[] arr = [1, 2, 3, 4, 5];

       for (int i = 0; i < arr.Length; i++)
       {
           // MAKING A COPY OF THE CURRENT INCREMENT
           // AND CAPTURING THE CLOSURE BY VALUE, INSTEAD OF REFERENCE
           int closureCapture = i;
           ThreadPool.QueueUserWorkItem(_ => Console.Write(arr[closureCapture]));
       }

       // Wait a bit for the Thread Pool threads to do their work
       // as we are not joining the threads together again
       Thread.Sleep(2000);
    }
}

And this is not a "problem" reserved for the loops only. This can also happen with Timers. And worse than that, it can happen in the opposite way. Timers have low-level APIs and are not very commonly used because there are higher-level abstractions, such as BackgroudWorker [3], which offer more flexible APIs and a better experience. But let's check this pseudo code example below with a low-level Timer:

using System;
using System.Timers;
using System.Threading.Tasks;

using Timer = System.Timers.Timer;

class Program
{
    static int count = 0;
    static int[] items = [1, 2, 3];
    static TaskCompletionSource tcs = new TaskCompletionSource();

    static async Task Main()
    {
        var timer = new Timer() { Interval = 100 };

        timer.Elapsed += (sender, e) => CronJob(sender, e,
            count); // CAPTURING COUNT BY VALUE

        timer.Enabled = true;
        await tcs.Task;

        timer.Stop();
        timer.Dispose();

        Console.WriteLine("Timer stopped.");
    }

    private static void CronJob(object? source, ElapsedEventArgs e, int count)
    {
        Console.WriteLine("Item: {0}", items[count]);

        count++;

        if (count == items.Length)
        {
            tcs.SetResult(true);
        }
    }
}

In this example, the variable count is being passed to the CronJob function by value and, therefore, will never be incremented more than once inside that function scope, leading to an infinity run.

To make this code work as expected, we just need to use a small ref key word that will work the same the & sign works in PHP, as we saw in the first blog post of this closures series. Please check the updated code below:

using System;
using System.Timers;
using System.Threading.Tasks;

using Timer = System.Timers.Timer;

class Program
{
    static int count = 0;
    static int[] items = [1, 2, 3];
    static TaskCompletionSource tcs = new TaskCompletionSource();

    static async Task Main()
    {
        var timer = new Timer() { Interval = 100 };

        timer.Elapsed += (sender, e) => CronJob(sender, e,
            ref count); // CAPTURING COUNT NOW BY REFERENCE

        timer.Enabled = true;
        await tcs.Task;

        timer.Stop();
        timer.Dispose();

        Console.WriteLine("Timer stopped.");
    }

    private static void CronJob(object? source, ElapsedEventArgs e,
        ref int count) // CAPTURING COUNT NOW BY REFERENCE
    {
        Console.WriteLine("Item: {0}", items[count]);

        count++;

        if (count == items.Length)
        {
            tcs.SetResult(true);
        }
    }
}

Of course, this is not recommended to use low-level APIs in the development of enterprise applications unless it is really necessary. Low-level code is more error prone and, among other things, should also have a negative impact on readability. The code example previously shown was just simulating the problems that can arise when we don't really know how things work. And with this, I hope to have helped more people understand a little more about scope, closures and compositions. And also, how to take advantage of it. Happy coding!

Disclaimer

It's worth noting that I'm not a Microsoft employee. All opinions in this blog post are my own. The information displayed here is not endorsed by Microsoft, .Net Foundation or any of their partners. This is not a sponsored post. All rights reserved.

  1. C# language specification Learn Microsoft, Retrieved April 25, 2024.
  2. Managed execution process Learn Microsoft, Retrieved April 25, 2024.
  3. BackgroundWorker Class Learn Microsoft, Retrieved April 25, 2024.
...

✅ Closures: Performance implications


📈 43.15 Punkte

✅ GameStop's poor performance is leading to store closures


📈 27.22 Punkte

✅ Rendering on the Web: Performance Implications of Application Architecture (Google I/O ’19)


📈 22.57 Punkte

✅ Performance Implications of Running Databases in Kubernetes


📈 22.57 Punkte

✅ USING JAVASCRIPT CLOSURES IN REACT


📈 20.58 Punkte

✅ Lexical Scoping vs Closures


📈 20.58 Punkte

✅ First-Class Functions, Higher-Order Functions, and Closures in Python – Explained with Code Examples


📈 20.58 Punkte

✅ Exploring JavaScript Closures: Practical Examples and Insights


📈 20.58 Punkte

✅ Let's Understand JavaScript Closures: A Fundamental Concept


📈 20.58 Punkte

✅ Scope, Hoisting and Closures in Javascript


📈 20.58 Punkte

✅ One Byte Explainer - Closures


📈 20.58 Punkte

✅ Mastering Closures: Tips and Tricks for Better JavaScript Development


📈 20.58 Punkte

✅ Learn Closures In 13 Minutes


📈 20.58 Punkte

✅ 50,000 More Retail Store Closures on the Horizon: Embracing a Data-Driven Approach


📈 20.58 Punkte

✅ Mastering JavaScript Closures: A Comprehensive Guide


📈 20.58 Punkte

✅ How Do Closures Work in JavaScript? Explained with Code Examples


📈 20.58 Punkte

✅ Understanding Closures in JavaScript: A Powerful Mechanism for Variable Scope


📈 20.58 Punkte

✅ 7-Eleven Denmark confirms ransomware attack behind store closures


📈 20.58 Punkte

✅ Understanding Closures in JavaScript


📈 20.58 Punkte

✅ JavaScript Closures in Action: Real-World Applications


📈 20.58 Punkte

✅ Understanding JavaScript Closures: A Comprehensive Guide


📈 20.58 Punkte

✅ 7-Eleven Denmark confirms ransomware attack behind store closures


📈 20.58 Punkte

✅ Closures in JavaScript: What They Are and Why They Matter


📈 20.58 Punkte

✅ Closures: Lifting the hood


📈 20.58 Punkte

✅ Understanding JavaScript Closures ⚡️


📈 20.58 Punkte

✅ Spark blames COVID-19 border closures for slight revenue decline


📈 20.58 Punkte

✅ Understanding Closures in JavaScript


📈 20.58 Punkte

✅ A Practical Introduction to Closures in JavaScript: Part 1


📈 20.58 Punkte

✅ Closures, Higher-Order Functions, and Prototypal Inheritance in JavaScript


📈 20.58 Punkte

✅ Business ID theft soars amid COVID closures


📈 20.58 Punkte

✅ Understanding Core JavaScript Concepts: Objects, Scopes, and Closures


📈 20.58 Punkte

✅ JavaScript Closures: Demystified


📈 20.58 Punkte

✅ Scope, Closures, and Hoisting in JavaScript – Explained with Code Examples


📈 20.58 Punkte

✅ Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!


📈 20.58 Punkte











matomo

Datei nicht gefunden!