当前位置: 首页 > news >正文

C#中避免GC压力和提高性能的8种技术

In a .NET application, memory and performance are very much linked. Poor memory management can hurt performance in many ways. One such effect is called GC Pressure or Memory Pressure.

GC Pressure (garbage collector pressure) is when the GC doesn’t keep up with memory deallocations. When the GC is pressured, it will spend more time garbage collecting, and these collections will come more frequently. When your app spends more time garbage collecting, it spends less time executing code, thus directly hurting performance.

If you’re not familiar with garbage collector fundamentals, I suggest reading this article first.

This article will show 8 techniques to minimize GC pressure, and by doing so, improve performance.

1. Set initial capacity for dynamic collections

.NET provides a lot of great collections types like , , and . All those collections have dynamic size capacity. That means they automatically expand in size as you add more items.List<T>Dictionary<T>HashSet<T>

While this functionality is very convenient, it’s not great for memory management. Whenever the collection reaches its size limit, it will allocate a new larger memory buffer (usually an array double in size). That means an additional allocation and deallocation.

Check out this benchmark:

[Benchmark]
public void ListDynamicCapacity()
{List<int> list = new List<int>();for (int i = 0; i < Size; i++){list.Add(i);}
}
[Benchmark]
public void ListPlannedCapacity()
{List<int> list = new List<int>(Size);for (int i = 0; i < Size; i++){list.Add(i);}
}
I’m using BenchmarkDotNet here with [Host]: .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT

In the first method, the collection started with default capacity and expanded in size. In the second benchmark, I set the initial capacity to the number of items it’s going to have.List

For 1000 items, the results were:

MethodMeanErrorStdDev
ListDynamicCapacity 3.415 us 0.0687 us 0.1240 us
ListPlannedCapacity 2.422 us 0.0219 us 0.0183 us

By setting capacity, we saved 30% in performance time. In practice, the improvement in performance is probably even greater because BenchmarkDotNet performs GC collections before and after each benchmark run.

I performed another benchmark for and , with similar results:DictionaryHashSet

MethodMeanErrorStdDev
DictionaryDynamicCapacity 36.693 us 0.7505 us 1.4637 us
DictionaryPlannedCapacity 17.500 us 0.3325 us 0.3696 us
HashSetDynamicCapacity 28.080 us 0.4264 us 0.3780 us
HashSetPlannedCapacity 16.533 us 0.3285 us 0.3374 us

2. Use ArrayPool for short-lived large arrays

Allocation of arrays and the inevitable de-allocation can be quite costly. Performing these allocations in high frequency will cause GC pressure and hurt performance. An elegant solution is the class found in the Systems.Buffers NuGet .System.Buffers.ArrayPool

The idea is pretty similar to to the ThreadPool. A shared buffer for arrays is allocated, which you can reuse without actually allocating and de-allocating memory. The basic usage is by calling . This returns a regular array, which you can use any way you please. When finished, call to return the buffer back to the shared pool.ArrayPool<T>.Shared.Rent(size)ArrayPool<int>.Shared.Return(array)

Here’s a benchmark showing this:

[Benchmark]
public void RegularArray()
{int[] array = new int[ArraySize];
}
[Benchmark]
public void SharedArrayPool()
{var pool = ArrayPool<int>.Shared;int[] array = pool.Rent(ArraySize);pool.Return(array);
}

For 100 integers the results are:

MethodMeanErrorStdDev
RegularArray 41.23 ns 0.8544 ns 2.236 ns
SharedArrayPool 47.42 ns 0.9781 ns 1.087 ns

Pretty similar, but when running for 1,000 integers:

MethodMeanErrorStdDev
RegularArray 404.53 ns 8.074 ns 18.872 ns
SharedArrayPool 51.71 ns 1.354 ns 1.505 ns

As you can imagine, the ArrayPool allocation time stays the same, whereas regular allocation time increases as the size grows.

Much like the ThreadPool with threads, the ArrayPool should be used for short-lived large arrays. For more info on the ArrayPool, read Adam Sitnik’s excellent blog post .

3. Use Structs instead of Classes (sometimes)

Structs have several benefits when it comes to deallocation:

  • When structs are not part of a class, they are allocated on the Stack and don’t require garbage collection at all (stack unwinding).
  • Structs are stored on the heap when they are part of a class (or any reference-type). In that case, they are stored inline and are deallocated when the containing type is deallocated. Inline means the struct’s data is stored as-is. As opposed to a reference type, where a pointer is stored to another location on the heap with the actual data. This is especially meaningful in collections, where a collection of structs is much cheaper to de-allocate because it’s just one buffer of memory.
  • Structs take less memory than a reference type because they don’t have an ObjectHeader and a MethodTable.

In most cases, you will want to use classes. Use structs when all of the following is true (full guidelines from Microsoft ):

  • The struct size is less than or equals to 16 bytes (e.g 4 integers). More than that size, classes are more effective than structs.
  • The struct is short lived
  • The struct is immutable.
  • The struct will not have to be boxed frequently.

In addition, structs are passing by value. So when you’re passing a struct as a method parameter, it will be copied entirely. Copying is expensive and can hurt performance instead of improving it.

Here’s a benchmark that shows how efficient allocating structs can be:

class VectorClass
{public int X { get; set; }public int Y { get; set; }
}
struct VectorStruct
{public int X { get; set; }public int Y { get; set; }
}
private const int ITEMS = 10000;
[Benchmark]
public void WithClass()
{VectorClass[] vectors = new VectorClass[ITEMS];for (int i = 0; i < ITEMS; i++){vectors[i] = new VectorClass();vectors[i].X = 5;vectors[i].Y = 10;}
}
[Benchmark]
public void WithStruct()
{VectorStruct[] vectors = new VectorStruct[ITEMS];// At this point all the vectors instances are already allocated with default valuesfor (int i = 0; i < ITEMS; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}

Result:

MethodMeanErrorStdDev
WithClass 77.97 us 1.5528 us 2.6785 us
WithStruct 12.97 us 0.2564 us 0.6094 us

As you can see, the allocation is about 6.5 times faster than allocation.structclass

4. Avoid Finalizers

Finalizers in C# are very expensive for several reasons:

  • Any class with a finalizer is automatically promoted a generation by the garbage collector. This means they can’t be garbage collected in Gen 0, which is the fastest generation.
  • The finalizer is placed in a Finalizer Queue, handled by a single dedicated thread. This can cause problems is some finalizer runs for a long time or throws an exception.

To prove how terrible finalizers can be for performance, consider the following benchmark:

class Simple
{public int X { get; set; }
}
class SimpleWithFinalizer
{~SimpleWithFinalizer(){}public int X { get; set; }
}
private int ITEMS = 100000;
private static Simple _instance1;
private static SimpleWithFinalizer _instance2;
[Benchmark]
public void AllocateSimple()
{for (int i = 0; i < ITEMS; i++){_instance1 = new Simple();}
}
[Benchmark]
public void AllocateSimpleWithFinalizer()
{for (int i = 0; i < ITEMS; i++){_instance2 = new SimpleWithFinalizer();}
}

The result for 100,000 items is:

MethodMeanErrorStdDev
AllocateSimple 409.9 us 9.063 us 17.24 us
AllocateSimpleWithFinalizer 128,796.8 us 2,520.871 us 2,588.75 us
The measuring unit ‘us’ stands for microseconds. 1000 us = 1 millisecond

As you can see, there’s a 1:320 ratio in favor of classes without finalizers.

Sometimes, finalizers are unavoidable. For example, they are often used in the Dispose Pattern . In such cases, make sure to suppress the finalizers when it’s no longer required, like this:

public  void  Dispose()
{Dispose(true); // the actual dispose functionalityGC.SuppressFinalize(this); //now, the finalizer won't be called
}

5. Use StackAlloc for short-lived array allocations

The keyword in C# allows for very fast allocation and deallocation of unmanaged memory. That is, classes won’t work, but primitives, structs, and arrays are supported. Here’s an example benchmark:StackAlloc

struct VectorStruct
{public int X { get; set; }public int Y { get; set; }
}
[Benchmark]
public void WithNew()
{VectorStruct[] vectors = new VectorStruct[5];for (int i = 0; i < 5; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}
[Benchmark]
public unsafe void WithStackAlloc() // Note that unsafe context is required
{VectorStruct* vectors = stackalloc VectorStruct[5];for (int i = 0; i < 5; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}
public void WithStackAllocSpan() // When using Span, no need for unsafe context
{Span<VectorStruct> vectors = stackalloc VectorStruct[5];for (int i = 0; i < 5; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}

This results are:

MethodMeanErrorStdDev
WithNew 10.372 ns 0.1531 ns 0.1432 ns
WithStackAlloc 5.704 ns 0.0938 ns 0.0831 ns
WithStackAllocSpan 5.742 ns 0.0965 ns 0.1021 ns

stackalloc is about twice as fast as regular instantiation. When increasing the number of items from 5 to 100, the difference is even greater – 82ns : 36ns.

Use Span<T> rather than array pointer since no unsafe context is needed

Learn more about here .stackalloc

6. Use StringBuilder, but not always

Strings are immutable. As such, they cannot change. Any concatenation like will allocate a new object. To prevent these new allocations and improve performance, the class was created.str1 = str1 + str2StringBuilder

I recently wrote a blog post on StringBuilder performance and found out that things were not as simple as they might seem. Here’s the summary of my research:

  • Regular concatenations are more efficient than for a small number of concatenations. Depending on string sizes, using becomes more efficient with over 10-15 concatenations.StringBuilderStringBuilder
  • StringBuilder can be optimized by setting its initial capacity.
  • StringBuilder can be optimized by reusing the same instance. This can make a difference for very frequent usages like logging.

For more information, read the full article: Challenging the C# StringBuilder Performance

7. Use String Interning in very specific cases

About 60% percent of the human body is water. Similarly, about 70% of a .NET application is strings. This makes optimizing strings one of the most important aspects of memory management.

The .NET runtime has a hidden optimization. For literal strings with the same value, it uses the same reference. For example, consider the following code:

string a = "Table";
string b = "Table";

It seems like and will be allocated to 2 different objects. But, the CLR will allocate just 1 object, which both and will reference. This optimization is called String Interning. There are 2 positive side effects to this:abab

  1. You save memory by using just 1 object.
  2. It’s cheaper to compare between the strings. A comparison first checks for reference equality. Since both and referencing same object, the comparison will return without actually checking the string contents.abtrue

This optimization is done just for string literals. For example, when you write something like this: . It’s not done for strings that are calculated at runtime. The reason is that string interning is expensive. When interning a new string, the runtime has to look for an identical string in memory to find a match. This is obviously expensive and just not done.string myString = "Something"

As it happens, you can perform string intering manually. This is done with the method. And you can check if a string is already interned with . In very specific cases, you can use this for optimization. Here’s one example:string.Intern(string)string.IsInterned(string)

private string s1 = "Hello";
private string s2 = " World";
[Benchmark]
public void WithoutInterning()
{string s1 = GetNonLiteral();string s2 = GetNonLiteral();for (int i = 0; i < Size; i++){bool x = s1.Equals(s2);}
}
[Benchmark]
public void WithInterning()
{string s1 = string.Intern(GetNonLiteral());string s2 = string.Intern(GetNonLiteral());for (int i = 0; i < Size; i++){bool x = s1.Equals(s2);}
}
private string GetNonLiteral()
{return s1 + s2;
}

For 100 items this benchmark will return:

MethodMeanErrorStdDevMedian
WithoutInterning 198.3 ns 3.986 ns 10.776 ns 201.5 ns
WithInterning 424.4 ns 8.426 ns 8.653 ns 421.0 ns

And for 10,00 items:

MethodMeanErrorStdDev
WithoutInterning 68.06 us 0.6225 us 0.5198 us
WithInterning 16.11 us 0.3288 us 0.3075 us

As you can see, this can be very effective when the amount of comparisons is much larger than the number of intern operations. These cases are very rare. If you do consider interning, do some benchmarking to make sure you are actually optimizing anything.

Note that an interned string will never be garbage collected. It might make more sense to create a local string-pool of your own. You can see Jon Skeet’s answer on StackOverflow where he explains this point further and even shows an implementation example.

8. Avoid memory leaks

Memory leaks are a constant troublemaker in any big application. Besides the obvious danger of an eventual out-of-memory exception, memory leaks also cause GC Pressure and performance issues. Here’s how:

  • With a memory leak, objects remain referenced, even when they are effectually unused. While referenced, the garbage collector will keep promoting them to higher generations instead of collecting them. These promotions are expansive and add work for the GC.
  • Memory leaks cause more memory to be in use. This means you will run out of free space quicker, causing the GC to do more frequent collections.

Memory leaks are a huge subject. Here are 2 resources you can take advantage of to learn more:

  • 8 Ways You can Cause Memory Leaks in .NET
  • Find, Fix, and Avoid Memory Leaks in C# .NET: 8 Best Practices

Summary

I hope you got value from the mentioned tips and tricks. You probably noticed that all of the above optimizations make use of one or more of these core concepts:

  • Allocations should be avoided if possible.
  • Reusing memory is better than allocating new memory.
  • Allocating on the Stack is faster than allocating on the Heap.

These are not the only concepts in performance optimizations, but probably the most important ones when it comes to GC pressure.

Happy coding.

http://www.hskmm.com/?act=detail&tid=10604

相关文章:

  • ctfshow web入门 爆破
  • 函数内联
  • 7. Innodb底层原理与Mysql日志机制深入剖析
  • 深入解析:HSA35NV001美光固态闪存NQ482NQ470
  • ERP和MES、WMS、CRM,到底怎么配合 - 智慧园区
  • YOLO实战应用 1YOLOv5 架构与模块
  • YOLO实战应用 2数据准备与增强
  • Day18稀疏数组
  • 底层
  • YOLO实战应用 3训练与优化策略
  • WPF 视图缩略图控件(支持缩放调节与拖拽定位)
  • ik中文分词器使用
  • 动态水印也能去除?ProPainter一键视频抠图整合包下载
  • SpringBoot整合RustFS:全方位优化文件上传性能
  • windows使用es-client插件
  • AI学习日记 - 实践
  • es中的端点
  • 解码C语言宏
  • es中的索引
  • es中的数据类型
  • 防御安全播客第214期:数据泄露与漏洞攻防实战
  • windows使用kibana
  • 03作业
  • 软工作业个人项目
  • YOLO进阶提升 5标注与配置
  • rapidxml中接口函数
  • YOLO进阶提升 6模型训练与测试
  • YOLO进阶提升 4训练准备与数据处理
  • windows安装elasticsearch
  • YOLO进阶提升 5标注与配置补充