C# 自旋锁,它们有多大用处?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1456225/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Spinlocks, How Useful Are They?
提问by
How often do you find yourself actually using spinlocks in your code? How common is it to come across a situation where using a busy loop actually outperforms the usage of locks?
Personally, when I write some sort of code that requires thread safety, I tend to benchmark it with different synchronization primitives, and as far as it goes, it seems like using locks gives better performance than using spinlocks. No matter for how little time I actually hold the lock, the amount of contention I receive when using spinlocks is far greater than the amount I get from using locks (of course, I run my tests on a multiprocessor machine).
您发现自己在代码中实际使用自旋锁的频率如何?遇到使用繁忙循环实际上优于使用锁的情况有多常见?
就我个人而言,当我编写某种需要线程安全的代码时,我倾向于使用不同的同步原语对其进行基准测试,就目前而言,使用锁似乎比使用自旋锁提供了更好的性能。不管我实际持有锁的时间有多短,我在使用自旋锁时收到的争用数量远远大于我使用锁获得的数量(当然,我在多处理器机器上运行我的测试)。
I realize that it's more likely to come across a spinlock in "low-level" code, but I'm interested to know whether you find it useful in even a more high-level kind of programming?
我意识到它更有可能在“低级”代码中遇到自旋锁,但我很想知道您是否觉得它在更高级的编程中有用?
采纳答案by jalf
It depends on what you're doing. In general application code, you'll want to avoid spinlocks.
这取决于你在做什么。在一般应用程序代码中,您会希望避免自旋锁。
In low-level stuff where you'll only hold the lock for a couple of instructions, and latency is important, a spinlock mat be a better solution than a lock. But those cases are rare, especially in the kind of applications where C# is typically used.
在低级的东西中,你只能持有几条指令的锁,并且延迟很重要,自旋锁垫是比锁更好的解决方案。但这些情况很少见,尤其是在通常使用 C# 的应用程序中。
回答by Remus Rusanu
My 2c: If your updates satisfy some access criteria then they are good spinlock candidates:
我的 2c:如果您的更新满足某些访问标准,那么它们是很好的自旋锁候选者:
- fast, ie you will have time to acquire the spinlock, perform the updates and release the spinlock in a single thread quanta so that you don't get pre-empted while holding the spinlock
- localizedall data you update are in preferably one single page that is already loaded, you do not want a TLB miss while you holding the spinlock, and you definetely don't want an page fault swap read!
- atomicyou do not need any other lock to perform the operation, ie. never wait for locks under spinlock.
- 快速,即您将有时间在单线程量子中获取自旋锁、执行更新和释放自旋锁,以便您在持有自旋锁时不会被抢占
- 本地化您更新的所有数据最好都在一个已经加载的单个页面中,您不希望在持有自旋锁时发生 TLB 未命中,并且您绝对不希望页面错误交换读取!
- 原子你不需要任何其他锁来执行操作,即。永远不要等待自旋锁下的锁。
For anything that has any potential to yield, you should use a notified lock structure (events, mutex, semaphores etc).
对于任何有可能产生的东西,您应该使用通知锁结构(事件、互斥锁、信号量等)。
回答by nos
You hardly ever need to use spinlocks in application code, if anything you should avoid them.
你几乎不需要在应用程序代码中使用自旋锁,如果有的话你应该避免它们。
I can't thing of any reason to use a spinlock in c# code running on a normal OS. Busy locks are mostly a waste on the application level - the spinning can cause you to use the entire cpu timeslice, vs a lock will immediatly cause a context switch if needed.
我没有任何理由在普通操作系统上运行的 c# 代码中使用自旋锁。忙锁在应用程序级别上主要是一种浪费 - 旋转会导致您使用整个 cpu 时间片,而如果需要,锁会立即导致上下文切换。
High performance code where you have nr of threads=nr of processors/cores might benefit in some cases, but if you need performance optimization at that level your likely making next gen 3D game, working on an embedded OS with poor synchronization primitives, creating an OS/driver or in any case not using c#.
在某些情况下,具有 nr 个线程 = nr 个处理器/内核的高性能代码可能会受益,但如果您需要在该级别进行性能优化,您可能会制作下一代 3D 游戏,在具有较差同步原语的嵌入式操作系统上工作,创建一个操作系统/驱动程序或在任何情况下不使用 c#。
回答by T.E.D.
For my realtime work, particularly with device drivers, I've used them a fair bit. It turns out that (when last I timed this) waiting for a sync object like a semaphore tied to a hardware interrupt chews up at least 20 microseconds, no matter how long it actually takes for the interrupt to occur. A single check of a memory-mapped hardware register, followed by a check to RDTSC (to allow for a time-out so you don't lock up the machine) is in the high nannosecond range (basicly down in the noise). For hardware-level handshaking that shouldn't take much time at all, it is really tough to beat a spinlock.
对于我的实时工作,尤其是设备驱动程序,我已经使用了很多。事实证明(上次我计时时)等待同步对象(如绑定到硬件中断的信号量)至少需要 20 微秒,无论中断发生实际需要多长时间。对内存映射硬件寄存器的单次检查,然后是对 RDTSC 的检查(以允许超时,因此您不会锁定机器)处于高纳秒范围内(基本上在噪声中)。对于根本不应该花费太多时间的硬件级握手,击败自旋锁真的很困难。
回答by Reed Copsey
In C#, "Spin locks" have been, in my experience, almost always worse than taking a lock - it's a rare occurrence where spin locks will outperform a lock.
在 C# 中,根据我的经验,“自旋锁”几乎总是比获取锁更糟糕 - 自旋锁性能优于锁的情况很少见。
However, that's not always the case. .NET 4 is adding a System.Threading.SpinLockstructure. This provides benefits in situations where a lock is held for a very short time, and being grabbed repeatedly. From the MSDN docs on Data Structures for Parallel Programming:
然而,情况并非总是如此。.NET 4 添加了System.Threading.SpinLock结构。这在锁被持有很短的时间并被反复抓取的情况下提供了好处。来自并行编程数据结构的 MSDN 文档:
In scenarios where the wait for the lock is expected to be short, SpinLock offers better performance than other forms of locking.
在预期锁定等待时间较短的情况下,SpinLock 提供比其他形式的锁定更好的性能。
Spin locks can outperform other locking mechanisms in cases where you're doing something like locking through a tree - if you're only having locks on each node for a very, very short period of time, they can out perform a traditional lock. I ran into this in a rendering engine with a multithreaded scene update, at one point - spin locks profiled out to outperform locking with Monitor.Enter.
在您通过树锁定之类的情况下,自旋锁可以胜过其他锁定机制 - 如果您只在每个节点上锁定非常非常短的时间,它们可以执行传统锁。我曾在具有多线程场景更新的渲染引擎中遇到过这个问题 - 自旋锁的性能优于使用 Monitor.Enter 的锁定。
回答by Ben
Please note the following points :
请注意以下几点:
Most mutexe's implementations spin for a little while before the thread is actually unscheduled. Because of this it is hard to compare theses mutexes with pure spinlocks.
Several threads spining "as fast as possible" on the same spinlock will consome all the bandwidth and drasticly decrease your program efficiency. You need to add tiny "sleeping" time by adding noop in your spining loop.
大多数互斥体的实现会在线程实际未调度之前旋转一段时间。因此,很难将这些互斥锁与纯自旋锁进行比较。
在同一个自旋锁上“尽可能快地”旋转的多个线程将占用所有带宽并大大降低您的程序效率。您需要通过在旋转循环中添加 noop 来增加微小的“睡眠”时间。
回答by Joren
If you have performance critical code andyou have determined that it needs to be faster than it currently is andyou have determined that the critical factor is the lock speed, then it'd be a good idea to try a spinlock. In other cases, why bother? Normal locks are easier to use correctly.
如果您有性能关键代码,并且确定它需要比当前更快,并且确定关键因素是锁定速度,那么尝试使用自旋锁是个好主意。在其他情况下,为什么要打扰?普通锁更容易正确使用。
回答by Jon Harrop
I used spin locks for the stop-the-world phase of the garbage collector in my HLVMproject because they are easy and that is a toy VM. However, spin locks can be counter-productive in that context:
在我的HLVM项目中,我在垃圾收集器的 stop-the-world 阶段使用了自旋锁,因为它们很简单,而且这是一个玩具 VM。但是,在这种情况下,自旋锁可能会适得其反:
One of the perf bugs in the Glasgow Haskell Compiler's garbage collector is so annoying that it has a name, the "last core slowdown". This is a direct consequence of their inappropriate use of spinlocks in their GC and is excacerbated on Linux due to its scheduler but, in fact, the effect can be observed whenever other programs are competing for CPU time.
Glasgow Haskell Compiler 的垃圾收集器中的性能错误之一非常烦人,以至于它有一个名字,“最后一个核心减速”。这是他们在 GC 中不当使用自旋锁的直接后果,并且在 Linux 上由于其调度程序而加剧,但实际上,只要其他程序竞争 CPU 时间,就可以观察到这种影响。
The effect is clear on the second graph hereand can be seen affecting more than just the last core here, where the Haskell program sees performance degradation beyond only 5 cores.
效果在此处的第二张图中很明显,并且可以看出影响的不仅仅是此处的最后一个内核,其中 Haskell 程序的性能下降仅超过 5 个内核。
回答by dsimcha
One use case for spin locks is if you expect very low contention but are going to have a lot of them. If you don't need support for recursive locking, a spinlock can be implemented in a single byte, and if contention is very low then the CPU cycle waste is negligible.
自旋锁的一个用例是,如果您希望争用非常少,但会出现很多争用。如果不需要支持递归锁定,可以在单个字节中实现一个自旋锁,如果争用非常低,那么 CPU 周期浪费可以忽略不计。
For a practical use case, I often have arrays of thousands of elements, where updates to different elements of the array can safely happen in parallel. The odds of two threads trying to update the same element at the same time are very small (low contention) but I need one lock for every element (I'm going to have a lot of them). In these cases, I usually allocate an array of ubytes of the same size as the array I'm updating in parallel and implement spinlocks inline as (in the D programming language):
对于实际用例,我经常拥有包含数千个元素的数组,其中对数组不同元素的更新可以安全地并行发生。两个线程尝试同时更新同一个元素的几率非常小(低争用),但我需要为每个元素一个锁(我将拥有很多)。在这些情况下,我通常分配一个与我并行更新的数组大小相同的 ubytes 数组,并实现内联自旋锁(在 D 编程语言中):
while(!atomicCasUbyte(spinLocks[i], 0, 1)) {}
myArray[i] = newVal;
atomicSetUbyte(spinLocks[i], 0);
On the other hand, if I had to use regular locks, I would have to allocate an array of pointers to Objects, and then allocate a Mutex object for each element of this array. In scenarios such as the one described above, this is just plain wasteful.
另一方面,如果我必须使用常规锁,我将不得不分配一个指向 Objects 的指针数组,然后为这个数组的每个元素分配一个 Mutex 对象。在上述场景中,这只是一种浪费。
回答by Rahul Gupta
Always keep these points in your mind while using spinlocks:
在使用自旋锁时,请始终牢记以下几点:
- Fast user mode execution.
- Synchronizes threads within a single process, or multiple processes if in shared memory.
- Does not return until the object is owned.
- Does not support recursion.
- Consumes 100% of CPU while "waiting".
- 快速的用户模式执行。
- 同步单个进程或多个进程(如果在共享内存中)中的线程。
- 在拥有对象之前不返回。
- 不支持递归。
- 在“等待”时消耗 100% 的 CPU。
I have personally seen so many deadlocks just because someone thought it will be a good idea to use spinlock.
我个人见过这么多死锁,只是因为有人认为使用自旋锁是个好主意。
Be very very careful while using spinlocks
使用自旋锁时要非常小心
(I can't emphasize this enough).
(我怎么强调都不过分)。