且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

什么时候应该使用自旋锁而不是互斥锁?

更新时间:2023-10-26 15:12:52

理论

理论上,当一个线程试图锁定一个互斥锁而没有成功时,由于该互斥锁已被锁定,它将进入睡眠状态,并立即允许另一个线程运行.它会继续睡眠直到被唤醒为止,一旦互斥量被之前持有锁的任何线程解锁,情况就是如此.当线程尝试锁定自旋锁而没有成功时,它将不断重试对其进行锁定,直到最终成功;否则,它将继续尝试锁定自旋锁.因此,它将不允许其他线程代替(但是,一旦超出当前线程的CPU运行时范围,操作系统将强制切换到另一个线程).

问题

互斥锁的问题是使线程进入睡眠状态并再次唤醒它们都是相当昂贵的操作,它们将需要大量的CPU指令,因此也要花费一些时间.如果现在互斥锁仅被锁定了很短的时间,则使线程进入睡眠状态并再次唤醒它所花费的时间可能会超过该线程实际睡眠的时间,甚至可能超过该线程将要休眠的时间.通过不断地对自旋锁进行轮询而浪费了时间.另一方面,对自旋锁进行轮询将不断浪费CPU时间,如果将锁保持更长的时间,则将浪费更多的CPU时间,而如果线程正在睡眠,那就更好了. /p>

解决方案

在单核/单CPU系统上使用自旋锁通常是没有意义的,因为只要自旋锁轮询阻塞了唯一可用的CPU内核,其他线程就无法运行,并且由于没有其他线程可以运行,因此锁也不会被解锁. IOW,自旋锁只会浪费那些系统上的CPU时间,而没有真正的好处.如果改为让该线程进入睡眠状态,则另一个线程可能会立即运行,可能会解除锁定,然后在第一个线程再次唤醒后允许第一个线程继续处理.

在多核/多CPU系统上,这些锁仅在很短的时间内保持大量锁存,浪费时间不断使线程进入睡眠状态并再次唤醒它们可能会显着降低运行时性能.相反,当使用自旋锁时,线程有机会利用其完整的运行时域(总是仅在很短的时间段内阻塞,然后立即继续其工作),从而带来更高的处理吞吐量.

实践

由于很多时候程序员无法事先知道互斥或自旋锁是否会更好(例如,由于目标体系结构的CPU内核数量未知),操作系统也无法知道某个代码段是否已针对单个代码进行了优化.在双核或多核环境中,大多数系统并不严格区分互斥锁和自旋锁.实际上,大多数现代操作系统都具有混合互斥锁和混合自旋锁.这到底是什么意思?

在多核系统上,混合互斥锁起初就像自旋锁.如果线程无法锁定互斥锁,则由于互斥锁可能很快就会被解锁,因此它不会立即进入睡眠状态,因此互斥锁首先会像自旋锁一样工作.只有在一定时间(或重试或任何其他测量因素)之后仍未获得锁定的情况下,线程才真正进入睡眠状态.如果相同的代码在只有一个内核的系统上运行,则互斥锁将不会自旋锁,但是,如上所述,这将无济于事.

混合自旋锁起初的行为类似于普通自旋锁,但为避免浪费过多的CPU时间,它可能有一个后退策略.它通常不会使线程进入睡眠状态(因为您不希望在使用自旋锁时发生这种情况),但是它可能决定停止线程(立即或在一定时间后停止)并允许另一个线程运行,因此增加了自旋锁被解锁的机会(纯线程切换通常比涉及使线程进入睡眠状态并稍后唤醒它的便宜,尽管目前为止还不算太远).

摘要

如果有疑问,请使用互斥锁,它们通常是更好的选择,并且大多数现代系统将允许它们在很短的时间内自旋,如果这看起来是有益的.使用自旋锁有时可以提高性能,但只有在某些条件下,并且您有疑问才可以告诉我,您目前不在从事任何可能有益于自旋锁的项目.您可以考虑使用自己的锁对象",该对象可以在内部使用自旋锁或互斥锁(例如,创建此类对象时可以配置此行为),最初在各处使用互斥锁,如果您认为在某个地方使用自旋锁可能确实帮助,请尝试一下并比较结果(例如,使用探查器),但一定要先测试单核和多核两种情况,然后再得出结论(如果使用代码,可能使用不同的操作系统)将是跨平台的.

更新:针对iOS的警告

实际上不是特定于iOS的,而是大多数开发人员可能会遇到此问题的平台:如果您的系统具有线程调度程序,则不能保证任何线程,无论其优先级多么低,最终都将有机会运行时,自旋锁会导致永久性死锁. iOS调度程序会区分线程的不同类别,并且仅当高级类别中的任何线程都不希望运行时,低级类别的线程才会运行.没有回退策略,因此,如果您永久拥有高级线程,则低级线程将永远不会获得任何CPU时间,因此也就不会有任何机会执行任何工作.

问题出现如下:您的代码在低prio类线程中获得了一个自旋锁,而当它处于该锁的中间时,时间段已超过,该线程停止运行.再次释放此自旋锁的唯一方法是,如果该低prio类线程再次获得CPU时间,但不能保证会发生这种情况.您可能有两个高prio类线程,它们经常要运行,任务调度程序将始终对它们进行优先级排序.其中之一可能会碰到自旋锁并尝试获取它,这当然是不可能的,并且系统会使其屈服.问题是:产生的线程立即可以再次运行!具有比持有锁的线程更高的优先级,持有锁的线程没有机会获得CPU运行时.某些其他线程将获得运行时或刚刚产生的线程.

为什么互斥锁不会出现此问题?当高prio线程无法获取互斥体时,它不会屈服,它可能会旋转一点,但最终会进入睡眠状态.休眠线程只有在事件(例如,事件)唤醒后才可以运行.互斥锁正在解锁等事件一直在等待. Apple已意识到该问题,因此已弃用OSSpinLock.新锁称为os_unfair_lock.此锁避免了上面提到的情况,因为它知道不同的线程优先级.如果您确定在您的iOS项目中使用自旋锁是个好主意,请使用该锁.远离OSSpinLock!而且在任何情况下都不能在iOS中实现自己的自旋锁!如有疑问,请使用互斥锁! macOS不受此问题的影响,因为它具有不同的线程调度程序,该调度程序不允许任何线程(甚至是低prio线程)在CPU时间上干run",在那里仍然可能出现相同的情况,从而导致性能很差性能,因此OSSpinLock在macOS上也已被弃用.

I think both are doing the same job,how do you decide which one to use for synchronization?

The Theory

In theory, when a thread tries to lock a mutex and it does not succeed, because the mutex is already locked, it will go to sleep, immediately allowing another thread to run. It will continue to sleep until being woken up, which will be the case once the mutex is being unlocked by whatever thread was holding the lock before. When a thread tries to lock a spinlock and it does not succeed, it will continuously re-try locking it, until it finally succeeds; thus it will not allow another thread to take its place (however, the operating system will forcefully switch to another thread, once the CPU runtime quantum of the current thread has been exceeded, of course).

The Problem

The problem with mutexes is that putting threads to sleep and waking them up again are both rather expensive operations, they'll need quite a lot of CPU instructions and thus also take some time. If now the mutex was only locked for a very short amount of time, the time spent in putting a thread to sleep and waking it up again might exceed the time the thread has actually slept by far and it might even exceed the time the thread would have wasted by constantly polling on a spinlock. On the other hand, polling on a spinlock will constantly waste CPU time and if the lock is held for a longer amount of time, this will waste a lot more CPU time and it would have been much better if the thread was sleeping instead.

The Solution

Using spinlocks on a single-core/single-CPU system makes usually no sense, since as long as the spinlock polling is blocking the only available CPU core, no other thread can run and since no other thread can run, the lock won't be unlocked either. IOW, a spinlock wastes only CPU time on those systems for no real benefit. If the thread was put to sleep instead, another thread could have ran at once, possibly unlocking the lock and then allowing the first thread to continue processing, once it woke up again.

On a multi-core/multi-CPU systems, with plenty of locks that are held for a very short amount of time only, the time wasted for constantly putting threads to sleep and waking them up again might decrease runtime performance noticeably. When using spinlocks instead, threads get the chance to take advantage of their full runtime quantum (always only blocking for a very short time period, but then immediately continue their work), leading to much higher processing throughput.

The Practice

Since very often programmers cannot know in advance if mutexes or spinlocks will be better (e.g. because the number of CPU cores of the target architecture is unknown), nor can operating systems know if a certain piece of code has been optimized for single-core or multi-core environments, most systems don't strictly distinguish between mutexes and spinlocks. In fact, most modern operating systems have hybrid mutexes and hybrid spinlocks. What does that actually mean?

A hybrid mutex behaves like a spinlock at first on a multi-core system. If a thread cannot lock the mutex, it won't be put to sleep immediately, since the mutex might get unlocked pretty soon, so instead the mutex will first behave exactly like a spinlock. Only if the lock has still not been obtained after a certain amount of time (or retries or any other measuring factor), the thread is really put to sleep. If the same code runs on a system with only a single core, the mutex will not spinlock, though, as, see above, that would not be beneficial.

A hybrid spinlock behaves like a normal spinlock at first, but to avoid wasting too much CPU time, it may have a back-off strategy. It will usually not put the thread to sleep (since you don't want that to happen when using a spinlock), but it may decide to stop the thread (either immediately or after a certain amount of time) and allow another thread to run, thus increasing chances that the spinlock is unlocked (a pure thread switch is usually less expensive than one that involves putting a thread to sleep and waking it up again later on, though not by far).

Summary

If in doubt, use mutexes, they are usually the better choice and most modern systems will allow them to spinlock for a very short amount of time, if this seems beneficial. Using spinlocks can sometimes improve performance, but only under certain conditions and the fact that you are in doubt rather tells me, that you are not working on any project currently where a spinlock might be beneficial. You might consider using your own "lock object", that can either use a spinlock or a mutex internally (e.g. this behavior could be configurable when creating such an object), initially use mutexes everywhere and if you think that using a spinlock somewhere might really help, give it a try and compare the results (e.g. using a profiler), but be sure to test both cases, a single-core and a multi-core system before you jump to conclusions (and possibly different operating systems, if your code will be cross-platform).

Update: A Warning for iOS

Actually not iOS specific but iOS is the platform where most developers may face that problem: If your system has a thread scheduler, that does not guarantee that any thread, no matter how low its priority may be, will eventually get a chance to run, then spinlocks can lead to permanent deadlocks. The iOS scheduler distinguishes different classes of threads and threads on a lower class will only run if no thread in a higher class wants to run as well. There is no back-off strategy for this, so if you permanently have high class threads available, low class threads will never get any CPU time and thus never any chance to perform any work.

The problem appears as follow: Your code obtains a spinlock in a low prio class thread and while it is in the middle of that lock, the time quantum has exceeded and the thread stops running. The only way how this spinlock can be released again is if that low prio class thread gets CPU time again but this is not guaranteed to happen. You may have a couple of high prio class threads that constantly want to run and the task scheduler will always prioritize those. One of them may run across the spinlock and try to obtain it, which isn't possible of course, and the system will make it yield. The problem is: A thread that yielded is immediately available for running again! Having a higher prio than the thread holding the lock, the thread holding the lock has no chance to get CPU runtime. Either some other thread will get runtime or the thread that just yielded.

Why does this problem not occur with mutexes? When the high prio thread cannot obtain the mutex, it won't yield, it may spin a bit but will eventually be sent to sleep. A sleeping thread is not available for running until it is woken up by an event, e.g. an event like the mutex being unlocked it has been waiting for. Apple is aware of that problem and has thus deprecated OSSpinLock as a result. The new lock is called os_unfair_lock. This lock avoids the situation mentioned above as it is aware of the different thread priority classes. If you are sure that using spinlocks is a good idea in your iOS project, use that one. Stay away from OSSpinLock! And under no circumstances implement your own spinlocks in iOS! If in doubt, use a mutex! macOS is not affected by this issue as it has a different thread scheduler that won't allow any thread (even low prio threads) to "run dry" on CPU time, still the same situation can arise there and will then lead to very poor performance, thus OSSpinLock is deprecated on macOS as well.