Cache Coherence Protocols must enforece two rules:
- Write propagation: Writes eventually become visible to all processors. 写操作最终所有处理器均可见。
- Write serialization: Writes to the same location are serialized (all processors see them in the same order). 写操作的顺序应当保持一致。
How to ensure write propagation?
- Write-invalidate protocols: Invalidate all other cached copied before performing the write. 执行写操作前,使其他所有的缓存备份失效。
- Write-update protocols: Update all other cached copies after performing the write. 执行写操作之后,更新其他所有缓存备份。
How to ensure write serialization?
- Snooping-based protocols: All caches observe each other’s actions through a shared bus. 所有缓存通过共享总线观测其他缓存的行为。
- Directory-based protocols: A coherence directory tracks contents of private caches and serializes requests. 一致性目录跟踪专用缓存的内容并对请求序列化。
Snooping-Based Coherence
- There are many processors running in parallel, the caches are connected through a shared bus, and then connected to the main mamory.
- If cache-hit, then cache return the data. If cache-miss, then go to the main memory and fetch the data.
-
Snoopy cache watch (snoop on) bus to keep all processors’ view of memory coherent. (cache have to listen to both processor and shared bus.)
How to achieve?
-
Bus provides serialization point
-
Broadcast, totally
ordered
- Each cache controller “snoops” all bus transactions
- Controller updates state of cache in response to processor and snoop events and generates bus transactions
-
Broadcast, totally
-
Snoopy protocol (Finite State Machine, FSM)
- State-transition diagram
- Actions
A Simple Protocol: Valid/Invalid (VI)
-
VI Drawbacks:
-
Every write updates main memory
-
Every write requires broadcast & snoop
-
Maintaining Coherence
-
In a coherent memory all loads and stores can be placed in a global order
- However, multiple copies of an address in various caches can cause this property to be violated 多个地址的拷贝可能导致负载和存储不能在全局按序存储(导致不一致)
-
This property can be ensured if:
- Only one cache at a time has the write permission for an address 一次仅一个缓存具有对地址的写许可权
- No cache can have a stale copy of the data after a write to the address has been performed 执行写入地址后,任何高速缓存都无法拥有数据的陈旧副本(Write-invalidate)
Modifed/Shared/Invalid(MSI) Protocol
MSI is a little different from the VI protocol.
-
Each line in each cache maintains MSI state:
- I – cache doesn’t contain the address “失效”状态表示该数据块是否已有最新值(失效说明数据块已经被其他processor修改)
- S – cache has the address but so may other caches; hence it can only be read “共享”状态表示改数据没有被修改过,被多个cache读取
- M – only this cache has the address; hence it can be read and written – any other cache that had this address got invalidated “修改”状态表示cache可以对该地址进行读/写操作
- VI Drawbacks: Every write updates main memory, and every write requires broadcast & snoop
- MSI: Allows writeback caches + satisfies writes locally
MSI Optimizations: Exclusive State
-
Observation: Doing read-modify-write sequences on private data is common
-
What’s the problem with MSI?
-
2 bus transactions for every read-modify-write of private data
-
-
What’s the problem with MSI?
-
Solution: E state (exclusive, clean)
- If no other sharers, a read acquires line in E instead of S
-
Writes silently cause E -> M (exclusive, dirty)
MESI: An Enhanced MSI protocol
- Increased performance for private read-write data 解决private cache更新浪费带宽的问题
Directory-Based Coherence
- Motivation: Snoopy的bus往往是性能瓶颈,随着processors的增加,bus会变得拥堵。如果有n个CPU,就需要支持n倍带宽,并且需要每一个CPU处理其他CPU的所有信息,即处理N^2的信息。
-
Route all coherence transactions through a directory,其他processor通过访问directory来判断该memory是否有自己需要的数据块
- Tracks contents of private caches -> No broadcasts,只对自己的cache请求,维护自己被分配到的memory
- Serves as ordering point for conflicting requests -> Unordered networks(这句没懂?)
Ref
版权声明:本文为suki570原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。