2013年5月21日 星期二

Note_Book_Multi-Core_CacheHierarchies-Ch1

Shared and Private Caches
    • Shared LLC


      • No duplicate in L2.
      • May have Multiple copy in L1.
      • Coherence must be maintained among L2 and L1.
      • For bus, coherence is maintained with snooping protocol.
      • For write back policy.
        • Read
          • A core put a request on the bus
          • Other cores looks up their L1 upon receiving the message.
          • If no core has the block in modified state.
            • L2 cache should respond after the snoop of L1 finished.
        • Write
          • The block for writing is invalidated during the snooping.
      • write-back and write-update protocols may lead to heavy traffic load on the bus.


    '
      • For large-scale systems, interconnect and directory protocol are needed.
      • Advantages of shared cache
        • dynamically allocable storage
        • Only need to maintain single copy of data
          • better space ultilization
          • better hit rates
        • coherence misses of private caches can be resolved in shared cache
      • Disadvantages of shared cache
        • Miss rate will increase because cores may interfere each other, and leads to poor QoS 
        • Unnecessary coherence for non-shared data 
        • long access time
      • The L2 cache here is inclusive


    • Private LLC
      • Both L1 and L2 are private
      • Coherence is maintained among L2$ w/ directory-based protocol across scalable non-broadcast interconnect
      • Private L2 is looked up before request on bus.
      • Snooping takes quite of time
      • The time is long for accessing large private L2
      • Advantages of private LLC
        • Threads within different cores won't affect each other
        • Size of L2 is small
          • Smaller access time without coherence.
          • Non-shared data will be benefited.
      • Disadvantages of private LLC
        • Replication will occur for shared data, which leads to the waste of space
        • Static allocation may confined the data block in one core
      • Simple coherence on bus
        • on L2 miss
        • snoop to bus 
        • if no response for snoop
          • forward request to next level
      • For coherence in directory-based protocol
        • L2 miss -> send to directory(may be centralized or distributed directory)
        • Directory keeps track of all private L2$s
        • The directory needs #ofL2s * #ofWays comparisons of tag
        • If block is found in another L2 ->Messages are exchanged between cores an directories
        • With directory, the replication is not required for shared data 
      • Workload Analysis
        • More than half cache block are shared
        • Shared must be better than private
    Centralized and Distributed Shared Cache

    • Centralized Cache



    沒有留言 :

    張貼留言