HW 4 solutions ---ECE 366 1. AMAT = 1 + 0.05 * 20 = 2 cc's = 4 nsecs. 2. AMAT = 1.2 + 0.03 * 20 = 1.8 cc's = 3.6 nsecs; so the AMAT improves 3. k-bit address, cache size = S = 2**s bytes, block size = B = 2**b bytes, A-way S.A., A = 2**a. (a) # sets = # blocks / A; # blocks = cache size/block size = S/B; # sets = S/BA = 2**(s - b -a) (b) # of index bits = log_2 (# sets) = s - b -a; (c) Total # of bits to implement cache: Data Store = 8*S bits; Tag Store = (# tag bits per block + 1) (# blocks) Note the 1 bit above is the valid bit per block. # tag bits per block = # bits in block number - # of index bits # bits in block number = k - log_2(block size in bytes) Note that memory is byte addressable and block size is given in bytes. # bits in block number = k - b. # tag bits per block = k - b - (s - b -a) = k-s+a Total # of bits to implement cache = # bits in data store + # bits in tag store = 8*S + (k-s+a+1)*S/B = 8*S + S/B*(k - log_2 S + log_2 A + 1) 4. (a) DM cache; each block has 1 word, so word address = block # Replaced block # 4 1 8 Block # 1->1, 4->4, 8->8, 5->5, 20->4, 17->1, 19->3, 56->8, 9->9, 11->11 Class. cmp, cmp, cmp, cmp, cmp, cmp, cmp, cmp, cmp, cmp Replaced block # 17 33 1 Block # 33->1, 43->10, 5, 1->1, 9, 17->1 Class. cmp, cmp, h, conf, h, conf 4 (b) Since there are 4 words per block, remove LS 2 bits of word address to get the block #. E.g., we can assume 6-bit address as largest word address is 56. Thus adresss 1 = 000001 => block # = 0000 (block 0); address 20 = 010100 => block # = 0101 = 5; address 56 = 111000 => block # = 1110 = 14, etc. OR we can do block # = (word address)/4 and round down to nearest integer if fractional. With these derived block #s, set up a block access pattern for the above word addresses, and proceed similarly as above except that now we have a FA associative cache with only 4 blocks in it.