# (Knowledge for development) KIBABII UNIVERSITY (KIBU) ## UNIVERSITY EXAMINATIONS 2020/2021 ACADEMIC YEAR # END OF SEMESTER EXAMINATIONS YEAR FOUR SEMESTER TWO EXAMINATIONS # FOR THE DEGREE OF (COMPUTER SCIENCE) COURSE CODE : CSC 457E **COURSE TITLE** : ADVANCED MICROPROCESSOR **ARCHITECTURE** DATE: 28/09/2021 TIME: 09.00 A.M - 11.00 A.M #### **INSTRUCTIONS TO CANDIDATES** ANSWER QUESTIONS ONE AND ANY OTHER TWO. #### QUESTION ONE (COMPULSORY) [30 MARKS] [1 marks] (a) State the meaning of the following terms applicable in microprocessor architecture Instruction Level Parallelism (i) (ii) Quantum computing [1 marks] (b) A computer has a 256 KByte, 4-way set associative, write back data cache with the block size of 32 Bytes. The processor sends 32-bit addresses to the cache controller. Each cache tag directory entry contains, in addition, to address tag, 2 valid bits, 1 modified bit and 1 replacement bit. (i) Compute the number of bits in the tag field of an address [4 marks] (ii) What is the size of the cache tag directory? [2 marks] (c) Even though pipelining speeds up the execution of instructions, it does pose potential problems. State three architectural problems and possible solutions for each of these problems. [6 marks] (d) Memory hierarchy in multiprocessors poses two fundamental problems, coherency and consistency. Briefly explain each of these problems [4 marks] (e) There are three classes of multiprocessors, according to the way each CPU sees main memory. State and briefly explain each class of multiprocessors. [6 marks] (f) (i) State any three issues that need to be considered in the design of superscalar processors [3 marks] (ii)List three limitations of superscalar processors [3 marks] ### QUESTION TWO [20 MARKS] | (a) | Define the following terms applicable in memory management in processors | | |-----|-------------------------------------------------------------------------------------------|--------------| | | (i) Paging | [1 mark] | | | (ii) Segmentation | [1 mark] | | (b) | Consider a machine with 64 MB physical memory and a 32-bit virtual address space. If the | e page size | | | is 4KB, what is the approximate size of the page table? | [4 marks] | | (c) | (i) State of the art microprocessors achieve high performance by executing multiple instr | ructions per | | | cycle. In an out-of-order engine, state the role of instruction scheduler | [2 marks] | | | (ii) Memory ambiguity is a problem that prevents the scheduler from advancing load | ls ahead of | | | preceding stores, since a load's dependency on a store is defined through their memory | addresses. | | | Briefly state how speculative memory disambiguation technique deals with this problem. | [2 marks] | | (d) | One of the common shared memory multiprocessor system is Non Uniform Memory Access (NUMA) | | | | Briefly describe the differences between the following two types of NUMA systems. | | | | (i) Non-Caching NUMA (NC-NUMA) | [2 marks] | | | (ii) Cache-Coherent NUMA (CC-NUMA) | [2 marks] | | (e) | In memory dependence prediction, each collision history tables (CHT) entry may | contain the | | | following fields. Briefly describe each of the fields in the CHT. | | | | (i) Tag | [2 marks] | | | (ii) Collision predictor | [2 marks] | | | (ii) Collision distance | [2 marks] | | | | | #### **QUESTION THREE [20 MARKS]** - (a) (i) Thrashing is a state in which the system spends most of its time swapping process pieces rather than executing instructions. State how this problem can be avoided. [2 marks] - (ii) State two support requirements for virtual memory to be practical and effective [4 marks] - **(b)** Memory management has two fundamental characteristics which when present, it is not necessary that all of the pages or segments of a process be in main memory during execution - (i) Enumerate two fundamental characteristics present in memory management [4 marks] - (ii) State the implication of presence of the two characteristics on memory management [2 marks - (c) Compute the Average Memory Access Time (AMAT) for a processor with a 2 ns clock cycle time, a miss rate of 0.04 misses per instruction, a missed penalty of 25 clock cycles, and a cache access time (including hit detection) of 1 clock cycle. Also, assume that the read and write miss penalties are the same and ignore other write stalls [4 marks] - (d) The ability of cache memory to improve a computer's performance relies on the concept of locality of reference. State two key types of locality for cache and briefly explain the difference [4 marks] #### **QUESTION FOUR [20 MARKS]** - (a) State the meaning of the following terms applicable in cache organization - (i) Miss Penalty [1 mark] (ii) Cache miss [1 mark] - (iii) Hit ratio - (b) (i) The very long instruction word (VLIW) is one of approaches for throughput improvement of an instruction pipeline. State how this improvement is achieved using this approach. [3 marks] - (ii) VLIW and super pipelining processor are both methods of throughput improvement of an instruction pipeline. Briefly state the difference between the two processor designs [3 marks] - (c) An 8KB direct-mapped write-back cache is organized as multiple blocks, each of size 32-bytes. The processor generates 32-bit addresses. The cache controller maintains the tag information for each cache block comprising of the following; 1 Valid bit; 1 Modified bit. As many bits as the minimum needed to identify the memory block mapped in the cache. What is the total size of memory needed at the cache controller to store meta-data (tags) for the cache? [6 marks] - (d) (i) Instruction Issue Policy is one of the superscalar processor design issue that defines the order in which instructions are fetched, executed and update registers and memory values (order of completion). List the three standard categories of instruction issue policies. - (ii) In superscalar processor design, output and anti-dependencies occur because register contents may not reflect the correct ordering from the program which can require a pipeline stall. State the solution to this problem. [2 marks] ### QUESTION FIVE [20 MARKS] [1 mark] (a) Define the following terms applicable in parallel processor architectures i. Concurrency | ii. Scalability | [1 mark] | |-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------| | | [1 mark] | | iii. Parallel Overhead | [1 mark] | | (b) (i)Briefly state the basic idea of Amdahl's law in relation to performance architectures | limitations of parallel [2 marks] | | (ii) If the proportion of the program that can be parallelized (P) is 0.95, ap calculate the maximum speed up that can be achieved with N processors. In number of processors tends to infinity. | nlyino Amdahl's law | | (iii) The high values of P poses embarrassingly parallel problems. Briefly | explain the aspect of | | embarrassingly parallel problems | [2 marks] | | (c) The performance profile of a given system/application depends on numero factors critical to performance profiling | ous factors. State three [3 marks] | | There are a number of quantum computing models distinguished by the basic elements in which the computation is decomposed. List the four main models and briefly describe each model. [4 marks] | | | (e) Suppose you are designing a small, 16KB, 2-way set-associative L1 and a associative L3 cache for the next processor your company will build. Which design decisions would you make and why? Justify your choice. (i) Access L1 tag store and data store: in parallel OR series | large, 32MB, 32-way set sh one of the following [2 marks] | | (ii) Access L3 tag store and data store in parallel OR series | [2 marks] |