13
Arquitectura MIPS, base de los microprocesadores modernos Fabian Vargas Catholic University - PUCRS Brazil Viernes 17 de agosto de 9:00 hs a 12:20 hs, Aula L3, 1er piso [email protected] [email protected] Catholic University PUCRS

[email protected] vargas@computer

  • Upload
    ulric

  • View
    59

  • Download
    0

Embed Size (px)

DESCRIPTION

Catholic University PUCRS. Arquitectura MIPS, base de los microprocesadores modernos Fabian Vargas Catholic University - PUCRS Brazil Viernes 17 de agosto de 9:00 hs a 12:20 hs , Aula L3, 1er piso. [email protected] [email protected]. - PowerPoint PPT Presentation

Citation preview

Page 1: vargas@pucrs.br vargas@computer

Arquitectura MIPS, base de los microprocesadores modernos

Fabian VargasCatholic University - PUCRS

Brazil

Viernes 17 de agosto de 9:00 hs a 12:20 hs, Aula L3, 1er piso

[email protected]@computer.org

Catholic UniversityPUCRS

Page 2: vargas@pucrs.br vargas@computer

[email protected], [email protected] 2

Fabian Vargas obtained his Ph.D. Degree in Microelectronics from the Institut National Polytechnique de Grenoble (INPG), France, in 1995. In 2005, he spent a sabbatical year at the Technical University of Lisbon (INESC-ID / IST), Portugal. At present, he is Associate Professor at the Catholic University (PUCRS) in Porto Alegre, Brazil. His main research domains involve the HW-SW co-design and test of system-on-chip (SoC) for critical applications; system-level design and methodologies for radiation and electromagnetic compatibility; and the on-chip sensor design for reliability and aging binning. Among several activities, Prof. Vargas has served as Technical Committee Member or Guest-Editor in many IEEE-sponsored conferences and journals. He holds 6 BR and international patents, co-authored a book and published over 200 refereed papers. Prof. Vargas is associate researcher of the BR National Science Foundation since 1996. He co-founded the IEEE Latin American Test Technology Technical Council (LA-TTTC) in 1997 and the IEEE Latin American Test Workshop (LATW) in 2000. Prof. Vargas received the Meritorious Service Award of the IEEE Computer Society for providing significant services for chairing the IEEE Latin American Regional TTTC Group and the LATW for several years. Prof. Vargas is a Golden Core Member of the IEEE Computer Society.

Page 3: vargas@pucrs.br vargas@computer

[email protected], [email protected] 3

Presentation•

MIPS (originally an acronym for Microprocessor without Interlocked Pipeline Stages) is a reduced instruction set (RISC) computer architecture developed by MIPS Technologies.

• MIPS implementations are primarily used in embedded systems such as Windows CE devices, routers, residential gateways, and video game consoles such as the Sony PlayStation 2 and PlayStation Portable. MIPS implementations were also used by Digital Equipment Corporation (NEC), Pyramid Technology, Siemens Nixdorf, Tandem Computers and others during the late 1980s and 1990s.

• In the mid to late 1990s, it was estimated that one in three RISC microprocessors produced was a MIPS implementation.

• Considering such strong success experienced by the MIPS architecture in the market in the last decades, this course deals with presenting an overview about the MIPS architecture, briefly describing the main features involved in the design of its internal functional blocks, namely the data and control paths, the pipeline and the memory system.

Page 4: vargas@pucrs.br vargas@computer

[email protected], [email protected] 4

RISC pioneer

• In 1981, a team led by John L. Hennessy at Stanford University started work on what would become the first MIPS processor.

• The basic concept was to increase performance through the use of deep instruction pipelines. Pipelining as a basic technique was well known before, but not developed into its full potential.

The Goal

Page 5: vargas@pucrs.br vargas@computer

[email protected], [email protected] 5

• One major barrier to pipelining was that some instructions, like division, take longer to complete and the CPU therefore has to wait before passing the next instruction into the pipeline.

• One solution to this problem is to use a series of interlocks that allows stages to indicate that they are busy, pausing the other stages upstream.

• Hennessy's team viewed these interlocks as a major performance barrier since they had to communicate to all the modules in the CPU which takes time, and appeared to limit the clock speed.

• A major aspect of the MIPS design was to fit every sub-phase, including cache-access, of all instructions into one cycle, thereby removing any needs for interlocking, and permitting a single cycle throughput.

• Although this design eliminated useful instructions (multiply and divide) it was felt that the overall performance of the system would be dramatically improved because the chips could run at much higher clock rates.

The Solution

Page 7: vargas@pucrs.br vargas@computer

[email protected], [email protected] 7

Other Improvements (1/4)

The other difference between the MIPS design and the competing Berkeley RISC involved the handling of subroutine calls.

• RISC used a technique called register windows to improve performance of these very common tasks (but in turn, this limited the maximum depth of multi-level calls). Each subroutine call required its own set of registers, which in turn required more real estate on the CPU and more complexity in its design.

• But, Hennessy felt that a careful compiler could find free registers without resorting to a hardware implementation (compilers performing “liveness analysis” techniques) , and that simply increasing the number of registers would not only make this simple, but increase the performance of all tasks.

Page 8: vargas@pucrs.br vargas@computer

[email protected], [email protected] 8

• In computer engineering, the use of register windows is a technique to improve the performance of a particularly common operation, the procedure call. This was one of the main design features of the original Berkeley RISC design, which would later be commercialized as the SPARC, AMD 29000, and Intel i960.

• Rendering the registers invisible can be implemented efficiently; the CPU recognizes the movement from one part of the program to another during a procedure call. It is accomplished by one of a small number of instructions (prologue) and ends with one of a similarly small set (epilogue). In the Berkeley design, these calls would cause a new set of registers to be "swapped in" at that point, or marked as "dead" (or "reusable") when the call ends.

• In the Berkeley RISC design, only 8 registers out of a total of 64 are visible to the programs. The complete set of registers are known as the register file, and any particular set of eight as a window. The file allows up to eight procedure calls to have their own register sets. As long as the program does not call down chains longer than eight calls deep, the registers never have to be spilled, i.e. saved out to main memory or cache which is a slow process compared to register access. For many programs a chain of six is as deep as the program will go.

Other Improvements (2/4)

Page 9: vargas@pucrs.br vargas@computer

[email protected], [email protected] 9

Other Improvements (3/4)• In compiler theory, live variable analysis (or simply liveness analysis) is a classic

data flow analysis performed by compilers to calculate for each program point the variables that may be potentially read before their next write, that is, the variables that are live at the exit from each program point.

• Stated simply: a variable is live if it holds a value that may be needed in the future.

• The set of live variables at line L3 is {b, c} because both are used in the addition, and thereby the call to f and assignment to a. But the set of live variables at line L1 is only {b} since variable c is updated in line 2.

• The value of variable a is never used, so the variable is never live. Note that f may be stateful, so the never-live assignment to a can be eliminated, but there is insufficient information to rule on the entirety of L3.

• L1: b := 3;• L2: c := 5;• L3: a := f(b + c);• goto L1;

Page 10: vargas@pucrs.br vargas@computer

[email protected], [email protected] 10

• In other ways the MIPS design was very much a typical RISC design.

• To save bits in the instruction word, RISC designs reduce the number of instructions to encode. The MIPS design uses 6 bits of the 32-bit word for the basic opcode; the rest may contain a single 26-bit jump address or it may have up to four 5-bit fields specifying up to three registers (in a bank of 32 registers) plus a shift value combined with another 6-bits of opcode; another format, among several, specifies two registers combined with a 16-bit immediate value, etc.

• This allowed MIPS to load up the instruction and the data it needed in a single cycle, whereas non-RISC designs required separate cycles to load the opcode and the data.

• This was one of the major performance improvements that RISC offered. However, modern non-RISC designs achieve this speed by other means (such as queues in the CPU).

Other Improvements (4/4)

Page 11: vargas@pucrs.br vargas@computer

[email protected], [email protected] 11

The Evolution (1/3)• They released their first design, the R2000, in 1985, improving the design

as the R3000 in 1988. These 32-bit CPUs formed the basis of their company through the 1980s, used primarily in SGI's series of workstations and later Digital Equipment Corporation DECstation workstations and servers.

• In 1991 MIPS released the first 64-bit microprocessor, the R4000. The R4000 has an advanced TLB where the entry contains not just virtual address but also the virtual address space id. Such buffer eliminates the major performance problems from microkernels[11] that are slow on competing architectures (Pentium, PowerPC, Alpha) because of the need to flush the TLB on the frequent context switches.

• However, MIPS had financial difficulties while bringing it to market. The design was so important to SGI, at the time one of MIPS' few major customers, that SGI bought the company outright in 1992 in order to guarantee the design would not be lost. As a subsidiary of SGI, the company became known as MIPS Technologies.

Page 12: vargas@pucrs.br vargas@computer

[email protected], [email protected] 12

The Evolution (2/3)

• The R8000 (1994) was the first superscalar MIPS design, able to execute two integer or floating point and two memory instructions per cycle. The design was spread over six chips: an integer unit (with 16 KB instruction and 16 KB data caches), a floating-point unit, three full-custom secondary cache tag RAMs (two for secondary cache accesses, one for bus snooping), and a cache controller ASIC. The design had two fully pipelined double precision multiply-add units, which could stream data from the 4 MB off-chip secondary cache.

• In 1995, the R10000 was released. This processor was a single-chip design, ran at a faster clock speed than the R8000, and had larger 32 KB primary instruction and data caches. It was also superscalar, but its major innovation was out-of-order execution.

• The R12000 used a 0.25 micrometre process to shrink the chip and achieve higher clock rates. The revised R14000 allowed higher clock rates with additional support for DDR SRAM in the off-chip cache. Later iterations are named the R16000 and the R16000A and feature increased clock speed and smaller die manufacturing compared with before.

Page 13: vargas@pucrs.br vargas@computer

[email protected], [email protected] 13

The Evolution (3/3)

• In recent years most of the technology used in the various MIPS generations has been offered as IP-cores (building-blocks) for embedded processor designs.

• Both 32-bit and 64-bit basic cores are offered, known as the 4K and 5K. These cores can be mixed with add-in units such as FPUs, SIMD systems, various input/output devices, etc.

• MIPS cores have been commercially successful, now being used in many consumer and industrial applications. MIPS cores can be found in newer Cisco, Linksys and Mikrotik's routerboard routers, cable modems and ADSL modems, smartcards, laser printer engines, set-top boxes, robots, handheld computers, Sony PlayStation 2 and Sony PlayStation Portable. In cellphone/PDA applications.

• MIPS has been largely unable to displace the incumbent, competing ARM architecture.