Instruction Set Architectures

Posted on 2024-02-06 Edited on 2025-01-10 In Computer Science Views:

Sources:

John L. Hennessy & David A. Patterson. (2019). Chapter 2. Memory Hierarchy Design. Computer Architecture: A Quantitative Approach (6th ed., pp. 78-148). Elsevier Inc.
RISC-V instruction format
ECS201A: ISAs and Machine Representation

Instruction set architectures

Instruction set architectures (ISA) is a contract between hardware and software.

Instruction format specifies: instruction format, virtual memory, number of registers, size of registers, exception, and etc.

Architecture, microarchitecture and technology

Architecture: In Computer Architecture context, the term architecture refers to the interface as seen by the programmer. ISA belongs to architecture.
Microarchitecture: the hardware implementation of the architecture (specifically, the ISA). The study of microarchitecture would include topics like pipelining, instruction-level parallelism (ILP), out-of-order execution, speculative execution, branch prediction and caching.
Techonology: In Computer Architecture context, the term technology refers to transistors and process technology.

The cache coherence protocol (the implementation of how the caches are kept transparent to the programmer), such as MESA, is part of the microarchitecture. It is primarily implemented in hardware.

The memory consistency model (the details of how loads and stores are ordered in a program) is part of the architecture.

Features of ISAs

Nearly all ISAs today are classified as general-purpose register architectures, where the operands are either registers or memory locations. The 80x86 has 16 general-purpose registers and 16 that can hold floating-point data, while RISC-V has 32 general-purpose and 32 floating-point registers.

Virtually all ISAS use byte addressing to access memory operands.

ISAs can be classified by following perspectives:

Whether can instructions access memory directly:
1. register-memory ISAs: one example is the 80x86, which can access memory as part of many instructions
2. load-store ISAs: two examples are ARMv8 and RISC-V, which can access memory only with load or store instructions.
  - All ISAs announced since 1985 are load-store.
Whether the memory addressing is aligned: Some architectures, like ARMv8, require that objects must be aligned. An access to an object of size \(s\) bytes at byte address \(A\) is aligned if \(A \mod s = 0\).

The 80x86 and RISC-V do not require alignment, but accesses are generally faster if operands are aligned.
Addressing modes: In addition to specifying registers and constant operands, addressing modes specify the address of a memory object. RISC-V addressing modes are:
1. Register,
2. Immediate (for constants), and
3. Displacement, where a constant offset is added to a register to form the memory address.
The 80x86 supports those three modes, plus three variations of displacement: no register (absolute), two registers (based indexed with displacement), and two registers.
Formats: There are two basic choices: fixed length and variable length.
1. All ARMv8 and RISC-V instructions are 32 bits long, which simplifies instruction decoding. See Figure 1.7.
2. The 80x86 encoding is variable length, ranging from 1 to 18 bytes
Operations: The general categories of operations are
1. data transfer,
2. arithmetic logical,
3. control (discussed next), and
4. floating point.
This table summarizes the integer RISC-V ISA, and this table lists the floating-point ISA.

The 80x86 has a much richer and larger set of operations.

RISC

Reduced instruction set computing (RISC)
1. Small number of instructions.
2. Load/store architecture.
3. Operating on two operands.
4. Greatly simplifies implementation of allowed for higher frequency.

CISC

Complex instruction set computing (CISC)
- Many instructions --> instructions are broken into sub-operations (micro code) by hardware

RISC-V

RISC-V is a simple and easy-to-pipeline instruction set architecture, and it is representative of the RISC architectures being used in 2017.

Registers

RISC-V registers, names, usage, and calling conventions:

Register	Name	Use	Saver
x0	zero	The constant value 0	N.A.
x1	ra	Return address	Caller
x2	sp	Stack pointer	Callee
x3	gp	Global pointer	-
x4	tp	Thread pointer	-
x5-x7	t0-t2	Temporaries	Caller
x8	s0 / fp	Saved register/frame pointer	Callee
x9	s1	Saved register	Callee
x10-x11	\(\mathrm{a} 0-\mathrm{a} 1\)	Function arguments/return values	Caller
x12-x17	a2-a7	Function arguments	Caller
x18-x27	s2-s11	Saved registers	Callee
x28-x31	t3-t6	Temporaries	Caller
f0-f7	ft0-ft7	FP temporaries	Caller
f8-f9	fs0-fs1	FP saved registers	Callee
f10-f11	fa0-fa1	FP function arguments/return values	Caller
f12-f17	fa2-fa7	FP function arguments	Caller
f18-f27	fs2-fs11	FP saved registers	Callee
f28-f31	ft8-ft11	FP temporaries	Caller

RISC-V has 32 general-purpose registers (x0-x31), and 32 floating-point registers (f0-f31) that can hold either a 32-bit single-precision number or a 64-bit double-precision number.

The registers that are preserved across a procedure call are labeled "Callee" saved.

Instruction formats

All instructions are 32 bits long.

All registers are referenced by 5 bits.

Types:

The R format is for integer register-to-register operations, such as ADD, SUB, and so on.
The I format is for loads and immediate operations, such as LD and ADDI.
The B format is for branches and
- The immediate is disrupted, so it will be decoded when the CPU executes in the future. After decoding, the CPU needs to restore the disrupted immediate in order. For example, when the CPU gets a B-type instruction, the immediate in it is scrambled, and the CPU needs to arrange the immediate in the order of 12-1 to restore the immediate.
the J format is for jumps and link.
- The immediate of J-type is signed and also disrupted. That means that the CPU must first put the immediate numbers together to restore the original immediate numbers when decoding.
The S format is for stores. Having a separate format for stores allows the three register specifiers (rd, rs1, rs2) to always be in the same location in all formats.
The U format is for the wide immediate instructions (LUI, AUIPC).

Insturction type

RISC-V has a base set of instructions (R64I) and offers optional extensions: multiply-divide (RVM), single-precision floating point (RVF), double-precision floating point (RVD).

Integer insturction types

This table includes R64I and RVM:

Instruction type/opcode	Instruction meaning
Data transfers	Move data between registers and memory, or between the integer and FP or special registers; only memory address mode is 12-bit displacement + contents of a GPR
`1b, 1bu, sb`	Load byte, load byte unsigned, store byte (to/from integer registers)
`1h, 1hu, sh`	Load half word, load half word unsigned, store half word (to/from integer registers)
`1w, 1wu, sw`	Load word, load word unsigned, store word (to/from integer registers)
`1d, sd`	Load double word, store double word (to/from integer registers)
`f1w, f1d, fsw, fsd`	Load SP float, load DP float, store SP float, store DP float
`fmv..x, fmv.x.`	Copy from/to integer register to/from floating-point register; "-"=S for single-precision, D for double-precision
`csrrw, csrrwi, csrrs, csrrsi, csrrc, csrrci`	Read counters and write status registers, which include counters: clock cycles, time, instructions retired
Arithmetic/logical	Operations on integer or logical data in GPRs
`add, addi, addw, addiw`	Add, add immediate (all immediates are 12 bits), add 32-bits only & sign-extend to 64 bits, add immediate 32-bits only
`sub, subw`	Subtract, subtract 32-bits only
`mul, mulw, mulh, mulhsu, mulhu`	Multiply, multiply 32-bits only, multiply upper half, multiply upper half signed-unsigned, multiply upper half unsigned
`div, divu, rem, remu`	Divide, divide unsigned, remainder, remainder unsigned
`divw, divuw, remw, remuw`	Divide and remainder: as previously, but divide only lower 32-bits, producing 32-bit sign-extended result
`and, andi`	And, and immediate
`or, ori, xor, xori`	Or, or immediate, exclusive or, exclusive or immediate
`lui`	Load upper immediate; loads bits 31-12 of register with immediate, then sign-extends
`auipc`	Adds immediate in bits 31-12 with zeros in lower bits to PC; used with JALR to transfer control to any 32-bit address
`sll, slli, srl, srli, sra, srai`	Shifts: shift left logical, right logical, right arithmetic; both variable and immediate forms
`sllw, slliw, srlw, srliw, sraw, sraiw`	Shifts: as previously, but shift lower 32-bits, producing 32-bit sign-extended result
`slt, slti, sltu, sltiu`	Set less than, set less than immediate, signed and unsigned
Control	Conditional branches and jumps; PC-relative or through register
`beq, bne, blt, bge, bltu, bgeu`	Branch GPR equal/not equal; less than; greater than or equal, signed and unsigned
`jal, jalr`	Jump and link: save PC+4, target is PC-relative (JAL) or a register (JALR); if specify X0 as destination register, then acts as a simple jump
`ecall`	Make a request to the supporting execution environment, which is usually an OS
`ebreak`	Debuggers used to cause control to be transferred back to a debugging environment
`fence, fence.i`	Synchronize threads to guarantee ordering of memory accesses; synchronize instructions and data for stores to instruction memory

JAL (Jump and Link):: The "jal" instruction also performs an unconditional jump like "j," but it additionally stores the return address in a register..

It allows a subroutine to jump to a target address and then return back to the original caller by using the stored return address.

Floating point insturction types

This table includes R64I, RVF and RVD.

Instruction type/opcode	Instruction meaning
Floating point	FP operations on DP and SP formats
`fadd.d, fadd.s`	Add DP, SP numbers
`fsub.d, fsub.s`	Subtract DP, SP numbers
`fmul.d, fmul.s`	Multiply DP, SP floating point
`fmadd.d, fmadd.s, fnmadd.d, fnmadd.s`	Multiply-add DP, SP numbers; negative multiply-add DP, SP numbers
`fmsub.d, fmsub.s, fnmsub.d, fnmsub.s`	Multiply-sub DP, SP numbers; negative multiply-sub DP, SP numbers
`fdiv.d, fdiv.s`	Divide DP, SP floating point
`fsqrt.d, fsqrt.s`	Square root DP, SP floating point
`fmax.d, fmax.s, fmin.d, fmin.s`	Maximum and minimum DP, SP floating point
Convert instructions	Convert instructions: FCVT. x. y converts from type x to type y, where x and y are L (64-bit integer), W (32-bit integer), D (DP), or S (SP). Integers can be unsigned (U)
`feq._,flt.,fle.-`	Floating-point compare between floating-point registers and record the Boolean result in integer register; "-" =S for single-precision, D for double-precision
`fclass.d, fclass.s`	Writes to integer register a 10-bit mask that indicates the class of the floating-point number (-∞, +∞, -0, +0, NaN, ...)
`fsgnj._, fsgnjn._, fsgnjx.-`	Sign-injection instructions that changes only the sign bit: copy sign bit from other source, the opposite of sign bit of other source, XOR of the 2 sign bits