Instruction execution on the Mill CPU: Working at Mach 3

A perennial objection to wide-issue CPU architectures such as VLIWs and the Mill is that there is insufficient instruction level parallelism (ILP) in programs to make effective use of the available functional width. While software pipelining can reveal large quantities of ILP in loops, in open (non-loop) code studies have calculated maximum ILP in the order of two instructions per cycle (IPC), well below the capacity of even conventional VLIWs never mind super-wide architectures such as high-end Mills. The problem is that the program instructions tend to form chains connected by data dependencies, precluding executing them in parallel.

This talk addresses the ILP issue, describing how the Mill is able to achieve much higher IPC even when the nominal ILP is relatively low. The Mill is able to execute as many as six chained dependent operations in a single cycle; open code IPC numbers typically exceed nominal ILP by a factor of three. The talk will show in detail how this is achieved, and why we chose "Mach 3" as the name of the mechanism. In the course of the explanation, the talk will also introduce other operations for which the semantics of Mill execution differs from that of conventional CPUs.

Speaker: Ivan Godard, Out-of-the-box Computing

Wednesday, 02/05/14

04:15 PM

Contact:

Website: Click to Visit

Cost:

Free

Save this Event:

iCalendar
Google Calendar
Yahoo! Calendar
Windows Live Calendar

Gates Computer Science Building

Stanford University
NEC Auditorium (B3)
Stanford, CA 94305