# NoRC Project Meeting Report

#### Wei Song 23-Mar-2009

Advanced Processor Technology Group The School of Computer Science



# Content

- Avoid deadlocks in the Dynamic Link Allocation Routers
- Delay measurements of asynchronous channels



# The original design



Advanced Processor Technology Group The School of Computer Science



#### The possible deadlocks





Advanced Processor Technology Group The School of Computer Science



The University of Mancheste

#### Deadlock avoidance in fault-free NoCs

- Restrict loops
  - Constrain the number of request lines that sharing the physical channels
- Divide the forward/backward (request/ack) channels



#### The new router



Advanced Processor Technology Group The School of Computer Science

# Proof of deadlock-free

- Forward paths is a SDM network
  - Routing algorithm has loops
  - The maximal loop is restricted by request number which equal with channel number
- Backward paths is a wormhole network
  - Routing algorithm has loops
  - Frame length is 1

MANCHESTER

The University of Mancheste

- The maximal number of frames in a single router is the request number
- Deadlock-free when the input buffer is large enough

# Benefits of this modification

- Deadlock free in any fault-free NoCs.
- Support a maximal number of (N-1)\*2 requests on a physical channel.
- Reduce the complexity of router and network interface designs.

MANCHESTER

The University of Manchestel The University of Manchester MANCHESTER 1824

#### Some simple results





#### Some simple results







Advanced Processor Technology Group The School of Computer Science

- Avoid deadlocks in the Dynamic Link Allocation Routers
- Delay measurements of asynchronous channels



#### Purpose

- Measure the latency of different asynchronous channels under <50 nm technology
- Try to prove that serial channels are faster than parallel channels and measure how fast they are



- Current ANoC designs are using parallel channels
  - MANGO bundled data
  - QNoC bundled data
    - synchronized 4-phase channels
    - synchronized 4-phase channels

– ANoC

MANCHESTER

The University of Mancheste

#### The University of Manchester

MANCHESTER 1824

### Asynchronous Channels



Advanced Processor Technology Group The School of Computer Science



## Channels in NoCs





Advanced Processor Technology Group The School of Computer Science

#### Measurement procedure

Cell Library

MANCHESTER

The University of Mancheste

- Nangate 45nm Open Source Cell Library
- 32 bit, 4 8-bit serial channels
- Tool Flow
  - Verilog netlist
  - DC
  - SoC encounter
  - Calibre LVS/xRC -> HSpice netlist
  - NanoSim

of Manchester

## Loop Delay and Bit Energy



Advanced Processor Technology Group The School of Computer Science

# Comparing in NoCs

- Compare traditional wormhole routers with routers with 4 sub-channels
- Set the channels length to 1mm according to the 45nm technology
- XY routing algorithm is used
- Target address is encode into 8 bits

MANCHESTER

The University of Mancheste

#### The traditional wormhole router



#### Router with serial channels



# Frame Length 4-16 Bytes



# Frame Length 96-128 Bytes



## Frame Length 240-255 Bytes



# Throughput with different frame lengths

|         | parallel  | serial    |
|---------|-----------|-----------|
| 4-16    | 0.25GByte | 0.27GByte |
| 96-128  | 0.50GByte | 0.81GByte |
| 240-255 | 0.45GByte | 0.88GByte |



MANCH

# Conclusion

- By constraining the number of request lines on each physical channel and separate the forward and backward channels, the DyLAR router is deadlock-free.
- Through realistic layout procedure, HSpice simulation, and NoC simulations, divide a parallel channel into several serial channel could improve throughput and reduce frame latency.