The bufferless router emerges as an interesting option for cost-efficient in network-on-chip (NoC) design. However, the bufferless router only works well under low network load because deflection more easily occurs as the injection rate increases. In this paper, we propose a load balancing bufferless deflection router (LBBDR) for NoC that relieves the effect of deflection in bufferless NoC. The proposed LBBDR employs a balance toggle identifier in the source router to control the initial routing direction of X or Y for a flit in the network. Based on this mechanism, the flit is routed according to XY or YX routing in the network afterward. When two or more flits contend the same one desired output port a priority policy called nearer-first is used to address output ports allocation contention. Simulation results show that the proposed LBBDR yields an improvement of routing performance over the reported bufferless routing in the flit deflection rate, average packet latency and throughput by up to 13%, 10% and 6% respectively. The layout area and power consumption compared with the reported schemes are 12% and 7% less respectively.
With the shrinking of transistor sizes more intellectual property (IP) cores are being integrated onto a single chip to implement more complicated system functionsŒ1. As the number of IP cores continue to scale in system-on-chip (SoC) the architecture of bus-based communication and crossbar interconnections is often the performance bottleneck due to the growing of bandwidth requirement and non-predictable wire delayŒ2. Solution to address the communication demands of future multicore system, NoC has become an emerging solution due to its considerable advantages such as reusability, scalability, and parallelism in communication infrastructureŒ3 .
The routing policy is among the most important considerations in NoC designŒ4. It has an important impact on some performance criteria such as the average latency and system throughput in the networkŒ5. A typical wormhole or virtual channel router for NoC is commonly comprised of input/output ports, buffers, routing logic and a crossbar switch connecting input ports to output ports. The use of buffers in the router can improve the bandwidth efficiency and prevent the packet being dropped. However, buffers in the router lead to a high hardware implementation overhead and increase design complexity. As mentioned in Reference , buffers in the router are responsible for 46% of router power consumption and 30% of router area. With buffers elimination, bufferless routing emerges as a potential solution for cost-efficient in NoC design.
The basic principle of bufferless routing for NoC is that all packets arriving at a router must immediately be forwarded to an adjacent router. As there are no buffers, output port contention in the router is mostly resolved by the deflection packet, which is called bufferless deflection routingŒ8. For bufferless deflection routing, deflections occur more frequently at a high network load because more packets may contend the same output port. Each deflection sends a flit further from its destination, which causes extra cost for latency and power consumption. In addition, the bufferless deflection routing with a high deflection rate may severely degrade the performance due to the processing of a packet involved in livelockŒ9. Thus, to reduce deflection is a key for improving the performance of bufferless deflection routing.
Two schemes are mostly discussed to address this issue for the bufferless deflection router. One is to reduce the deflections of packets by adding a few buffers, such as the approach mentioned in Reference . However, this strategy weakens the primary advantage of the bufferless router in cost-efficiency. The other is to reduce deflections by using a priority strategy or control mechanism. Based on this idea, a location based priority and a distributed source-throttling congestion control mechanism were proposed in References [11,12] respectively to reduce deflection and relieve congestion for a bufferless deflection router. The experiment results have shown that throughput can be improved by 12% compared with the baseline bufferless deflection router as mentioned in Reference . To analyze the causes of deflections, three deflection models were constructed and a low-deflection bufferless router (LDBR) was proposed in Reference  which adopted a multi-channel network interface and control mechanism to address contention. However, the routers above yield performance improvement by a complex mechanism.
In this paper, we propose a load balancing bufferless deflection router called LBBDR for NoC. To explain the working mechanism, two typical bufferless deflection routers are introduced. The common deficiency of design complexity for those routers is analyzed, which disagrees with the primary advantages of the bufferless router. Based on the current research results, the proposed LBBDR employs the strategy of load balancing for flit initial routing combined with an improved priority policy to simplify the design of the deflection router. To evaluate the effectiveness of LBBDR, the performance of deflection routing and hardware overhead are assessed by a simulation platform using System C and Design Complier of Synopsys with the TSMC 28 nm HPM technology. The results illustrate its superiority under low-to-medium network load compared with the reported schemes.
The remainder of this paper is organized as follows. Two typical bufferless deflection routers are described to explain its working mechanism and the deficiency of current research results in Section 2. The architecture of the proposed LBBDR and its algorithm are presented in Section 3. A comparison of output port contention between the proposed LBBDR and the reported bufferless deflection routers are given to illustrate the superiority of the former in Section 4. The simulation results are presented in Section 5. The conclusions of this paper are drawn in Section 6.