Cycle-level simulators are cycle accurate models characterized by an high accuracy on performance evaluation comparing to the real hardware. Such simulators require a lot of communications between modules composing the simulator that is slowing down the simulation.
Transaction-level simulators are mostly based on functional models and focus on the communication, which is the real bottleneck of CMP architectures. TLM simulators may be less accurate than their cycle-level counterparts, but will run much more faster.
To illustrate the difference between those two different level of abstraction, let’s consider both at CLM and TLM level a Level-2 cache performing a write back to the DRAM memory. The cache line is 256bits, and the bus width between the L2 cache and the DRAM only 64bits.
At cycle level, The cache line will be split into 4 64bits packet fitting on the bus, and the routing of each packet will be done separately.
The cache behavior will be as suggested by the figure below:
- First, at the beginning of the cycle, the cache will send the packet to its output port
- Second it will receive an accept signal corresponding to this data
- Third it will send an enable signal as an hand-check for the accept signal.
- And last, it will at the end of the cycle get rid of the packet if it managed to sent it properly (without contentions)
This means 4*4=16 communications for the four packets, considering there is no contention (so packets don’t need to be re-sent)
However at cycle level, even where there is no communication, the nothing data signal is sent. So at the same time the DRAM will send 4 nothing to the cache module, meaning a total number of signal communication equal to 28.
At transaction level sending the whole cache line will only require one communication, as shown on the figure below.
However at this level, the simulator won’t be able to accurately model the contention of each packet, usually relying on averages, or just considering that 4 full bus cycles are required to send the whole line.
Different techniques exists to speed up the simulation. We divided such techniques into two different categories:
Simulator level techniques allow to produce more efficient simulators by optimizing the simulation engine. These techniques usually rely on reducing the number of required communications, or in increasing the parallelism in the simulator engine.
Simulation level techniques reduce the overall simulation workload, usually by simulating only a subset of the whole workload. Such techniques can drastically reduce the simulation time, but reduce the accuracy of the overall simulation.
- Fast SystemC engine : A faster SystemC engine dedicated to cycle-level simulation.