Signal granularity
Data signal granularity is defined by the amount of data sent between connected ports. Signal granularity can vary from bit information (lowest granularity) to word information (opcodes, addresses, ...) to complex type granularity with structured information at the highest level.
Signal granularity in UNISIM
Within the UNISIM framework, the user is free to tune up the level of granularity for signal data types. However, the lower the granularity will be, the more communications your simulator will need, and the slower your simulator will be.
So, of course it is possible to set up as many port pair as the number of wires in your architecture, but there is no point to separate the information for the wires that are tied together: For instance when the register file is sending some register values to the ALUs, there is no need to choose a lower granularity than sending this register value in one signal message.
In the following subsections, we’ll present some different levels of granularity for data signals.
Minimalistic data signal types
The idea behind the Minimalistic data signal type scheme is to communicate between modules sending just the required information as stated on the architecture block diagram. The figure below shows such a minimalistic scheme, where each module receive the appropriate information.
The modules of the figure above should behave the following way:
- fetch module is sending the fetched instruction to the register file, and may receive some branch addresses corresponding to branch instructions.
- register file module is receiving the full opcode of the instruction, extracts the register indexes (or immediate value) from this opcode, and provide on its output ports the corresponding register values
RaandRbto be used by the alu as operands.
- ALU module is either sending the computed branch address to the fetcher, the computed memory address to the data memory, of the computed value back to the register for write-back.
- data memory module is sending back the memory value for read requests only.
But some information is missing: How does the ALU knows which operation to perform? How does the data memory knows if the received address correspond to a read or a write?
To do so, those modules should also being receiving the opcode corresponding to the incoming signal adding some new ports to communicate this information, but doing so increase the number of communication. To avoid the slowdown raised by this issue, it is valuable to send some piece of information together, as shown in Figure \ref{fig:togesigcom}.
In the simulator represented by Figure \ref{fig:togesigcom}, We grouped all the information that each module was sending at the same time to the same target, reducing the number of overall communications: For instance the register file module is now sending at the same time both operands and the opcode containing the operation to be performed by the ALU.
However, there is a pitfall with reusability: Imagine you want to insert a new module between two existing module in Figure \ref{fig:togesigcom}. To do so the port type of the new module have to match the port types of the modules you are inserting between. So it won’t be easy to move this new module across the pipeline, as you will need to update the new module interface. This may become a strong issue for pluggability.
The homogeneous data type scheme presented below propose a scheme were all the data type are standardized to optimize module pluggability.
Homogeneous data signal type
The homogeneous data type scheme propose to send the same message type across the simulator as show in Figure \ref{fig:homosigcom}, providing the same interface to every module.
This new message type is built as a container of possible sent values. For instance, in the simulator of Figure \ref{fig:homosigcom}, the value instr represent a container of the different information being sent within the simulator: \{ operation, Ra_idx, Rb_idx, Rd_idx, Ra_val, Rb_val, Rd_val, branch@, mem@, Rd’ \}.
The fields of this container will be filled as the instruction goes along the pipeline:
- The fetch module will set the
operationand the register indexesRa_idx,Rb_idxandRd_idxby decoding the instruction.
- The register file module will read the values from the register bank corresponding to the register indexes, and set the fields
Ra_valandRb_val.
- The alu module will compute the operation corresponding to the
operationfield on the valuesRa_valandRb_val. Then, depending on the operation, the result will be stored to one ofRd_val,branch@ormem@and the instruction sent to the appropriate port.
- The data memory module will from the
operationfield either perform a write at addressmem@, or perform a read at this address and store the result in theRd’ field.
- To finish, the register file module will perform the write back with either
RdorRd’ depending on the incoming port.
This is now very easy to plug a new module between two existing modules as the input and output port data types have been standardized. If the new module requires some new information, such information will be added to the instr container.
The homogeneous data type scheme is also very useful when debugging, as all the information corresponding to the current instruction is available in the instr data type. There is also no real impact on performance, as we’ll see in the ``Implementation in C++” section.
Heterogeneous data signal type
In larger systems, it may look weird to have this homogeneous data type: Imagine now you want to have a much more complex memory hierarchy associated with your processor. Your simulator has no more only to take care of instructions going along the pipeline, but also about memory requests going along the memory hierarchy.
It will look weird to have all the information corresponding to memory requests in the pipeline, whereas a few instruction actually correspond to load and stores. We therefore decided to have two different data type for such a system: a data type for instruction and one for memory requests.
The Figure \ref{fig:hetesigcom} correspond to such a system: With on the first row the modules corresponding to the processor pipeline, and on the second row the modules corresponding to the memory hierarchy. For the purpose of this example, the system as been kept simple with only one cpu, and therefore a very simple memory bus.
As it may be interesting for debugging purpose to know to which instruction a memory request correspond to, we set that the instruction can be stored as a field of a memory request. However, the memory request won’t modify the instruction. The effective modification of a read instruction when receiving the corresponding data is done by the load / store queue module.
C++ implementation
Using c++ classes
The signal containers presented in the previous sub-sections fit well into C++ object, as shown in Figure \ref{fig:sigclass}, container fields being represented as properties of the class.
class instruction
{ uint32_t opcode;
enum operation_t operation;
uint32_t Ra_index, Rb_index, Rd_index;
uint32_t mem_address, branch_address;
}
We have defined in the repository a c++ class for the instruction and a c++ class for the memory request. Those signal data types will be detailed in the following chapters.
Performance concern
The Minimalistic data signal type scheme proposed to use some integers as signal data types, whereas the homogeneous and the Heterogeneous data type scheme proposed to use c++ objects with a lot of information. Sending this information from one module to another is done by copying this information from the sender output port to the target input port, so the object copying maybe slower than the integer passing.
However the object based schemes have two main benefits: It provides a better pluggability to modules facilitating the reuse of modules, and it allows to debug much more easily by providing additional information to understand misbehaviors.
It is quite possible to use the object scheme with performance as good as the Minimalistic scheme by changing the signal data types from objects to pointers or references on those objects. To do so, we provide a Pointer templated class which behave like a classic c++ pointer, but takes care of the garbage collection.
Replacing every object message by this Pointer of object should be done at release time to build faster production simulator, as the object passing method is much more safer and understandable for debugging.
Other signals
Module communication in UNISIM does not only imply a data signal, but also an accept signal and an enable signal. Those two signals are used to embed control into communication, allowing to distribute the centralized control among modules.
However, the accept signal and an enable signal are far less complex than the data signal, as they does not contain a data type set by the user. Those two signals are in fact boolean value signals, the boolean allowing to known if whether the data should be accepted or rejected, and if its use should be enabled or prevented.



