The next line from that song is, "Does anybody really care?". Well, the SPARC-V9 Standard Reference Model doesn't care.

First, I wanted to make the memory initialization monitor part of the view. After this refactoring, I got FSS to compile on 2009-12-19.

At this point, I wanted to find out if the monitor flags the read of uninitialized address 0 as an error (it should). On the way to answering that question, I noted that the memory view in the FSS GUI disagreed with the reference model about the contents of address 0. I'd come back to that.

Yes, the monitor flagged the read of uninitialized address 0 as an error. I wanted to commit to CVS, but the server was down (see Solaris Installation Hell; jesus is also the CVS server), so I'd continue working out of the sandbox for the time being.

So I returned to the data discrepancy. The answer is simply that the reference model has its own memory. The memory view in the FSS GUI displays the memory attached to the design under verification (DUV). The two memories are instances of the AddressSpace class. This class returns random data from uninitialized addresses when read.

Now that I had confirmed that the monitor was able to recognize reads of uninitialized addresses, it was time to quiet the monitor down. It would be necessary to disable the monitor before Power-on reset (POR) because it is possible for a microprocessor in a random state to read from an uninitialized memory address and this is not a concern.

Disabling the monitor before POR was accomplished by 2009-12-23.

While memory reads from a microprocessor in a random state are not a concern, writes certainly are. There is a sequence difference between the FSS environment and a real system. At power-on, a real system runs a bootstrap loader that tells the microprocessor how to find the boot device and then the operating system (from the boot device), which gets loaded into RAM. In FSS, memory is prepared before time begins to elapse. That memory includes the bootstrap code, but this code is very simple. The bootstrap loader in FSS tells the microprocessor how to find the application (in this case a test program) directly. In a real system, the operating system would load the application into RAM. In FSS, the user (in interactive mode) loads the application into simulated memory as part of the aforementioned preparation. The bootstrap loader exists in both cases, but in FSS it is simplified.

So we don't want the prepared memory contents disrupted by the random state of the DUV as simulation begins before POR.

I modified the AddressSpace class so that, when disabled (e.g. before reset), it ignores writes from the DUV. I would commit that too once jesus was returned to service.

With all that behind me, it was time to get back to work on mainstream issues leading to FSS Version_0-007 release. And that meant developing the SPARC-V9 Standard Reference Model to the point that it could produce the expected result of a load-add-store sequence.

Given such a test executable, FSS identified several illegal instructions. Also, the memory view at the end of the test displayed more data than expected (it displays only addresses with activity). I theorized that I was seeing reads of uninitialized addresses. And this test case revealed a column width issue in the memory view (i.e. a GUI issue).

The problem was that I hadn't allowed for the width of the vertical scroll bar, which appears when there is enough data that it is needed. The solution was to explicitly set the scroll pane's preferred size. That's another commit for the queue.

Back to the illegal instruction problem, I predicted that that was going to be interesting to debug.

The FSS GUI presents a tool tip with the illegal instruction indicator. The contents of the tool tip aren't very useful. Correct behavior would be a trap to an operating system. I wanted to postpone implementation of correct behavior, but I needed to keep future implementation in mind as I do what's needed now. It's not ideal to depend on the DUV to help debug a verification environment problem. There are multiple issues.

Issue 1: Verify that the DUV behaves correctly in the presence of an illegal instruction.
Issue 2: Debug FSS when the verification environment presents an illegal instruction that it shouldn't have.
Issue 3: Debug RTPG when it creates stimulus that it shouldn't have.

I reasoned as follows:

Issue 1 is trap verification and what I would like to postpone.

Issue 2 could involve Java debugging, but might benefit from a more useful tool tip. I'd really like to know about the first illegal instruction.

Issue 3 is probably not the case this time. The failure was observed in interactive mode, where the user provides the test program for execution. If I trust load_add_store.s, the Solaris assembler and the Solaris link editor, and the elfdump of a.out looks good, then look elsewhere for the root cause.

I considered stopping the simulation at the first illegal instruction, but doing so would interfere with Issue 1.

I decided to focus on Issue 2. The reference model was correctly following the instruction stream up to the first control transfer instruction (not yet implemented). Eventually, I would modify Sputnik to support traps. For Issue 2, however, it is wise to avoid modifying Sputnik. FSS's waveform viewer was the ideal tool for debugging Issue 2.

Starting with cycle 12, all instructions were illegal.

And I was getting an assertion error because of a nonzero instruction field, which came as no surprise because instructions appeared to be junk. But when I attack Issue 1, I might have to remove this assertion or at lease move the check downstream so that I know the instruction is legal before checking the field.

The instruction causing the first "illegal instruction" exception was actually not yet implemented in Sputnik: STFSR (deprecated) or STXFSR. How did that get into the instruction register (IR)?

I made another simulation run to capture additional IR signals, and noted that the instruction causing the first "illegal instruction" exception had changed to Bicc, suggesting that it was random.

The address in the program counter (PC) was strange. It wasn't aligned and it was nowhere near the test program. The address happened to match the contents of address 30. These contents are the address to which I jump at the end of the bootstrap loader. These contents are supposed to be the address of the test program.

That was 2009-12-31, so I took my wife out with another couple and I sang karaoke.

Soon I discovered, with the help of jdb, that the Loader class was failing to initialize address 30. Rememeber that I had recently modified the AddressSpace class so that it ignores writes before reset? That was the problem.

So I changed that to enable writes initially. This allows memory initialization before simulation. As soon as that initialization is complete (and still before simulation starts), I disable writes, protecting memory from the random state of the DUV before reset. This fixed the problem with address 30.

At this point, I was relying on the code that I previously put in place to enable writes at reset. But, thinking that I might have neglected to call that code, I temporarily placed an assertion that would “fail” on such a call. The assertion never fired. Additional evidence was that the the sum was never stored (recall that this is a load-add-store test). I concluded that the call was, in fact, missing.

After adding the call to the enable-writes code, there were no obvious failures. That's yet another commit for the queue.

The major task that I'm currently working on toward Version_0-007 release is step 8 of 9 (run automated verification on ADDTestCases) from the list I shared with you a year ago. This can be regarded as regression testing of the verification environment, with expected failure due to the new dependency on the SPARC-V9 Standard Reference Model, which is under development.

Getting interactive mode right will be easier to do first. Will interactive mode use the reference model? I haven't committed to that, but it's the path I'm on. Anyway, I know that the reference model has no capability to run load_add_store. That's what I need to work on. To begin with, the reference needs to be able to follow a control transfer instruction (e.g. to the test program).

On 2010-01-06, I noted that I was causing some change in the instruction flow (in response to JMPL), but that it was not the correct change. The reference model was indicating a fetch of address 30. That address contains the jump target address. What happened to the indirection?

I decided that I needed to move the assignment of the program counter register contents to the address bus, from the clock method to the propagate method (see Behavioral Modeling in JHDL). This was to be a continuous assignment. But I made the assignment conditional in such a way that suggests that I was not thinking clearly that day. The reference model was still fetching from address 30.

As I write this, I'm reading some nonsense from my notes, so I'll skip over some details. Somehow, I managed to stop the instruction fetch from address 30 and persuade the reference model to try to follow the jump. I was so happy that I thought I was ready to commit again.

I wrote that the reference model tried to follow the jump because the register file hadn't yet been initialized. Therefore, the jump address was being calculated from bad data. The reference model would need to understand at least some of the preceding instructions.

Fast-forward a few weeks. On 2010-02-01, I had a chat with Bimal Mishra (he's a good listener). I had come to confirm my suspicion that the reference model needed to be more abstract. I explained that Sputnik is a von Neumann architecture and that FSS currently only supports that architecture. It will have to become more flexible. But the immediate problem is that the reference model needs to fetch instructions every "cycle", and sometimes needs to access data as well. Sputnik has a clever way of dealing with this von Neumann bottleneck, but the reference model can't be concerned with such details. I would need to remove the concept of a "clock cycle" from the reference model's vocabulary. In other words, the reference model doesn't care what time it is.

In general, the DUV may take multiple clock cycles to execute an instruction, or it may be capable of multiple instructions per clock cycle. The reference model needs to pace itself in either case.

Dr. Mishra wanted to understand how the Harvard architecture is possible.

It's relatively easy to make the reference model independent of the clock signal. More difficult (down the road) will be to generalize the number of memory buses (at least to support one or two).

From pic24micro.com, "The Von Neumann architecture has a single storage structure to hold both instructions and data. The CPU can be either reading an instruction or reading/writing data from/to the memory because instructions and data use the same bus system. A memory interface unit is responsible for arbitrating access to the memory space between reading instructions and passing data back and forth among the processor and its internal registers. It might seem that the memory interface is a bottle neck between the processor and the memory space. In many Von Neumann architectures this is not the case because the time required to execute an instruction is normally used to fetch the next instructions."

On 2010-02-08, I renamed
com.alanfeldstein.sparc.fss.model.Sparcv9StandardReferenceModel.clock
to
com.alanfeldstein.sparc.fss.model.Sparcv9StandardReferenceModel.executeInstruction

This produced a runtime failure, which was fixed by removing the call to
byucc.jhdl.base.Structural.clockMethodIsDisabled

If there's no clock method, it makes no sense to enable or disable it.

On 2010-02-10, I simulated load-add-store with no apparent failure. The reference model was quiet, of course, because no one was calling executeInstruction.

I'm considering creating a monitor that watches the instructions that are fetched by the DUV. This information would then be relayed to the test bench (perhaps through a scoreboard), which calls executeInstruction in the reference model.

But that strategy won't work in general because some implementations don't commit all fetched instructions.

What does the Software Requirements Specification say about all this?