How to make a non-stop microcomputer

A microcomputer is a product which is easy to disrupt and this problem cannot be totally avoided even with careful attention to protection.

Okay, let’s consider what we can do to make the system more robust, using this as a given. If the microcomputer may sometimes be stopped or disrupted, then, it should restart again as soon as possible. For this restart, there is only one reliable method, a reset. Without a reset, some failures cannot be handled.

Watch-dog

Almost all microcomputers are equipped with watch-dog timers which can generate a microcomputer system reset. This watch-dog is a type of timer which, when a preset time interval expires, resets the CPU, peripherals, and other parts of the system. A programmer may initialize this watch-dog timer with a time value, then periodically (before the interval has expired) re-initialize the timer so that the interval starts over. If there is no such signal to this timer, it means that the CPU is malfunctioning. Then, this watch-dog timer if enabled will reset the microcomputer and initialize the system.

However, there are many cases in which a watch-dog timer has been initialized and fails to reset the microcomputer even though the microcomputer is malfunctioning. Since this initialization is done by the programmer, depending on his or her skill and carefulness some problem cases can be missed. Clearly, a programmer cannot predict all possible problems and cannot be certain that all trouble cases will be handled.

Furthermore, in some types of systems, if the watch-dog generates a system reset, certain types of peripherals, such as motor drivers, could generate another type of accident by a sudden motor stop. Also, there could be valuable information in the system’s RAM memory, which would be lost.

New Technology

FUJIMI High Resilience technology is a solution to the above symptoms and problems.

block

Basic implementation in a microcomputer

First of all, there is an important difference in a FUJIMI microcomputer. It is that there are two separate reset signals: (1) total system reset and (2) CPU-only reset. Total system reset is common to all microcomputers. CPU-only reset is a unique feature of a FUJIMI-enhanced microcomputer. Additionally, one interrupt (NMI), at the highest priority, is provided in addition to the microcomputer’s existing interrupts.

Additional components in a FUJIMI-enhanced microcomputer include the FUJIMI dedicated timer, which can be set to an interval of time and each time this timer expires, it generates the highest priority interrupt (NMI) first and then, after another programmable time interval, the CPU (only) is given a reset.

One important thing is: this CPU-only reset must not affect any other part of the microcomputer or system.

Above is a drawing (block diagram) of such a microcomputer.

The part shown in green is the dedicated FUJIMI timer, which can generate the highest priority (NMI) interrupt and CPU-only reset in sequence.

Total system reset is a common reset, also.

FUJIMI software

To implement a FUJIMI-based system, based on the above hardware, it is very important to write suitable software to support the functioning of this system.

After initialization, the FUJIMI timer is set to a period of time and the user application begins. After the set period, the FUJIMI timer generates the highest priority (NMI) interrupt . From this interrupt to the CPU-only reset, there is a programmable but limited time. Thus, this NMI interrupt routine must be certain to complete within this interval of time, with adequate margin.

Furthermore, the highest priority interrupt (NMI) may be spuriously generated by noise. Thus, this interrupt routine should be written considering a possible overlap of interrupts. Normally, the second interrupt can be detected by this routine and effectively discarded or ignored.

Software Protocol

  1. First, the NMI interrupt routine must check whether this invocation was a normal one or an overlapping one (spurious). If this interrupt occurs while the first interrupt routine is running, this second interrupt should be exited as soon as possible. This determination can be made by examining a flag and the value of the program counter (PC) in the stack.
  2. Second, this routine must check whether the application software is executing normally. This can be done by checking the saved PC value and the stack pointer value. By this check, over 90% of software violations can be detected.
  3. Once it is confirmed that there was no software violation in the running application software when the interrupt occurred, this interrupt routine sets a Processing flag, then moves all register contents to the stack. Once all CPU information is pushed into the stack, all of the stack contents are copied to two dedicated RAM areas. This is in case there is a violation after this step.
  4. After all of the stack is copied, this routine sets a Save-OK flag. Then, it enters a wait mode, waiting for the CPU core reset to occur.
  5. When the CPU core reset is issued, the routine does the following things:
    1. Check whether RAM contents are initialized and flags in RAM are set. If not, this means the total system had just powered up, or RAM contents are corrupted by noise. In this case, the routine should initialize the total system, including peripherals and parameters in RAM.
    2. If RAM contents seem to be okay, then, examine the “Save-OK” flag. If this flag is valid, it means that the system is executing normally. After any optional tasks, the routine can return from this interrupt based on the saved CPU information.
    3. If the “Save-OK” flag is not valid, this means that the interrupt routine found some problem or the interrupt routine did not run. In these cases, the system is malfunctioning. Once the system condition is identified as abnormal, the routine can implement a predefined recovery method.
    4. In a FUJIMI system, the most appropriate recovery action can be implemented in software and thus, a wide choice of strategies is available. This is why the FUJIMI system is called a “highly resilient system”.
  6. Options: If the reset routine has determined the system is running normally, a programmer can then optionally perform traditional “Real Time Interrupt” functions . The FUJIMI interrupt-reset happens with a regular period, similarly to traditional Real Time Interrupts, and thus, these can be merged. Furthermore, in the FUJIMI standard, this reset routine can restore the peripheral settings on each reset. This way, any error condition in the peripherals can be cleared within a shorter time. Below are flow charts illustrating the FUJIMI protocols described above.
flow