6 common failure analysis and solutions for dcs maintenance

The dcs maintenance content is many and complicated, so when a failure occurs, it is easy to have no clue. Common faults can be started from the following 6 directions.

1. Communication network failure

Communication network faults are generally prone to occur at the contact bus, local bus, or caused by address identification errors.

Node bus failure

The transmission medium of the node bus is generally a coaxial cable, and some adopt a token signal transmission method, and some adopt a multi-channel transmission method with conflict detection to send and receive a contention bus signal. No matter which method is adopted, when any part of the main line of the bus is interrupted, it will cause communication failures of all stations and their sub-devices on the bus.

At present, the general method to prevent such failures is to adopt a dual redundant configuration to avoid affecting the overall situation due to failure of one bus, but this does not fundamentally prevent the occurrence of the failure, and once a bus fails, the processing It is very easy to cause another bus failure, and the consequences are very serious. An effective method should be to prevent poor bus contact or open circuit.

The more successful is the node bus arrangement of the system. The connection of the coaxial cable is not at the front of the communication module, but at the back of the module, so that when the communication module fails during system operation, it can avoid accidentally touching the coaxial cable and causing the network cable to break. At the same time, the coaxial cable will not be touched at any time except for special inspection, which can prevent loosening of the coaxial cable's plug due to multiple plugging and unplugging of the coaxial cable, increasing the possibility of its failure. In addition, a coaxial cable inspection and replacement management system should be formulated, and the replacement or treatment should be carried out before the contact resistance increases to affect communication.

Local bus failure

In-situ bus or field bus-generally a data communication network composed of twisted-pair wires. Because the connected equipment is a primary component or control equipment directly connected to the production process, the working environment is harsh, the failure rate is high, and it is vulnerable to maintenance personnel. Misoperation affects the production process. In addition, the bus itself will cause communication failures due to various reasons.

The effective measures to prevent such failures are: first, properly handle the connection points between the local bus and the local equipment. When disassembling the equipment, the normal operation of the bus must not be affected. The bus branch should be installed in a place that is not easy to touch. At the same time, it is best to use dual redundancy for the bus to improve the reliability of communication.

Address mark error

Whether it is a local component or a bus interface, once its address identification is wrong, it will inevitably cause the communication network to be disordered. Therefore, it is necessary to prevent the address identification of each component from being wrong, and to prevent man-made misoperation and erroneous modification. When the system is expanded, it should generally be carried out when the system stops running. Especially for systems that use token communication, any increase or decrease in the configuration must be released to the network when the system is out of service to avoid unpredictable consequences.

2. Hardware failure

According to the different hardware functions of the DCS system, its faults can be divided into man-machine interface faults and process channel faults. The man-machine interface mainly refers to the engineer station, operator station, printer, keyboard, mouse, etc. used to realize the human-machine contact function; Channels mainly refer to local buses, channels, process processors, primary components or control equipment, etc. The human-machine interface is composed of multiple workstations with the same function. When one of them fails, as long as it is handled in a timely manner, it will generally not affect the monitoring operation of the system. When the process channel failure occurs on the local bus or primary equipment, it will directly affect the control or detection function, so the consequences are more serious.

Man-machine interface failure

Common human-machine interface failures include mouse operation failure, control operation failure, operator station crash, abnormal membrane keyboard function, and printer not working. The abnormal mouse operation is generally due to the long-term aging or pollution of the internal mechanical devices, which makes the contacts unable to be reliably connected or disconnected, or the cable does not communicate with the host because the cable is not firmly plugged in. At this time, you only need to replace it and check it.

6 common failure analysis and solutions for dcs maintenance

The control operation fails because the mouse operation signal cannot change the state of the process channel. On the one hand, the hardware of the process channel itself may be faulty, on the other hand, it may be the software defect of the operator station itself. When the equipment is overloaded or the process windows are opened too many , Resulting in no response. After checking that the function of the process channel is normal, the operator station should be checked and restarted if necessary to initialize the operator station.

There are many reasons for the crash of the operator station, which may be due to the hard disk or card failure, and the software itself is defective.

Cooling fan failure causes the host to overheat or overload. You can first check the temperature rise of the host itself, and then use the alternative method to check the hard disk, host cards, etc., to determine the faulty part.

Membrane keyboards are used in most operator stations. Its main function is to quickly retrieve process graphics, which is convenient for operators to quickly monitor process parameters. When the membrane keyboard is configured incorrectly, the keyboard is in poor contact, the signal cable is loose, or the keyboard is incorrectly activated when the host is started, the startup can be incomplete, which can cause its function to be abnormal. It should be dealt with according to different situations.

Analysis and solutions of 6 common failures maintained by dcs

The printer does not work generally due to configuration reasons, and at the same time, after the printer is shielded, the printing function will also be disabled. In addition, the hardware failure of the printer itself will cause some or all of its functions to be abnormal, and the printer's settings and hardware should be rechecked and deal with it.

Process channel failure

At most, the occurrence of the process channel is a card failure or a local bus failure. One reason is that the card itself works during factory hours, and the components are aging or damaged; in addition, the grounding of external signals or the stringing of strong signals into the card can also cause channel failure. Now generally the card itself has taken good isolation measures, and under normal circumstances it will not lead to the expansion of the fault, but once such a fault occurs, it will directly cause the abnormality of the process control or monitoring function. Therefore, it is necessary to find out the cause of the failure in time and replace the card in time.

A failure of the original or control equipment can sometimes not be directly discovered by the operator. Only when the parameters are abnormal or alarms, can it attract attention. The failure of the control processor (process processor) generally generates an alarm immediately, attracting the attention of the operator. Now the control processor basically adopts 1:1 redundant configuration, one of the failures will not cause serious consequences, but the malfunctioning machine should be dealt with immediately. During the processing, the normal processor must not be moved by mistake, otherwise serious consequences will occur.

Three, man-made failure

Maintenance or troubleshooting of the system may sometimes cause misoperations, which can happen to personnel who frequently perform system maintenance or newly participate in system overhaul and maintenance. Generally, when modifying control logic, downloading software, restarting equipment or forcing equipment, the protection signal is the most prone to misoperation event. It can cause some measuring points and equipment to be abnormal, and it can cause the unit or main auxiliary equipment to stop operating. The consequences are very serious. In the chemical plants used, faults caused by human misoperation account for a large proportion of unsafe incidents.

Four, power failure

There are also many problems with the power supply, such as the backup power cannot be turned on automatically, the unreasonable insurance configuration and the internal failure of the power supply cause power interruption, the fluctuation of the temperature and pressure power supply causes the protection to malfunction, and the poor contact of the plug leads to the temperature and pressure power supply without output; The entire cabinet of the system provides all input signals through a fuse or a power supply with a large external load, and the control power supply is neither connected nor redundant.

Five, SOE is not working properly

The SOE record is the event sequence record. When the power equipment has a remote signal displacement such as a switch displacement, the power protection device or smart power meter will automatically record the displacement time, the reason for the displacement, and the corresponding remote measurement value when the switch trips (such as Corresponding three-phase current, active power, etc.), form an SOE record for subsequent analysis. Many relay protection devices and smart power meters, such as power protection meters from GE Power, Schneider Electric, ABB, Siemens and other manufacturers, special power RTU equipment, etc., have SOE recording functions.

The conclusion of SOE plays a very important role in the analysis and judgment of the accident, but in reality, many power plants do not record the pull-down or the recording time does not match the actual situation when the protection action occurs in many power plants. If power plant #1 unit has experienced SOE event sequence recall time that does not correspond to the actual trip time, SOE time cannot be returned after printing and browsing, the first trip cause failed to be the first response in the time sequence, and SOE time sequence data cannot be set. However, some power plants found that the time sequence in the SOE conclusion was deviated from the time sequence in the historical curve during the analysis of several accidents, and sometimes even the time sequence was reversed. This is manifested in the inconsistency between the historical curve and the people’s livelihood time in the SOE at the same point, and sometimes the deviation is very large. Large, this will delay the process of accident analysis, and sometimes even mislead the direction of accident analysis. The SOE problem is not only related to the unreasonable system design, the SOE points are not completely concentrated on one, but also related to the poor consideration of the system hardware and software.

6. Failure caused by interference

There are also many examples of failures caused by interference. The interference signal of the system may come from the system itself, or it may come from the external environment. Since different systems have strict requirements for grounding, once the grounding resistance or grounding method fails to meet the requirements, it will reduce the efficiency of network communication or increase the possibility of bit errors. In the light of this, it may cause some functions to be abnormal. Cause the network to be paralyzed.

The power quality also affects the stable operation of the system. The power supply used in the system must not only ensure the stability of the voltage, but also ensure that when one power supply fails, it will switch to another power supply without disturbance, otherwise it will interfere with the system operation. Switching between the main and standby processors of the process control processor can sometimes cause interference. In addition, high-power radio communication equipment such as mobile phones and walkie-talkies can easily cause interference and endanger the operation of the system.

Organic Light Emitting Diode

Organic Light Emitting Diode,Pm Oled,Oled For Medical Products,Stepper Motor Oled

ESEN Optoelectronics Technology Co., Ltd, , https://www.esenoptoelectronics.com