Spooler unable to reset out-queue
search cancel

Spooler unable to reset out-queue

book

Article ID: 92998

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

These spooler errors started appearing around the same time for six different servers where the robot is installed.
 
Apr 25 06:00:20:402 [8880] spooler: QueueAdmin - expire old messages
Apr 25 06:41:08:288 [8880] spooler: rdbReset - open (q2.rdb) failed (Invalid argument)
Apr 25 06:41:08:288 [8880] spooler: FlushMessages - unable to reset out-queue
Apr 25 06:41:08:293 [8880] spooler: FlushMessages - aborting

I have checked the firewalls, file permissions, disk space  and they all seem to be fine. I have also deleted both of the .rdb files.. Everything has been reset or restarted at least once to fix the problem.

I have also updated to the latest version of the robot 7.93 though the problem originally occurred on 7.8

Environment

- UIM 8.5.1, 9.x
- Robot/spooler 7.80 and 7.93

Cause

This alarm will be published if the spooler probe fails to create a new queue file (q1.rdb/q2.rdb)

Resolution

In general these alarms are usually due to an environmental issue which is preventing the spooler probe from accessing one of these queue files. The out-queue corresponds to the q2.rdb file. The spooler will not clear this alarm, even if the condition that causes the alarm to be generated in the first place is cleared on the robot. If you continue to receive alarms and QoS from this robot and you see that the \Nimsoft\robot\q1.rdb file is not continuously growing in size, then the intermittent problem is most likely environmental.

It is advisable to have the robot System admin check for these common causes for this alarm:
  • There is no hard drive space on the robot disk volume or the drive is read-only
  • The robot does not have proper permissions
  • There are 2 spooler services running
  • There is AV software scanning/locking the files in the nimsoft directory which is preventing the spooler service from functioning correctly.
  • (you must create a full exception for all UIM/Nimsoft programs on the robot)
  • There is a file system problem on the system where the robot is installed
  • Backup software is locking the files
On the other hand, if you are not receiving any messages from the robot generating the alarm and the q1.rdb file is growing in size, it may mean that the q2.rdb file is corrupted. To look into this further:

1. Stop the Nimsoft Robot Watcher service on the robot
2. Save off a copy of the q2.rdb file and remove it from the \Nimsoft\robot directory
3. Start the Nimsoft Robot Watcher service

The spooler will create a new q2.rdb file and if the problem was due to a corrupted q2.rdb file, messages should start flowing again.

An analysis of the archived q2.rdb may show corrupt data via illegal characters which could have been disrupting the normal data flow.

Additional Information

Additionally, robot machines that generated spooler alarms may be:

  • Unreachable via ping
  • Decommissioned (non-existent / unknown)
  • Moved to test environment
  • Have host intrusion protection IPS installed on them (e.g., in /opt/Symantec)
  • If the problem is not seen on a higher loglevel and there is no impact that the client can tell these message can be ignored.