Scheduled jobs not processed by the job scheduler (BG service Issue)
search cancel

Scheduled jobs not processed by the job scheduler (BG service Issue)

book

Article ID: 49034

calendar_today

Updated On:

Products

Clarity PPM On Premise Clarity PPM SaaS

Issue/Introduction

One or more Clarity jobs are stuck in the Waiting, Scheduled, and indefinite Processing status.

Examples of jobs that can be impacted:

  • Create and Update Jaspersoft Users job
  • Datamart Rollup job
  • Index contents and documents for searches job
  • Tomcat access log import/analyze job
  • Synchronize Jaspersoft roles job
  • Time Slicing Sync job
  • Send Action Item Reminders job
  • Update Aggregated Data job
  • Load Data Warehouse (DWH) job (Incremental or full)
  • Time Slicing job
  • Datamart Extraction job

Other jobs of different types may or may not be impacted.
The scheduled time shown is stuck in the past. Pausing and resuming the jobs does not work.
Jobs that are run immediately get stuck in the Scheduled state.

Environment

Release: All Supported Clarity releases

Cause

The only known cause is that when one or more Clarity jobs are processing and is killed manually on the database, this causes waiting and scheduled jobs to be stuck indefinitely.

Search for the following error in the logs:

ERROR 2020-09-01 11:09:50,993 [Dispatch Time Slicing : bg@<server> (tenant=clarity)] niku.njs (clarity:admin:<session>:Time Slicing) Error updating job in scheduler bg@<server>
com.niku.union.persistence.PersistenceException: Error getting a DB connection
 at com.niku.union.persistence.PersistenceController.doProcessRequest(PersistenceController.java:620)
 at com.niku.union.persistence.PersistenceController.processRequest(PersistenceController.java:311)
 at com.niku.njs.SchedulerImpl.processDBRequest(SchedulerImpl.java:1614)
 at com.niku.njs.SchedulerImpl.unlockJobs(SchedulerImpl.java:1266)
 at com.niku.njs.SchedulerImpl.unlockJobs(SchedulerImpl.java:1252)
 at com.niku.njs.SchedulerImpl.completeAndReschedule(SchedulerImpl.java:656)
 at com.niku.njs.Dispatcher$BGTask.run(Dispatcher.java:690)
 at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.sql.SQLException: Connection unavailable
 at com.niku.union.persistence.connection.ApacheContext.getConnection(ApacheContext.java:213)
 at com.niku.union.persistence.PersistenceController.createLocalContext(PersistenceController.java:461)
 at com.niku.union.persistence.PersistenceController.doProcessRequest(PersistenceController.java:569)

Resolution

For an easy way to find out if a job is stuck in processing at DB level, but not in the UI see: Single job is stuck in processing state in Clarity

If the above does not help, the following steps will need to be performed in the exact order.

Note: The Clarity administrator performs steps 1 and 4. For SaaS Customers, the Broadcom SaaS team will perform steps 2, 3 and 5.

Step 1: Cancel and re-run the jobs

  1. Go to Home > Reports and Jobs
  2. Pause all jobs and reports in the following states: WAITING, SCHEDULED
  3. Cancel all PROCESSING instances.
  4. Cancel all NOT SCHEDULED instances.
  5. Filter for all CANCELLED  jobs.
  6. Make a note of the Cancelled jobs and take screenshots/notes of their schedule, as they will need to be re-entered at a later time.
  7. Select and delete all CANCELLED instances.
  8. Run an immediate instance of any fast running job such as the Clean User Sessions job
  9. If the job does not go to the Processing status, delete the job and proceed to step 2.
  10. If it does go to the Processing status, proceed to step 5.

Step 2: Remove all possible orphan records and locks on the jobs

  1. Stop all background service/deployment on the environment
  2. Run the below query

    select id from cmn_sch_jobs csj

    where csj.is_visible = 0 and csj.job_definition_id not in

    (select id from cmn_sch_job_definitions where upper(job_code) in

    ('JOB_CHECK_HEART_BEAT','BPM_ESC_ESCALATION','BPM_ESC_RESCHEDULE_ESC_JOB','TELEMETRY_JOB','PURGE_CSV_DOWNLOADS'))

    and csj.status_code in ('WAITING', 'SCHEDULED','PROCESSING');

  3. If records are returned, run the following SQL delete statements:

    delete from cmn_sch_jobs csj
    where csj.is_visible = 0 and csj.job_definition_id not in
    (select id from cmn_sch_job_definitions where upper(job_code) in
    ('JOB_CHECK_HEART_BEAT','BPM_ESC_ESCALATION','BPM_ESC_RESCHEDULE_ESC_JOB','TELEMETRY_JOB','PURGE_CSV_DOWNLOADS'))
    and csj.status_code in ('WAITING', 'SCHEDULED','PROCESSING');

    commit;

  4. Check for any locks on the scheduler with the query:

    select * from prlock where prtablename = 'CMN_SCH_JOBS'

  5. If any records are returned, then run the following SQL statement:

    delete from prlock where prtablename = 'CMN_SCH_JOBS';
    commit;

  6. If on Clarity 15.3 and higher, proceed with Step 3. If not on Clarity 15.3 and higher, start the Background services

Step 3: Clear the Scheduler Table (Clarity 15.3 and higher)

Starting in 15.3, there is a new table that holds all currently processing jobs: CMN_SCH_JOB_CURR_RUNS
Sometimes the table can get out of sync, and here is what you can do to resolve it:

  1. Make sure Step 1 is performed (all processing jobs are stopped in UI )
  2. Stop the background services if not done already (as per Step 2)
  3. Run the statement:

    select * from CMN_SCH_JOB_CURR_RUNS where job_definition_id != -1; ;

    delete from cmn_sch_job_curr_runs where job_definition_id != -1;

  4. Start the background service/deployment

Step 5: Recreate the jobs in Clarity

  1. Re-enter the previously deleted 'Cancelled' jobs with their schedule.
  2. Resume (unpause) all 'PAUSED' jobs and reports

Note: For issues with the Time Slicing job being stuck, see also Time Slicing job appears to be hanging - troubleshooting

Step 6. Check for jobs stuck in processing in the Database

If the above does not help proceed to the steps below (Note: In rare occasions this is needed)

  1. Check for any jobs stuck in processing in the Database by running the below query.

    select scj.id,scj.name,scj.status_code JOB_STATUS,csj.id RUNID,csj.status_code JOB_RUN_STATUS,(select user_name from cmn_sec_users where id=csj.created_by) from cmn_sch_jobs scj 
    join cmn_sch_job_runs csj on csj.job_id=scj.id 
    where csj.status_code='PROCESSING'; 


  2. If any jobs are found where the JOB_STATUS is Processing, but the JOB_RUN_STATUS is a different status, delete these jobs from the Scheduled Jobs in the UI and then reschedule the jobs.

    Note: It's recommended to take a note of the job reoccurrence set up for the jobs to be deleted for rescheduling after the jobs are deleted. 
 

Additional Information

See also: Single job is stuck in processing state in Clarity

As part of the symptoms noted in the KB, you may also see the Process Engine is down or time slices are not updated, but if the above isn't the cause of your issue, see also: