Thursday, April 29, 2010

Executing Linux Shell Script Through DBMS_SCHEDULER

We receive some of refrence data through third party vendors. This usually involves us loading the data from CD or other source, running sqlldr, and then some sql to sync tables across databases, as well as denormalize a table or two.

In order to move this out of my hands and into the business hands I had to make the process error free and easy to do -- push buttons and make magic happen. I did this through multiple steps:

1. The Linux script that uploads data into a QA schema on Development was already in place. I just didn't need a way to execute it via a push of a button on APEX. DBMS_SCHEDULER was the simplest approach.

2. Next I created a refresh_ref_tables package on each of the databases, which would be called from a procedure on the APEX database. By doing this allows me to call the procedures via database links and be able to include more than 4 databases in the process.

3. Created an apex application that allowed for buttons to push and the magic to happen. I even created all the QA reports within the application so they could do away with their spreadsheets on the business side.

The first stumbling block was the DBMS_SCHEDULER job to execute the Linux shell script. I created the job using the job_action as the /usr/bin/sh, and then the actual script is the argument. So the create_job statement looked like this:

dbms_scheduler.create_job(
job_name=>'USED_FOR_SHELL_SCRIPTS',
job_type=>'EXECUTABLE',
job_action=>'/usr/bin/sh',
number_of_arguments=>1);

This can be used for any shell script execution. We made sure that only the DBA group had the ability to schedule this job. A tried several different ways to schedule the actual script, but I kept receiving the file or directory not found. I probably can now just schedule the actual shell script, but for now we will leave it this way as I can several different shell scripts being scheduled.

Within APEX when the page is displayed the user can select a button that will then begin the shell script by executing the following:

begin
dbms_scheduler.set_job_argument_value@orl1(job_name=>'USED_FOR_SHELL_SCRIPTS',
argument_position=>1, argument_value='/home/oracle/script/path/refresh.sh');
end;

Then to run the job:

begin
dbms_scheduler.run_job@orl1(job_name=>'USED_FOR_SHELL_SCRIPTS');

The shell script kicks off and executes. One issue I have found with running shell scripts in this manner is that I can't use pipes or redirects for logfiles. However, I'm using sqlldr and do have the log file option there. The sql statements I actually log the row counts and actions to an audit log.

If errors do occur with executing the script the error is logged to the dba_scheduler_job_run_details table. Some errors appear to be very generic, but with a closer look in this table the actual issue will be discovered.

select log_id, log_date,job_name,status,error#,additional_info from dba_scheduler_job_run_details where job_name='USED_FOR_SHELL_SCRIPTS';

One of the errrors I was repeatly receiving was ORA-27369: job type Executable Failed with exit code: Key has expired

This usually indicates a permissions issue, but where? The additional info column provided the answer:

STANDARD_ERROR="/home/oracle/scripts/path/refresh.sh: line 84: //XXXXX_DESC.dat: Permission denied
cat: write error: Broken pipe
/home/oracle/scripts/path/refresh.sh: "

This was due to the unzip command and if the file already existed. I added the -u option and it elimnated this error.

Once the development script refresh is finished the QA staff handles the data QA to understand the impact of the data changes. Upon their approval we then move the changes into all the databases primary schema that holds these tables.

This is done through a package on each of the databases, that is executed via apex. The package on each of the databases has a different procedure for the different types of reference data that is being handled. They are kicked off via APEX as required.

The actual process that is kicked off from apex is a procedure that resides on the apex database that allows for multiple databases to be selected for the refresh. I used this same concept for user management, as it works well.

The multiple select list is passed in as the pDB parameter. The list will look as follows orcl1:orcl2 or orcl1:orcl4:orcl5

The last one doesn't have a colon so I add a colon. This allows for the IF statements to be built. The vDB is then analyzed for the existence of each of the different databases and if found performs the action. Then it moves on to the next on in the list, until all have been satisfied.

The procedure on APEX is as follows:

refresh(pDB IN VARCHAR2)
IS
vDB VARCHAR2(100);
vJobTime DATE := trunc(sysdate+1)+5/24;

BEGIN

vDB := pDB ||':'; -- this required because multiselect list with have
each selection separated by a colon, but not the
end one.

IF instr(vDB, 'orcl1:') != 0 THEN
-- what I want to execute
EXECUTE IMMEDIATE 'BEGIN refresh_tables@orl1;end';
end if;
-- additional statements to cover each of the databases
IF inst(vDB, 'orcl5:') != 0 THEN
-- In this case I need to ensure it happens after hours
dbms_scheduler.create_job@orcl5
(job_name=>'TEST',
job_type=>'PLSQL_BLOCK',
job_action=>'BEGIN REFRESH;end;',
start_date => vJobTime,
enabled=> TRUE);

end if;

end;

Now we only get involved, if there are failures or the requiremnts change in how the reference data is being handled.

As always the names, and paths have all been changed.

1 comment:

  1. Thanks! Especially for the pointer where to find the real problem!

    select log_id, log_date,job_name,status,error#,additional_info from dba_scheduler_job_run_details where job_name='USED_FOR_SHELL_SCRIPTS';

    ReplyDelete