Thursday, December 15, 2016

Oracle 12C: Moving datafiles feature

Trying to test new Oracle 12C feature "Moving or renaming datafiles" in attempt to fix "
ORA-19566: exceeded limit of 0 corrupt blocks for file". This Oracle error simply means, the block no longer belonging to any extents and Oracle 'marked' it as corruption, so, logically, if I can find a way to scrub it, it should  fix the issue.



According to the Oracle DOC this should perform the following upon moving the datafile."When you rename or relocate online data files, the pointers to the data files, as recorded in the database control file, are changed. The files are also physically renamed or relocated at the operating system level."


set current session to the database that I intend to test which is pdorcl.

SQL> alter session set container=PDBORCL;
Session altered.

Intent to change the vpxadmin3.dbf to vpxadmin4.dbf. Essentially, this can be point to somewhere else.

SQL> col file_name format a130;
SQL> SELECT FILE_NAME, BYTES FROM dba_data_files;

FILE_NAME                                                                               BYTES
-------------------------------------------------------------------------------------- ----------
/u01/data/ORCL/E1428863D6CF20F7E043CB41830AFB4F/datafile/o1_mf_system_8xy2dwhf_.dbf     272629760
/u01/data/ORCL/E1428863D6CF20F7E043CB41830AFB4F/datafile/o1_mf_sysaux_8xy2dwhd_.dbf     681574400
/u01/data/ORCL/E1428863D6CF20F7E043CB41830AFB4F/datafile/o1_mf_users_8xy2fwlk_.dbf      5242880
/u01/data/ORCL/datafile/vpxadmin3.dbf                                                   1073741824


Move the datafile. This takes about 3 minutes to complete.

SQL> alter database move datafile '/u01/data/ORCL/datafile/vpxadmin3.dbf' to '/u01/data/ORCL/datafile/vpxadmin4.dbf';
Database altered.


The datafile name vpxadmin3.dbf is changed to vpxadmin4.dbf

SQL> col file_name format a130;
SQL> SELECT FILE_NAME, BYTES FROM dba_data_files;

FILE_NAME                                                                                 BYTES
----------------------------------------------------------------------------------------- ----------
/u01/data/ORCL/E1428863D6CF20F7E043CB41830AFB4F/datafile/o1_mf_system_8xy2dwhf_.dbf       272629760
/u01/data/ORCL/E1428863D6CF20F7E043CB41830AFB4F/datafile/o1_mf_sysaux_8xy2dwhd_.dbf       681574400
/u01/data/ORCL/E1428863D6CF20F7E043CB41830AFB4F/datafile/o1_mf_users_8xy2fwlk_.dbf        5242880
/u01/data/ORCL/datafile/vpxadmin4.dbf                                                     1073741824


First time I get the chance to test out the "datafile move" feature. I can see that this may come very handy moving things around. Though, the performance can be questionable for large production environment.

Saturday, May 21, 2016

Oracle: CREDENTIAL: ORA-27369: job of type EXECUTABLE failed with exit code: 7 !@#--!@#7#@!--#@!&

This issue is caused by incorrect user being used during dbms_scheduler credential setup.  The exit code of 7 is basically referring to permission issue. Pre-requisite should make sure the jssu, externaljob.ora and extjob have correct permission setup.

Note: there are 2 ways to create scheduler credential. DBMS_SCHEDULER.CREATE_CREDENTIAL is deprecating in 12.1 and preferably be using DBMS_CREDENTIAL.CREATE_CREDENTIAL package 12.1 onward.

Setup: 
Legit OS level user: oracle and password is vmware
Instance level user setup: scott
ascript.sh is a script that is calling another sql script at the OS level.




TEST # 1 : dbms_scheduler with valid OS user - PASSED

exec dbms_scheduler.create_credential('EXT_JOB_CRED1', 'oracle', 'vmware');

exec dbms_scheduler.drop_job ('scott.SCRIPT_JOB1');

exec dbms_scheduler.create_job ( job_name => 'SCRIPT_JOB1', job_type => 'EXECUTABLE', job_action => '/home/oracle/Downloads/script/ascript.sh', start_date => SYSTIMESTAMP, repeat_interval => 'freq=hourly;interval=2', end_date => NULL,enabled => TRUE,auto_drop => FALSE,comments => 'Executable jobfor ascript.sh',credential_name => 'scott.EXT_JOB_CRED1');

EXEC dbms_scheduler.run_job('SCRIPT_JOB1');

SQL> EXEC dbms_scheduler.run_job('SCRIPT_JOB1');

PL/SQL procedure successfully completed.


TEST 2: DBMS Scheduler with random user - FAILED

exec dbms_scheduler.create_credential('EXT_JOB_CRED2', 'somerandomuser', 'vmware123');

exec dbms_scheduler.drop_job ('scott.SCRIPT_JOB2');

exec dbms_scheduler.create_job ( job_name => 'SCRIPT_JOB2', job_type => 'EXECUTABLE', job_action => '/home/oracle/Downloads/script/ascript.sh', start_date => SYSTIMESTAMP, repeat_interval => 'freq=hourly;interval=2', end_date => NULL,enabled => TRUE,auto_drop => FALSE,comments => 'Executable jobfor ascript.sh',credential_name => 'scott.EXT_JOB_CRED9');
EXEC dbms_scheduler.run_job('SCRIPT_JOB2');

SQL> EXEC dbms_scheduler.run_job('SCRIPT_JOB2');
BEGIN dbms_scheduler.run_job('SCRIPT_JOB2'); END;

*
ERROR at line 1:
ORA-27369: job of type EXECUTABLE failed with exit code: 7 !@#--!@#7#@!--#@!&
ORA-06512: at "SYS.DBMS_ISCHED", line 209
ORA-06512: at "SYS.DBMS_SCHEDULER", line 594
ORA-06512: at line 1

Test 3 – using DBMS_CREDENTIAL with random user - FAILED

 exec DBMS_CREDENTIAL.CREATE_CREDENTIAL ('EXT_JOB_CRED3', 'randomuser', 'vmware123');

exec dbms_scheduler.create_job ( job_name => 'SCRIPT_JOB3', job_type => 'EXECUTABLE', job_action => '/home/oracle/Downloads/script/ascript.sh', start_date => SYSTIMESTAMP, repeat_interval => 'freq=hourly;interval=2', end_date => NULL,enabled => TRUE,auto_drop => FALSE,comments => 'Executable jobfor ascript.sh',credential_name => 'scott.EXT_JOB_CRED10');
EXEC dbms_scheduler.run_job('SCRIPT_JOB3');

SQL> EXEC dbms_scheduler.run_job('SCRIPT_JOB3');BEGIN dbms_scheduler.run_job('SCRIPT_JOB3'); END; *ERROR at line 1:ORA-27369: job of type EXECUTABLE failed with exit code: 7 !@#--!@#7#@!--#@!&ORA-06512: at "SYS.DBMS_ISCHED", line 209ORA-06512: at "SYS.DBMS_SCHEDULER", line 594ORA-06512: at line 1


These are just some portions of my tests. There were other things I have tried such as incorrect passwords and etc and they were all failing with exit code of 7. At this point, I am not completely sure if this is a bug or part of the design.


Tuesday, February 23, 2016

Oracle EBS: RW-50010: Error: - script has returned an error: 1

This one is fun. EBS installation failed at 29% after half a day of waiting.


Error

C:\\oracle\\dev\\fs2\\FMW_Home\webtier\OPatch ## Node=NodeId=1698 Type=24 TypeName=filesys_patch_action Name= RefId=901 State=init ConfigDoc=APPS_OHS_HOME ParentDoc=null Topology=R12 Action=os_cmd
    at oracle.apps.fnd.txk.config.FileSysPatchActionNode.doFileSysOSCmd(FileSysPatchActionNode.java:169)
    at oracle.apps.fnd.txk.config.FileSysPatchActionNode.processState(FileSysPatchActionNode.java:101)
    at oracle.apps.fnd.txk.config.PatchActionNode.processState(PatchActionNode.java:187)
    at oracle.apps.fnd.txk.config.PatchNode.processState(PatchNode.java:338)
    at oracle.apps.fnd.txk.config.PatchesNode.processState(PatchesNode.java:79)
    at oracle.apps.fnd.txk.config.InstallNode.processState(InstallNode.java:68)
    at oracle.apps.fnd.txk.config.TXKTopology.traverse(TXKTopology.java:594)
    at oracle.apps.fnd.txk.config.InstallService.doInvoke(InstallService.java:224)
    at oracle.apps.fnd.txk.config.InstallService.invoke(InstallService.java:237)
    at oracle.apps.fnd.txk.config.InstallService.main(InstallService.java:291)

C:\Users\ADMINI~1\AppData\Local\Temp\1>if 1 == 0 goto :INSTAS_OK
C:\Users\ADMINI~1\AppData\Local\Temp\1>echo Cannot install one-off patches
Cannot install one-off patches

RW-50010: Error: - script has returned an error:   1
RW-50004: Error code received when running external process.  Check log file for details.
Running APPL_TOP Install Driver for dev instance

This error is very generic. One should not stopped at this level of troubleshooting. I do not believed looking at this one can guess what is the root cause. Quite a few Oracle noteid look the same but having a different root cause at the end of the resolution.

User will need to look at the install log to look at which stage and line that failed. At 29%, it does half a dozen things particularly moving java files around. User can look at the entire installation in the  adrunfmw.cmd. 




Locate the install log. It usually reside in the EBS install folder under temp.  Example, oracle.apps.fnd.txk.install0.

Look for stdout and stderr or any exception errors. In my case, I encountered the following error.

  <message>Process Completed (3)  cmd /c rmdir /s /q C:\\oracle\\dev\\fs2\\FMW_Home\webtier\OPatch
Stdout:  See C:\oracle\dev\fs2\EBSapps\appl\admin\dev_ebsdev\tmp\T1456139825284_122.tmp
Stderr: The system cannot find the path specified.
</message>
</record>
<record>
  <date>2016-02-22T18:46:09</date>
  <millis>1456141569147</millis>
  <sequence>344</sequence>
  <logger>oracle.apps.fnd.txk.install</logger>
  <level>SEVERE</level>
  <class>oracle.apps.fnd.txk.config.InstallService</class>
  <method>printError</method>
  <thread>10</thread>
  <message>oracle.apps.fnd.txk.config.ProcessStateException: FileSys OS COMMAND Failed : Exit=3 See log for details. CMD= cmd /c rmdir /s /q C:\\oracle\\dev\\fs2\\FMW_Home\webtier\OPatch ## Node=NodeId=1698 Type=24 TypeName=filesys_patch_action Name= RefId=901 State=init ConfigDoc=APPS_OHS_HOME ParentDoc=null Topology=R12 Action=os_cmd
    at oracle.apps.fnd.txk.config.FileSysPatchActionNode.doFileSysOSCmd(FileSysPatchActionNode.java:169)
    at oracle.apps.fnd.txk.config.FileSysPatchActionNode.processState(FileSysPatchActionNode.java:101)
    at oracle.apps.fnd.txk.config.PatchActionNode.processState(PatchActionNode.java:187)
    at oracle.apps.fnd.txk.config.PatchNode.processState(PatchNode.java:338)
    at oracle.apps.fnd.txk.config.PatchesNode.processState(PatchesNode.java:79)
    at oracle.apps.fnd.txk.config.InstallNode.processState(InstallNode.java:68)
    at oracle.apps.fnd.txk.config.TXKTopology.traverse(TXKTopology.java:594)
    at oracle.apps.fnd.txk.config.InstallService.doInvoke(InstallService.java:224)
    at oracle.apps.fnd.txk.config.InstallService.invoke(InstallService.java:237)
    at oracle.apps.fnd.txk.config.InstallService.main(InstallService.java:291)


It is pretty self-explanatory. The required tmp file was not found. What I found though, not only the T1456139825284_122.tmp not found but the entire directory of C:\oracle\dev\fs2\EBSapps\appl\admin\dev_ebsdev\tmp\ was empty. The team have decided to get through this and re-visitng this missing file once the installation is successfully. Just to fast forward a little bit, there will be another missing file that I will point out in a bit.

Taking a look at the adrunfw.cmd command, I realized the script is hack-able. Since we are going to come back and revisit this issue if the installation ever completed, I decided to make a copy of the script and delete the script ending line so it break out of the loop if it failed and move on. Now - another issue found here, the  j11067592_fnd.zip was not residing in the right path where the script will be copying from later on. It was sitting one level outside of fnd directory. I decided to just copy and paste the file into the fnd folder.


As expected, a few hours into the installation, it did make it passed 29% and completed successfully. As of the "T1456139825284_122.tmp" and possibly other files, I have no idea how to address it. That file was no where to be found in any of the medias nor I know what they are or for. Perhaps, run a one-off patching might restore them since, the it failed at the One-Off stage. 

Please leave comments if you know what this T1456139825284_122.tmp is.







Wednesday, January 27, 2016

Oracle datapump migration to 11G triggered [qerxtgetrefoffset_911]

I have worked on this issue with Oraclec Support for over 8 months. It took 5 months for Oracle Engineering to recognize this as a bug and another 3 months find resolution. Oracle Engineering finally came back and stated that they are unable to fix it due to its complexity and will be fixed in latest  12.1.0.2.0 version. The patch id is 13770504 and has been rolled in to it as well.

I have done hundreds of datapump and this one just would not go through. I have tried everything that I can dig through in Oracle Metalink and attempted informal patches from Oracle on 11.2.0.1,11.2.0.3, 11.2.0.4 and 12.0.1.0. Some of the attempts I have made are using VERSION=12, parallel=1 and tried recommended one-off patch from Oracle.

OK, OK, so what was my issue ??

I was trying to migrate their database at 10.2.0.5 to 11.2.0.4 and been getting ORA-­600 [qerxtGetRefOffset_911]. Oracle alert.log, trace files and datapump traces and the issue appear to hit the Datapump Worker Process bug. There were quite a bit of similar bugs have been reported and it seems to me that the fix did not make it into 11.2.0.4 at first. They all appear similar to the following errors


ORA­39014: One or more workers have prematurely exited. 
ORA­39029: worker 1 with process name "DW00" prematurely terminated 
ORA­31671: Worker process DW00 had an unhandled exception. 
ORA­00600: internal error code, arguments: [qerxtGetRefOffset_911], [], [], [], [], [], [], [], [], [], [], [] 
ORA­06512: at "SYS.KUPW$WORKER", line 1887 ORA­06512: at line 2 
ORA­39014: One or more workers have prematurely exited. 
ORA­39029: worker 2 with process name "DW00" prematurely terminated 
ORA­31671: Worker process DW00 had an unhandled exception. 
ORA­00600: internal error code, arguments: [qerxtGetRefOffset_911], [], [], [], [], [], [], [], [], [], [], [] 
ORA­06512: at "SYS.KUPW$WORKER", line 1887 
ORA­06512: at line 2 Job "SYS"."SYS_IMPORT_FULL_01" stopped due to fatal error

DataPump Import (IMPDP) Fails With Internal Error ORA­600 [qerxtgetrefoffset_911] When Importing V10 Dump Files (Doc ID 1369347.1) ORA­600 [qerxtGetRefOffset_911] (Doc ID 1267718.1) ­ this bug stated bug fix at 11.2.0.3, 12.1.0.1 and 11.2.0.2. Upon, trying to patch, it still did not fixed my issue.

Basically, nailed it down to this one table in Oracle 10G with CLOB column storing XML documentation that was causing all the issue. I was able to reproduce the issue in all Oracle 11G dot releases and early Oracle 12G base as well. Oracle finally stated that the bug has been checked into the patch 13770504 . I no longer have the opportunity to test it out as of writing this and I will just take the words for it.

I haven't tried conventional export/import or files copying as the options were not given and this issue is not my call. I was just being instructed to fix the issue by using Datapump.

Thursday, January 21, 2016

Oracle RAC Scan and resolv.conf issue

Oracle RAC evicted. SCAN no longer working. At first, everything look sporadic. Then, it seem to have something to do with network bounced.


Things to check/verify and double check in case someone changed something.

/etc/hosts
/etc/hostname or hostname
/etc/resolv.conf
/etc/named.conf
/var/named/mhrac1.zone (the zone file)
 /etc/sysconfig/network-scripts   -- check all the eth0, eth1, eth2 files here to have a good picture of how the networking looks like in this setup

Long story short -
When running an nslookup check, I was getting the following. I have disabled the firewall for the test, so port 53 should not be blocked. I can ping freely in and out of the node yet I kept getting the following. All configured IPs ping-ed fine.

nslookup dnsrac-scan.wwdom.com
Server:          10.11.188.211
Address:     10.11.188.211#53


** server can't find mhrac1-scan.wwdom.com: NXDOMAIN

Upon checking my resolv.conf, the following did not look right at all. I should have a "search" and the bottom 2 IPs addresses I set up in my mhrac1.zone file, instead, I am having the following

Generated by NetworkManager
search wwdom.com
nameserver 10.11.188.211
nameserver 10.11.188.212
nameserver 10.16.169.90
# NOTE: the libc resolver may not support more than 3 nameservers.
# The nameservers listed below may not be recognized.
nameserver 10.10.1.110
nameserver 75.75.75.75


After fixing the extra 3 IPs from the top, I was able to perform an expected nslookup. I was also able to replicate the issue by restarting the network - "service network restart" and new IPs get inserted every time I did that.


This could be due to when network restarted, it appended other IP addresses on top and the intended ones did not get read from - as the message already hinted.

My guess to permanently preventing from this issue, I need to change the PEERDNS in the  ifcfg-eth0 so, when network restarted, the VM will not reach outside and grab the real domain DNS IPs.  (By the way, the  ifcfg-eth0 is a make up name. Centos7 will provide a random name by default. So, keep track of you MAC address and assigned IPs. Default might look like this  ifcfg-234133445. the same goes for eth1 and eth2.