A couple of days ago we had this issue where all the DB instances on one server of a RAC were failing with this error:

[oracle@hostname ~]$ srvctl start instance -d DBNAME -i INSTANCE_2
PRCR-1013 : Failed to start resource ora.DBNAME.db
PRCR-1064 : Failed to start resource ora.DBNAME.db on node hostname
CRS-5017: The resource action "ora.DBNAME.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
Process ID: 0
Session ID: 0 Serial number: 0
. For details refer to "(:CLSN00107:)" in "/u01/app/oragrid/diag/crs/hostname/crs/trace/crsd_oraagent_oracle.trc".

CRS-2674: Start of 'ora.DBNAME.db' on 'hostname' failed
[oracle@hostname ~]$

Checking alert log it was like if a rogue process was still using memory segments:

ORA-1092 : opitsk aborting process
2023-12-07T10:39:48.004323+00:00
ORA-1092 : opitsk aborting process
2023-12-07T10:39:48.305329+00:00
Warning: 2 processes are still attacheded to shmid 229403:
 (size: 53248 bytes, creator pid: 537143, last attach/detach pid: 

But nothing was in use:

[root@hostname ~]# ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x21a2b3c0 196620     oragrid    600        45056      43

[root@hostname ~]#

After a failed attempt with an Oracle SR, we noticed /u02 was gone from server side.
Unix guys remounted but contents (and structure) were gone:

[oracle@hostname ~]$ ls -tlr /u02/
total 0
[oracle@hostname ~]$ ls -tld /u02/
drwxr-xr-x. 2 root root 6 Dec  7 09:24 /u02/
[oracle@hostname ~]$ 

We changed permints and recreated the expected folders:

[oracle@hostname  ~]$ ls -tld /u02/app/oracle/diag/rdbms/
drwxr-xr-x. 5 oracle oinstall 63 Dec  8 11:43 /u02/app/oracle/diag/rdbms/
[oracle@hostname  ~]$

After this, Instances started without issues. Hope this helps as pretty much everything you find it points to rogue processes holding memory.

Last modified: 18 January 2024

Author

Comments

Write a Reply or Comment

Your email address will not be published.