Monday, January 4, 2010

Oracle Database loses its OEM configuration after a Cold Backup

This was an annoying problem that took awhile to track down. In short: after our scheduled cold backups, an Oracle (11g) database would lose its configuration in Oracle Enterprise Manager. It would present a "Metric Collection Error" that would go away after reconfiguring the database. The fix, as it turned out, was pretty simple, but it took awhile to tease out. The problem was that the trace directory (bdump in 10g) was too full. Specifically, the metric collection (the process by which OEM gathers data about the database) was timing out. We ruled out performance problems on the database side; the system is not utilized much at all. Instead, we discovered that because there were a lot of files in the trace directory (> 31k), it was taking a long time for the OEM agent to get to the alert log, which is one of the metrics that it collects. This was hinted at in the emagent.trc file:
2010-01-04 13:01:53,087 Thread-47647632 ERROR TargetManager: TIMEOUT reached in computing dynamic properties for target TESTDB,
oracle_database::compute timings were [decideIncludeDB:0-0] [SystemTablespaceNumber:0-0] [SysauxTablespaceNumber:0-0 ...
[DeduceAlertLogFile:1-1] [GetCPUCount:1-1] [EnabledFeatures:1-1] [GetOSMInstance:1-1] [GetNLSParam:1-1] [GetAdrBase:1-(1)]
So you can see above that one of the things it was trying to do was get at the alert log. It took a long time to enumerate all of the small files in the trace directory, so we shut down the instance, cleared out the trace directory, and restarted the instance. That took care of the problem. In troubleshooting this problem, we also increased the dynamic properties timeout setting (dynamicPropsComputeTimeout_oracle_database) in the emd.properties file (in [agent_home]/sysman/config), changing the value from the default (120) to a larger setting (240). That did not help, though it's a good troubleshooting step, should you run into a similar problem.

1 comment:

  1. Thank you for posting this. It was very helpful!

    ReplyDelete

Thanks for leaving a comment!