Significant reduction in database outages, batch job errors, and the proactive identification and prevention of system issues.
Serco, Inc. is a leading provider of professional facilities management services focused on the federal government. It is an international service company with employees around the world, employing more than 11,000 dedicated professionals in more than 100 locations across North America.
Diamond Technology was selected to support Serco Inc’s Transportation Group for IT/Database Services in support of the SFMTA’s San Francisco Parking Meter (SFPM) management application. This infrastructure consists of 2 Oracle 10g databases, and 4 MS SQL Server housing critical traffic and transportation information and services. DTI has been assigned the following specific tasks and responsibilities:
ChallengesREDO LOG ERRORS
The SFPM Oracle database had been experiencing a number of Redo Log errors that had been of some concern.
DTI’s diagnosis and concern was, as the number of redo logs increase for a database, it becomes less likely the writing of the redo logs is going to keep up with the archiving process. Heavy transaction data activity will switch the redo logs, but the archive process may not be fast enough to keep up, as it tries to copy redo logs to the archived log files. This could stop all the database transactional change activity, until a redo log can be archived and recycled. If it has not yet been archived, a redo log will not be recycled. This would result in a stoppage of all database activity until all required archiving is completed and a redo log is cleared for reuse.
When there is high data activity and a redo log is too small, this could result in excessive switching. Once again, this could halt the database activity because the archiving process may not be able to keep up,PREVIOUS DATABASE CRASHES
As the SFPM database had already experienced, when a problem prevents a database from continuing to run, it’s called a database instance failure. An instance failure can be due to factors such as an operating system crash, or be the result of a hardware problem, power outage, or another type of software problem.AGE OF SFPM DATABASES
The SFPM database system was 9+ years old, and has had various software updates and changes made to the main application.
As databases begin to age past eight, ten, twelve plus years, they become increasingly at risk of experiencing a number of age-related problems. Problems can stem from OS compatibility issues. Slow performance and database crashes can begin to originate from data growth issues, index growth issues, fragmentation issues, and more. Aging databases may also suffer from data loss caused by incomplete backup and restore operations, update errors, media failure, user error, unintended changes in the database structure, and other related maintenance issues.
DTI recommended and implemented a database management plan to, not only significantly reduce the chances of database performance problems and crashes, it also enabled DTI to provide informed guidance regarding client requested system improvements and recommendations. The results was the significant reduction in database outages, batch job errors, and the proactive identification and prevention of system issues. Additionally, the client reported faster problem resolution when problems did occur. The provisioning of services included the following:
1. Oracle Database Maintenance
- A. Quarterly maintenance of the SFPM Oracle database
- B. Setup of Grid Control for Oracle and setup extensive alerts/notifications and better centralization of backups, monitoring and management
- C. Configured Oracle backups through RMAN along with cold backups and copying local drive content to tape to avoid performance and compatibility issues with Symantec Backup.
- D. Verify database size & Indexes
- E. Check logs & server health
- F. Dump database for SIT
- G. Provide written reports on server and database status
- H. Provide recommendations on server and database upgrades/improvements.
- I. Database Monitoring and Backup
- A. Identify lost data from database corruption/outage restore lost data
- B. Assist in hardware failure determination and resolution
- C. Assist in database failure determination and resolution
- D. Assist in hardware maintenance as required; diagnosis and repair of defective hardware by replacing parts; and installation of hardware upgrades and new systems
- E. Assist SCG (SIT) with DBA tasks as needed
3. Database & Application Development
- A. Oracle database development and upgrades
- B. Report Design and Customization
- C. Enhancements to traffic & transportation data systems