Thursday, June 11, 2026

Cross-Version Oracle Database Automation on Oracle Linux (7, 8, 9) with systemd, dbstart/dbshut, and Email Alerts on Failure

 

Introduction

In Oracle environments that do not use Oracle ASM, databases are not configured to start automatically after a server reboot. To automate database startup and shutdown, administrators commonly use Oracle's standard dbstart and dbshut utilities.

Although these tools provide basic automation, they do not always verify that the database has successfully reached the OPEN state, and they do not provide built-in notification when startup failures occur.

This article demonstrates how to combine dbstart/dbshut, systemd, startup validation checks, and automated email alerts to create a reliable Oracle database auto-start framework. The approach is compatible with Oracle Linux 7, 8, and 9 and can be used with Oracle database versions that support the standard dbstart/dbshut utilities. It also includes a safe testing method for validating the email alert system without affecting a production database.

In this blog post, I design:

  • An Oracle auto-start service using systemd
  • Proper shutdown control
  • An email alert system for service failures
  • A safe test service to validate the alerting mechanism without production risk

This design is simple, stable, and suitable for Oracle Linux production environments.

Part 1: Oracle Database Auto Start Service (Production)

1.1. Creating oracle-db.service file

As the root user, create the file /etc/systemd/system/oracle-db.service and add the following content:

[Unit]
Description=Oracle Database Service
After=network-online.target
Wants=network-online.target

# Failure (important part)
OnFailure=unit-status-mail@%n.service

[Service]
Type=oneshot
User=oracle
Group=oinstall
WorkingDirectory=/u01/app/oracle
RemainAfterExit=yes

# Execute wrappers
ExecStart=/usr/local/bin/oracle-start.sh
ExecStop=/usr/local/bin/oracle-stop.sh

TimeoutStartSec=10min
TimeoutStopSec=10min

[Install]
WantedBy=multi-user.target

The network-online.target dependency ensures that network services are available before Oracle startup begins. The OnFailure parameter triggers an email alert whenever the Oracle service fails, allowing administrators to be notified immediately.

The service runs as the Oracle software owner (oracle) and uses Type=oneshot because the startup script performs its work and exits. The RemainAfterExit=yes parameter keeps the service in an active state after a successful startup.

Startup and shutdown operations are handled through dedicated wrapper scripts, making the configuration easier to maintain and troubleshoot.

The timeout and resource limit settings help ensure that Oracle has sufficient time and operating system resources to complete startup and shutdown operations successfully.

1.2. Oracle Startup Script (oracle-start.sh)

This script performs more than a standard dbstart operation. First, it checks whether the Oracle listener is already running and starts it only when required. This prevents unnecessary listener startup attempts during server reboot or service restart.

After the listener check, the script executes Oracle's native dbstart utility to start the database instance.

The most important part of the script is the validation loop. Instead of assuming that dbstart completed successfully, the script connects to the database using SQL*Plus and checks the instance status every 10 seconds. Startup is considered successful only when the database reaches the OPEN state.

If the database does not reach the OPEN state within 10 minutes, the script returns a failure code to systemd. This allows the OnFailure notification service to send an email alert and notify administrators that manual investigation may be required. The polling loop does not always wait for the full 10 minutes. If the database reaches the OPEN state earlier, the script exits immediately. The full 10-minute window only applies in failure or slow-start scenarios.

As the root user, create the startup script for Oracle database /usr/local/bin/oracle-start.sh and add the following configuration:

#!/bin/bash
# Description: Production Oracle Startup & Deep Health Verification Wrapper

export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin

echo "=== Oracle Startup Initiated: $(date) ==="

# 1. Start the Listener safely (only if not already running)
$ORACLE_HOME/bin/lsnrctl status >/dev/null 2>&1
if [ $? -ne 0 ]; then
    echo "Listener is stopped. Starting Oracle Listener..."
    $ORACLE_HOME/bin/lsnrctl start
    if [ $? -ne 0 ]; then
        echo "ERROR: Listener failed to start." >&2
        exit 1
    fi
else
    echo "Listener is already running and healthy."
fi

# 2. Start the Database Instance
echo "Starting Oracle Database..."
$ORACLE_HOME/bin/dbstart $ORACLE_HOME
if [ $? -ne 0 ]; then
    echo "ERROR: dbstart script returned a non-zero exit code." >&2
    exit 1
fi

# 3. Poll for OPEN State (60 attempts x 10 seconds = 10 minutes)
# Generous window to safely accommodate heavy instance crash recoveries.
echo "Verifying database instance state..."
for i in {1..60}; do
    STATUS=$(
    $ORACLE_HOME/bin/sqlplus -s / as sysdba <&2
        exit 1
    fi

    # Clean up whitespace/newlines
    STATUS=$(echo "$STATUS" | tr -d '[:space:]')
    echo "Attempt $i: Current database status is '$STATUS'"

    if [ "$STATUS" = "OPEN" ]; then
        echo "SUCCESS: Oracle Database is fully OPEN and operational."
        exit 0
    fi
    
    sleep 10
done

echo "ERROR: Database failed to reach OPEN state within 10 minutes." >&2
exit 1  

1.3. Oracle Shutdown Script (oracle-stop.sh)

This script is used for safe shutdown of the Oracle database and listener, usually triggered by systemd during service stop or server shutdown.

First, we set ORACLE_HOME so Oracle commands can locate the database software. Then dbshut gracefully stops all database instances and closes active sessions in a clean way.

After that, lsnrctl stop stops the listener to prevent new connections. Finally, exit 0 tells systemd that shutdown completed successfully without errors.

As the root user, create the startup script for Oracle database /usr/local/bin/oracle-stop.sh and add the following configuration:

#!/bin/bash
# Description: Production Oracle Hardened Shutdown Wrapper

export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin

echo "=== Oracle Shutdown Initiated: $(date) ==="

echo "Shutting down Oracle Database Instance(s)..."
$ORACLE_HOME/bin/dbshut $ORACLE_HOME
DB_RC=$?

echo "Stopping Oracle Listener..."
$ORACLE_HOME/bin/lsnrctl stop
LSNR_RC=$?

if [ $DB_RC -ne 0 ] || [ $LSNR_RC -ne 0 ]; then
    echo "ERROR: Shutdown sequence encountered failures. DB_RC=$DB_RC, LSNR_RC=$LSNR_RC" >&2
    exit 1
fi

echo "=== Oracle Shutdown Completed Successfully: $(date) ==="
exit 0

1.4. Set Script Permissions

After creating the startup and shutdown scripts, we need to set correct ownership and permissions. This step is important because systemd runs the service using the oracle user, so the scripts must be accessible and executable by that user.

chown oracle:oinstall /usr/local/bin/oracle-start.sh /usr/local/bin/oracle-stop.sh
chmod 750 /usr/local/bin/oracle-start.sh /usr/local/bin/oracle-stop.sh

Part 2: Systemd Email Alert System (Working Pipeline)

This part is responsible for sending an email notification when any systemd service fails. The main idea is very simple, when a service returns an error status, systemd automatically triggers another helper service that sends an email to the DBA or support team.

2.1. Email Template Unit

This unit is responsible only for calling the email script. The important part is %i, which means systemd will pass the failed service name dynamically into this unit. So we always know exactly which service has failed.

Also, Type=oneshot is used because this is not a long-running service, it just executes one action (send email) and finishes.

As the root user, create the file /etc/systemd/system/unit-status-mail@.service and add the following content:

[Unit]
Description=Send Status Email for %i
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/send-systemd-alert.sh %i

2.2. Email Sending Script

This script takes the service name as input, then collects the system hostname and builds the email content.

The email contains:

  • Hostname of server
  • Name of failed service
  • Commands to check service status
  • Commands to check logs

This is very helpful for the DBA team because they can directly see what command should be used for troubleshooting without guessing.

Finally, the script sends an email using mailx and also logs the result in /tmp/systemd-mail.log for debugging purposes.

As the root user, create /usr/local/bin/send-systemd-alert.sh that actually sends email and add the following content.

#!/bin/bash

UNIT_NAME="$1"

HOSTNAME=$(hostname)
RECIPIENT="youremail@yourdomain.com"

SUBJECT="[ALERT] ${UNIT_NAME} failed on ${HOSTNAME}"

BODY="
Host: ${HOSTNAME}
Unit: ${UNIT_NAME}

Check status:
systemctl status ${UNIT_NAME}

Check logs:
journalctl -u ${UNIT_NAME} -b
"

echo "${BODY}" | /usr/bin/mailx -v -s "${SUBJECT}" "${RECIPIENT}" \
>> /tmp/systemd-mail.log 2>&1

echo "$(date) mailx rc=$?" >> /tmp/systemd-mail.log  

2.3. Systemd Failure Notification Workflow

When a service fails, systemd automatically triggers the action defined in the OnFailure parameter. In this configuration, the failed service name is passed to the unit-status-mail template service, which executes the email script and sends an alert notification to the configured recipient. This allows the DBA or support team to be notified immediately without manually checking service status or system logs.


3. Creating the Test Service

Since my production environment was already online and serving users, I could not test the Oracle auto-start service directly without introducing unnecessary risk. Therefore, I created a separate test service to validate the email notification framework safely.

3.1. oracle-db-test.service

The oracle-db-test.service unit is intentionally designed to be very similar to the production oracle-db.service. It uses the same Oracle user, service type, resource limits, and OnFailure configuration. This allows us to test the complete systemd alert workflow in an environment that closely matches production behavior.

As the root user, create the file /etc/systemd/system/oracle-db-test.service and add the following content:

3.2. Test Wrapper Script

The wrapper script is responsible for simulating Oracle service execution without making any changes to the database.

It performs a safe lsnrctl status check, which only verifies listener availability and does not start or stop any Oracle component. After a short delay to simulate a validation phase, the script intentionally returns a failure code.

Because systemd treats a non-zero return code as a service failure, the OnFailure action is triggered, and the email notification process starts automatically. 

As the root user, create /usr/local/bin/oracle-db-test-wrapper.sh

 #!/bin/bash

export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
export PATH=$ORACLE_HOME/bin:/usr/bin:/bin

echo "======================================"
echo "ORACLE PROD-MIRROR TEST START: $(date)"
echo "User: $(whoami)"
echo "Working Dir: $(pwd)"
echo "======================================"

# STEP 1: mimic listener check (SAFE READ ONLY)
echo "[STEP 1] Checking listener status (no changes)..."
$ORACLE_HOME/bin/lsnrctl status >/dev/null 2>&1
LSNR_RC=$?

if [ $LSNR_RC -eq 0 ]; then
    echo "Listener reachable"
else
    echo "Listener NOT reachable (simulated condition)"
fi

# STEP 2: mimic DB check style (NO DB CONNECTION)
echo "[STEP 2] Simulating DB validation phase..."
sleep 2

# STEP 3: FORCE FAILURE (this is intentional for OnFailure test)
echo "[STEP 3] Forcing controlled failure to trigger OnFailure hook"

exit 1 

3.3. Set Script Permissions

After creating the test wrapper script, set the correct ownership and permissions:

chown oracle:oinstall /usr/local/bin/oracle-db-test-wrapper.sh
chmod 750 /usr/local/bin/oracle-db-test-wrapper.sh

3.4. How to Test

Run the following commands:


systemctl daemon-reload
systemctl start oracle-db-test.service

systemctl daemon-reload instructs systemd to reload all unit files and recognize any new or modified service definitions.

systemctl start oracle-db-test.service starts the test service. The service is expected to fail intentionally, which triggers the OnFailure action and starts the email notification process.

If the alert email is received successfully, we can be confident that the notification framework is working correctly and is ready for use with the production Oracle service.

Conclusion

In this design, we successfully build:

  • Oracle automatic startup system using systemd
  • Safe shutdown handling
  • Reliable email alert system for failure
  • Production-mirror test service for validation

This architecture is stable because:

  • systemd handles service lifecycle
  • Oracle scripts handle database logic
  • email system is fully separated and reusable
  • test service allows safe validation without risk

This approach is very useful in real enterprise Oracle environments where downtime is critical and testing must be safe.

No comments:

Post a Comment

Cross-Version Oracle Database Automation on Oracle Linux (7, 8, 9) with systemd, dbstart/dbshut, and Email Alerts on Failure

  Introduction In Oracle environments that do not use Oracle ASM, databases are not configured to start automatically after a server reboot....