Introduction
In Oracle
environments that do not use Oracle ASM, databases are not configured to start
automatically after a server reboot. To automate database startup and shutdown,
administrators commonly use Oracle's standard dbstart and dbshut
utilities.
Although these
tools provide basic automation, they do not always verify that the database has
successfully reached the OPEN state, and they do not provide built-in
notification when startup failures occur.
This article
demonstrates how to combine dbstart/dbshut, systemd, startup
validation checks, and automated email alerts to create a reliable Oracle
database auto-start framework. The approach is compatible with Oracle Linux 7,
8, and 9 and can be used with Oracle database versions that support the
standard dbstart/dbshut utilities. It also includes a safe testing
method for validating the email alert system without affecting a production
database.
In this blog
post, I design:
- An Oracle auto-start service using systemd
- Proper shutdown control
- An email alert system for service failures
- A safe test service to validate the
alerting mechanism without production risk
This design is simple, stable, and suitable for Oracle Linux production environments.
Part 1: Oracle Database Auto Start Service (Production)
1.1. Creating oracle-db.service file
As the root user, create the file /etc/systemd/system/oracle-db.service and add the following content:
[Unit]
Description=Oracle Database Service
After=network-online.target
Wants=network-online.target
# Failure (important part)
OnFailure=unit-status-mail@%n.service
[Service]
Type=oneshot
User=oracle
Group=oinstall
WorkingDirectory=/u01/app/oracle
RemainAfterExit=yes
# Execute wrappers
ExecStart=/usr/local/bin/oracle-start.sh
ExecStop=/usr/local/bin/oracle-stop.sh
TimeoutStartSec=10min
TimeoutStopSec=10min
[Install]
WantedBy=multi-user.target
The
network-online.target dependency ensures that network services are available
before Oracle startup begins. The OnFailure parameter triggers an email alert
whenever the Oracle service fails, allowing administrators to be notified
immediately.
The service
runs as the Oracle software owner (oracle) and uses Type=oneshot because the
startup script performs its work and exits. The RemainAfterExit=yes parameter
keeps the service in an active state after a successful startup.
Startup and
shutdown operations are handled through dedicated wrapper scripts, making the
configuration easier to maintain and troubleshoot.
The timeout and
resource limit settings help ensure that Oracle has sufficient time and
operating system resources to complete startup and shutdown operations
successfully.
1.2. Oracle Startup Script (oracle-start.sh)
This script performs more than a standard dbstart operation. First,
it checks whether the Oracle listener is already running and starts it only
when required. This prevents unnecessary listener startup attempts during
server reboot or service restart.
After the listener check, the script executes Oracle's native
dbstart utility to start the database instance.
The most important part of the script is the validation loop.
Instead of assuming that dbstart completed successfully, the script connects to
the database using SQL*Plus and checks the instance status every 10 seconds.
Startup is considered successful only when the database reaches the OPEN state.
If the database does not reach the OPEN state within 10 minutes,
the script returns a failure code to systemd. This allows the OnFailure
notification service to send an email alert and notify administrators that
manual investigation may be required. The polling loop does not always wait for
the full 10 minutes. If the database reaches the OPEN state earlier, the script
exits immediately. The full 10-minute window only applies in failure or
slow-start scenarios.
As the root user, create the startup script for Oracle database /usr/local/bin/oracle-start.sh and add the following configuration:
#!/bin/bash
# Description: Production Oracle Startup & Deep Health Verification Wrapper
export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin
echo "=== Oracle Startup Initiated: $(date) ==="
# 1. Start the Listener safely (only if not already running)
$ORACLE_HOME/bin/lsnrctl status >/dev/null 2>&1
if [ $? -ne 0 ]; then
echo "Listener is stopped. Starting Oracle Listener..."
$ORACLE_HOME/bin/lsnrctl start
if [ $? -ne 0 ]; then
echo "ERROR: Listener failed to start." >&2
exit 1
fi
else
echo "Listener is already running and healthy."
fi
# 2. Start the Database Instance
echo "Starting Oracle Database..."
$ORACLE_HOME/bin/dbstart $ORACLE_HOME
if [ $? -ne 0 ]; then
echo "ERROR: dbstart script returned a non-zero exit code." >&2
exit 1
fi
# 3. Poll for OPEN State (60 attempts x 10 seconds = 10 minutes)
# Generous window to safely accommodate heavy instance crash recoveries.
echo "Verifying database instance state..."
for i in {1..60}; do
STATUS=$(
$ORACLE_HOME/bin/sqlplus -s / as sysdba <&2
exit 1
fi
# Clean up whitespace/newlines
STATUS=$(echo "$STATUS" | tr -d '[:space:]')
echo "Attempt $i: Current database status is '$STATUS'"
if [ "$STATUS" = "OPEN" ]; then
echo "SUCCESS: Oracle Database is fully OPEN and operational."
exit 0
fi
sleep 10
done
echo "ERROR: Database failed to reach OPEN state within 10 minutes." >&2
exit 1
1.3. Oracle Shutdown Script (oracle-stop.sh)
This
script is used for safe shutdown of the Oracle database and listener, usually
triggered by systemd during service stop or server shutdown.
First,
we set ORACLE_HOME so Oracle commands can locate the database software. Then
dbshut gracefully stops all database instances and closes active sessions in a
clean way.
After
that, lsnrctl stop stops the listener to prevent new connections. Finally, exit
0 tells systemd that shutdown completed successfully without errors.
As the root user, create the startup script for Oracle database /usr/local/bin/oracle-stop.sh and add the following configuration:
#!/bin/bash
# Description: Production Oracle Hardened Shutdown Wrapper
export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin
echo "=== Oracle Shutdown Initiated: $(date) ==="
echo "Shutting down Oracle Database Instance(s)..."
$ORACLE_HOME/bin/dbshut $ORACLE_HOME
DB_RC=$?
echo "Stopping Oracle Listener..."
$ORACLE_HOME/bin/lsnrctl stop
LSNR_RC=$?
if [ $DB_RC -ne 0 ] || [ $LSNR_RC -ne 0 ]; then
echo "ERROR: Shutdown sequence encountered failures. DB_RC=$DB_RC, LSNR_RC=$LSNR_RC" >&2
exit 1
fi
echo "=== Oracle Shutdown Completed Successfully: $(date) ==="
exit 0
1.4. Set Script Permissions
After
creating the startup and shutdown scripts, we need to set correct ownership and
permissions. This step is important because systemd runs the service using the
oracle user, so the scripts must be accessible and executable by that user.
chown oracle:oinstall /usr/local/bin/oracle-start.sh /usr/local/bin/oracle-stop.sh
chmod 750 /usr/local/bin/oracle-start.sh /usr/local/bin/oracle-stop.sh
Part 2: Systemd Email Alert System (Working Pipeline)
This part is responsible for sending an email notification when any
systemd service fails. The main idea is very simple, when a service returns an
error status, systemd automatically triggers another helper service that sends
an email to the DBA or support team.
2.1. Email Template Unit
This unit is responsible only for calling the email script. The
important part is %i, which means systemd will pass the failed service name
dynamically into this unit. So we always know exactly which service has failed.
Also, Type=oneshot is used because this is not a long-running
service, it just executes one action (send email) and finishes.
As the root user, create the file /etc/systemd/system/unit-status-mail@.service
and add the following content:
[Unit]
Description=Send Status Email for %i
After=network-online.target
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/send-systemd-alert.sh %i
2.2. Email Sending Script
This script takes the service name as input, then collects the system
hostname and builds the email content.
The email contains:
- Hostname of server
- Name of failed service
- Commands to check service status
- Commands to check logs
This is very helpful for the DBA team because they can directly see
what command should be used for troubleshooting without guessing.
Finally, the script sends an email using mailx and also logs the
result in /tmp/systemd-mail.log for debugging purposes.
As the root user, create /usr/local/bin/send-systemd-alert.sh that
actually sends email and add the following content.
#!/bin/bash
UNIT_NAME="$1"
HOSTNAME=$(hostname)
RECIPIENT="youremail@yourdomain.com"
SUBJECT="[ALERT] ${UNIT_NAME} failed on ${HOSTNAME}"
BODY="
Host: ${HOSTNAME}
Unit: ${UNIT_NAME}
Check status:
systemctl status ${UNIT_NAME}
Check logs:
journalctl -u ${UNIT_NAME} -b
"
echo "${BODY}" | /usr/bin/mailx -v -s "${SUBJECT}" "${RECIPIENT}" \
>> /tmp/systemd-mail.log 2>&1
echo "$(date) mailx rc=$?" >> /tmp/systemd-mail.log
2.3. Systemd Failure Notification Workflow
When a service fails, systemd automatically triggers the action
defined in the OnFailure parameter. In this configuration, the failed service
name is passed to the unit-status-mail template service, which executes the
email script and sends an alert notification to the configured recipient. This
allows the DBA or support team to be notified immediately without manually
checking service status or system logs.
3. Creating the Test Service
Since my production environment was already online and serving
users, I could not test the Oracle auto-start service directly without
introducing unnecessary risk. Therefore, I created a separate test service to
validate the email notification framework safely.
3.1. oracle-db-test.service
The oracle-db-test.service unit is intentionally designed to be
very similar to the production oracle-db.service. It uses the same Oracle user,
service type, resource limits, and OnFailure configuration. This allows us to
test the complete systemd alert workflow in an environment that closely matches
production behavior.
As the root user, create the file /etc/systemd/system/oracle-db-test.service
and add the following content:
3.2. Test Wrapper Script
The wrapper script is responsible for simulating Oracle service
execution without making any changes to the database.
It performs a safe lsnrctl status check, which only verifies
listener availability and does not start or stop any Oracle component. After a
short delay to simulate a validation phase, the script intentionally returns a
failure code.
Because systemd treats a non-zero return code as a service failure,
the OnFailure action is triggered, and the email notification process starts
automatically.
As the root user, create /usr/local/bin/oracle-db-test-wrapper.sh
#!/bin/bash
export ORACLE_HOME=/u01/app/oracle/product/19.0.0/dbhome_1
export ORACLE_SID=ORCL
export PATH=$ORACLE_HOME/bin:/usr/bin:/bin
echo "======================================"
echo "ORACLE PROD-MIRROR TEST START: $(date)"
echo "User: $(whoami)"
echo "Working Dir: $(pwd)"
echo "======================================"
# STEP 1: mimic listener check (SAFE READ ONLY)
echo "[STEP 1] Checking listener status (no changes)..."
$ORACLE_HOME/bin/lsnrctl status >/dev/null 2>&1
LSNR_RC=$?
if [ $LSNR_RC -eq 0 ]; then
echo "Listener reachable"
else
echo "Listener NOT reachable (simulated condition)"
fi
# STEP 2: mimic DB check style (NO DB CONNECTION)
echo "[STEP 2] Simulating DB validation phase..."
sleep 2
# STEP 3: FORCE FAILURE (this is intentional for OnFailure test)
echo "[STEP 3] Forcing controlled failure to trigger OnFailure hook"
exit 1
3.3. Set Script Permissions
After creating the test wrapper script, set the correct ownership
and permissions:
chown oracle:oinstall /usr/local/bin/oracle-db-test-wrapper.sh
chmod 750 /usr/local/bin/oracle-db-test-wrapper.sh
3.4. How to Test
Run the following commands:
systemctl daemon-reload
systemctl start oracle-db-test.service
systemctl daemon-reload instructs systemd to reload all unit files
and recognize any new or modified service definitions.
systemctl start oracle-db-test.service starts the test service. The
service is expected to fail intentionally, which triggers the OnFailure action
and starts the email notification process.
If the alert email is received successfully, we can be confident that the notification framework is working correctly and is ready for use with the production Oracle service.
Conclusion
In this design,
we successfully build:
- Oracle automatic startup system using
systemd
- Safe shutdown handling
- Reliable email alert system for failure
- Production-mirror test service for
validation
This
architecture is stable because:
- systemd handles service lifecycle
- Oracle scripts handle database logic
- email system is fully separated and
reusable
- test service allows safe validation
without risk
This approach
is very useful in real enterprise Oracle environments where downtime is
critical and testing must be safe.
No comments:
Post a Comment