Skript: Selbstheilung für FHEM und PILIGHT ( no pilight ssdp connections found )

Begonnen von ikea, 24 November 2018, 00:10:25

Vorheriges Thema - Nächstes Thema

ikea

Moin,

ich nutze FHEM und pilight seit Jahren. Immer wieder stand ich vor dem Problem, dass sich einer der Dienste FHEM oder PILIGHT (offensichtlich grundlos oder aufgrund von Fehlern in FHEM-Modulen) verabschiedet hat. Das zeigte sich dann darin, dass FHEM nicht oder nur fehlerhaft lief (WebGUI nicht erreichbar) und/oder die Fehlermeldung " no pilight ssdp connections found " auftrat.

Ich habe ein Skript geschrieben, dass per CRON alle x Minuten ausgeführt wird.
Dann wird die Funktionalität von FHEM und PILIGHT geprüft.
Bei einem Fehler werden dann die jeweiligen Dienste neu gestartet.
Wenn das nach x Versuchen ohne Erfolg ist, kann auf Wunsch das gesamte System rebootet werden.

Auf Wunsch kann eine Fehlermeldung in Dein FHEM-Log ausgegeben werden.

Hier das Skript, vielleicht kann es der ein oder andere gebrauchen.

Download: http://www.emjau.de/downloads/selfhealer_fhem_piligth.sh

Die gesamte Beschreibung zur Handhabung und Individualisierung des Skripts befindet sich in den Kommentaren am Anfang des Skripts.

Ich habe es auf meinem Pi 3B mit Raspbian implementiert.

Wenn irgendwas nicht funktioniert oder Fragen bestehen - bitte kontaktier mich mit Angaben zu Deinem System und genauer Fehlerbeschreibung.
Ich möchte das Skript kontinuierlich verbessern, so dass es für alle User möglichst einfach und problemlos einzusetzen ist.
Dazu brauche ich den Input derjenigen, die es ausprobiert haben.
Danke!

UPDATE 2018-11-24 : Ermittlung des absoluten Pfades bei Ausführung durch CRON ist gefixt!


#!/bin/bash

# =====================================================================================
# Self-Healer for the services FHEM und PILIGHT
# (c)2018 by emjau
# contact fhem@emjau.de for help (put catchword FHEM to subject!)
# =====================================================================================
#
# ! ! ! !
# Please read carefully to understand how to get this script working correctly ! ! ! !
# ! ! ! !
#
# =====================================================================================
#
# This script is checking if the services FHEM and PILIGHT are running properly
# If malfunction is detected these services are restarted
# After repeatedly unsuccessful attempts the whole system can be rebootet (if you allow)
#
# Put this script to [YourDirectory].
# You may change the script's name, it must end with .sh
# The names of the state- and log-files will be automatically adapted.
#
# The script needs full-access-rights 777:
# >> chmod 777 [YourDirectory]/[scriptname].sh
#
# Run the script periodically, i.g. every 5 minutes:
# Entry in CRONTAB:  ( >> sudo crontab -e)
# */5 * * * * [YourDirectory]/[scriptname].sh
#
# Costumize the section YOUR SETTINGS !!!
#
# The following files will be created by this script:
#   - [scriptname].log
#   - [scriptname].status
#   - [scriptname].reboots
#
# ...you may reset all counters by deleting the .status and .reboots files.
#
# Check if the script works:
#  Drop one of these commands:
#   >> sudo service fhem stop         or
#   >> sudo service pilight stop
# ...then wait for the next run by CRON (or run it manually), read the logfile an test the function of FHEM/PILIGHT
#





#=============================================================================================#
#  YOUR SETTINGS (customize this!)                                                            #
#=============================================================================================#

FHEM_USER="YOUR_USERNAME_FOR_FHEM_WEBGUI" # Your FHEM WebGUI-User
FHEM_PASS="YOUR_PASSWORD_FOR_FHEM_WEBGUI" # Your FHEM WebGUI-Password

FHEM_URL="http://127.0.0.1:8083/fhem" # Your URL + :Port for the  FHEM-WebGUI [default http://127.0.0.1:8083/fhem]

REBOOT=YES # [YES / NO] Should the whole system been rebooted if restarting the services was not successful?
TRIES_BEFORE_REBOOT=3 #[default = 3] how many tries of service-restarts before rebooting the whole system?
REBOOTS_MAX=2 #[default=2] how many reboots ín maximm

LOG_SUCCESSFUL_CHECKS=NO # [YES / NO] if you want to log successful checks too -> set to YES [default=NO]
                         # Warning: setting to YES might cause a big logfile!

WRITE_TO_FHEM_LOGFILE=YES # [YES / NO] do you whish an output to you FHEM-logfile if an error was detected?
FHEM_LOGFILE=/opt/fhem/log/fhem-$(date +%Y-%m).log #your FHEM-Logfile. Take care of giving the absolute path to the fhem-logfile!






#=============================================================================================#
#  PROGRAM (do not touch!)                                                                    #
#=============================================================================================#

# determine the absolute path where this script is run from
# ---
SOURCE="${BASH_SOURCE[0]}"
while [ -h "$SOURCE" ]; do # resolve $SOURCE until the file is no longer a symlink
  DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
  SOURCE="$(readlink "$SOURCE")"
  [[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE" # if $SOURCE was a relative symlink, we need to resolve it relative to the path where the symlink file was located
done
DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
# ---
cd $DIR

SCRIPTNAME_FULL=${0##*/} # determine the name of this script
SCRIPTNAME=${SCRIPTNAME_FULL%.*} # cut .sh from scriptname
logfile=$SCRIPTNAME.log
statusfile=$SCRIPTNAME.status
rebootfile=$SCRIPTNAME.reboots

timestamp=`date +%Y-%m-%d_%H:%M:%S`

if [ ! -e "$statusfile" ]; then echo "0" > $statusfile; fi
if [ ! -e "$rebootfile" ]; then echo "0" > $rebootfile; fi
if [ ! -e "$logfile" ]; then echo "Logfile created on first run: $timestamp" | tee -a $logfile; echo "--------------------------------------------------------" | tee -a $logfile; fi

status=$(sed -n '1p;2q' $statusfile) # read first (and only) row of statusfile
reboot_counter=$(sed -n '1p;2q' $rebootfile) # read first (and only) row of rebootfile

#echo "$timestamp ... testing reachability of FHEM-WebGui and testing login:" | tee -a $logfile
# Check if login to FHEM-WebGUI is successful and if the button 'Save config' is visible:
testvar=$(/usr/bin/wget $FHEM_URL --timeout=10 --tries=2 --http-user=$FHEM_USER --http-passwd=$FHEM_PASS -O - 2>/dev/null | grep 'Save config') #try login and suppress output so stdout
#testvar=$(/usr/bin/wget $FHEM_URL --timeout=10 --tries=2 --http-user=$FHEM_USER --http-passwd=$FHEM_PASS -O - | grep 'Save config') #try login WITHOUT suppressing output so stdout
testvar_len=${#testvar}

failure_fhem=0
failure_pilight=0

if [ $testvar_len -lt 1 ] # if login to FHEM-WebGUI was NOT successful
then
  failure_fhem=1
else
  TEST=$(/usr/local/bin/pilight-send -p raw --code="999 999 999 999" 2>&1)
  if [[ "$TEST" =~ "no pilight ssdp connections found" ]] # if pilight ssdp-connection is faulty
  then
    failure_pilight=1
  fi
fi

#,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
# for testing purposes only !
#,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
#failure_fhem="1"
#failure_pilight="1"
#,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
#,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,



if [ $failure_fhem -gt 0 ] || [ $failure_pilight -gt 0 ] # if login to FHEM-WebGUI was NOT successful  OR  pilight ssdp-connection is faulty
then
  timestamp=`date +%Y-%m-%d_%H:%M:%S`
  echo "----------------------------------------------------------------------------------------" | tee -a $logfile
  if [ $failure_fhem -gt 0 ]; then echo "$timestamp   !!!   The FHEM WebGUI is down   !!!" | tee -a $logfile; fi
  if [ $failure_pilight -gt 0 ]; then echo "$timestamp   !!!   PiLight SSDP-Connection is fucked up   !!!" | tee -a $logfile; fi
  if [ $WRITE_TO_FHEM_LOGFILE == "YES" ]
  then
    if [ ! -e "$FHEM_LOGFILE" ]
    then
      echo "!!! Your stated FHEM_LOGFILE does not exist: $FHEM_LOGFILE" | tee -a $logfile
    else
      echo $'\n\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n'$timestamp$'\n'$SCRIPTNAME_FULL$' detected an ERROR\nSee '$SCRIPTNAME$'.log for details\n!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n\n' >> $FHEM_LOGFILE
    fi
  fi

  if [[ "$status" =~ ^[0-9]+$ ]] #if $status is integer
  then
    #echo "number of already performed restarts of services (fhem + pilight): $status (max. before reboot: $TRIES_BEFORE_REBOOT)" | tee -a $logfile
    if [ $status -lt $TRIES_BEFORE_REBOOT ]
    then
      if [ $status -eq 0 ] && [ $reboot_counter -gt 0 ]; then echo "System has been just rebooted for the $reboot_counter. time! (REBOOTS_MAX = $REBOOTS_MAX)" | tee -a $logfile; fi
      status_new=$((status+1))
      if [ $reboot_counter -gt 0 ]; then msg_addon="-> after $reboot_counter reboots"; else msg_addon=""; fi
      if [ $failure_fhem -gt 0 ]
      then
        echo "restarting services for the $status_new. time $msg_addon... (FHEM + PILIGHT)" | tee -a $logfile
        echo "  --> this will take about 15 seconds..."
        echo $status_new > $statusfile
        sudo service fhem stop && sleep 3
        sudo service pilight stop && sleep 3
        sudo service pilight start && sleep 3
        sudo service fhem start && sleep 3
      elif [ $failure_pilight -gt 0 ]
      then
        echo "restarting services for the $status_new. time $msg_addon... (PILIGHT only)" | tee -a $logfile
        echo "  --> this will take about 10 seconds..."
        echo $status_new > $statusfile
        sudo service pilight stop && sleep 3
        sudo service pilight start && sleep 3
      else
        echo "    !!!    UNDEFINED ERROR 465_BC    !!!   " | tee -a $logfile
      fi
    else
      echo "Maximum number of attempts by restarting service(s) is now reached!" | tee -a $logfile
      if [ $REBOOT == "YES" ]
      then
        reboot_counter_new=$((reboot_counter+1))
        if [ $reboot_counter_new -gt $REBOOTS_MAX ]
        then
          echo "Maximum number of attempts by rebooting is reached. Nothing to be done... sorry..." | tee -a $logfile
        else
          echo "  -> Now: rebooting the system for the $reboot_counter_new. time!" | tee -a $logfile
          echo $reboot_counter_new > $rebootfile
          echo "0" > $statusfile
          #sudo shutdown -r now
        fi
      else
        echo "  Reboot not allowed by your settings! (YOUR SETTINGS: REBOOT=$REBOOT) ...no action!" | tee -a $logfile
        echo "  -> Please set REBOOT=YES in section YOUR_SETTIGS to allow reboots." | tee -a $logfile
      fi
    fi
  else #is $status is NOT integer
    echo "$timestamp !!! Undefined last known state! Variable status is not numeric: $status" | tee -a $logfile
  fi
else
  #echo "$timestamp The FHEM WebGUI running and login successful." | tee -a $logfile
  if [ $status -ne 0 ] || [ $reboot_counter -ne 0 ] #if last state was faulty
    then
      echo "----------------------------------------------------------------------------------------" | tee -a $logfile
      echo "$timestamp Finally everything is fine again !!! -> Resetting state and reboot-counter to 0 (OK)..." | tee -a $logfile
      echo "0" > $statusfile
      echo "0" > $rebootfile
    else
      echo "[OK] services FHEM and PILIGHT are fine!"
      if [ $LOG_SUCCESSFUL_CHECKS == "YES" ]
      then
        echo "$timestamp [OK] Check by $SCRIPTNAME_FULL was OK..." | tee -a $logfile
      fi
  fi
fi