FHEM Forum

FHEM => fhem-users => Thema gestartet von: Dr. Boris Neubert am 27 Oktober 2008, 06:38:03

Titel: [FHZ] watchdog related crashes?
Beitrag von: Dr. Boris Neubert am 27 Oktober 2008, 06:38:03
                                             

Hi,

I recently added watchdogs for my FHTs. Since then fhem crashes after a while.

The last lines in the global log are:

2008.10.26 17:13:05 2: FHT set eg.Hzg report1 report2 5
Use of uninitialized value in numeric le (<=) at /data/Homeautomation/fhem/fhem.pl line 1635.
Use of uninitialized value in subroutine entry at /data/Homeautomation/fhem/fhem.pl line 1637.
Undefined subroutine &main:: called at /data/Homeautomation/fhem/fhem.pl line 1637.

I suppose that there is an adverse interaction between my at definitions

define eg.Hzg.at.1      at +*03:00:00 set eg.Hzg report1 0 report2 5
define eg.Hzg.at.2      at +*23:13:31 set eg.Hzg report1 255 report2 255
(and 8 more of that kind)

an my watchdog definitions

define w02 watchdog eg.Hzg 00:15:00 SAME set eg.Hzg refreshvalues
(and 8 more of that kind)

>From the respective perl code

1632   # Check the internal list.
1633   foreach my $i (keys %intAt) {
1634     my $tim = $intAt{$i}{TRIGGERTIME};
1635     if($tim <= $now) {
1636       no strict "refs";
1637       &{$intAt{$i}{FN}}($intAt{$i}{ARG});
1638       use strict "refs";
1639       delete($intAt{$i});
1640     }
1641     $nextat = $tim if(!$nextat || $nextat > $tim);
1642   }

it seems to be that the at bookkeeping might be confused.

Any ideas? How to debug?

Regards,
Boris


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~---
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: rudolfkoenig am 15 November 2008, 11:06:57
                                                   

> I recently added watchdogs for my FHTs. Since then fhem crashes after a while.

I hope I fixed this one, please test it (from the CVS). I think it had
to do with the watchdog adding itself to the internal timer list when
executing its own command, as the command matched the regexp.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: Dr. Boris Neubert am 17 November 2008, 21:13:18
                                             

Am Samstag, 15. November 2008 schrieb Rudolf Koenig:
> > I recently added watchdogs for my FHTs. Since then fhem crashes after a
> > while.
>
> I hope I fixed this one, please test it (from the CVS). I think it had
> to do with the watchdog adding itself to the internal timer list when
> executing its own command, as the command matched the regexp.

fhem has been running like a charm with 9 watchdogs for 2 days now. Before the
fix, it crashed reliably after a few hours => I guess your fix solved the
problem.

Thank you.
Boris


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: rudolfkoenig am 27 November 2008, 20:18:04
                                             

Am Montag, 17. November 2008 schrieb Boris Neubert:
> Am Samstag, 15. November 2008 schrieb Rudolf Koenig:
> > > I recently added watchdogs for my FHTs. Since then fhem crashes after a
> > > while.
> >
> > I hope I fixed this one, please test it (from the CVS). I think it had
> > to do with the watchdog adding itself to the internal timer list when
> > executing its own command, as the command matched the regexp.
>
> fhem has been running like a charm with 9 watchdogs for 2 days now. Before
> the fix, it crashed reliably after a few hours => I guess your fix solved
> the problem.

zu früh gefreut...

since a few days I have random crashes with the following perl errors spilling
the main log:

Use of uninitialized value in hash element
at /data/Homeautomation/fhem/fhem.pl line 1210.
Use of uninitialized value in string comparison (cmp)
at /data/Homeautomation/fhem/fhem.pl line 1210.
...
Use of uninitialized value in string ne at /data/Homeautomation/fhem/fhem.pl
line 1217.
Use of uninitialized value in hash element
at /data/Homeautomation/fhem/fhem.pl line 871.
Use of uninitialized value in hash element
at /data/Homeautomation/fhem/fhem.pl line 1337.
Use of uninitialized value in concatenation (.) or string
at /data/Homeautomation/fhem/fhem.pl line 1227.
Use of uninitialized value in concatenation (.) or string
at /data/Homeautomation/fhem/fhem.pl line 1250.
Use of uninitialized value in string ne at /data/Homeautomation/fhem/fhem.pl
line 1217.

Then, an insane xmllist breaks pgm3 and thus CommandXmlList is not called
furthermore. Only the following lines are repeated:

Use of uninitialized value in hash element
at /data/Homeautomation/fhem/fhem.pl line 1862.
Use of uninitialized value in hash element
at /data/Homeautomation/fhem/fhem.pl line 1863.

The mentioned code lines have in common the use of

   $modules{$defs{$dev}{TYPE}}

which has probably turned undefined for yet unknown reason.

The substantial difference between now and the previous situation is an
addtional notify definition:

define bad.piri.notify notify bad.piri set hr.Pumpe on-for-timer 60

Regards,
Boris

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: Dr. Boris Neubert am 27 November 2008, 22:37:29
                                             

> Then, an insane xmllist breaks pgm3 and thus CommandXmlList is not called
> furthermore. Only the following lines are repeated:

addition: when the error occurs, the xmllist is invalid xml.

   1
   2           < name="hr.Pumpe_timer" state="" sets="" attrs="room comment">
   3          
   4         <_internal__LIST>
   5   etc...

The error is in line 2 and it is probably related to the follow-on-for-timer
attribute to hr.Pumpe as hr.Pumpe_timer is the autogenerated at command for
setting hr.Pumpe to off after the specified time.

Boris

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: rudolfkoenig am 03 Dezember 2008, 17:59:42
                                                   

> The error is in line 2 and it is probably related to the follow-on-for-timer
> attribute to hr.Pumpe as hr.Pumpe_timer is the autogenerated at command for
> setting hr.Pumpe to off after the specified time.

I cannot find any obvious problem, if I define a switch with a "follow-
on-for-timer" then it wont corrupt the %defs hash, in fact I am using
this feature at home for a year. Anyway I went through fhem.pl and
checked all deep hash references of the %defs hash. Some of the
references were not careful, but are now. Boris, could you please try
out the CVS version? If nothing serious arises, I intend to package
the current CVS state as 4.5.

If the problem still arises, then please execute following command in
the telnet prompt before fhem crashes:

{ foreach my $k (keys %{$defs{hr.Pumpe_timer}}) { Log 1, $k } }

and send me the result.

Rudi
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: Dr. Boris Neubert am 04 Dezember 2008, 22:29:54
                                             

Am Mittwoch, 3. Dezember 2008 schrieb Rudolf Koenig:
> > The error is in line 2 and it is probably related to the
> > follow-on-for-timer attribute to hr.Pumpe as hr.Pumpe_timer is the
> > autogenerated at command for setting hr.Pumpe to off after the specified
> > time.
>
> references were not careful, but are now. Boris, could you please try
> out the CVS version? If nothing serious arises, I intend to package

First of all, turning follow-on-for-timer on half crashed fhem within 15
minutes (logs thrashed, list xml output invalid)

hr.Pumpe is a FS20 device with attr follow-on-for-timer set.

I integrated some debugging code in fhem.pl and found out the following:

(1) hr.Pumpe_timer exists but has no keys at all.

(2) list hr.Pumpe_timer

gives the empty result:

   Internals:
   Attributes:

This is the reason for the message

   Use of uninitialized value in hash element
   at /data/Homeautomation/fhem/fhem.pl line 1886

and for the invalid xml produced around line 1232.

(3) fhem.save contains

   setstate hr.Pumpe on-for-timer 60
   setstate hr.Pumpe 2008-12-04 19:50:29 state on-for-timer 60

so the  hr.Pumpe_timer is no relic from a previous session.

(4) hr.Pumpe_timer is not there after a fresh start of fhem. It just appears
after some time. I will do some more debugging next weekend.



BTW, an undefined QUEUECNT right after startup of the FHZ devices causes the
following error:

Use of uninitialized value in numeric gt (>)
at /data/Homeautomation/fhem/FHEM/00_FHZ.pm line 538, <$fh> line 48.

I will tidy this up later. My eyes are square %\

Regards,
Boris

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: Dr. Boris Neubert am 07 Dezember 2008, 21:16:15
                                             

Hi,

Am Donnerstag, 4. Dezember 2008 schrieb Boris Neubert:
> I integrated some debugging code in fhem.pl and found out the following:
> (1) hr.Pumpe_timer exists but has no keys at all.

I was not able to reproduce the problem today. Maybe this is related to the
fixes you introduced last friday to address the issue:

> BTW, an undefined QUEUECNT right after startup of the FHZ devices causes
> the following error:

I keep watching issue (1) to see whether it re-emerges next week.

Regards,
Boris

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-
Titel: [FHZ] Re: watchdog related crashes?
Beitrag von: Dr. Boris Neubert am 13 Dezember 2008, 12:08:54
                                             

> Am Donnerstag, 4. Dezember 2008 schrieb Boris Neubert:
>> I integrated some debugging code in fhem.pl and found out the following:
>> (1) hr.Pumpe_timer exists but has no keys at all.
>
> I was not able to reproduce the problem today. Maybe this is related to
> the
> fixes you introduced last friday to address the issue:
> I keep watching issue (1) to see whether it re-emerges next week.

my fhem has been running flawlessly for a week now. The CVS update on 05 Dec
seems to have removed the issue. It would be of interest if Olaf"s "Delete
im AT funktioniert nicht" problem, which seems to be related or the same,
can be resolved by upgrading to the latest fhem CVS version too.

Regards,
Boris


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "FHZ1000 users on Linux" group.
To post to this group, send email to FHZ1000-users-on-unix@googlegroups.com
To unsubscribe from this group, send email to FHZ1000-users-on-unix+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/FHZ1000-users-on-unix?hl=en
-~----------~----~----~----~------~----~------~--~-