THZ Tecalor (LWZ Stiebel Eltron) module support and code improvement.

Begonnen von immi, 02 Februar 2015, 11:42:16

Vorheriges Thema - Nächstes Thema

willybauss

Hi immi,

my logfile shows frquent (3 ... 6 times per day) messages saying

Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?

This is not a new effect, I know it since years, and everything works fine. But from time to time I wonder if I can do something to avoid it. Is there a user accessible value to e.g. increase wait time?
FHEM auf Raspberry Pi B und 2B; THZ (THZ-303SOL), CUL_HM, TCM-EnOcean, SamsungTV, JSONMETER, SYSMON, OBIS, STATISTICS

immi

Zitat von: willybauss am 09 Juni 2017, 19:00:03
my logfile shows frquent (3 ... 6 times per day) messages saying
Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?
This is not a new effect, I know it since years, and everything works fine. But from time to time I wonder if I can do something to avoid it. Is there a user accessible value to e.g. increase wait time?
Hi willy
since when do you have 6 of this logs per day? I have one per month.
is this increasing?
change following lines to get better logs
line 1068:   Log3 $hash->{NAME}, 3, "THZ_Get: Error msg2:  $err ---- $cmdHex2";
line 1080:   Log3 $hash->{NAME}, 3, "THZ_Get: Error msg3:  $err ---- $cmdHex3";

I see 2 ways:

1) I can make an attribute to remove the warning message from the logs (easy but it is just hiding the problem)
2) you can work on waittimes (I can hardcode it later): increase slowly (10% steps) and change only one at the time
- line 741:  $timedelay += 0.6;  this is the minimum waittime in seconds between 2 consequent scheduled readings; it prevents that the tecalor gets to busy e.g. by fhem start, then all register are read in sequence 
- line 1116: select(undef, undef, undef, 0.001); this is the minimum waittime in seconds between 2 consequent unscheduled readings
- line 1160: $buf = DevIo_SimpleReadWithTimeout($hash, 0.1) if(!defined($buf)) ; this is an additional waittime in seconds used in the comunication protocol of each reading.

immi

willybauss

Hi immi,

I don't think it's increasing. I believe I have these entries since 2 years or so. Normally I don't even check the logfile contents, as long as it stays small in monthly size. But currently I try to find the root cause, why my Homematic CUL stops working frequently; needs a 'reopen' then. I observed an approx. 70 ... 80% match of times tamps between CUL stop and these THZ entries. CUL stop happens approx. 2 ... 3 times per month. So I don't believe that CUL is root cause for THZ issue, but I think its the other way round - THZ issue might be cause for the CUL to stop working.

For that reason I wonder if increasing wait times may help. Is fhem blocked during these wait times? If yes, the CUL issue wouldn't be healed by increasing wait times further. I need to try it out. How long may these wait times be - few seconds?

As I understand, the message tells me that Mythz sends a request to the THZ, and the answer doesn't come in expected time frame, right? What can I do to have the THZ's answer earlier? Nothing I guess, right? I have a serial connection with approx. 4 - 5m cable. The cable ends in a serial-USB converter. The converter is plugged into an USB hub, directly connected to RaspberryPi's USB port.

Since the CUL issue happens very rarely we shouldn't expect an answer within weeks. We need to think in months though. I'll let you know as soon as I have any answers.

Thanks so far. Have a nice weekend.
Willy
FHEM auf Raspberry Pi B und 2B; THZ (THZ-303SOL), CUL_HM, TCM-EnOcean, SamsungTV, JSONMETER, SYSMON, OBIS, STATISTICS

immi

Zitat von: willybauss am 10 Juni 2017, 14:14:41
Hi immi,
I don't think it's increasing. I believe I have these entries since 2 years or so. Normally I don't even check the logfile contents, as long as it stays small in monthly size. But currently I try to find the root cause, why my Homematic CUL stops working frequently; needs a 'reopen' then. I observed an approx. 70 ... 80% match of times tamps between CUL stop and these THZ entries. CUL stop happens approx. 2 ... 3 times per month. So I don't believe that CUL is root cause for THZ issue, but I think its the other way round - THZ issue might be cause for the CUL to stop working.
I have not changed anything in the protocoll since 1 year or more. Look since when do you have problems with CUL.

Zitat
What can I do to have the THZ's answer earlier? Nothing I guess, right?
nothing

Zitat
For that reason I wonder if increasing wait times may help. Is fhem blocked during these wait times?
timedelay in line 741 is not blocking fhem
timedelay in line 1116 is  blocking fhem (but 1 ms is not a problem)
timedelay in line  1159 and 1160 are  slowing down fhem only if there is a problem in the comunication


willybauss

Zitat von: immi am 10 Juni 2017, 23:12:21
Look since when do you have problems with CUL.
Since I installed the CUL - approx. a year after Mythz.

I'll keep you informed, but it may take months, as I said.
FHEM auf Raspberry Pi B und 2B; THZ (THZ-303SOL), CUL_HM, TCM-EnOcean, SamsungTV, JSONMETER, SYSMON, OBIS, STATISTICS

willybauss

After having changed the code lines as you suggested the logs show:

2017.06.17 18:54:03 3: Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?
2017.06.17 18:54:03 3: THZ_Get: Error msg2:  InterfaceNotRespondig error found at step1 ---- 0100320A091E1003
2017.06.17 18:54:04 3: THZ_Get: Error msg2:  CRC error in request ---- 0100D10A05C11003
2017.06.17 18:54:25 3: Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?
2017.06.17 18:54:26 3: THZ_Get: Error msg2:  InterfaceNotRespondig error found at step1 ---- 0100A00B14801003
2017.06.17 18:54:42 3: Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?
2017.06.17 18:54:42 3: THZ_Get: Error msg3:  InterfaceNotRespondig error found at step1 ---- 0100410A092D1003
2017.06.17 18:54:42 3: THZ_Get: Error msg2:  new unknown answer  ---- 0100980A05881003
2017.06.17 18:54:59 3: THZ_Get: Error msg2:  CRC error in request ---- 0100910A05811003
2017.06.17 18:55:15 3: Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?
2017.06.17 18:55:15 3: THZ_Get: Error msg2:  InterfaceNotRespondig error found at step1 ---- 0100430A17211003
2017.06.17 18:55:23 3: Mythz THZ_ReadAnswer got no answer from DevIo_SimpleRead. Maybe too slow?
2017.06.17 18:55:23 3: THZ_Get: Error msg2:  InterfaceNotRespondig error found at step1 ---- 01007A0B05691003
2017.06.17 18:55:23 3: THZ_Get: Error msg2:  new unknown answer  ---- 0100830A17611003
2017.06.17 18:55:30 3: THZ_Get: Error msg2:  CRC error in request ---- 0100630C15411003
2017.06.17 18:55:32 3: Mythz NAK!!
FHEM auf Raspberry Pi B und 2B; THZ (THZ-303SOL), CUL_HM, TCM-EnOcean, SamsungTV, JSONMETER, SYSMON, OBIS, STATISTICS

immi

I uploaded v0.160
-improved logging
-reduced blockingtime by NAK

Hi Willy
keep on monitoring, the logging should be more clear now, share futher logs
- InterfaceNotRespondig: your tecalor has ignored the request
- NAK: your tecalor has refused the request becase doing something else (e.g. busy with canbus or with slow eprom or...)
- CRC error in request: your tecalor received a request with a bad crc. (bitkipper) Could be a frequency issue in the serial port or EMV/EMC.

TheTrumpeter

Hello immi,

I found another value; it contains the total amout of heat from active-cooling:
"sCoolHCTotal"                => {cmd2=>"0A0648", cmd3=>"0A0649",     type =>"1clean", unit =>" kWh"},

Currently my THZ only has a value in 0A0648, but I assume it will be 2 bytes as all the other heat-counters.
FHEM auf RPi3, THZ (LWZ404SOL), RPII2C & I2C_MCP342x (ADCPiZero), PowerMap, CustomReadings, RPI_GPIO, Twilight, nanoCUL (WMBus für Diehl Wasserzähler & Regenerationszähler für BWT AqaSmart), ESPEasy, TPLinkHS110

TheTrumpeter

#548
Hello immi,

Some more:

sets:
0B0287 is "Kühlbetrieb HK1", it's only one bit. If it's 1, then cooling is enabled, 0 is disabled.
0B0582 is "HK Temp. kühlen", it's in °C, divide the raw value by 10, then you get the physical value, ex. 0x00AF is 175d which is 17,5°C.
0B0583 is "Hyst. Vorlauftemp kühlen", it's in K, divide the raw value by 10, then you get the physical value.
0B0584 is "HK1 Raumtemperatur Hysterese",  it's in K, divide the raw value by 10, then you get the physical value.

gets:
0B0264 is the "dewpoint of HC1", it's in °C, divide the raw value by 10, then you get the physical value
FHEM auf RPi3, THZ (LWZ404SOL), RPII2C & I2C_MCP342x (ADCPiZero), PowerMap, CustomReadings, RPI_GPIO, Twilight, nanoCUL (WMBus für Diehl Wasserzähler & Regenerationszähler für BWT AqaSmart), ESPEasy, TPLinkHS110

immi


willybauss

Zitat von: immi am 18 Juni 2017, 16:36:57
Hi Willy
keep on monitoring ...
Yesterday's log, filtered for THZ related issues:

2017.06.24 00:17:34 3: THZ_Get: Error msg2:  decode: CRC error in request -- 0A0176 -> 01020D0A1003
2017.06.24 00:32:34 3: THZ_Get: Error msg2:  decode: CRC error in request -- 0A0176 -> 01020D0A1003
2017.06.24 02:49:36 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step2 -- 0A0176 ->
2017.06.24 04:44:36 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- 0A0176 ->
2017.06.24 04:44:49 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- FB ->
2017.06.24 06:35:44 3: THZ_Get: Error msg2:  decode: new unknown answer  -- F4 -> 0101F6F41003
2017.06.24 06:44:36 3: THZ_Get: Error msg2:  decode: new unknown answer  -- 0A0176 -> 01010C0A1003
2017.06.24 08:25:44 3: THZ_Get: Error msg2:  decode: CRC error in request -- F4 -> 0102E7E41003
2017.06.24 08:52:35 1: /dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller_D-if00-port0 disconnected, waiting to reappear (Mythz)
2017.06.24 08:52:37 3: Setting Mythz serial parameters to 115200,8,N,1
2017.06.24 08:52:37 1: /dev/serial/by-id/usb-Prolific_Technology_Inc._USB-Serial_Controller_D-if00-port0 reappeared (Mythz)
2017.06.24 08:53:22 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step2 -- 0A092C ->
2017.06.24 08:53:23 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step0  -- 0A05B8 ->
2017.06.24 08:53:40 3: THZ_Get: Error msg2:  decode: new unknown answer  -- 0A0581 -> 01010C0A1003
2017.06.24 08:53:49 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- 0A17A2 ->
2017.06.24 09:43:29 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- FB ->
2017.06.24 09:45:14 3: THZ_Get: Error msg2:   THZ_Get_Com: error found at step1  NAK!!  -- 0A0176 -> 15
2017.06.24 11:18:29 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- FB ->
2017.06.24 12:44:15 3: THZ_Get: Error msg2:  decode: CRC error in request -- 0A0176 -> 01020D0A1003
2017.06.24 15:48:30 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- FB ->
2017.06.24 16:15:15 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step2 -- 0A0176 ->
2017.06.24 17:54:15 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- 0A0176 ->
2017.06.24 18:07:15 3: THZ_Get: Error msg2:  decode: CRC error in request -- 0A0176 -> 01020D0A1003
2017.06.24 18:35:15 3: THZ_Get: Error msg2:  decode: CRC error in request -- 0A0176 -> 01020D0A1003
2017.06.24 22:40:15 3: THZ_Get: Error msg2:  decode: new unknown answer  -- 0A0176 -> 01010C0A1003
2017.06.24 23:18:30 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step1  -- FB ->
2017.06.24 23:49:15 3: THZ_Get: Error msg2:  decode: CRC error in request -- 0A0176 -> 01020D0A1003
FHEM auf Raspberry Pi B und 2B; THZ (THZ-303SOL), CUL_HM, TCM-EnOcean, SamsungTV, JSONMETER, SYSMON, OBIS, STATISTICS

andre.k

Hi immi,

sometimes I also have timouts when reading data from THZ. I have checked all logfiles of the last 2 years.
In the average there are about 60 timeouts per day, which is 5-6% of the schedued readings. Most of them are errors in step0 and step1.
In order to understand the timing of the communication better, I inserted an additional loop into the function THZ_ReadAnswer.

        my $buf;
        my $cnt = 0;
        #$buf = DevIo_SimpleReadWithTimeout($hash, 0.25) if(!defined($buf)) ; #added for Karl Antwort msg468515
        while (!defined($buf) and ($cnt < 400)) {
          $buf = DevIo_SimpleReadWithTimeout($hash, 0.001);
          $cnt ++;
        }
        Log3($hash, 5, "THZ_ReadAnswer: $cnt ms");

So I can measure the time for the reading process.  Usualy the THZ responds in less than 10ms. But sometimes I got timeouts after 400ms.
At the moment I observe the behavior. I'll come back...
Andre

immi

Zitat von: andre.k am 26 Juni 2017, 10:30:02
So I can measure the time for the reading process.  Usualy the THZ responds in less than 10ms. But sometimes I got timeouts after 400ms.
Hi Andre
I made a similar statistics 1 year ago. I have in mind following distribution
95% under 15ms 
4,95% up to 30ms.
0.05% no answer.

please keep me updated for your distribution above 400ms; We need to find a threshold to differentiate between no answer and slow answer.
The lower the threshold, the better the performance of fhem.
for this investigation please use in your while loop DevIo_SimpleReadWithTimeout($hash, 0.005);



Hi Willy
I see 3 cases for you: 0101, 0102 or no answer:

let us take tecalor answers 1002 CRC error in request -- F4 -> 0102E7E41003
you ask to read F4  and your tecalor understend E4 (bitkipper)
I have no idea why your tecalor gets a wrong bit ; maybe your should shield your serial data line, make it shorter or change the USB-Serial_Controller

0101 is even more interesting  0A0581 -> 01010C0A1003, I do not know what does 0101(hex) mean for tecalor
immi

andre.k

Hi immi,

sometimes after timeout I got a log entry like this2017.06.25 13:16:21 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step2 -- F3 ->
2017.06.25 13:16:25 2: myWP/RAW: 02
2
This RAW: 02 comes probably from funktion THZ_Read. These subroutine is never called from the module THZ self. Do you now when and from where the sub THZ_Read is called?
Second, only for advise :) Could you clean up the old Log calls and replace with the new Log3 calls (see topic https://forum.fhem.de/index.php/topic,14341.0/all.html). We don't need the old loglevel stuff ($115, GetLogLevel...)

Andre

immi

Zitat von: andre.k am 27 Juni 2017, 17:37:53
sometimes after timeout I got a log entry like this2017.06.25 13:16:21 3: THZ_Get: Error msg2:  THZ_ReadAnswer: InterfaceNotRespondig. Maybe too slow THZ_Get_Com: error found at step2 -- F3 ->
2017.06.25 13:16:25 2: myWP/RAW: 02
2
This RAW: 02 comes probably from funktion THZ_Read. These subroutine is never called from the module THZ self. Do you now when and from where the sub THZ_Read is called?
Hi Andre
THZ_Read should watch the tecalor outside the comunication time frame. e.g if the tecalor start to send something without beeing asked. It never happened to me.
In your case something else happened. The timeout in THZ_ReadAnswer was reached and when the tecalor finally send the 02 was too late. THZ_Read collected the garbage.  Do not trust 13:16:21 and 13:16:25. I do not think that your tecalor needed 4 seconds to answer.
It is important to find a threshold to differentiate between no answer and slow answer.

Zitat
Second, only for advise :) Could you clean up the old Log calls and replace with the new Log3 calls (see topic https://forum.fhem.de/index.php/topic,14341.0/all.html). We don't need the old loglevel stuff ($115, GetLogLevel...)
Ok.