FHEM Forum

FHEM - Hardware => Einplatinencomputer => Thema gestartet von: Bartimaus am 19 September 2016, 10:03:01

Titel: RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 10:03:01
Moin,

mal ne Frage an die Experten,

an meinem BananaPi (BananianOS) hängt sich das RootFS ca. alle 3-4 Wochen auf. So wie heute Nacht, dann ist das FS nur noch Read-Only. Habe ich heute morgen gemerkt, als ich gesehen habe das im FHEM-Log ab 00:05 Uhr nichts mehr geschrieben wurde.
Dann hilft nur noch reboot nach stromlos.

RootFS habe ich auf eine HDD an SDA1 ausgelagert (Sata an Bananapi)

Wisst Ihr nen Grund dafür ? Ist das ständige Schreiben der Logs zuviel ? Ich habe das schon weitestgehend reduziert.

Sonstige Tips ?

Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 19 September 2016, 13:31:27
was sagt den das kern.log zu der zeit?
cat /var/log/kern.log
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 14:00:45
Hallo Werniemann,

danke für den Hinweis. (Kern.log kannte ich noch nicht...  ::) )

Ich habe es nur gerade schnell versucht per Handy zu checken, aber da scheint gestern Abend spät ein Mountbefehl (CIFS auf ein NAS) vor die Pumpe geflitzt zu sein.

Den ganzen Log kann ich erst heute Abend liefern.


Sep 18 23:30:06 bananapi kernel: [2369388.192955] CIFS VFS: cifs_mount failed w/return code = -112
Sep 19 00:00:15 bananapi kernel: [2371197.540265] JBD2: Detected IO errors while flushing file data on sda1-8


Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 19 September 2016, 14:05:56
Nicht den gaaanzen, der dürfte lang sein ... nur in der ungefähren zeit, bzw. auch gerne:
grep -i i/o /var/log/kern.log

Bzw. wenn Du einmal neugestartet hast, eventuell auch kern.log.1 oder andere, siehe auch
ls -lha /var/log/kern.log*
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 14:11:44
Hi,

sh. Edit im ersten Post. Hatte sich etwas überschnitten.
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 19 September 2016, 14:46:28
Sep 18 23:30:06 bananapi kernel: [2369388.192955] CIFS VFS: cifs_mount failed w/return code = -112
Da hat ein CIFS mount nicht funktioniert. Wie mountest Du?

Sep 19 00:00:15 bananapi kernel: [2371197.540265] JBD2: Detected IO errors while flushing file data on sda1-8
Was steht 2-3 Zeilen dafür/dahinter?
Sieht mir nach einem Hardwareproblem aus ...

grep -C3 -i i/o /var/log/kern.log
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: AxelSchweiss am 19 September 2016, 14:52:26
Hi
Generell verhält sich der Filesystemtreiber so das er das Filesystem RO mounted, um weiteren Schaden zu vermeiden, wenn es gravierende Probleme gibt.
Dazu müsste aber irgendwo was im Log oder in dmesg stehen. So .... mounting read-only   .... oder so ähnlich.

Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 15:17:00
Für die Backups mounte ich den Stick an der Fritzbox sowie mein NAS per crontab.

Das hätte ich eleganter lösen müssen (Mount Fritz nach reboot, und Mount NAS dann wenn Backup ansteht + anschliessendem umount)
Stattdessen habe ich alle 30min per crontab versucht zu mounten.

Aber ob das der Fehler war, weiss ich nicht. Ich logge+plotte sehr viel mit FHEM. Zuerst hatte ich das Standard-Logverzeichnis auf einen USB-Stick am Banana umgelogen (symlink), bis 2 Sticks irgendwann abgeraucht sind. Dann habe ich eine SSD an SATA gehangen, und Rootfs dorthin verschoben.

Im Frühjahr diesen Jahres hat dann die SSD (5 Jahre alt) Ihren Geist aufgegeben. K.a. ob es an den vielen Schreibzyklen lag. Dann hatte ich noch eine 320er 2,5-HDD übrig, die jetzt ihren Dienst als RootFS versieht. Vielleicht hats diese jetzt auch gehimmelt...

Wie gesagt, mehr Log-Auszüge wenn ich daheim bin....
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 19 September 2016, 15:20:36
Ich kennen den Banana nicht so genau, aber hast Du mal per smartctl versucht, die Platte zu prüfen?
Bzw. überhaupt die Werte auszulesen?

Bezüglich CIFS:
Stichwort automounter .....
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 15:25:05
Nein, smartctl kenne ich auch nicht. (smartctl ist wohl in Bananian nicht enthalten)

Naja, das mounten per crontab sollte ein Versuch meinerseits eines "automounters" sein. Das NAS läuft nur sporadisch. Ausser zum Backup Zeitpunkt, da wird es vorher automatisch gestartet.
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 19 September 2016, 15:35:55
Also ... smart kann auch nachinstalliert werden ... weiß nur nicht, ob Dein USB-SATA Kontroller Babana es kann.
apt-get install smartmontools

Wenn Du schon vorher automatisch anschaltest, warum macht dieses script dann nicht auch das mounten?

Also
Zitatanschalten
2 Min Warten
prüfen und mounten
.... arbeiten .....
1 Min Warten
sync
umounten
prüfen und abschalten

so jedenfalls läuft es bei mir .. nur mit einer externen USB-HDD (Braucht eigene Stromversorgung)
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 15:39:34
Hi,

smartmontools habe ich gerade nachinstalliert, während Du das geschrieben hast.
Worauf muss ich jetzt nach smartctl achten ?

Das ist auch ne Idee das Script um Automount zu ergänzen. Mal sehen ob der Ersteller sowas vorgesehen hat...
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 19 September 2016, 15:48:04
" .. selberschreiben ... " ;o)

Wegen Smart:
smartctl -a /dev/sda
Es können noch zusätzliche Parameter wegen USB nötig werden, probiere es erstmal naiv aus ...
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 19 September 2016, 19:04:08
Hi,

endliche zuhause.

Sodele, Auto mount/umount mache "ich" jetzt per FHEM/DOIF, sehr geil, wieder was gelernt. D.h. jetzt keine Crontab-Jobs oder fstab mehr. Prima.

smartctl liefert:

user@bananapi:~$ sudo smartctl -a /dev/sda
smartctl 5.41 2011-06-09 r3365 [armv7l-linux-3.4.104+] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Travelstar 5K500.B
Device Model:     Hitachi HTS545032B9A300
Serial Number:    xxxxxxxxxxxx
LU WWN Device Id: 5 000cca 5f1ef0c22
Firmware Version: PB3OC60N
User Capacity:    320.071.851.520 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Mon Sep 19 18:21:47 2016 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  645) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 106) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   163   163   033    Pre-fail  Always       -       2
  4 Start_Stop_Count        0x0012   097   097   000    Old_age   Always       -       4779
  5 Reallocated_Sector_Ct   0x0033   094   094   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   054   054   000    Old_age   Always       -       20288
10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       1884
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       94
193 Load_Cycle_Count        0x0012   018   018   000    Old_age   Always       -       820400
194 Temperature_Celsius     0x0002   189   189   000    Old_age   Always       -       29 (Min/Max 10/50)
196 Reallocated_Event_Count 0x0032   086   086   000    Old_age   Always       -       1066
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      8955         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



Und hier der kernel.log


Sep 18 23:17:11 bananapi kernel: [    2.274072] sw-ohci sw-ohci.2: new USB bus registered, assigned bus number 4
Sep 18 23:17:11 bananapi kernel: [    2.285444] sw-ohci sw-ohci.2: irq 97, io mem 0x01c1c400
Sep 18 23:17:11 bananapi kernel: [    2.354051] usb 2-1: new full-speed USB device number 2 using sw-ohci
Sep 18 23:17:11 bananapi kernel: [    2.359791] usb usb4: New USB device found,idVendor=1d6b, idProduct=0001
Sep 18 23:17:11 bananapi kernel: [    2.379151] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
Sep 18 23:17:11 bananapi kernel: [    2.392437] usb usb4: Product: SW USB2.0 'Open' Host Controller (OHCI) Driver
Sep 18 23:17:11 bananapi kernel: [    2.404075] usb usb4: Manufacturer: Linux 3.4.104+ ohci_hcd
Sep 18 23:17:11 bananapi kernel: [    2.412764] usb usb4: SerialNumber: sw-ohci
Sep 18 23:17:11 bananapi kernel: [    2.419634] hub 4-0:1.0: USB hub found
Sep 18 23:17:11 bananapi kernel: [    2.426244] hub 4-0:1.0: 1 port detected
Sep 18 23:17:11 bananapi kernel: [    2.433981] Initializing USB Mass Storage driver...
Sep 18 23:17:11 bananapi kernel: [    2.443798] usbcore: registered new interface driver usb-storage
Sep 18 23:17:11 bananapi kernel: [    2.453359] USB Mass Storage support registered.
Sep 18 23:17:11 bananapi kernel: [    2.462492] mousedev: PS/2 mouse device common for all mice
Sep 18 23:17:11 bananapi kernel: [    2.472660] input: sunxi-ir as /devices/virtual/input/input0
Sep 18 23:17:11 bananapi kernel: [    2.479620] IR Initial OK
Sep 18 23:17:11 bananapi kernel: [    2.484682] sunxi-rtc sunxi-rtc: Warning: RTC time is wrong!
Sep 18 23:17:11 bananapi kernel: [    2.495682] sunxi-rtc sunxi-rtc: rtc core: registered rtc as rtc0
Sep 18 23:17:11 bananapi kernel: [    2.504277] i2c /dev entries driver
Sep 18 23:17:11 bananapi kernel: [    2.511539] config i2c gpio with gpio_config api
Sep 18 23:17:11 bananapi kernel: [    2.517166] axp_mfd 0-0034: AXP (CHIP ID: 0x41) detected
Sep 18 23:17:11 bananapi kernel: [    2.528312] axp_mfd 0-0034: AXP internal temperature monitoring enabled
Sep 18 23:17:11 bananapi kernel: [    2.540025] [AXP]axp driver uning configuration failed(342)
Sep 18 23:17:11 bananapi kernel: [    2.541977] [AXP]power_start = 0
Sep 18 23:17:11 bananapi kernel: [    2.545038] I2C: i2c-0: AW16XX I2C adapter
Sep 18 23:17:11 bananapi kernel: [    2.552546] config i2c gpio with gpio_config api
Sep 18 23:17:11 bananapi kernel: [    2.555897] I2C: i2c-1: AW16XX I2C adapter
Sep 18 23:17:11 bananapi kernel: [    2.563409] config i2c gpio with gpio_config api
Sep 18 23:17:11 bananapi kernel: [    2.566716] I2C: i2c-2: AW16XX I2C adapter
Sep 18 23:17:11 bananapi kernel: [    2.574208] config i2c gpio with gpio_config api
Sep 18 23:17:11 bananapi kernel: [    2.577496] I2C: i2c-3: AW16XX I2C adapter
Sep 18 23:17:11 bananapi kernel: [    2.583399] [ace_drv] start!!!
Sep 18 23:17:11 bananapi kernel: [    2.585720] [ace_drv] init end!!!
Sep 18 23:17:11 bananapi kernel: [    2.587385] [pa_drv] start!!!
Sep 18 23:17:11 bananapi kernel: [    2.589579] [pa_drv] init end!!!
Sep 18 23:17:11 bananapi kernel: [    2.593696] Driver for 1-wire Dallas network protocol.
Sep 18 23:17:11 bananapi kernel: [    2.603051] invalid gpio pin in fex configuration : -1
Sep 18 23:17:11 bananapi kernel: [    2.610562] axp20_ldo1: 1300 mV
Sep 18 23:17:11 bananapi kernel: [    2.619006] axp20_ldo2: 1800 <--> 3300 mV at 3000 mV
Sep 18 23:17:11 bananapi kernel: [    2.629356] axp20_ldo3: 700 <--> 3500 mV at 2800 mV
Sep 18 23:17:11 bananapi kernel: [    2.639466] axp20_ldo4: 1250 <--> 3300 mV at 2800 mV
Sep 18 23:17:11 bananapi kernel: [    2.649861] axp20_buck2: 700 <--> 2275 mV at 1450 mV
Sep 18 23:17:11 bananapi kernel: [    2.660103] axp20_buck3: 700 <--> 3500 mV at 1300 mV
Sep 18 23:17:11 bananapi kernel: [    2.670048] axp20_ldoio0: 1800 <--> 3300 mV at 2800 mV
Sep 18 23:17:11 bananapi kernel: [    2.684857] input: axp20-supplyer as /devices/platform/sunxi-i2c.0/i2c-0/0-0034/axp20-supplyer.28/input/input1
Sep 18 23:17:11 bananapi kernel: [    2.702357] usb 2-1: New USB device found, idVendor=03eb, idProduct=3301
Sep 18 23:17:11 bananapi kernel: [    2.713489] axp20_ldo2: Failed to create debugfs directory
Sep 18 23:17:11 bananapi kernel: [    2.722530] device-mapper: uevent: version 1.0.3
Sep 18 23:17:11 bananapi kernel: [    2.733349] usb 2-1: New USB device strings: Mfr=0, Product=2, SerialNumber=0
Sep 18 23:17:11 bananapi kernel: [    2.740735] device-mapper: ioctl: 4.22.0-ioctl (2011-10-19) initialised: dm-devel@redhat.com
Sep 18 23:17:11 bananapi kernel: [    2.759319] cpuidle: using governor ladder
Sep 18 23:17:11 bananapi kernel: [    2.766276] cpuidle: using governor menu
Sep 18 23:17:11 bananapi kernel: [    2.772460] [mmc-msg] sw_mci_init
Sep 18 23:17:11 bananapi kernel: [    2.781417] [mmc-msg] MMC host used card: 0x9, boot card: 0x0, io_card 8
Sep 18 23:17:11 bananapi kernel: [    2.792972] [mmc-msg] sdc0 set round clock 400000, src 24000000
Sep 18 23:17:11 bananapi kernel: [    2.806867] [mmc-msg] sdc0 set ios: clk 0Hz bm OD pm OFF vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    2.823021] [mmc-msg] sdc0 Probe: base:0xf0154000 irq:64 sg_cpu:f0156000(4fc00000) ret 0.
Sep 18 23:17:11 bananapi kernel: [    2.836053] [mmc-msg] sdc3 set round clock 400000, src 24000000
Sep 18 23:17:11 bananapi kernel: [    2.849946] [mmc-msg] sdc3 set ios: clk 0Hz bm OD pm OFF vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    2.866100] [mmc-msg] sdc3 Probe: base:0xf0158000 irq:67 sg_cpu:f015a000(4fc01000) ret 0.
Sep 18 23:17:11 bananapi kernel: [    2.878786] [mmc_pm]: failed to fetch sdio card configuration!
Sep 18 23:17:11 bananapi kernel: [    2.882245] usb 2-1: Product: Standard USB Hub
Sep 18 23:17:11 bananapi kernel: [    2.884599] sunxi_leds driver init
Sep 18 23:17:11 bananapi kernel: [    2.896155] Registered led device: green:ph24:led1
Sep 18 23:17:11 bananapi kernel: [    2.899980] Registered led device: blue:pg02:led2
Sep 18 23:17:11 bananapi kernel: [    2.905092] ledtrig-cpu: registered to indicate activity on CPUs
Sep 18 23:17:11 bananapi kernel: [    2.917596] usbcore: registered new interface driver usbhid
Sep 18 23:17:11 bananapi kernel: [    2.926042] usbhid: USB HID core driver
Sep 18 23:17:11 bananapi kernel: [    2.928747] hub 2-1:1.0: USB hub found
Sep 18 23:17:11 bananapi kernel: [    2.939679] hub 2-1:1.0: 4 ports detected
Sep 18 23:17:11 bananapi kernel: [    2.945923] ashmem: initialized
Sep 18 23:17:11 bananapi kernel: [    2.952720] logger: created 256K log 'log_main'
Sep 18 23:17:11 bananapi kernel: [    2.961036] logger: created 256K log 'log_events'
Sep 18 23:17:11 bananapi kernel: [    2.969469] logger: created 256K log 'log_radio'
Sep 18 23:17:11 bananapi kernel: [    2.977871] logger: created 256K log 'log_system'
Sep 18 23:17:11 bananapi kernel: [    2.988338] IPv4 over IPv4 tunneling driver
Sep 18 23:17:11 bananapi kernel: [    2.995343] TCP: bic registered
Sep 18 23:17:11 bananapi kernel: [    3.000760] TCP: cubic registered
Sep 18 23:17:11 bananapi kernel: [    3.006634] TCP: westwood registered
Sep 18 23:17:11 bananapi kernel: [    3.012865] TCP: highspeed registered
Sep 18 23:17:11 bananapi kernel: [    3.019146] ehci_irq: port change detect
Sep 18 23:17:11 bananapi kernel: [    3.021488] TCP: hybla registered
Sep 18 23:17:11 bananapi kernel: [    3.026984] TCP: htcp registered
Sep 18 23:17:11 bananapi kernel: [    3.032483] TCP: vegas registered
Sep 18 23:17:11 bananapi kernel: [    3.037977] TCP: veno registered
Sep 18 23:17:11 bananapi kernel: [    3.043741] TCP: scalable registered
Sep 18 23:17:11 bananapi kernel: [    3.049321] TCP: lp registered
Sep 18 23:17:11 bananapi kernel: [    3.054652] TCP: yeah registered
Sep 18 23:17:11 bananapi kernel: [    3.060469] TCP: illinois registered
Sep 18 23:17:11 bananapi kernel: [    3.067294] Initializing XFRM netlink socket
Sep 18 23:17:11 bananapi kernel: [    3.074165] The port change to OHCI now!
Sep 18 23:17:11 bananapi kernel: [    3.078330] NET: Registered protocol family 10
Sep 18 23:17:11 bananapi kernel: [    3.087345] NET: Registered protocol family 17
Sep 18 23:17:11 bananapi kernel: [    3.095365] NET: Registered protocol family 15
Sep 18 23:17:11 bananapi kernel: [    3.104497] [mmc_pm]: No sdio card, please check your config !!
Sep 18 23:17:11 bananapi kernel: [    3.108224] Registering the dns_resolver keytype
Sep 18 23:17:11 bananapi kernel: [    3.115056] VFP support v0.3: implementor 41 architecture 2 part 30 variant 7 rev 4
Sep 18 23:17:11 bananapi kernel: [    3.131369] Registering SWP/SWPB emulation handler
Sep 18 23:17:11 bananapi kernel: [    3.140778] axp20_buck2: Failed to create debugfs directory
Sep 18 23:17:11 bananapi kernel: [    3.152510] [cpu_freq] INF:-------------------V-F Table-------------------
Sep 18 23:17:11 bananapi kernel: [    3.164446] [cpu_freq] INF: voltage = 1450mv frequency = 1008MHz
Sep 18 23:17:11 bananapi kernel: [    3.175592] [cpu_freq] INF: voltage = 1425mv frequency =  912MHz
Sep 18 23:17:11 bananapi kernel: [    3.186740] [cpu_freq] INF: voltage = 1350mv frequency =  864MHz
Sep 18 23:17:11 bananapi kernel: [    3.197881] [cpu_freq] INF: voltage = 1250mv frequency =  720MHz
Sep 18 23:17:11 bananapi kernel: [    3.209024] [cpu_freq] INF: voltage = 1150mv frequency =  528MHz
Sep 18 23:17:11 bananapi kernel: [    3.220220] [cpu_freq] INF: voltage = 1100mv frequency =  312MHz
Sep 18 23:17:11 bananapi kernel: [    3.231369] [cpu_freq] INF: voltage = 1050mv frequency =  144MHz
Sep 18 23:17:11 bananapi kernel: [    3.242512] [cpu_freq] INF: voltage = 1000mv frequency =    0MHz
Sep 18 23:17:11 bananapi kernel: [    3.254440] [cpu_freq] INF:-----------------------------------------------
Sep 18 23:17:11 bananapi kernel: [    3.271054] [cpu_freq] INF:sunxi_cpufreq_initcall, get cpu frequency from sysconfig, max freq: 912MHz, min freq: 720MHz
Sep 18 23:17:11 bananapi kernel: [    3.285601] registered taskstats version 1
Sep 18 23:17:11 bananapi kernel: [    3.294138] I2C: i2c-4: HDMI I2C adapter
Sep 18 23:17:11 bananapi kernel: [    3.374048] usb 4-1: new full-speed USB device number 2 using sw-ohci
Sep 18 23:17:11 bananapi kernel: [    3.565356] usb 4-1: New USB device found, idVendor=0658, idProduct=0200
Sep 18 23:17:11 bananapi kernel: [    3.578170] usb 4-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
Sep 18 23:17:11 bananapi kernel: [    3.675256] usb 2-1.2: new full-speed USB device number 3 using sw-ohci
Sep 18 23:17:11 bananapi kernel: [    3.727532] [mmc-msg] mmc 0 detect change, present 1
Sep 18 23:17:11 bananapi kernel: [    3.805519] usb 2-1.2: New USB device found, idVendor=03eb, idProduct=204b
Sep 18 23:17:11 bananapi kernel: [    3.818672] usb 2-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Sep 18 23:17:11 bananapi kernel: [    3.828692] usb 2-1.2: Product: CUL868
Sep 18 23:17:11 bananapi kernel: [    3.835930] usb 2-1.2: Manufacturer: busware.de
Sep 18 23:17:11 bananapi kernel: [    3.925250] usb 2-1.3: new full-speed USB device number 4 using sw-ohci
Sep 18 23:17:11 bananapi kernel: [    4.066517] usb 2-1.3: New USB device found, idVendor=0403, idProduct=6001
Sep 18 23:17:11 bananapi kernel: [    4.079685] usb 2-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Sep 18 23:17:11 bananapi kernel: [    4.090490] usb 2-1.3: Product: FT232R USB UART
Sep 18 23:17:11 bananapi kernel: [    4.097991] usb 2-1.3: Manufacturer: FTDI
Sep 18 23:17:11 bananapi kernel: [    4.105321] usb 2-1.3: SerialNumber: AH02Z25S
Sep 18 23:17:11 bananapi kernel: [    4.236570] [mmc-msg] sdc0 set ios: clk 0Hz bm PP pm UP vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.247995] [mmc-msg] sdc0 power on
Sep 18 23:17:11 bananapi kernel: [    4.276985] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.291311] [mmc-msg] sdc0 set round clock 400000, src 24000000
Sep 18 23:17:11 bananapi kernel: [    4.374277] [mmc-err] smc 0 err, cmd 52,  RTO
Sep 18 23:17:11 bananapi kernel: [    4.382783] [mmc-err] smc 0 err, cmd 52,  RTO
Sep 18 23:17:11 bananapi kernel: [    4.395495] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.415624] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.430377] [mmc-err] smc 0 err, cmd 5,  RTO
Sep 18 23:17:11 bananapi kernel: [    4.438694] [mmc-err] smc 0 err, cmd 5,  RTO
Sep 18 23:17:11 bananapi kernel: [    4.447019] [mmc-err] smc 0 err, cmd 5,  RTO
Sep 18 23:17:11 bananapi kernel: [    4.455346] [mmc-err] smc 0 err, cmd 5,  RTO
Sep 18 23:17:11 bananapi kernel: [    4.468613] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.486371] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.506493] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing LEGACY(SDR12) dt B
Sep 18 23:17:11 bananapi kernel: [    4.550934] [mmc-msg] sdc0 set ios: clk 400000Hz bm PP pm ON vdd 3.3V width 1 timing SD-HS(SDR25) dt B
Sep 18 23:17:11 bananapi kernel: [    4.568685] [mmc-msg] sdc0 set ios: clk 50000000Hz bm PP pm ON vdd 3.3V width 1 timing SD-HS(SDR25) dt B
Sep 18 23:17:11 bananapi kernel: [    4.583323] [mmc-msg] sdc0 set round clock 42857143, src 600000000
Sep 18 23:17:11 bananapi kernel: [    4.652997] [mmc-msg] sdc0 set ios: clk 50000000Hz bm PP pm ON vdd 3.3V width 4 timing SD-HS(SDR25) dt B
Sep 18 23:17:11 bananapi kernel: [    4.666944] mmc0: new high speed SDHC card at address 0007
Sep 18 23:17:11 bananapi kernel: [    4.676291] mmcblk0: mmc0:0007 SD08G 7.42 GiB
Sep 18 23:17:11 bananapi kernel: [    4.684253]  mmcblk0: p1 p2
Sep 18 23:17:11 bananapi kernel: [   13.321607] Timeout waiting for EDID info
Sep 18 23:17:11 bananapi kernel: [   13.331995] disp clks: lcd 74250000 pre_scale 1 hdmi 74250000 pll 297000000 2x 0
Sep 18 23:17:11 bananapi kernel: [   13.864961] Console: switching to colour frame buffer device 160x45
Sep 18 23:17:11 bananapi kernel: [   13.897754] axp20_buck3: incomplete constraints, leaving on
Sep 18 23:17:11 bananapi kernel: [   13.908364] axp20_buck2: incomplete constraints, leaving on
Sep 18 23:17:11 bananapi kernel: [   13.918856] axp20_ldo4: incomplete constraints, leaving on
Sep 18 23:17:11 bananapi kernel: [   13.929256] axp20_ldo3: incomplete constraints, leaving on
Sep 18 23:17:11 bananapi kernel: [   13.946337] axp20_ldo2: incomplete constraints, leaving on
Sep 18 23:17:11 bananapi kernel: [   13.956735] axp20_ldo1: incomplete constraints, leaving on
Sep 18 23:17:11 bananapi kernel: [   13.965398] console [netcon0] enabled
Sep 18 23:17:11 bananapi kernel: [   13.972715] netconsole: network logging started
Sep 18 23:17:11 bananapi kernel: [   13.981571] otg_wakelock_init: No USB transceiver found
Sep 18 23:17:11 bananapi kernel: [   13.994490] sunxi-rtc sunxi-rtc: setting system clock to 2010-01-01 00:00:00 UTC (1262304000)
Sep 18 23:17:11 bananapi kernel: [   14.010887] ALSA device list:
Sep 18 23:17:11 bananapi kernel: [   14.022825]   #0: sunxi-CODEC  Audio Codec
Sep 18 23:17:11 bananapi kernel: [   14.038799] md: Waiting for all devices to be available before autodetect
Sep 18 23:17:11 bananapi kernel: [   14.056143] md: If you don't use raid, use raid=noautodetect
Sep 18 23:17:11 bananapi kernel: [   14.071496] md: Autodetecting RAID arrays.
Sep 18 23:17:11 bananapi kernel: [   14.084732] md: Scanned 0 and added 0 devices.
Sep 18 23:17:11 bananapi kernel: [   14.096607] md: autorun ...
Sep 18 23:17:11 bananapi kernel: [   14.107380] md: ... autorun DONE.
Sep 18 23:17:11 bananapi kernel: [   14.860067] EXT4-fs warning (device sda1): ext4_clear_journal_err:4239: Filesystem error recorded from previous mount: IO fa                                                                                        ilure
Sep 18 23:17:11 bananapi kernel: [   14.887161] EXT4-fs warning (device sda1): ext4_clear_journal_err:4240: Marking fs in need of filesystem check.
Sep 18 23:17:11 bananapi kernel: [   14.920550] EXT4-fs (sda1): warning: mountin&am

Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 20 September 2016, 08:38:29
Könntest Du bitte das Log in "Code-Tags" setzen? So ist es (für mich) nicht lesbar. Allerdings ist es vom letzten reboot (sieht man n en Logzeiten), der Fehler ist Aktuell ja nicht da?

Die Platte ist laut smart O.K. ....
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 20 September 2016, 08:41:26
Ich habe den Log in Code-Tags gesetzt, so wie das Smart, aber es wurde ignoriert. Vielleicht weil der Text zu lang ist ?

Ich schau da heute Abend mal.

Danke für die Info über den Plattenzustand  :)
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 20 September 2016, 09:03:07
Geöffnete/geschlossene Code-Tags beachtet, eventuell ausversehen ein [ gelöscht?
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 20 September 2016, 09:18:01
K.a., habs nochmal editiert, jetzt sollte es passen
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 20 September 2016, 09:20:32
Wie schon geschrieben: Ist das Logfile vom letzten reboot. Könntest Du in einem älteren Gucken?
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 20 September 2016, 10:35:15
Ok, mache ich heute Abend vom PC.
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 20 September 2016, 18:55:07
Hier ml ein älterer Log

Sep 13 12:26:57 bananapi kernel: [1897592.497738] EXT4-fs warning (device sda1): ext4_end_bio:249: I/O error writing to inode 4326714 (offset 3670016 size 524288 start$
Sep 13 12:26:57 bananapi kernel: [1897592.520534] sd 0:0:0:0: [sda] Unhandled error code
Sep 13 12:26:57 bananapi kernel: [1897592.528801] sd 0:0:0:0: [sda]  Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Sep 13 12:26:57 bananapi kernel: [1897592.548163] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 01 d0 e8 00 00 04 00 00
Sep 13 12:26:57 bananapi kernel: [1897592.556557] end_request: I/O error, dev sda, sector 30468096
Sep 13 12:26:57 bananapi kernel: [1897592.561703] Buffer I/O error on device sda1, logical block 3808256
Sep 13 12:26:57 bananapi kernel: [1897592.573199] Buffer I/O error on device sda1, logical block 3808257
Sep 13 12:26:57 bananapi kernel: [1897592.596641] Buffer I/O error on device sda1, logical block 3808258
Sep 13 12:26:57 bananapi kernel: [1897592.616474] Buffer I/O error on device sda1, logical block 3808259
Sep 13 12:26:57 bananapi kernel: [1897592.627434] Buffer I/O error on device sda1, logical block 3808260
Sep 13 12:26:57 bananapi kernel: [1897592.638904] Buffer I/O error on device sda1, logical block 3808261
Sep 13 12:26:57 bananapi kernel: [1897592.644053] Buffer I/O error on device sda1, logical block 3808262
Sep 13 12:26:57 bananapi kernel: [1897592.649191] Buffer I/O error on device sda1, logical block 3808263
Sep 13 12:26:57 bananapi kernel: [1897592.654374] Buffer I/O error on device sda1, logical block 3808264
Sep 13 12:26:57 bananapi kernel: [1897592.659512] Buffer I/O error on device sda1, logical block 3808265
Sep 13 12:26:57 bananapi kernel: [1897592.664682] Buffer I/O error on device sda1, logical block 3808266
Sep 13 12:26:57 bananapi kernel: [1897592.669819] Buffer I/O error on device sda1, logical block 3808267
Sep 13 12:26:57 bananapi kernel: [1897592.674991] Buffer I/O error on device sda1, logical block 3808268
Sep 13 12:26:57 bananapi kernel: [1897592.692781] Buffer I/O error on device sda1, logical block 3808269
Sep 13 12:26:57 bananapi kernel: [1897592.704280] Buffer I/O error on device sda1, logical block 3808270
Sep 13 12:26:57 bananapi kernel: [1897592.715766] Buffer I/O error on device sda1, logical block 3808271
Sep 13 12:26:57 bananapi kernel: [1897592.727287] Buffer I/O error on device sda1, logical block 3808272
Sep 13 12:26:57 bananapi kernel: [1897592.738788] Buffer I/O error on device sda1, logical block 3808273
Sep 13 12:26:57 bananapi kernel: [1897592.750282] Buffer I/O error on device sda1, logical block 3808274
Sep 13 12:26:57 bananapi kernel: [1897592.768104] Buffer I/O error on device sda1, logical block 3808275
Sep 13 12:26:57 bananapi kernel: [1897592.779576] Buffer I/O error on device sda1, logical block 3808276
Sep 13 12:26:57 bananapi kernel: [1897592.791050] Buffer I/O error on device sda1, logical block 3808277
Sep 13 12:26:57 bananapi kernel: [1897592.802535] Buffer I/O error on device sda1, logical block 3808278
Sep 13 12:26:57 bananapi kernel: [1897592.807681] Buffer I/O error on device sda1, logical block 3808279
Sep 13 12:26:57 bananapi kernel: [1897592.812818] Buffer I/O error on device sda1, logical block 3808280
Sep 13 12:26:57 bananapi kernel: [1897592.829867] Buffer I/O error on device sda1, logical block 3808281
Sep 13 12:26:57 bananapi kernel: [1897592.842657] Buffer I/O error on device sda1, logical block 3808282
Sep 13 12:26:57 bananapi kernel: [1897592.854149] Buffer I/O error on device sda1, logical block 3808283
Sep 13 12:26:57 bananapi kernel: [1897592.865622] Buffer I/O error on device sda1, logical block 3808284


Das sieht IMO doch garnicht so gut aus. Würde das FSCK helfen ?
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 20 September 2016, 19:50:31
Sep 13 12:26:57 bananapi kernel: [1897592.497738] EXT4-fs warning (device sda1): ext4_end_bio:249: I/O error writing to inode 4326714 (offset 3670016 size 524288 start$
Sep 13 12:26:57 bananapi kernel: [1897592.520534] sd 0:0:0:0: [sda] Unhandled error code
Sep 13 12:26:57 bananapi kernel: [1897592.528801] sd 0:0:0:0: [sda]  Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
Sep 13 12:26:57 bananapi kernel: [1897592.548163] sd 0:0:0:0: [sda] CDB: Write(10): 2a 00 01 d0 e8 00 00 04 00 00
Sep 13 12:26:57 bananapi kernel: [1897592.556557] end_request: I/O error, dev sda, sector 30468096

Kommt etwas vor der ersten Meldung?

Mir sieht dieses nach einem Hardwarefehler aus. "I/O error writing to inode" .....

also auf dem Weg vom Kernel in die Platte liegt ein "Defekt" vor. Da Du schriebst, das es eine USB-Platte ist, kannst Dudie an einem anderen Linux-Rechner anschließen und mal ein fsck rüberlaufen lassen?
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 20 September 2016, 19:57:40
Es ist eine SATA-Platte. Aber ich könnte sie in ein USB-Gehäuse pcken, und am QNAP anschliessen und fsck drüberlaufen lassen....

Ja, vor dem Log stand noch einiges, aber der Ausschnitt erschien mir bemerkenswert...
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 20 September 2016, 20:59:36
Was für ein Dateisystem?

Und wenn Du es an die Synology mountest, könntest Du versuchen mehrere GByte hin/herzuschaufeln?

P.S. Bitte nicht Backup vergessen ....
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 20 September 2016, 22:09:51
Hm, ich denke ext4

Ist ein Qnap.
Backup mache ich täglich
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: AxelSchweiss am 21 September 2016, 01:00:15
Ist ein ext4  ;)
Sep 13 12:26:57 bananapi kernel: [1897592.497738] EXT4-fs warning (device sda1): ext4_end_bio:249: I/O error writing to inode 4326714 (offset 3670016 size 524288 start$

Hat die Platte sowas wie einen Autopowerdown?
Dann könnte es nämlich sein das sie nicht wieder oder zu langsam aufwacht.

Mach mal einen "Extended Self Test" mit den smartmontools.
Das Ergebnis siehst du dann, wenn er durchgelaufen ist, am Ende des Listings bei smartctl -a

Das sieht dann z.B.: so aus:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     32383         -
# 2  Short offline       Completed without error       00%     32359         -
# 3  Short offline       Completed without error       00%     32335         -
# 4  Extended offline    Completed without error       00%     32317         -
# 5  Short offline       Completed without error       00%     32287         -
# 6  Short offline       Completed without error       00%     32263         -
# 7  Short offline       Completed without error       00%     32239         -
# 8  Short offline       Completed without error       00%     32215         -

Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 21 September 2016, 07:10:13
Der extended-Test soll 106min dauern lt den smarttools. Liegt in der Zeit dann alles brach, bzw. kann man den Test auch stoppen ?
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 21 September 2016, 08:06:20
Nee .. der Test läuft im Hintergrund nur auf der Platte. Teoretisch ist diese in der Zeit langsamer, praktisch dürfte dieses bei Dir (wegen USB-Anschluß) nicht meßbar sein.
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 21 September 2016, 08:21:50
Danke.

Wie kommst Du dauernd auf den USB-Anschluss ????
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 21 September 2016, 16:42:42
Anbei das Ergebnis des HDDchecks.

Irgendwelche Auffälligkeiten ? Kenne mich da nicht so aus...

smartctl 5.41 2011-06-09 r3365 [armv7l-linux-3.4.104+] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Travelstar 5K500.B
Device Model:     Hitachi HTS545032B9A300
Serial Number:    xxxxxxxxxxxxxxxxxxx
LU WWN Device Id: 5 000cca 5f1ef0c22
Firmware Version: PB3OC60N
User Capacity:    320.071.851.520 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Wed Sep 21 16:36:50 2016 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  645) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off supp                                                                                        ort.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 106) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_  FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -  0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -  0
  3 Spin_Up_Time            0x0007   170   170   033    Pre-fail  Always       -  2
  4 Start_Stop_Count        0x0012   097   097   000    Old_age   Always       -  4780
  5 Reallocated_Sector_Ct   0x0033   094   094   005    Pre-fail  Always       -  0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -  0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -  0
  9 Power_On_Hours          0x0012   054   054   000    Old_age   Always       -  20334
10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -  0
12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -  1885
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -  0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -  94
193 Load_Cycle_Count        0x0012   018   018   000    Old_age   Always       -  826541
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -  33 (Min/Max 10/50)
196 Reallocated_Event_Count 0x0032   086   086   000    Old_age   Always       -  1066
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -  0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -  0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -  0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     20328         -
# 2  Short offline       Completed without error       00%      8955         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 21 September 2016, 19:43:03
Sieht gut aus (Übrigens hat Deine Platte 33° ;o) )
Also ... smartd ist zwar in der Zuverlässigkeit nicht 100% Perfekt, ich würde aber von en Werten eher ausgehen, das es O.K. ist. Was mir noch einfällt ....

- Traffik auf der Platte erzeugt? (Große Daten auf die Platte kopieren, am besten mit einem 2. System)
- Kabelsitz geprüft?

Momentan würde ich aber von einem Problem im i/o Bereich des Cubi suchen :o( nicht auf Sata Seite, da:
Zitat199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -  0

Interessant, Deine Platte ist nur 2 mal Hochgefahren, aber über 4000 mal start/stop bei 1800 Power an??
Zitat3 Spin_Up_Time            0x0007   170   170   033    Pre-fail  Always       -  2
  4 Start_Stop_Count        0x0012   097   097   000    Old_age   Always       -  4780
12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -  1885

Ansonsten sind die Werte bei über 20000 Stunden Betriebzeit normal und kein Wert (auch kin oben genannter) ist kritisch
Zitat9 Power_On_Hours          0x0012   054   054   000    Old_age   Always       -  20334
P.S. 20334h ~ 2,3 Jahre




SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_  FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -  0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -  0
  3 Spin_Up_Time            0x0007   170   170   033    Pre-fail  Always       -  2
  4 Start_Stop_Count        0x0012   097   097   000    Old_age   Always       -  4780
  5 Reallocated_Sector_Ct   0x0033   094   094   005    Pre-fail  Always       -  0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -  0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -  0
  9 Power_On_Hours          0x0012   054   054   000    Old_age   Always       -  20334
10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -  0
12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -  1885
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -  0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -  94
193 Load_Cycle_Count        0x0012   018   018   000    Old_age   Always       -  826541
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always       -  33 (Min/Max 10/50)
196 Reallocated_Event_Count 0x0032   086   086   000    Old_age   Always       -  1066
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -  0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -  0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -  0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -  0
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: AxelSchweiss am 21 September 2016, 21:35:16
Zitat von: Wernieman am 21 September 2016, 19:43:03
Ansonsten sind die Werte bei über 20000 Stunden Betriebzeit normal und kein Wert (auch kin oben genannter) ist kritischP.S. 20334h ~ 2,3 Jahre

Ähhh doch  ... der hier
193 Load_Cycle_Count        0x0012   018   018   000    Old_age   Always       -  826541

Lies dir mal den ersten Thread in dem Link hier durch : http://www.linuxforen.de/forums/showthread.php?267786-Der-gr%FCne-Festplattentod-Load-Cycle-Count-Liste-anf%E4lliger-Festplatten (http://www.linuxforen.de/forums/showthread.php?267786-Der-gr%FCne-Festplattentod-Load-Cycle-Count-Liste-anf%E4lliger-Festplatten)

Da wird auch Hitachi erwähnt.
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 21 September 2016, 21:44:01
Hallo Jungs,

dank für die Analysen. Das mit HDPARM werde ich mal einstellen. Ansonsten habe ich meinen selbstgebauten Automounter im Verdacht. Dieser hat zu überlaufenden Logs geführt. Hab den jetzt durch einen MountOnRequest ersetzt :)

Dennoch werde ich die Platte weiter beobachten. Durch die die gewonnenen Erkenntnisse werde ich vielleicht auch nochmal die ausgediente SSD überprüfen..
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 22 September 2016, 12:24:41
Mit dem Load_Cycle_Count hast Du Prinzipiell recht, aber in bezug auf die anderen Werte ist die Platte eben noch nicht defekt. Nur der Verschleiß ist höher ....
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: AxelSchweiss am 22 September 2016, 13:02:25
 :)
Ja ... ich sach ja nur Load_Cycle.
Aber das Problem mit dem Load_Cycle haben so einige Platten.
Für WD Platten gibts dafür sogar ein eigenes Tool.
Defekt by Design eben.
Bei Tintenstrahldruckern hat man ja auch eine "Nutzungsobergrenze" eingeführt.
Irgendwie muss man ja Geld verdienen  ... wen schon nicht mit Qualität dann mit anderen Mitteln.


Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 22 September 2016, 13:28:56
Im Gegensatz zu Druckern möchte ich Dir bei Festplatten da wiedersprechen ...

Wenn die Platte eben in den speziellen Modus geht, parkt sie die Köpfe. Bei NB-Platten ist es auch sinnvoll, da in der zeit sie Erschüttungsresistenter sind. Das eine Mechanik irgendwo auf eine Maximale Grenze entwickelt ist, ist insovern verständlich, da auf unendlich optimieren einfach nicht geht.

Das für eine Normalle Platte der Hersteller Werte genommen hat, die bei 24/7 Nutzung eventuell zu niedrig sind .....
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 23 September 2016, 16:34:32
Hallo Jungs,

könnt Ihr bitte hier mal nen Blick drauf werfen ?

Die SSD hat es glaube ich hinter sich, oder ? (Obwohl ich gerade von einem Laptop schreibe, wo ich die eingebaut, und Knoppix drauf installiert habe. Im Banana hat die nur noch Probleme gemacht...)


knoppix@Microknoppix:~$ sudo smartctl -t long /dev/sda
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.6-64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 4 minutes for test to complete.
Test will complete after Fri Sep 23 16:17:02 2016

Use smartctl -X to abort test.
knoppix@Microknoppix:~$ sudo smartctl -a  /dev/sda     
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.2.6-64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Crucial/Micron RealSSD m4/C400
Device Model:     M4-CT064M4SSD2
Serial Number:    ohne
LU WWN Device Id: 5 00a075 103169892
Firmware Version: 000F
User Capacity:    64.023.257.088 bytes [64,0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Fri Sep 23 16:28:26 2016 CEST

==> WARNING: This drive may hang after 5184 hours of power-on time:
http://www.tomshardware.com/news/Crucial-m4-Firmware-BSOD,14544.html
See the following web pages for firmware updates:
http://www.crucial.com/support/firmware.aspx
http://www.micron.com/products/solid-state-storage/client-ssd#software

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 115) The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline
data collection:                (  295) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (   4) minutes.
Conveyance self-test routine
recommended polling time:        (   3) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   050    Pre-fail  Always       -       566
  5 Reallocated_Sector_Ct   0x0033   095   095   010    Pre-fail  Always       -       47104 (0 7)
  9 Power_On_Hours          0x0032   100   100   001    Old_age   Always       -       13980
12 Power_Cycle_Count       0x0032   100   100   001    Old_age   Always       -       1546
170 Grown_Failing_Block_Ct  0x0033   095   095   010    Pre-fail  Always       -       247
171 Program_Fail_Count      0x0032   100   100   001    Old_age   Always       -       8959
172 Erase_Fail_Count        0x0032   100   100   001    Old_age   Always       -       0
173 Wear_Leveling_Count     0x0033   089   089   010    Pre-fail  Always       -       332
174 Unexpect_Power_Loss_Ct  0x0032   100   100   001    Old_age   Always       -       203
181 Non4k_Aligned_Access    0x0022   100   100   001    Old_age   Always       -       1482 401 1081
183 SATA_Iface_Downshift    0x0032   100   100   001    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   001    Old_age   Always       -       698
188 Command_Timeout         0x0032   100   100   001    Old_age   Always       -       0
189 Factory_Bad_Block_Ct    0x000e   100   100   001    Old_age   Always       -       48
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       0
195 Hardware_ECC_Recovered  0x003a   100   100   001    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   001    Old_age   Always       -       247
197 Current_Pending_Sector  0x0032   100   100   001    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   001    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   001    Old_age   Always       -       0
202 Perc_Rated_Life_Used    0x0018   089   089   001    Old_age   Offline      -       11
206 Write_Error_Rate        0x000e   100   100   001    Old_age   Always       -       8959

SMART Error Log Version: 1
ATA Error Count: 0
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 0 occurred at disk power-on lifetime: 13974 hours (582 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 b8 9c 8d e0  Error: UNC 8 sectors at LBA = 0x008d9cb8 = 9280696

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 b8 9c 8d e0 00  35d+11:23:19.744  READ DMA
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00  35d+11:23:19.744  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00  35d+11:23:19.744  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error -1 occurred at disk power-on lifetime: 13974 hours (582 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 b8 9c 8d e0  Error: UNC 8 sectors at LBA = 0x008d9cb8 = 9280696

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 b8 9c 8d e0 00  35d+11:23:19.744  READ DMA
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00  35d+11:23:19.744  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00  35d+11:23:19.744  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error -2 occurred at disk power-on lifetime: 13974 hours (582 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 b8 9c 8d e0  Error: UNC 8 sectors at LBA = 0x008d9cb8 = 9280696

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 b8 9c 8d e0 00  35d+11:23:19.744  READ DMA
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00  35d+11:23:19.744  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00  35d+11:23:19.744  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error -3 occurred at disk power-on lifetime: 13974 hours (582 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 b8 9c 8d e0  Error: UNC 8 sectors at LBA = 0x008d9cb8 = 9280696

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 b8 9c 8d e0 00  35d+11:23:19.744  READ DMA
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00  35d+11:23:19.744  IDENTIFY DEVICE
  ef 03 42 00 00 00 a0 00  35d+11:23:19.744  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 00  35d+11:23:19.744  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]

Error -4 occurred at disk power-on lifetime: 13974 hours (582 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 b8 9c 8d e0  Error: UNC 8 sectors at LBA = 0x008d9cb8 = 9280696

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 b8 9c 8d e0 00  35d+11:23:19.744  READ DMA
  c8 00 80 30 70 8c e0 00  35d+11:23:19.744  READ DMA
  c8 00 18 70 6e 84 e0 00  35d+11:23:19.744  READ DMA
  c8 00 20 a0 03 8a e0 00  35d+11:23:19.744  READ DMA
  35 00 88 00 84 bc e0 00  35d+11:23:19.744  WRITE DMA EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       30%     13979         92273608

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Wernieman am 23 September 2016, 20:33:25
Mach mal ein smartctrl Test ...

Wobei die read/write-error werte mich stutzig machen ...
Titel: Antw:RootFS nach 3-4 Wochen RO statt RW
Beitrag von: Bartimaus am 26 September 2016, 12:20:51
Moin,

lt. smartctl hat meine Platte bereits 84% ihrer Cycles verbraucht. Also nochmal 2% innerhalb der letzten Tage...

Habe den APM-Wert der Platte anhand dieses Links http://wiki.ubuntuusers.de/Notebook-Festplatten-Bug
nun von 128 auf 254 eingestellt. Werde weiter beobachten. :o