Problemy se stabilitou serveru

Gabriel gabriel at maxpark.cz
Sun Mar 20 11:33:28 CET 2011


Zdravim konferenci, 

mam problem s jednim servrem, ktery se mi pri zatizeni sam od sebe
rebootne a vypada to na dokonale cisty reboot, tedy zadny crash. Stava se
to hlavne v noci, kdyz se dela zaloha databaze.
Zaznam v last logu ze dneska:
reboot           ~                         Sun Mar 20 04:54

Plus pri transferech pres scp jakymkoliv smerem tyto transfery padnou taky
s hlaskou "Connection reset", nebo "bad MAC packet", idealne kdyz
transferuji vetsi soubory (v radu GB). Vypada to, jako problem s RAM, takze
jsem na to pustil na pul dne stress test a k memu prekvapeni to server v
pohode ustal.

Serverova konfigurace je nasledujici: 

FreeBSD 8.2-RELEASE #0: Wed Mar 2 00:29:27 CET 2011
 root at zeus:/usr/obj/usr/src/sys/zeus amd64 
CPU: Dual Core AMD Opteron(tm) Processor 280 (2389.99-MHz K8-class CPU)
 Origin = "AuthenticAMD" Id = 0x20f12 Family = f Model = 21 Stepping = 2
 Features=0x178bfbff
 Features2=0x1
 AMD Features=0xe2500800
 AMD Features2=0x3
real memory = 12884901888 (12288 MB)
avail memory = 12372381696 (11799 MB)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
bge0: <Broadcom Gigabit Ethernet Controller, ASIC rev. 0x002003> mem
0xfc8c0000-0xfc8cffff,0xfc8b0000-0xfc8bffff irq 24 at device 9.0 on pci2
bge0: CHIP ID 0x00002003; ASIC REV 0x02; CHIP REV 0x20; PCI-X
bge1: <Broadcom Gigabit Ethernet Controller, ASIC rev. 0x002003> mem
0xfc8f0000-0xfc8fffff,0xfc8e0000-0xfc8effff irq 25 at device 9.1 on pci2
bge1: CHIP ID 0x00002003; ASIC REV 0x02; CHIP REV 0x20; PCI-X
ZFS filesystem version 4
ZFS storage pool version 15
ad4: 238475MB <Seagate ST3250310NS SN06> at ata2-master UDMA100 SATA
1.5Gb/s
ad6: 238475MB <Seagate ST3250310NS SN06> at ata3-master UDMA100 SATA
1.5Gb/s

K serveru je pripojen ZFS /home pres NFS po gigabitu.
RAM jsou samozrejme ECC.

Root je na mirrorovem zfs poolu:
[root at zeus]:(~)# zpool status
  pool: system
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        system      ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            ad4p3   ONLINE       0     0     0
            ad6p3   ONLINE       0     0     0

errors: No known data errors


Setkal se s temito nebo podobnymi problemy nekdy nekdo? Opravdu si uz
nevim rady, kde co hledat, nebo jak najit ten problem.
Jakekoliv dalsi informace potrebne k investigaci rad dodam.

Dekuji za jakoukoliv pomoc.
Gabriel



More information about the Users-l mailing list