帮忙看看是scsi总线问题还是磁盘问题

2台v490 通过scsi线连接到 sun jbod磁盘阵列
前段时间系统一直正常.这几天突然2台系统都在晚上报一些scsi的错误  但通过format看磁盘状态正常,磁盘阵列上指示灯正常,业务运行正常


itellin1:

an 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 4.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 4.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 4.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 4.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 4.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 8.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        Log info 110b0000 received for target 4.
Jan 15 22:44:37 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:37 itellin1        got external SCSI bus reset.
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@8,0 (sd53):
Jan 15 22:44:37 itellin1        SCSI transport failed: reason 'reset': retrying command
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@4,0 (sd50):
Jan 15 22:44:37 itellin1        SCSI transport failed: reason 'reset': retrying command
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@8,0 (sd53):
Jan 15 22:44:37 itellin1        Error for Command: write(10)               Error Level: Retryable
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  Requested Block: 36831300                  Error Block: 36831300
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 062034F00Z  
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x2
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@4,0 (sd50):
Jan 15 22:44:37 itellin1        Error for Command: write(10)               Error Level: Retryable
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  Requested Block: 37238568                  Error Block: 37238568
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 0451B9S3P8  
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 15 22:44:37 itellin1 scsi: [ID 107833 kern.notice]  ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x2
Jan 17 00:04:36 itellin1 scsi: [ID 107833 kern.notice] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 17 00:04:36 itellin1        got external SCSI bus reset.
Jan 17 00:04:36 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 17 00:04:36 itellin1        Log info 110b0000 received for target 8.
Jan 17 00:04:36 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 17 00:04:36 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 17 00:04:36 itellin1        Log info 110b0000 received for target 4.
Jan 17 00:04:36 itellin1        scsi_status=0, ioc_status=804b, scsi_state=8
Jan 17 00:04:36 itellin1 scsi: [ID 365881 kern.info] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 17 00:04:36 itellin1        Log info 110b0000 received for target 4.

----------------


itellin2
Jan 14 20:34:37 itellin2 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 14 20:34:37 itellin2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x2
Jan 14 22:24:35 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@a,0 (sd55):
Jan 14 22:24:35 itellin2        SCSI transport failed: reason 'reset': retrying command
Jan 14 22:24:35 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@3,0 (sd49):
Jan 14 22:24:35 itellin2        SCSI transport failed: reason 'reset': retrying command
Jan 14 22:24:35 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@c,0 (sd57):
Jan 14 22:24:35 itellin2        SCSI transport failed: reason 'reset': retrying command
Jan 14 22:24:35 itellin2 scsi: [ID 107833 kern.notice] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 14 22:24:35 itellin2        got external SCSI bus reset.
Jan 15 22:44:36 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@c,0 (sd57):
Jan 15 22:44:36 itellin2        SCSI transport failed: reason 'reset': retrying command
Jan 15 22:44:36 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@a,0 (sd55):
Jan 15 22:44:36 itellin2        SCSI transport failed: reason 'reset': retrying command
Jan 15 22:44:36 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@3,0 (sd49):
Jan 15 22:44:36 itellin2        SCSI transport failed: reason 'reset': retrying command
Jan 15 22:44:36 itellin2 scsi: [ID 107833 kern.notice] /pci@1d,700000/scsi@1,1 (mpt3):
Jan 15 22:44:36 itellin2        got external SCSI bus reset.
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@a,0 (sd55):
Jan 15 22:44:37 itellin2        Error for Command: write(10)               Error Level: Retryable
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Requested Block: 16166528                  Error Block: 16166528
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 062034GNKA  
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x2
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@c,0 (sd57):
Jan 15 22:44:37 itellin2        Error for Command: write(10)               Error Level: Retryable
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Requested Block: 16166656                  Error Block: 16166656
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 062034MEC4  
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x2
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@3,0 (sd49):
Jan 15 22:44:37 itellin2        Error for Command: write(10)               Error Level: Retryable
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Requested Block: 16156720                  Error Block: 16156720
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: 0508BA2DN9  
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 15 22:44:37 itellin2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (<vendor unique code 0x29>), ASCQ: 0x2, FRU: 0x2
Jan 17 00:04:36 itellin2 scsi: [ID 107833 kern.warning] WARNING: /pci@1d,700000/scsi@1,1/sd@a,0 (sd55):


大概分析了一下
2台设备上都是报3个硬盘错误

/pci@1d,700000/scsi@1,1/sd@8,0 (sd53):
c3t8d0          Soft Errors: 0 Hard Errors: 11 Transport Errors: 157
Vendor: SEAGATE  Product: ST373207LSUN72G  Revision: 045A Serial No: 2034F00Z

/pci@1d,700000/scsi@1,1/sd@4,0 (sd50)
c3t4d0          Soft Errors: 0 Hard Errors: 10 Transport Errors: 150
Vendor: SEAGATE  Product: ST373307LSUN72G  Revision: 0507 Serial No: 3HZ9S3P800007523

---------------------
/pci@1d,700000/scsi@1,1/sd@a,0 (sd55):
c3t10d0         Soft Errors: 0 Hard Errors: 10 Transport Errors: 378
Vendor: SEAGATE  Product: ST373207LSUN72G  Revision: 045A Serial No: 2034GNKA


/pci@1d,700000/scsi@1,1/sd@c,0 (sd57):
c3t12d0         Soft Errors: 0 Hard Errors: 10 Transport Errors: 403
Vendor: SEAGATE  Product: ST373207LSUN72G  Revision: 045A Serial No: 2034MEC4


/pci@1d,700000/scsi@1,1/sd@3,0 (sd49)
c3t3d0          Soft Errors: 0 Hard Errors: 8 Transport Errors: 43
Vendor: SEAGATE  Product: ST373307LSUN72G  Revision: 0707 Serial No: 3HZA2DN900007532

按理说同时坏这么多硬盘的可能性很低. 可SCSI线连接又很紧 不可能松动  大家说说这会是什么原因??

作者: twlogin   发布时间: 2011-01-26

本帖最后由 yulemi 于 2011-01-27 11:30 编辑

JBOD的,等着高人回答
同时坏这么多盘的可能性很小,是什么存储?什么版本的系统?

作者: yulemi   发布时间: 2011-01-27

You can check the speed to each scsi target with:

  # prtpicl -v | egrep "NAME=|sync-speed" | grep -v spindle

作者: 东方蜘蛛   发布时间: 2011-01-27