Aug 18, 2010
by
pauln
Read-before-writes being issued from the client needlessly:
[1282083602:849976 sliricthr01:14541:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=98304 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=0
[1282083602:962496 sliricthr03:14543:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=229376 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083602:983695 sliricthr02:14542:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=360448 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083602:995708 sliricthr18:14558:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=491520 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083602:999686 sliricthr14:14554:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=622592 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:015023 sliricthr00:14540:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=753664 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:021103 sliricthr03:14543:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=884736 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:027943 sliricthr10:14550:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1015808 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:035300 sliricthr09:14549:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1146880 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:042009 sliricthr21:14561:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1277952 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:047650 sliricthr08:14548:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1409024 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:055499 sliricthr21:14561:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1540096 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:061232 sliricthr14:14554:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1671168 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:068275 sliricthr04:14544:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1802240 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:074854 sliricthr08:14548:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=1933312 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:080775 sliricthr17:14557:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2064384 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:088872 sliricthr21:14561:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2195456 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:096729 sliricthr08:14548:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2326528 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:103729 sliricthr21:14561:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2457600 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083603:109248 sliricthr29:14569:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2588672 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083606:738186 sliricthr19:14559:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2719744 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083607:150602 sliricthr17:14557:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:2 sz=76546032 :: bmapno=1 size=32768 off=2850816 rw=42 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083607:185957 sliricthr08:14548:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:3 sz=76546032 :: bmapno=1 size=1048576 off=0 rw=43 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083607:204033 sliricthr21:14561:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:4 sz=76546032 :: bmapno=1 size=1048576 off=1048576 rw=43 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
[1282083607:221164 sliricthr20:14560:gen:sli_ric_handle_io:162] fcmh@0x71af20 fg:0x080000002d8eb0:0 DA ref:5 sz=76546032 :: bmapno=1 size=655344 off=2097152 rw=43 sbd_seq=4294967302 biod_cur_seqkey[0]=4294967302
Aug 17, 2010
by
pauln
- Single-threaded write test size verification from multiple clients (STWT_SZV) - PASS
- Multi-threaded write test size verification from multiple clients - FAIL
Multi-threaded write test size verification from multiple clients
This is NOT working at log level 5.
Note these tests used the bessemer@PSC I/O backend, with 2 I/O nodes.
group 8peReadWrite {
files_per_dir = 4;
tree_depth = 0;
tree_width = 0;
pes = 4;
test_freq = 0;
block_freq = 0;
path = /s2/pauln;
output_path = /home/pauln/fio/tmp;
filename = largeioc;
file_size = 4g;
block_size = 1m;
thrash_lock = yes;
samedir = yes;
samefile = no;
intersperse = no;
seekoff = no;
fsync_block = no;
verify = yes;
barrier = yes;
time_block = yes;
block_barrier = no;
time_barrier = no;
iterations = 1;
debug_conf = no;
debug_block = no;
debug_memory = no;
debug_buffer = no;
debug_output = no;
debug_dtree = no;
debug_barrier = no;
debug_iofunc = no;
iotests (
WriteEmUp [create:openwr:write:close]
)
}
All blocks were written:
(pauln@lemon:TGFIO_tests)$ grep "block# 4095" ./largeio.test1.outc
1282068769.436847 PE_00002 do_io() :: bl_wr 0000.090650 MB/s 0011.031429 block# 4095 bwait 00.000000
1282068775.774260 PE_00003 do_io() :: bl_wr 0000.088850 MB/s 0011.254921 block# 4095 bwait 00.000000
1282068776.913274 PE_00001 do_io() :: bl_wr 0000.083602 MB/s 0011.961443 block# 4095 bwait 00.000000
1282068778.415998 PE_00000 do_io() :: bl_wr 0000.075364 MB/s 0013.268957 block# 4095 bwait 00.000000
However, all files should be 4294967296. At least the clients agree on
the size which points to the mds or sliod as the culprit.
Orange:
-rw-r--r-- 1 pauln staff 4215275520 Aug 17 14:07 fio_f.pe0.largeioc.0.0
-rw-r--r-- 1 pauln staff 4294967296 Aug 17 14:07 fio_f.pe1.largeioc.0.0
-rw-r--r-- 1 pauln staff 4202037232 Aug 17 14:07 fio_f.pe2.largeioc.0.0
-rw-r--r-- 1 pauln staff 4215930864 Aug 17 14:07 fio_f.pe3.largeioc.0.0
Lemon:
-rw-r--r-- 1 pauln staff 4215275520 Aug 17 14:07 fio_f.pe0.largeioc.0.0
-rw-r--r-- 1 pauln staff 4294967296 Aug 17 14:07 fio_f.pe1.largeioc.0.0
-rw-r--r-- 1 pauln staff 4202037232 Aug 17 14:07 fio_f.pe2.largeioc.0.0
-rw-r--r-- 1 pauln staff 4215930864 Aug 17 14:07 fio_f.pe3.largeioc.0.0
Single threaded write test with size verification from multiple clients
This test is working at log level 5 on clients and servers.
stat(2)’s from the writer client and a 3rd party client are
both correct, with the 3rd party client timing out his size attributes
after 8 seconds.
group 8peReadWrite {
files_per_dir = 1;
tree_depth = 0;
tree_width = 0;
pes = 1;
test_freq = 0;
block_freq = 0;
path = /s2/pauln;
output_path = /home/pauln/fio/tmp;
filename = largeiob;
file_size = 4g;
block_size = 1m;
thrash_lock = yes;
samedir = yes;
samefile = no;
intersperse = no;
seekoff = no;
fsync_block = no;
verify = yes;
barrier = yes;
time_block = yes;
block_barrier = no;
time_barrier = no;
iterations = 1;
debug_conf = no;
debug_block = no;
debug_memory = no;
debug_buffer = no;
debug_output = no;
debug_dtree = no;
debug_barrier = no;
debug_iofunc = no;
iotests (
WriteEmUp [create:openwr:write:close]
)
}
Orange:
-rw-r--r-- 1 pauln staff 4294967296 Aug 17 13:58 fio_f.pe0.largeiob.0.0
Lemon:
-rw-r--r-- 1 pauln staff 4294967296 Aug 17 13:58 fio_f.pe0.largeiob.0.0
Wow. Been a while since I’ve updated this!
Aug 4, 2010
by
yanovich
Paul gave a Birds of a Feather talk at Teragrid 2010.
Aug 3, 2010
by
yanovich
The Web site is now available.
Mar 23, 2009
by
pauln
After a few go-rounds with various firewall and security mumbo-jumbo we
finally have mounted wolverine’s SLASH2 export at WVU.
Here’s a df(1) command and gdb stack trace for the first bug:
(root@castor:mount_slash)# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 68959192 5880592 59575628 9% /
/dev/sda1 101086 20617 75250 22% /boot
tmpfs 1029476 640 1028836 1% /dev/shm
/slashfs_client 478468950 672976 477795975 1% /slashfs_client
Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffe6be8950 (LWP 28034)]
0x0000003e8c232f05 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install fuse-libs-2.7.4-2.fc10.x86_64 glibc-2.9-3.x86_64
(gdb) bt
#0 0x0000003e8c232f05 in raise () from /lib64/libc.so.6
#1 0x0000003e8c234a73 in abort () from /lib64/libc.so.6
#2 0x0000000000456477 in _psclogv (
fn=0x4aef30 "..//..//psc_fsutil_libs/include/psc_util/lock.h",
func=0x4aef25 "_tands", line=203, subsys=5, level=0, options=0,
fmt=0x4aef80 "lock %p has invalid value (%d)", ap=0x7fffe6be78c0)
at ..//..//psc_fsutil_libs/psc_util/log.c:225
#3 0x00000000004566af in _psc_fatal (
fn=0x4aef30 "..//..//psc_fsutil_libs/include/psc_util/lock.h",
func=0x4aef25 "_tands", line=203, subsys=5, level=0, options=0,
fmt=0x4aef80 "lock %p has invalid value (%d)")
at ..//..//psc_fsutil_libs/psc_util/log.c:246
#4 0x0000000000404f9b in _tands (s=0x7ffe58)
at ..//..//psc_fsutil_libs/include/psc_util/lock.h:203
#5 0x0000000000404ebd in spinlock (s=0x7ffe58)
at ..//..//psc_fsutil_libs/include/psc_util/lock.h:212
#6 0x0000000000404e58 in reqlock (sl=0x7ffe58)
at ..//..//psc_fsutil_libs/include/psc_util/lock.h:255
#7 0x00000000004105f2 in slash2fuse_lookup_helper (req=0x7fffe0011460,
parent=13240447, name=0x7fffe43a1038 "fio_f.pe6.8peRW_1mbs.0.35")
at main.c:1099
#8 0x0000000000403ee2 in slash2fuse_listener_loop (arg=0x0)
at fuse_listener.c:260
#9 0x0000003e8ce073da in start_thread () from /lib64/libpthread.so.0
#10 0x0000003e8c2e62bd in clone () from /lib64/libc.so.6
(gdb)