The dataset size is about 120 MB, so the transfer rate is 0.66 MB/second. This is very slow and I can assure you the disk io, cpu is capable of much better. Also it's running on localhost so there should be very minimal network delay for all the ACK packets. bmon shows lo interface at an average of 650 KiB/s. PDU is set to maximum and also tried default 16k.
I tried versions v3.6.0 2011-01-06 and v3.6.1 2012-11-02. The 3.6 is from ubuntu apt repository, and I compiled 3.6.1. They should not be debug binaries.
Any advice on what I'm doing wrong? Can anyone confirm same performance, better, worse performance on their system?
I cannot confirm this bad performance. I just transferred 131 DICOM images with a total size of 1.2 GB in ~3 seconds. Without storing the received datasets as files (storescp option --ignore), it took ~2 seconds.
I have the same performance problem. On the same network, transferring DICOM images to/from an Osirix station is more than twice as fast as transferring to/from another machine that uses our software (and dcmtk). Is there any tweaking that can be done to accelerate image transfers?
First of all, you should make sure that the hostname lookup does not cause timeout issues. The various SCP tools in the DCMTK have an -dhl option for this purpose. Also playing around with the PDU size might help. However, I have no idea why OsiriX should be twice as fast (since it is probably also using DCMTK for networking).
Ok, I again performed the tests with 131 files (see above). The first call is slower but still around 9 seconds for reading via network (1 GBit/s NAS with hard disks), transferring to localhost and storing as files (on a SSD). When I repeat the same call, I get something around 5 seconds. The third call is below 5 seconds. Using --ignore (i.e. not storing of files on the receiver side) and having the files (apparently) in the cache, I get something between 2 and 3 seconds.
Don't ask me why the results were differently this morning.
I tried it on Windows 7 and it does indeed work on the order of 10 seconds (I don't have time utility on windows). Very bizarre. Same arguments. May I ask what system you were testing on?
Here's some more testing I did:
I converted the dataset to be little endian explicit
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Stepping: 9
CPU MHz: 1200.000
BogoMIPS: 5187.86
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3
As I said, the local drive is a fast SSD (used for saving the received files) and the network drive is a 1 GBit/s connected conventional harddisk in a NAS (used for reading the files to be transmitted).