java.net.Socket ( via Straams)
Socket class, writes use SocketOutputStream. Thewrite()
method ends up invoking a JNI method, socketWrite0(). This function has a local stack allocated buffer of length MAX_BUFFER_LEN
, which is set to 8192 innet_util_md.h). If the array fits in this buffer, it is copied using GetByteArrayRegion(). Finally, the implementation calls NET_Send
, a wrapper around the send()
system call. This means that every call to write a byte array in Java makes at least one copy that could be avoided in C. Even worse, if the Java byte array is longer than 8192 bytes, the code calls malloc()
to allocate a buffer of up to 64 kB, then copies into that buffer.Conclusion
Don't make calls to
write()
with arrays larger than 8 kB, since calling malloc()
and free()
for each write will impact performance.java.nio.SocketChannel
With the newer NIO package, writes must use ByteBuffers. When writing, the data first ends up at sun.nio.ch.SocketChannelImpl. It acquires some locks then calls sun.nio.ch.IOUtil.write, which checks the type of
ByteBuffer
. If it is a heap buffer, a temporary direct ByteBuffer
is allocated from a pool and the data is copied using ByteBuffer.put(). The direct ByteBuffer
is eventually written by calling sun.nio.ch.FileDispatcherImpl.write0, a JNI method. TheUnix implementation finally calls write()
with the raw address from the direct ByteBuffer
.Benchmark conclusion
OutputStream
: When writing byte[] arrays larger than 8192 bytes, performance takes a hit. Read/write in chunks ≤ 8192 bytes.ByteBuffer
: direct ByteBuffers are faster than heap buffers for filling with bytes and integers. However, array copies are faster with heap ByteBuffers (results not shown here). Allocation and deallocation is apparently more expensive for direct ByteBuffers as well.- Little endian or big endian: Doesn't matter for byte[], but little endian is faster for putting ints in ByteBuffers on a little endian machine.
- ByteBuffer versus byte[]: ByteBuffers are faster for I/O, but worse for filling with data.
Direct ByteBuffers provide very efficient I/O, but getting data into and out of them is more expensive than byte[] arrays. Thus, the fastest choice is going to be application dependent.
if the buffer size is at least 2048 bytes, it is actually faster to fill a byte[] array, copy it into a direct ByteBuffer, then write that, then to write the byte[] array directly. However for small writes (512 bytes or less), writing the byte[] array using OutputStream is slightly faster.
Generally, using NIO can be a performance win, particularly for large writes. You want to allocate a single direct ByteBuffer, and reuse it for all I/O to and from a particular channel. However, you should serialize and deserialize your data using byte[] arrays, since accessing individual elements from a ByteBuffer is slow.
if the buffer size is at least 2048 bytes, it is actually faster to fill a byte[] array, copy it into a direct ByteBuffer, then write that, then to write the byte[] array directly. However for small writes (512 bytes or less), writing the byte[] array using OutputStream is slightly faster.
Generally, using NIO can be a performance win, particularly for large writes. You want to allocate a single direct ByteBuffer, and reuse it for all I/O to and from a particular channel. However, you should serialize and deserialize your data using byte[] arrays, since accessing individual elements from a ByteBuffer is slow.
Source