Tuesday, July 21, 2015

Java NIO FileChannel

Creating FileChannel instance

java.nio.FileChannel is a new way of transferring file data. In traditional I/O, file data is read via java.io.FileInputStream and written via java.io.FileOutputStream. A FileChannel instance supports read and write data from or to a file. However, not all the FileChannel instances support both operations by default. The construction of an instance of FileChannel determines its reading and writing capability.

Create a FileChannel instance from FileInputStream instance

A FileChannel instance that is created via getChannel() method of a java.io.FileInputStream instance can only read data from a file into a buffer. An attempt to write data into this channel will throw java.nio.channels.NonWritableChannelException.

FileInputStream is = new FileInputStream("D:\\test.txt");
FileChannel channel = is.getChannel();
channel.read(ByteBuffer.allocate(10)); //OK
channel.write(ByteBuffer.allocate(10)); //java.nio.channels.NonWritableChannelException

Create a FileChannel instance from FileOutputStream instance

A FileChannel instance that is created via getChannel() method of a java.io.FileOutputStream instance can only write data from a buffer into a file. An attempt to read data from this channel will throw java.nio.channels.NonReadableChannelException.

FileOutputStream os = new FileOutputStream("D:\\test.txt");
FileChannel channel = os.getChannel();
channel.write(ByteBuffer.allocate(10)); //OK
channel.read(ByteBuffer.allocate(10)); //java.nio.channels.NonReadableChannelException

Create a FileChannel instance from RandomAccessFile instance

Creation of a java.io.RandomAccessFile instance requires the mode input. A FileChannel instance that is created via getChannel() method of a RandomAccessFile with r mode can only read data from a file into a buffer. An attempt to write data into this channel will throw java.nio.channels.NonWritableChannelException.

RandomAccessFile raf = new RandomAccessFile("D:\\test1.txt", "r");
FileChannel channel = raf.getChannel();
channel.read(ByteBuffer.allocate(10)); //OK
channel.write(ByteBuffer.allocate(10)); //java.nio.channels.NonWritableChannelException

Create a FileChannel instance by using Paths

We could open a file channel based on java.nio.Path. There is a list OpenOption that we could choose to determine the accessibility of this file channel instance. If neither APPEND or WRITE option is specified for the file channel then an attempt to write data into this channel will throw java.nio.channels.NonWritableChannelException.

Path path = Paths.get("D:\\test.txt");
EnumSet<StandardOpenOption> options
   = EnumSet.of(StandardOpenOption.CREATE_NEW, StandardOpenOption.READ);
FileChannel channel = FileChannel.open(path, options);
channel.read(ByteBuffer.allocate(10)); //OK
channel.write(ByteBuffer.allocate(10)); //java.nio.channels.NonWritableChannelException

What can be done by FileChannel

FileChannel is far more powerful than transitional I/O stream. It implemented a number of interfaces which greatly increase its capabilities rather than just able to read and write file data.

FileChannel is a SeekableByteChannel

Querying file size

Given the file D:\test.txt with content below:

abcdefg
We could query the file size with the codes below is executed,

RandomAccessFile file = new RandomAccessFile("D:\test.txt", "rw");
SeekableByteChannel fileChannel = file.getChannel();

long size = fileChannel.size(); // size in long data type.
System.out.println("Query file size: " + size); // Query file size: 7

The size above is commonly used as the size of the ByteBuffer. This is just working fine if the file size is small.

    ByteBuffer buffer = ByteBuffer.allocate((int) size); // potentially lose precision
    fileChannel.read(buffer);
    printBuffer(buffer);
}

private static void printBuffer(ByteBuffer buffer) {
    buffer.flip();
    while (buffer.hasRemaining()) {
        System.out.print(Character.toChars(buffer.get())[0] + " ");
    }
    System.out.println("");
}

a b c d e f g
In the case where the file size is big, then the intention of just using single ByteBuffer to hold the whole file content will not works. This is because the file size is a long data type, while the byte buffer size is only int data type. To avoid this problem, we could allocate a fixed size byte buffer, and use it repeatably to read the whole file content.

    ByteBuffer buffer = ByteBuffer.allocate(3);
    while (fileChannel.read(buffer) > 0) {
        printBuffer(buffer);
        buffer.clear();
    }

a b c d e f g

Querying and modifying current position

FileChannel also maintains a moving index called current position which indicating the next element in the byte sequence of the file that to be read/wrote. This "current position" could be queried and modified.

    ByteBuffer buffer = ByteBuffer.allocate(3);
    while (fileChannel.read(buffer) > 0) {
        System.out.println("Query current position: " + fileChannel.position());
        printBuffer(buffer);
        buffer.clear();
    }

Query current position: 3
a b c
Query current position: 6
d e f
Query current position: 7
g
The codes below modifies the "current position" to a position which lesser than file size. Therefore, data is read start from that position.

    ByteBuffer buffer = ByteBuffer.allocate(3);
    fileChannel.position(2);
    while (fileChannel.read(buffer) > 0) {
        System.out.println("Query current position: " + fileChannel.position());
        printBuffer(buffer);
        buffer.clear();
    }

Query current position: 5
c d e
Query current position: 7
f g
If we set the "current position" to a position which greater than file size and try to read the file, then nothing is read into buffer and the read() operation returns -1 which indicating the end-of-file.

    ByteBuffer buffer = ByteBuffer.allocate(3);
    fileChannel.position(10); // new position greater than file size.
    System.out.println(fileChannel.read(buffer2)); //reading returns end of file

-1

If we set the "current position" to a position which greater than file size and try to write to the file. The file size will get expanded according to the new content written into the file.

    System.out.println("Current file size: " + fileChannel.size());
    fileChannel.position(10); // new position greater than file size.
    fileChannel.write(ByteBuffer.wrap("klmn".getBytes("UTF-8")));
    System.out.println("New file size: " + fileChannel.size());

Current file size: 7
New file size: 14

Truncating file

We could truncate a file by setting the new size to the FileChannel that connected to the file. Any byte beyond the new size will be removed.

    System.out.println("Current file size: " + fileChannel.size());
    fileChannel.truncate(5);
    System.out.println("New file size: " + fileChannel.size());

    ByteBuffer buffer = ByteBuffer.allocate((int) fileChannel.size());
    fileChannel.read(buffer);
    printBuffer(buffer);

Current file size: 7
New file size: 5
a b c d e
What will happen if the "current position" is greater than the given size? After truncation, the "current position" will be set to the given size, which is the last index in the byte sequence of the file channel.

    System.out.println("Current file size: " + fileChannel.size());
    fileChannel.position(6);

    // position is greater than new size
    System.out.println("Current position: " + fileChannel.position());
    fileChannel.truncate(5);
    System.out.println("New file size: " + fileChannel.size());

    // position is set to the new size
    System.out.println("New position: " + fileChannel.position()); 

Current file size: 7
Current position: 6
New file size: 5
New position: 5
If the given size is equals or greater than current file size, then nothing will be truncated.

    System.out.println("Current file size: " + fileChannel.size());
    fileChannel.truncate(fileChannel.size());
    System.out.println("New file size: " + fileChannel.size());

Current file size: 7
New file size: 7

FileChannel is a GatheringByteChannel and ScatteringByteChannel

FileChannel is able to read/write a sequence of bytes from/to one or more byte buffers in just a single invocation. This is useful when we would like to treats the buffers as different segments of the byte sequence and process them differently right after we perform a read/write operation. Bear in mind that, FileChannel does not know how the bytes sequence should be segmented. It basically just accepts whatever number of buffers we pass to it. It then performs the read/write operation onto those buffers start from the first one. When the first one is totally read out/filled in, then it move to the next one and so on.

In other words, it is our responsibility to determine the data segmentation. We need to know exactly the data format that we are transferring. It normally starts with fixed size buffer then followed by variable-length buffer. One good example of this is to transferring data via the network, where data has to be in the format of header and body, which could be processed differently.

Gathering data segments into channel

    RandomAccessFile file = new RandomAccessFile("D:\test .txt", "rw");
    GatheringByteChannel fileChannel = file.getChannel();

    ByteBuffer header = ByteBuffer.wrap("header".getBytes());
    ByteBuffer body = ByteBuffer.wrap("body".getBytes());
    fileChannel.write(new ByteBuffer[]{header, body});

Scattering data segments from channel

    RandomAccessFile file = new RandomAccessFile("D:\test1.txt", "rw");
    ScatteringByteChannel fileChannel = file.getChannel();

    ByteBuffer header = ByteBuffer.allocate(6);
    ByteBuffer body = ByteBuffer.allocate(10);
    fileChannel.read(new ByteBuffer[]{header, body});

    System.out.println("Print header...");
    printBuffer(header); //header

    System.out.println("nPrint body...");
    printBuffer(body); //body

Transferring data between file channels

FileChannel provides API which allow us to transfer data from one file to another file directly without involve any intermediate buffer.

    RandomAccessFile doc1 = new RandomAccessFile("D:\doc1.txt", "rw"); //abc
    RandomAccessFile doc2 = new RandomAccessFile("D:\doc2.txt", "rw"); //xyz

    FileChannel doc1Channel = doc1.getChannel();
    FileChannel doc2Channel = doc2.getChannel();

    doc1Channel.transferFrom(doc2Channel, 3, 3); //abcxyz
    doc1Channel.transferTo(0, 3, doc2Channel); //xyzabc


References:
http://docs.oracle.com/javase/7/docs/api/java/nio/channels/FileChannel.html

No comments: