Basics of the N5 API for Java developers. This tutorial shows how to read and write n-dimensional image data and structured metadata into HDF5, N5, and Zarr containers using the N5 API.
hdf5
n5
zarr
imglib2
tutorial
Authors
John Bogovic
Caleb Hulbert
Published
February 27, 2024
Modified
April 23, 2024
This tutorial for Java developers covers the most basic functionality of the N5 API for storing large, chunked n-dimensional image data and structured metadata. The N5 API and documentation refer to n-dimensional images as “datasets”, terminology inherited from HDF5. We will use this terminology in this tutorial. If you are used to work with Python and Numpy, an n-dimensional image or dataset is what you know as an ndarray. We will learn about:
creating readers and writers
modifying and inspecting the hierarchy (“folder structure”)
saving and loading datasets
saving and loading metadata
Readers and writers
N5Readers and N5Writers form the basis of the N5 API and allow you to read and write data, respectively. We generally recommend using an N5Factory to create readers and writers:
Code
// N5Factory can make N5Readers and N5Writersvar factory =newN5Factory();// trying to open a reader for a container that does not yet exist will throw an error // var n5Reader = factory.openReader("my-container.n5");// creating a writer creates a container at the given location// if it does not already existvar n5Writer = factory.openWriter("my-container.n5");// now we can make a readervar n5Reader = factory.openReader("my-container.n5");// test if the container existsn5Reader.exists("");// true// "" and "/" both refer to the root of the containern5Reader.exists("/");// true
The N5 API gives you access to a number of different storage formats: HDF5, Zarr, and N5’s own format. N5Factory’s convenience methods try to infer the storage format from the extension of the path you provide:
Code
factory.openWriter("my-container.h5").getClass();// HDF5 Format N5Writerfactory.openWriter("my-container.n5").getClass();// N5 Format N5Writerfactory.openWriter("my-container.zarr").getClass();// Zarr Format N5Writer
In fact, it is possible to read with N5Writers since every N5Writer is also an N5Reader, so from now on we’ll just be using the n5Writer.
Try it!
We use the the N5 storage format for the rest of the tutorial, but it will work just as well over either an HDF5 file or Zarr container.
Groups
N5 containers form hierarchies of groups - think “nested folders on your file system.” It’s easy to create groups and test if they exist:
Notice that these methods only give information about what groups are present and do not provide information about metadata or datasets.
Note
Some storage / access systems (AWS-S3) separate permissions for reading and listing, meaning it may be possible to access data but not list.
Datasets
N5 stores datasets (n-dimensional arrays) in particular groups in the hierarchy.
Warning
Datasets must be terminal (leaf) nodes in the container hierarchy - i.e. a dataset can not contain another group or dataset. (Is this strictly true? May be confusing with names like multiscale “datasets”)
We recommend using code from n5-ij or n5-imglib2 to write datasets. The examples in this post will use the latter.
The N5Utils class in n5-imglib2 has many useful methods, but in this post, we’ll cover simple methods for reading and writing. First, N5Utils.save writes a dataset and required metadata to the container at a group that you specify. The group will be created if it does not already exist. The parameters will be discussed in more detail below.
Code
// the parametersvar img =demoImage(64,64);// the image to write- size 64 x 64var groupPath ="data";var blockSize =newint[]{32,32};var compression =newGzipCompression();// save the imageN5Utils.save(img, n5Writer, groupPath, blockSize, compression);
var exec =Executors.newFixedThreadPool(4);// with 4 parallel threadsN5Utils.save(img, n5Writer, groupPath, blockSize, compression, exec);
Reading the dataset from the container is also easy with N5Utils.open :
Code
var loadedImg = N5Utils.open(n5Writer, groupPath);Util.getTypeFromInterval(loadedImg).getClass();// FloatTypeArrays.toString(loadedImg.dimensionsAsLongArray());// [64, 64]
Overwriting data is possible
This save method DOES NOT perform any checks prior to writing data and will overwrite data that exists in the specified location. Be sure to check and take appropriate action if it is possible that data could already be at a particular location and container to avoid data loss or corruption.
This example shows that data can be over written:
Code
// overwrite our previous datavar img = ArrayImgs.unsignedBytes(2,2);N5Utils.save(img, n5Writer, groupPath, blockSize, compression);// load the new data, the old data are no longer accessiblevar loadedImg = N5Utils.open(n5Writer, groupPath);Arrays.toString(loadedImg.dimensionsAsLongArray());// [2, 2]
Parameter details
groupPath
is the location inside the container that will store the dataset. You can store an dataset at the root of a container by specifying "" or "/" as the groupPath. In this case, the container will only be able to store one dataset (see the warning above).
blockSize
is a very important parameter. HDF5, N5, and Zarr all break up the datasets they store into equally sized blocks or “chunks”. The block size parameter specifies the size of these blocks.
For the example above, we stored an image of size 64 x 64 using blocks sized 32 x 32. As a result, N5 uses four blocks to store the entire image:
Code
printBlocks("my-container.n5/data");
my-container.n5/data/1/1 is 1762 bytes
my-container.n5/data/1/0 is 2012 bytes
my-container.n5/data/0/1 is 1763 bytes
my-container.n5/data/0/0 is 2020 bytes
Quiz: How many blocks would there be if the block size was 64 x 8?
Click here to show the answer.
There would be eight blocks.
One block covers the first dimension, but it takes 8 blocks to cover the second dimension (\(8 \times 8 = 64\)). Also demonstrated by the code below:
Code
// remove the old datan5Writer.remove(groupPath);// rewrite with a different block sizevar blockSize =newint[]{64,8};N5Utils.save(img, n5Writer, groupPath, blockSize, compression);// how many blocks are there?printBlocks("my-container.n5/data");
my-container.n5/data/0/1 is 837 bytes
my-container.n5/data/0/7 is 847 bytes
my-container.n5/data/0/3 is 839 bytes
my-container.n5/data/0/6 is 844 bytes
my-container.n5/data/0/0 is 968 bytes
my-container.n5/data/0/4 is 846 bytes
my-container.n5/data/0/2 is 840 bytes
my-container.n5/data/0/5 is 847 bytes
Try it!
N5 lets you store your image in a single file if you want - just provide a block size that is equal to or larger than the image size.
compression
Each block is compressed independently, using the specified compression. Use RawCompression to store blocks without compression.
Code
// rewrite without compressionvar groupPath ="dataNoCompression";var blockSize =newint[]{32,32};var compression =newRawCompression();N5Utils.save(img, n5Writer, groupPath, blockSize, compression);// what size are the blocks?
Code
printBlocks("my-container.n5/dataNoCompression");
my-container.n5/dataNoCompression/1/1 is 4108 bytes
my-container.n5/dataNoCompression/1/0 is 4108 bytes
my-container.n5/dataNoCompression/0/1 is 4108 bytes
my-container.n5/dataNoCompression/0/0 is 4108 bytes
Notice that blocks were previously ~1700-2000 bytes and are now ~4100 without compression.
The available compression options at the time of this writing are:
N5 can also store rich structured metadata in addition to array data. This tutorial will discuss basic, low-level metadata operations. Advanced operations and metadata standards may be described in a future tutorial.
Basics
N5Writers have a setAttribute method for writing metadata to the storage backend. It takes three arguments:
<T>voidsetAttribute(String groupPath,String attributePath, T attribute)
groupPath : the group in which to store this metadata
attributePath : the name of this attribute
attribute : the metadata attribute to be stored. Can be an arbitrary type (denoted T).
Note
There are differences between an attribute “name” and an attribute “path”, but attribute “paths” are an advanced topic and will be covered elsewhere.
<T> T getAttribute(String groupPath,String attributePath,Class<T> clazz)
The last argument (Class<T>) lets you specify the type that getAttribute should return. An N5Exception will be thrown if the requested type can not be created from the requested attribute. If an attribute does not exist, null will be returned (see the last example of this section). Consider these examples:
Code
// create a group inside the container (think: "folder")var groupName ="put-data-in-me";n5Writer.createGroup(groupName);// attributes have names and values// make an attribute called "date" with a String valuevar attributeName ="date";n5Writer.setAttribute(groupName, attributeName,"2024-Jan-01");// Ask the N5 API to make a double array from the data attribute// it will try and fail, so an exception will be throwntry{var nothing = n5Writer.getAttribute(groupName, attributeName,double[].class);}catch( N5Exception e ){System.out.println("Error: could not get attribute as double[]");}// get the value of the "date" attribute as a StringString date = n5Writer.getAttribute(groupName, attributeName,String.class);date
Error: could not get attribute as double[]
2024-Jan-01
Sometimes it is possible to interpret an attribute as multiple different types:
Code
n5Writer.setAttribute(groupName,"a",42);var num = n5Writer.getAttribute(groupName,"a",double.class);// 42.0var str = n5Writer.getAttribute(groupName,"a",String.class);// "42"
Rich metadata
It possible to save attributes of arbitrary types, enabling you to struture your metadata into classes that are easy to save and load directly. For example, if we define a metadata class FunWithMetadata:
var metadata =newFunWithMetadata("Dorothy",2,newdouble[]{2.72,3.14});n5Writer.setAttribute(groupName,"metadata", metadata);// get attribute as an instance of FunWithMetdatan5Writer.getAttribute(groupName,"metadata", FunWithMetadata.class);
FunWithMetadata{Dorothy(2): [2.72, 3.14]}
To retrieve all the metadata in a group as JSON:
Code
// get attribute as an instance of JsonElementn5Writer.getAttribute(groupName,"/", JsonElement.class);
You can remove attributes by their name as well. To return the element that was removed, just provide the class for that element (this mirrors the remove method for Lists in Java.
Code
// set attributesn5Writer.setAttribute(groupName,"sender","Alice");n5Writer.setAttribute(groupName,"receiver","Bob");// notice that they're setn5Writer.getAttribute(groupName,"sender",String.class);// Alicen5Writer.getAttribute(groupName,"receiver",String.class);// Bob// remove "sender"n5Writer.removeAttribute(groupName,"sender");// remove "receiver" and store result in a variablevar receiver = n5Writer.removeAttribute(groupName,"receiver",String.class);// Bobn5Writer.getAttribute(groupName,"sender",String.class);// nulln5Writer.getAttribute(groupName,"receiver",String.class);// null
Working with Dataset Metadata
Metadata used to describe datasets can be get and set the same as all other metadata. However there are special DatasetAttributes methods to safely work with dataset metadata. N5Reader.getDatasetAttributes and N5Writer.setDatasetAttributes ensure the metadata is always a valid representation of dataset metadata. Setting DatasetAttributes however should only be done when the dataset is initially saved. This ensure the required metadata is tightly coupled with the data. For example, setting dataset metadata should be done through the N5Writer.createDataset methods (or indirectly through the N5Utils.savemethods mentioned above)
Code
var arrayMetadata = n5Writer.getDatasetAttributes("data");arrayMetadata.getDimensions();arrayMetadata.getBlockSize();arrayMetadata.getDataType();arrayMetadata.getCompression();
Warning
The attributes that N5 uses to read datasets can be set with setAttribute, and modifying them could corrupt your data. Do not manually set these attributes unless you absolutely know what you’re doing!
dimensions
blockSize
dataType
compression
The attributes that describe datasets are also accessible using getAttribute, try running: