Convert a number of array shards to a single array store#
In the previous notebooks, we’ve seen how to incrementally create a collection and train models on it.
Once we have a collection of validated array shards, we might want to concatenate them to one big array store.
This is what the CELLxGENE team does to create Census: a high number of .h5ad
files are concatenated to give rise to a single TileDB-SOMA array store.
A requirement is duplicating the data that’s present in a collection of .h5ad
files, but doing so speeds up ad-hoc queries for slices for arbitrary metadata.
See how this looks for cellxgene
here: CELLxGENE: scRNA-seq.