H5matref {BiocSklearn} | R Documentation |
obtain an HDF5 dataset reference suitable for handling as numpy matrix
H5matref(filename, dsname = "assay001")
filename |
a pathname to an HDF5 file |
dsname |
internal name of HDF5 matrix to use, defaults to 'assay001' |
instance of (S3) "h5py._hl.dataset.Dataset"
This should only be used with persistent environment discipline of basilisk. Additional support is planned in Bioc 3.12.
## Not run: fn = system.file("ban_6_17/assays.h5", package="BiocSklearn") ban = H5matref(fn) ban proc = basilisk::basiliskStart(bsklenv) basilisk::basiliskRun(proc, function() { np = import("numpy", convert=FALSE) # ensure print(ban$shape) print(np$take(ban, 0:3, 0L)) fullpca = skPCA(ban) dim(getTransformed(fullpca)) ta = np$take }) basilisk::basiliskStop(proc) ## End(Not run) # project samples ## Not run: # on celaya2 this code throws errors, and # I have seen # .../lib/python2.7/site-packages/sklearn/decomposition/incremental_pca.py:271: RuntimeWarning: Mean of empty slice. # explained_variance[self.n_components_:].mean() # .../lib/python2.7/site-packages/numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars # ret = ret.dtype.type(ret / rcount) ta(ban, 0:20, 0L)$shape st = skPartialPCA_step(ta(ban, 0:20, 0L)) st = skPartialPCA_step(ta(ban, 21:40, 0L), obj=st) st = skPartialPCA_step(ta(ban, 41:63, 0L), obj=st) oo = st$transform(ban) dim(oo) cor(oo[,1:4], getTransformed(fullpca)[,1:4]) ## End(Not run) # so blocking this part of example for now