This repository contains code for the paper beachmat: a Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types by Lun et al. (2018).
The provided code will check the performance of different matrix types for row/column access, using simulated and real data sets. To run the tests on your machine, please read the following instructions.
- Install beachmat from Bioconductor.
- Enter
timingsand runR CMD INSTALL --clean package. This requires installation of RcppArmadillo and RcppEigen.
timings/contains scripts for timings (in milliseconds) for accessing data from different matrix representations.timings/chunking/contains scripts for timing rechunking, as well as checking the chunk cache logic.memory/contains scripts for memory usage for different matrix representations.miscellaneouscontains scripts to compare timings to R, and to verify the no-copy access method of RcppArmadillo and RcppEigen.
Enter real/zeisel and download the count matrix for the Zeisel data set.
- Execute the
zeisel_time.Rscript to generate timings (in milliseconds) for matrix access to this data. This will also determine memory usage for each matrix representation. - Execute the
detection_stats.Rscript to generate timings (in milliseconds) for computing various cell- or gene-based statistics from this data.
Enter real/10X and install TENxBrainData.
Read the README.md file for order of evaluation of the various Rmarkdown scripts.