This is a collection of experiments designed to test the feasiblity of performing data transformations in kernel space. These tests are written to run on the FreeBSD kernel.
You should already have a working copy of FreeBSD built from source in /usr/src.
You will also need a few packages:
pkg install git bison autoconf automake rsyncTo begin, clone the repository somewhere convenient.
git clone [email protected]:gharris1727/kgrep.git kgrep
cd kgrepPrepare to build the repository by initializing dependencies and creating object directories.
make init obj objlinkBuild and load all of the kernel modules and userspace programs.
make
sudo make loadThis will then expose the /dev/kgrep_control and /dev/kagrep_control devices, to which we can send commands.
To remove everything completely, you can run each of these to remove all modules and built objects.
sudo make unload
make clean
make cleandir cleandirThe executables for each program will be avaliable at the following paths inside the repository:
ugrep/obj/ugrep
kgrep/cli/obj/kgrep
kagrep/cli/obj/kagrepOne example usage is to search /var/log/messages for instances of 'kernel', printing output on stdout, maxing out at 1000 matches: Both ugrep and kgrep have the same command-line syntax, as follows.
ugrep/obj/ugrep kernel /var/log/messages - 1000
kgrep/cli/obj/kgrep kernel /var/log/messages - 1000A similar usage for kagrep is as follows:
kagrep/cli/obj/kagrep kernel fgrep o /var/log/messages - 1000Notice the additional "fgrep" and "o" arguments for configuring the internal matcher. "fgrep" indicates that fixed string matches should be used, and "o" indicates that only the match itself should be output.
Other values for these fields can be seen in kagrep/module/INTERFACE.md, for the "matcher" and "flags" fields.
In order to run scripted tests on the above applications, make sure they are all built as directed.
There are currently two scripted tests, urandom.sh and enron.sh.
urandom.sh reads data from /dev/urandom, and populates urandom.csv with results from each test run.
This is not a prepared data set, and can be run immediately after cloning and building.
In order to operate on the prepared datasets, we have to first download and preprocess them.
make datasetsThe enron dataset can be found in enron/obj/ along with the preprocessed files.
enron.sh reads data from enron.tar, a single 1.7G tarball filled with emails.
The results of the test are put in enron.csv