diff --git a/docs/content/browser.md b/docs/content/browser.md
new file mode 100644
index 00000000..9eab0712
--- /dev/null
+++ b/docs/content/browser.md
@@ -0,0 +1,6 @@
++++
+Categories = ["Lab"]
++++
+
+See `examples/planets` and `examples/simstats` for example uses of the **Browser** graphical interface. Go API docs: [[doc:lab.Browser]]
+
diff --git a/docs/content/cluster.md b/docs/content/cluster.md
new file mode 100644
index 00000000..3ea6f6a5
--- /dev/null
+++ b/docs/content/cluster.md
@@ -0,0 +1,6 @@
++++
+Categories = ["Stats"]
++++
+
+**Cluster** computes agglomerative clustering. Go docs: [[doc:stats/cluster]]
+
diff --git a/docs/content/datatree.md b/docs/content/datatree.md
new file mode 100644
index 00000000..a492c861
--- /dev/null
+++ b/docs/content/datatree.md
@@ -0,0 +1,8 @@
++++
+Categories = ["Lab"]
++++
+
+**DataTree** provides a [core/filetree](https://cogentcore.org/core/filetree) with support for standard data types, including viewing a [[tensorfs]] virtual filesystem, as part of a data [[Browser]].
+
+Go API docs: [[doc:lab.DataTree]]
+
diff --git a/docs/content/glm.md b/docs/content/glm.md
new file mode 100644
index 00000000..0435e455
--- /dev/null
+++ b/docs/content/glm.md
@@ -0,0 +1,6 @@
++++
+Categories = ["Stats"]
++++
+
+**glm** computes generalized linear models. Go docs: [[doc:stats/glm]]
+
diff --git a/docs/content/goal.md b/docs/content/goal.md
new file mode 100644
index 00000000..27833e0b
--- /dev/null
+++ b/docs/content/goal.md
@@ -0,0 +1,36 @@
+**Goal** is the _Go augmented language_ with support for two additional modes, in addition to standard Go:
+
+* [[shell|$ shell mode $]] that operates like a standard command-line shell (e.g., `bash`), with space-separated elements and standard shell functionality including input / output redirection. Goal automatically detects most instances of shell mode based on the syntax of the line, but it can always be explicitly indicated with surrounding `$`s.
+
+* [[math|# math mode #]] that supports Python-like concise mathematical expressions operating on [[tensor]] elements.
+
+Here is an example of shell-mode mixing Go and shell code:
+
+```goal
+for i, f := range goalib.SplitLines($ls -la$) {  // ls executes, returns string
+    echo {i} {strings.ToLower(f)}              // {} surrounds Go within shell
+}
+```
+
+where Go-code is explicitly indicated by the `{}` braces.
+
+Here is an example of math-mode:
+```goal
+# x := 1. / (1. + exp(-wts[:, :, :n] * acts[:]))
+```
+
+You can also intermix math within Go code:
+```goal
+for _, x := range #[1,2,3]# {
+    fmt.Println(#x**2#)
+}
+```
+
+Goal can be used in an interpreted mode by using the [yaegi](https://github.com/traefik/yaegi) Go interpreter (and can be used as your shell executable in a terminal), and it can also replace the standard `go` compiler in command-line mode, to build compiled executables using the extended Goal syntax.
+
+A key design feature of Goal is that it always _transpiles directly to Go_ in a purely syntactically driven way, so the output of Goal is pure Go code.
+
+Goal can also be used in conjunction with [[gosl]] to build programs that transparently run on GPU hardware in addition to standard CPUs (as standard Go programs).
+
+## Goal pages
+
diff --git a/docs/content/gosl.md b/docs/content/gosl.md
new file mode 100644
index 00000000..3301e822
--- /dev/null
+++ b/docs/content/gosl.md
@@ -0,0 +1,252 @@
+**Gosl** allows you to write Go programs that run on [[GPU]] hardware, by transpiling Go into the WGSL shader language used by [WebGPU](https://www.w3.org/TR/webgpu/), thereby establishing the _Go shader language_.
+
+Gosl uses the [core gpu](https://github.com/cogentcore/core/tree/main/gpu) compute shader system, and operates within the overall [[Goal]] framework of an augmented version of the Go language.
+
+The relevant regions of Go code to be run on the GPU are tagged using the `//gosl:start` and `//gosl:end` comment directives, and this code must only use basic expressions and concrete types that will compile correctly in a GPU shader (see [[#Restrictions]] below). Method functions and pass-by-reference pointer arguments to `struct` types are supported and incur no additional compute cost due to inlining (see notes below for more detail).
+
+See [[doc:examples/basic]] and [[doc:examples/rand]] for complete working examples.
+
+Typically, `gosl` is called from a `go generate` command, e.g., by including this comment directive:
+
+```
+//go:generate gosl 
+```
+
+To install the `gosl` command:
+```bash
+$ go install cogentcore.org/lab/gosl/@latest
+```
+
+It is also strongly recommended to install the `naga` WGSL compiler from [wgpu](https://github.com/gfx-rs/wgpu) and the `tint` compiler from [dawn](https://dawn.googlesource.com/dawn/) Both of these are used if available to validate the generated GPU shader code. It is much faster to fix the issues at generation time rather than when trying to run the app later. Once code passes validation in both of these compilers, it should load fine in your app, and if the Go version runs correctly, there is a good chance of at least some reasonable behavior on the GPU.
+
+## Usage
+
+There are two key elements for GPU-enabled code:
+
+1. One or more [[#Kernels]] compute functions that take an _index_ argument and perform computations for that specific index of data, _in parallel_. **GPU computation is effectively just a parallel `for` loop**. On the GPU, each such kernel is implemented by its own separate compute shader code, and one of the main functions of `gosl` is to generate this code from the Go sources, in the automatically created `shaders/` directory.
+
+2. [[#Global variables]] on which the kernel functions _exclusively_ operate: all relevant data must be specifically copied from the CPU to the GPU and back. As explained in the [[GPU]] docs, each GPU compute shader is effectively a _standalone_ program operating on these global variables. To replicate this environment on the CPU, so the code works in both contexts, we need to make these variables global in the CPU (Go) environment as well.
+
+`gosl` generates a file named `gosl.go` in your package directory that initializes the GPU with all of the global variables, and functions for running the kernels and syncing the gobal variable data back and forth between the CPu and GPU.
+
+## Kernels
+
+Each distinct compute kernel must be tagged with a `//gosl:kernel` comment directive, as in this example (from `examples/basic`):
+```go
+// Compute does the main computation.
+func Compute(i uint32) { //gosl:kernel
+	Params[0].IntegFromRaw(int(i))
+}
+```
+
+The kernel functions receive a `uint32` index argument, and use this to index into the global variables containing the relevant data. Typically the kernel code itself just calls other relevant function(s) using the index, as in the above example. Critically, _all_ of the data that a kernel function ultimately depends on must be contained with the global variables, and these variables must have been sync'd up to the GPU from the CPU prior to running the kernel (more on this below).
+
+In the CPU mode, the kernel is effectively run in a `for` loop like this:
+```go
+	for i := range n {
+		Compute(uint32(i))
+	}
+```
+A parallel goroutine-based mechanism is actually used, but conceptually this is what it does, on both the CPU and the GPU. To reiterate: **GPU computation is effectively just a parallel for loop**.
+
+## Global variables
+
+The global variables on which the kernels operate are declared in the usual Go manner, as a single `var` block, which is marked at the top using the `//gosl:vars` comment directive:
+
+```go
+//gosl:vars
+var (
+	// Params are the parameters for the computation.
+	//gosl:read-only
+	Params []ParamStruct
+
+	// Data is the data on which the computation operates.
+	// 2D: outer index is data, inner index is: Raw, Integ, Exp vars.
+	//gosl:dims 2
+	Data *tensor.Float32
+)
+```
+
+All such variables must be either:
+1. A `slice` of GPU-alignment compatible `struct` types, such as `ParamStruct` in the above example. In general such structs should be marked as `//gosl:read-only` due to various challenges associated with writing to structs, detailed below.
+2. A `tensor` of a GPU-compatible elemental data type (`float32`, `uint32`, or `int32`), with the number of dimensions indicated by the `//gosl:dims <n>` tag as shown above. This is the preferred type for writable data.
+
+You can also just declare a slice of elemental GPU-compatible data values such as `float32`, but it is generally preferable to use the tensor instead, because it has built-in support for higher-dimensional indexing in a way that is transparent between CPU and GPU.
+
+### Tensor data
+
+On the GPU, the tensor data is represented using a simple flat array of the basic data type. To index into this array, the _strides_ for each dimension are encoded in a special `TensorStrides` tensor that is managed by `gosl`, in the generated `gosl.go` file. `gosl` automatically generates the appropriate indexing code using these strides (which is why the number of dimensions is needed).
+
+Whenever the strides of any tensor variable change, and at least once at initialization, your code must call the function that copies the current strides up to the GPU:
+```go
+	ToGPUTensorStrides()
+```
+
+### Multiple tensor variables for large data
+
+The size of each memory buffer is limited by the GPU, to a maximum of at most 4GB on modern GPU hardware. Therefore, if you need to have any single tensor that holds more than this amount of data, then a bank of multiple vars are required. `gosl` provides helper functions to make this relatively straightforward.
+
+TODO: this could be encoded in the TensorStrides. It will always be the outer-most index that determines when it gets over threshold, which all can be pre-computed.
+
+### Systems and Groups
+
+Each kernel belongs to a `gpu.ComputeSystem`, and each such system has one specific configuration of memory variables. In general, it is best to use a single set of global variables, and perform as much of the computation as possible on this set of variables, to minimize the number of memory transfers. However, if necessary, multiple systems can be defined, using an optional additional system name argument for the `args` and `kernel` tags.
+
+In addition, the vars can be organized into _groups_, which generally should have similar memory syncing behavior, as documented in the [core gpu](https://github.com/cogentcore/core/tree/main/gpu) system.
+
+Here's an example with multiple groups:
+```go
+//gosl:vars [system name]
+var (
+    // Layer-level parameters
+    //gosl:group -uniform Params
+    Layers   []LayerParam // note: struct with appropriate memory alignment
+
+    // Path-level parameters
+    Paths    []PathParam  
+
+    // Unit state values
+    //gosl:group Units
+    Units    tensor.Float32
+    
+    // Synapse weight state values
+    Weights  tensor.Float32
+)
+```
+
+## Memory syncing
+
+Each global variable gets an automatically-generated `*Var` enum (e.g., `DataVar` for global variable named `Data`), that used for the memory syncing functions, to make it easy to specify any number of such variables to sync, which is by far the most efficient. All of this is in the generated `gosl.go` file. For example:
+
+```go
+	ToGPU(ParamsVar, DataVar)
+```
+
+Specifies that the current contents of `Params` and `Data` are to be copied up to the GPU, which is guaranteed to complete by the time the next kernel run starts, within a given system.
+
+## Kernel running
+
+As with memory transfers, it is much more efficient to run multiple kernels in sequence, all operating on the current data variables, followed by a single sync of the updated global variable data that has been computed. Thus, there are separate functions for specifying the kernels to run, followed by a single "Done" function that actually submits the entire batch of kernels, along with memory sync commands to get the data back from the GPU. For example:
+
+```go
+    RunCompute1(n)
+    RunCompute2(n)
+    ...
+    RunDone(Data1Var, Data2Var) // launch all kernels and get data back to given vars
+```
+
+For CPU mode, `RunDone` is a no-op, and it just runs each kernel during each `Run` command.
+
+It is absolutely essential to understand that _all data must already be on the GPU_ at the start of the first Run command, and that any CPU-based computation between these calls is completely irrelevant for the GPU. Thus, it typically makes sense to just have a sequence of Run commands grouped together into a logical unit, with the relevant `ToGPU` calls at the start and the final `RunDone` grabs everything of relevance back from the GPU.
+
+## GPU relevant code taggng
+
+In a large GPU-based application, you should organize your code as you normally would in any standard Go application, distributing it across different files and packages. The GPU-relevant parts of each of those files can be tagged with the gosl tags:
+```
+//gosl:start
+
+< Go code to be translated >
+
+//gosl:end
+```
+to make this code available to all of the shaders that are generated.
+
+Use the `//gosl:import "package/path"` directive to import GPU-relevant code from other packages, similar to the standard Go import directive. It is assumed that many other Go imports are not GPU relevant, so this separate directive is required.
+
+If any `enums` variables are defined, pass the `-gosl` flag to the `core generate` command to ensure that the `N` value is tagged with `//gosl:start` and `//gosl:end` tags.
+
+**IMPORTANT:** all `.go` and `.wgsl` files are removed from the `shaders` directory prior to processing to ensure everything there is current -- always specify a different source location for any custom `.wgsl` files that are included.
+
+# Command line usage
+
+```
+gosl [flags] 
+```
+    
+The flags are:
+```
+  -debug
+    	enable debugging messages while running
+  -exclude string
+    	comma-separated list of names of functions to exclude from exporting to WGSL (default "Update,Defaults")
+  -keep
+    	keep temporary converted versions of the source files, for debugging
+  -out string
+    	output directory for shader code, relative to where gosl is invoked -- must not be an empty string (default "shaders")
+```
+
+`gosl` always operates on the current directory, looking for all files with `//gosl:` tags, and accumulating all the `import` files that they include, etc.
+  
+Any `struct` types encountered will be checked for 16-byte alignment of sub-types and overall sizes as an even multiple of 16 bytes (4 `float32` or `int32` values), which is the alignment used in WGSL and glsl shader languages, and the underlying GPU hardware presumably.  Look for error messages on the output from the gosl run.  This ensures that direct byte-wise copies of data between CPU and GPU will be successful.  The fact that `gosl` operates directly on the original CPU-side Go code uniquely enables it to perform these alignment checks, which are otherwise a major source of difficult-to-diagnose bugs.
+
+# Restrictions    
+
+In general shader code should be simple mathematical expressions and data types, with minimal control logic via `if`, `for` statements, and only using the subset of Go that is consistent with C.  Here are specific restrictions:
+
+* Can only use `float32`, `[u]int32` for basic types (`int` is converted to `int32` automatically), and `struct` types composed of these same types -- no other Go types (i.e., `map`, slices, `string`, etc) are compatible.  There are strict alignment restrictions on 16 byte (e.g., 4 `float32`'s) intervals that are enforced via the `alignsl` sub-package.
+
+* WGSL does _not_ support 64 bit float or int.
+
+* Use `slbool.Bool` instead of `bool` -- it defines a Go-friendly interface based on a `int32` basic type.
+
+* Alignment and padding of `struct` fields is key -- this is automatically checked by `gosl`.
+
+* WGSL does not support enum types, but standard go `const` declarations will be converted.  Use an `int32` or `uint32` data type.  It will automatically deal with the simple incrementing `iota` values, but not more complex cases.  Also, for bitflags, define explicitly, not using `bitflags` package, and use `0x01`, `0x02`, `0x04` etc instead of `1<<2` -- in theory the latter should be ok but in practice it complains.
+
+* Cannot use multiple return values, or multiple assignment of variables in a single `=` expression.
+
+* *Can* use multiple variable names with the same type (e.g., `min, max float32`) -- this will be properly converted to the more redundant form with the type repeated, for WGSL.
+
+* `switch` `case` statements are _purely_ self-contained -- no `fallthrough` allowed!  does support multiple items per `case` however. Every `switch` _must_ have a `default` case.
+
+* WGSL does specify that new variables are initialized to 0, like Go, but also somehow discourages that use-case.  It is safer to initialize directly:
+```go
+    val := float32(0) // guaranteed 0 value
+    var val float32 // ok but generally avoid
+```    
+
+* Use the automatically-generated `GetX` methods to get a local variable to a slice of structs:
+```go
+    ctx := GetCtx(0)
+```
+This automatically does the right thing on GPU while returning a pointer to the indexed struct on CPU.
+
+* tensor variables can only be used in `storage` (not `uniform`) memory, due to restrictions on dynamic sizing and alignment. Aside from this constraint, it is possible to designate a group of variables to use uniform memory, with the `-uniform` argument as the first item in the `//gosl:group` comment directive.
+
+## Other language features
+
+* [tour-of-wgsl](https://google.github.io/tour-of-wgsl/types/pointers/passing_pointers/) is a good reference to explain things more directly than the spec.
+
+* `ptr<function,MyStruct>` provides a pointer arg
+* `private` scope = within the shader code "module", i.e., one thread.  
+* `function` = within the function, not outside it.
+* `workgroup` = shared across workgroup -- coudl be powerful (but slow!) -- need to learn more.
+
+## Atomic access
+
+WGSL adopts the Metal (lowest common denominator) strong constraint of imposing a _type_ level restriction on atomic operations: you can only do atomic operations on variables that have been declared atomic, as in:
+
+```
+var<storage, read_write> PathGBuf: array<atomic<i32>>;
+...
+atomicAdd(&PathGBuf[idx], val);
+```
+
+This also unfortunately has the side-effect that you cannot do _non-atomic_ operations on atomic variables, as discussed extensively here: https://github.com/gpuweb/gpuweb/issues/2377  Gosl automatically detects the use of atomic functions on GPU variables, and tags them as atomic. 
+
+## Random numbers: slrand
+
+See [[doc:gosl/slrand]] for a shader-optimized random number generation package, which is supported by `gosl` -- it will convert `slrand` calls into appropriate WGSL named function calls.  `gosl` will also copy the `slrand.wgsl` file, which contains the full source code for the RNG, into the destination `shaders` directory, so it can be included with a simple local path:
+
+```go
+//gosl:wgsl mycode
+// #include "slrand.wgsl"
+//gosl:end mycode
+```
+
+## Performance
+
+With sufficiently large N, and ignoring the data copying setup time, around ~80x speedup is typical on a Macbook Pro with M1 processor.  The `rand` example produces a 175x speedup!
+
+## Gosl pages
+
diff --git a/goal/GPU.md b/docs/content/gpu.md
similarity index 68%
rename from goal/GPU.md
rename to docs/content/gpu.md
index 8e573f54..eff4bccb 100644
--- a/goal/GPU.md
+++ b/docs/content/gpu.md
@@ -1,12 +1,20 @@
-# Goal GPU support
++++
+Categories = ["Gosl"]
+Title = "GPU"
+Name = "GPU"
++++
 
-The use of massively parallel _Graphical Processsing Unit_ (_GPU_) hardware has revolutionized machine learning and other fields, producing many factors of speedup relative to traditional _CPU_ (_Central Processing Unit_) computation. However, there are numerous challenges for supporting GPU-based computation, relative to the more flexible CPU coding.
+The use of massively parallel _Graphical Processsing Unit_ (**GPU**) hardware has revolutionized machine learning and other fields, producing many factors of speedup relative to traditional _CPU_ (_Central Processing Unit_) computation. However, there are numerous challenges for supporting GPU-based computation, relative to the more flexible CPU coding.
 
-The Goal framework provides a solution to these challenges that enables the same Go-based code to work efficiently and reasonably naturally on both the GPU and CPU (i.e., standard Go execution), via the [gosl](../gosl) package. Debugging code on the GPU is notoriously difficult because the usual tools are not directly available (not even print statements), so the ability to run exactly the same code on the CPU is invaluable, in addition to the benefits in portability across platforms without GPU hardware.
+The [[Gosl]] Go shader language operating within the broader [[Goal]] augmented version of the Go lanuage provides a solution to these challenges that enables the same Go-based code to work efficiently and reasonably naturally on both the GPU and CPU (i.e., standard Go execution).
 
-See the [gosl](../gosl) documentation for the details on how to write code that works on the GPU. The remainder of this document provides an overview of the overall approach in relation to other related tools.
+Debugging code on the GPU is notoriously difficult because the usual tools are not directly available (not even print statements), so the ability to run exactly the same code on the CPU and GPU is invaluable, in addition to the benefits in portability across platforms without GPU hardware.
 
-The two most important challenges are:
+See the [[gosl]] documentation for the details on how to write code that works on the GPU. The remainder of this document provides an overview of the overall approach in relation to other related tools.
+
+## Challenges for GPU computation
+
+The two most important challenges for GPU-based programs are:
 
 * The GPU _has its own separate memory space_ that needs to be synchronized explicitly and bidirectionally with the standard CPU memory (this is true programmatically even if at a hardware level there is shared memory).
 
@@ -24,20 +32,21 @@ The [JAX](https://github.com/jax-ml/jax) framework in Python provides one soluti
 
 We take a different approach, which is much simpler implementationally but requires a bit more work from the developer, which is to provide tools that allow _you_ to organize your computation into kernel-sized chunks according to your knowledge of the problem, and transparently turn that code into the final CPU and GPU programs.
 
-In many cases, a human programmer can most likely out-perform the automatic compilation process, by knowing the full scope of what needs to be computed, and figuring out how to package it most efficiently per the above constraints. In the end, you get maximum efficiency and complete transparency about exactly what is being computed, perhaps with fewer "gotcha" bugs arising from all the magic happening under the hood, but again it may take a bit more work to get there.
+In many cases, a human programmer can most likely out-perform the automatic compilation process, by knowing the full scope of what needs to be computed, and figuring out how to package it most efficiently per the above constraints. In the end, you get maximum efficiency and complete transparency about exactly what is being computed, perhaps with fewer "gotcha" bugs arising from all the magic happening under the hood, but it may take a bit more work to get there.
 
-The role of Goal is to allow you to express the full computation in the clear, simple, Go language, using intuitive data structures that minimize the need for additional boilerplate to run efficiently on CPU and GPU. This ability to write a single codebase that runs efficiently on CPU and GPU is similar to the [SYCL](https://en.wikipedia.org/wiki/SYCL) framework (and several others discussed on that wikipedia page), which builds on [OpenCL](https://en.wikipedia.org/wiki/OpenCL), both of which are based on the C / C++ programming language.
+The role of [[Gosl]] and [[Goal]] is to allow you to express the full computation in the clear, simple, Go language, using intuitive data structures that minimize the need for additional boilerplate to run efficiently on CPU and GPU. This ability to write a single codebase that runs efficiently on CPU and GPU is similar to the [SYCL](https://en.wikipedia.org/wiki/SYCL) framework (and several others discussed on that wikipedia page), which builds on [OpenCL](https://en.wikipedia.org/wiki/OpenCL), both of which are based on the C / C++ programming language.
 
-In addition to the critical differences between Go and C++ as languages, Goal targets only one hardware platform: WebGPU (via our [gpu](../gpu) package), so it is more specifically optimized for this use-case. Furthermore, SYCL and other approaches require you to write GPU-like code that can also run on the CPU (with lots of explicit fine-grained memory and compute management), whereas Goal provides a more natural CPU-like programming model, while imposing some stronger constraints that encourage more efficient implementations.
+In addition to the critical differences between Go and C++ as languages, Gosl targets only one hardware platform: WebGPU (via the [core gpu](https://github.com/cogentcore/core/tree/main/gpu) package), so it is more specifically optimized for this use-case. Furthermore, SYCL and other approaches require you to write GPU-like code that can also run on the CPU (with lots of explicit fine-grained memory and compute management), whereas Goal provides a more natural CPU-like programming model, while imposing some stronger constraints that encourage more efficient implementations.
 
 The bottom line is that the fantasy of being able to write CPU-native code and have it magically "just work" on the GPU with high levels of efficiency is just that: a fantasy. The reality is that code must be specifically structured and organized to work efficiently on the GPU. Goal just makes this process relatively clean and efficient and easy to read, with a minimum of extra boilerplate. The resulting code should be easily understood by anyone familiar with the Go language, even if that isn't the way you would have written it in the first place. The reward is that you can get highly efficient results with significant GPU-accelerated speedups that works on _any platform_, including the web and mobile phones, all with a single easy-to-read codebase.
 
-# Kernel functions
+## Kernel functions
 
 First, we assume the scope is a single Go package that implements a set of computations on some number of associated data representations. The package will likely contain a lot of CPU-only Go code that manages all the surrounding infrastructure for the computations, in terms of creating and configuring the data in memory, visualization, i/o, etc.
 
 The GPU-specific computation is organized into some (hopefully small) number of **kernel** functions, that are conceptually called using a **parallel for loop**, e.g., something like this:
-```Go
+
+```go
 for i := range parallel(data) {
     Compute(i)
 }
@@ -49,11 +58,11 @@ We assume that multiple kernels will in general be required, and that there is l
 
 Even though the GPU kernels must each be compiled separately into a single distinct WGSL _shader_ file that is run under WebGPU, they can `import` a shared codebase of files, and thus replicate the same overall shared code structure as the CPU versions.
 
-The GPU code can only handle a highly restricted _subset_ of Go code, with data structures having strict alignment requirements, and no `string` or other composite variable-length data structures (maps, slices etc). Thus, the [gosl](../gosl) package recognizes `//gosl:start` and `//gosl:end` comment directives surrounding the GPU-safe (and relevant) portions of the overall code. Any `.go` or `.goal` file can contribute GPU relevant code, including in other packages, and the gosl system automatically builds a shadow package-based set of `.wgsl` files accordingly.
+The GPU code can only handle a highly restricted _subset_ of Go code, with data structures having strict alignment requirements, and no `string` or other composite variable-length data structures (maps, slices etc). Thus, [[Gosl]] recognizes `//gosl:start` and `//gosl:end` comment directives surrounding the GPU-safe (and relevant) portions of the overall code. Any `.go` or `.goal` file can contribute GPU relevant code, including in other packages, and the gosl system automatically builds a shadow package-based set of `.wgsl` files accordingly.
 
 > Each kernel function is marked with a `//gosl:kernel` directive, and the name of the function is used to create the name of the GPU shader file.
 
-```Go
+```go
 // Compute does the main computation.
 func Compute(i uint32) { //gosl:kernel
 	Params[0].IntegFromRaw(&Data[i])
@@ -68,7 +77,7 @@ Perhaps the strongest constraints for GPU programming stem from the need to orga
 
 Thus, names must be chosen appropriately for these variables, given their global scope within the Go package. The specific _values_ for these variables can be dynamically set in an easy way, but the variables themselves are global.
 
-Within the [gpu](../gpu) framework, each `ComputeSystem` defines a specific organization of such GPU buffer variables, and maximum efficiency is achieved by minimizing the number of such compute systems, and associated memory buffers. Each system also encapsulates the associated kernel shaders that operate on the associated memory data, so
+Within the [core gpu](https://github.com/cogentcore/core/tree/main/gpu) framework, each `ComputeSystem` defines a specific organization of such GPU buffer variables, and maximum efficiency is achieved by minimizing the number of such compute systems, and associated memory buffers. Each system also encapsulates the associated kernel shaders that operate on the associated memory data, so
 
 > Kernels and variables both must be defined within a specific system context.
 
@@ -104,21 +113,10 @@ Furthermore:
 
 > Pointer-based access of global variables is not supported in GPU mode.
 
-You have to use _indexes_ into arrays exclusively. Thus, some of the data structures you may need to copy up to the GPU include index variables that determine how to access other variables. TODO: do we need helpers for any of this?
-
-# Examples
+You have to use _indexes_ into arrays exclusively. Thus, some of the data structures you may need to copy up to the GPU include index variables that determine how to access other variables.
 
-A large and complex biologically-based neural network simulation framework called [axon](https://github.com/emer/axon) has been implemented using `gosl`, allowing 1000's of lines of equations and data structures to run through standard Go on the CPU, and accelerated significantly on the GPU.  This allows efficient debugging and unit testing of the code in Go, whereas debugging on the GPU is notoriously difficult.
+## Examples
 
-# TODO
-
-## Optimization
-
-can run naga on wgsl code to get wgsl code out, but it doesn't seem to do much dead code elimination: https://github.com/gfx-rs/wgpu/tree/trunk/naga
-
-```
-naga --compact gpu_applyext.wgsl tmp.wgsl
-```
+A large and complex biologically-based neural network simulation framework called [axon](https://github.com/emer/axon) has been implemented using `gosl`, allowing 1000's of lines of equations and data structures to run through standard Go on the CPU, and accelerated significantly on the GPU. This allows efficient debugging and unit testing of the code in Go, whereas debugging on the GPU is notoriously difficult.
 
-https://github.com/LucentFlux/wgsl-minifier does radical minification but the result is unreadable so we don't know if it is doing dead code elimination.  in theory it is just calling naga --compact for that.
 
diff --git a/docs/content/histogram.md b/docs/content/histogram.md
new file mode 100644
index 00000000..a678a4e5
--- /dev/null
+++ b/docs/content/histogram.md
@@ -0,0 +1,6 @@
++++
+Categories = ["Stats"]
++++
+
+**Histogram** computes histograms. Go docs: [[doc:stats/histogram]]
+
diff --git a/docs/content/home.md b/docs/content/home.md
index 73270ab5..8df808c0 100644
--- a/docs/content/home.md
+++ b/docs/content/home.md
@@ -11,12 +11,14 @@ Cogent Lab is still under development, but the basic API should be somewhat stab
 
 Features include:
 
-* The [[goal]] language transpiler (generates standard Go code) that supports more concise math and matrix expressions that are largely compatible with the widely used numpy framework, in addition to shell command syntax, so it can be used as a replacement for a command-line shell.
+* The [[Goal]] language transpiler (generates standard Go code) that supports more concise [[math]], [[matrix]], and [[stats]] expressions that are largely compatible with the widely used [NumPy](https://numpy.org/doc/stable/index.html) framework, in addition to [[shell]] command syntax, so it can be used as a replacement for a command-line shell.
 
-* A [[tensor]] representation for n-dimensional data, which serves as the universal data type within the Lab framework. The [[table]] uses tensors as columns for tabular, heterogenous data (similar to the widely-used pandas data table), and the [[tensorfs]] is a hierarchical filesystem for tensor data that serves as the shared data workspace.
+    + The [[Gosl]] (_Go shader language_) that allows you to write Go (and [[Goal]]) functions that run on either the CPU or the [[GPU]], using the WebGPU framework that supports full GPU compute functionality in the web browser and on desktop platforms.
+
+* A [[tensor]] representation for n-dimensional data, which serves as the universal data type within the Lab framework. The [[table]] uses tensors as columns for tabular, heterogenous data (similar to the widely-used [pandas](https://pandas.pydata.org/) data table), and the [[tensorfs]] is a hierarchical filesystem for tensor data that serves as the shared data workspace.
 
 * Interactive, full-featured [[plot]]s and other GUI visualization tools.
 
-* The overarching `lab` API for flexibly connecting data and visualization components, providing the foundation for interactive data analysis applications integrating different Cogent Lab elements.
+* The [[lab]] user interface API for flexibly connecting data and visualization components, providing the foundation for interactive data analysis applications integrating different Cogent Lab elements.
+
 
-* The [[gosl]] (Go shader language) that allows you to write Go (and [[goal]]) functions that run on either the CPU or the GPU, using the WebGPU framework that supports full GPU compute functionality through the web browser.
diff --git a/docs/content/lab.md b/docs/content/lab.md
new file mode 100644
index 00000000..aad7a9e0
--- /dev/null
+++ b/docs/content/lab.md
@@ -0,0 +1,11 @@
+
+**Lab** contains graphical interface elements for Cogent Lab (Go docs: [[doc:lab]]), including:
+
+* [[Browser]] is a default combination of the following elements.
+
+* [[Tabs]] provides functions for creating tabs with data elements such as [[plot]]s and views of [[table]]s.
+
+* [[DataTree]] provides a tree-structured view of [[tensorfs]] and regular filesystem data.
+
+## Lab pages
+
diff --git a/docs/content/math.md b/docs/content/math.md
new file mode 100644
index 00000000..d97a917c
--- /dev/null
+++ b/docs/content/math.md
@@ -0,0 +1,393 @@
++++
+Categories = ["Goal"]
++++
+
+**Math** mode in [[Goal]] provides a [NumPy](https://numpy.org/doc/stable/index.html)-like language for mathematical expressions involving [[tensor]] data elements (see [[#Reference tables]]), which are transpiled into Go code for compilation or interactive interpretation. It is activated by a `#` character on a line, which is otherwise not a recognized symbol in Go, and `##` starts and stops multi-line blocks of math mode.
+
+The most important thing to remember about math mode is that everything must be a tensor! Any variables created in math mode are automatically tensors, but anything created outside of math mode must be converted to a tensor using the `array` function.
+
+```Goal
+fv := 42.0 // this is a float64
+
+// now we enter math mode:
+##
+tv := 42.0 // this is a tensor.Float64
+tfv := array(fv) // as is this
+##
+
+fmt.Printf("fv: %v %T\n", fv, fv)
+fmt.Printf("tv: %v %T\n", tv, tv)
+fmt.Printf("tfv: %v %T\n", tfv, tfv)
+```
+
+## Basics
+
+Here's how you can create, inspect, and manipulate tensor data:
+
+```Goal
+##
+a := [[1., 2., 3.], [4., 5., 6.]] 
+aShape := a.shape  // tensor of sizes of each dim
+b := zeros(aShape) // which can be used to make a new one
+c := a.reshape(3,2) // preserves data while reshaping
+d := arange(1, 11)  // like a for loop from 1..11 (exclusive max)
+e := linspace(1., 3., 11, true) // floats over a range: final arg = inclusive
+##
+
+fmt.Println("a:", a)
+fmt.Println("aShape:", aShape)
+fmt.Println("b:", b)
+fmt.Println("c:", c)
+fmt.Println("d:", d)
+fmt.Println("e:", e)
+```
+
+(go ahead and play around with any of the above expressions to explore the effects!)
+
+Note that, as in Python, you do need to add a decimal point to have a number treated as a floating point value in most cases -- otherwise it will be an int.
+
+You can perform math operations directly using standard operators:
+
+```Goal
+##
+a := [[1., 2., 3.], [4., 5., 6.]] 
+b := a * a
+c := sin(a)
+d := a * [3., 2., 1.] // smaller dims apply repeatedly
+##
+
+fmt.Println("a:", a)
+fmt.Println("b:", b)
+fmt.Println("c:", c)
+fmt.Println("d:", d)
+```
+
+See [[tensor math#Alignment of shapes]] for more details on [[tensor math]] operations, using the NumPy [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) logic.
+
+### Tensorfs
+
+In an interactive Goal shell (which we simulate here in the docs), variables in math mode are automatically saved to the [[tensorfs]] virtual data filesystem:
+
+```Goal
+##
+// make a new tensorfs directory for this example
+mkdir tfs
+cd tfs
+a := [[1., 2., 3.], [4., 5., 6.]]
+setcp("a_copy", a) // this will preserve values
+
+b := a * a
+a += a
+
+// list the current directory:
+ls -l
+// go back to root (all vars from this page are there!)
+cd ..
+// (add another ls -l here to see them all..)
+##
+
+fmt.Println("a:", a)
+fmt.Println("a get:", # get("tfs/a") #)
+fmt.Println("a copy:", # get("tfs/a_copy") #)
+fmt.Println("b:", b)
+```
+
+Note that the filesystem variables are pointers to the live variables, so they always reflect the latest changes, so the `setcp` command is useful for saving a copy that does not get further updated.
+
+## Slicing and indexing
+
+Math mode provides NumPy-like ways of extracting data from a tensor (examples follow Go versions in [[tensor#Views and values]]).
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,4)
+
+row1 := x[1]
+col1 := x[:,1]
+##
+
+fmt.Println("x:", x)
+fmt.Println("row1:", row1)
+fmt.Println("col:", col1)
+```
+
+Where `:` is an empty `slice` expression that indicates all values in that dimension.
+
+To get the column as a column vector, use `newaxis`:
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,4)
+
+col1 := x[:, 1, newaxis]
+##
+
+fmt.Println("col1:", col1)
+```
+
+To get ranges within each dimension, or reorder, use `slice` expressions similar to those used in accessing Go slices, but with a 3rd `Step` value, as in a standard Go `for` loop:
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,4)
+
+row1 := x[1, 1:3]  // only a subset of columns
+col1 := x[::-1, 1] // reverse order of rows dimension
+##
+
+fmt.Println("row1:", row1)
+fmt.Println("col1:", col1)
+```
+
+Ellipsis (`...`) makes it easy to get the last dimension(s):
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,2,2)
+
+last1 := x[..., 1]
+##
+
+fmt.Println("x", x)
+fmt.Println("last1:", last1)
+```
+
+As in [NumPy](https://numpy.org/doc/stable/index.html) (and standard Go slices), indexed subsets of a tensor are _views_ onto the original tensor, so that changes to the original values are immediately seen in these views. Use `copy` to make a new separate copy of the values to break this connection.
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,2,2)
+
+last1 := x[..., 1]
+cp1 := copy(last1)
+
+x[..., 1, 1] = 3.14 // note how we assign to all rows
+##
+
+fmt.Println("x", x)
+fmt.Println("last1:", last1)
+fmt.Println("cp1:", cp1)
+```
+
+### Masked by booleans
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,4)
+
+m := x[x>=6]
+mi := x >= 6
+##
+
+fmt.Println("m:", m)
+fmt.Println("mi:", mi)
+```
+
+### Arbitrary indexes
+
+```Goal
+##
+x := linspace(0., 12., 12., false).reshape(3,4)
+
+ixs := [[0, 1], [0, 1], [0, 2], [0, 2], [1, 1], [1, 1], [2, 2], [2, 2]].reshape(2,4,2)
+
+ix := x[ixs]
+##
+
+fmt.Println("ix:", ix)
+fmt.Println("ixs:", ixs)
+```
+
+
+## Reference tables
+
+The following tables summarize Goal math-mode syntax in terms of [NumPy](https://numpy.org/doc/stable/index.html) and the underlying Go code generated. For MATLAB equivalents, see [numpy-for-matlab-users](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html).
+
+* The _same:_ in Goal means that the same NumPy syntax works in Goal, minus the `np.` prefix, and likewise for _or:_ (where Goal also has additional syntax).
+* In the `tensor.Go` code, we sometimes just write a scalar number for simplicity, but these are actually `tensor.NewFloat64Scalar` etc.
+* Goal also has support for `string` tensors, e.g., for labels, and operators such as addition that make sense for strings are supported. Otherwise, strings are automatically converted to numbers using the `tensor.Float` interface. If you have any doubt about whether you've got a `tensor.Float64` when you expect one, use `tensor.AsFloat64Tensor` which makes sure.
+
+### Tensor shape
+
+| `tensor` Go  |   Goal      | NumPy   | Notes            |
+| ------------ | ----------- | ------  | ---------------- |
+| `a.NumDim()` | `ndim(a)` or `a.ndim` | `np.ndim(a)` or `a.ndim`   | number of dimensions of tensor `a` |
+| `a.Len()`    | `len(a)` or `a.len` or: | `np.size(a)` or `a.size`   | number of elements of tensor `a` |
+| `a.Shape().Sizes` | same: | `np.shape(a)` or `a.shape` | "size" of each dimension in a; `shape` returns a 1D `int` tensor |
+| `a.Shape().Sizes[1]` | same: | `a.shape[1]` | the number of elements of the 2nd dimension of tensor `a` |
+| `tensor.Reshape(a, 10, 2)` | same except no `a.shape = (10,2)`: | `a.reshape(10, 2)` or `np.reshape(a, 10, 2)` or `a.shape = (10,2)` | set the shape of `a` to a new shape that has the same total number of values (len or size); No option to change order in Goal: always row major; Goal does _not_ support direct shape assignment version. |
+| `tensor.Reshape(a, tensor.AsIntSlice(sh)...)` | same: | `a.reshape(10, sh)` or `np.reshape(a, sh)` | set shape based on list of dimension sizes in tensor `sh` |
+| `tensor.Reshape(a, -1)` or `tensor.As1D(a)` | same: | `a.reshape(-1)` or `np.reshape(a, -1)` | a 1D vector view of `a`; Goal does not support `ravel`, which is nearly identical. |
+| `tensor.Flatten(a)` | same: | `b = a.flatten()`   | returns a 1D copy of a |
+| `b := tensor.Clone(a)` | `b := copy(a)` or: | `b = a.copy()` | direct assignment `b = a` in Goal or NumPy just makes variable b point to tensor a; `copy` is needed to generate new underlying values (MATLAB always makes a copy) |
+| `tensor.Squeeze(a)` | same: |`a.squeeze()` | remove singleton dimensions of tensor `a`. |
+
+
+### Constructing
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| `tensor.NewFloat64FromValues(` `1, 2, 3)` | `[1., 2., 3.]` | `np.array([1., 2., 3.])` | define a 1D tensor |
+| (reshape) | `[[1., 2., 3.], [4., 5., 6.]]` or: | `(np.array([[1., 2., 3.], [4., 5., 6.]])` | define a 2x3 2D tensor |
+| (reshape) | `[[a, b], [c, d]]` or `block([[a, b], [c, d]])` | `np.block([[a, b], [c, d]])` | construct a matrix from blocks `a`, `b`, `c`, and `d` |
+| `tensor.NewFloat64(3,4)` | `zeros(3,4)` | `np.zeros((3, 4))` | 3x4 2D tensor of float64 zeros; Goal does not use "tuple" so no double parens |
+| `tensor.NewFloat64(3,4,5)` | `zeros(3, 4, 5)` | `np.zeros((3, 4, 5))` | 3x4x5 three-dimensional tensor of float64 zeros |
+| `tensor.NewFloat64Ones(3,4)` | `ones(3, 4)`  | `np.ones((3, 4))` | 3x4 2D tensor of 64-bit floating point ones |
+| `tensor.NewFloat64Full(5.5, 3,4)` | `full(5.5, 3, 4)` | `np.full((3, 4), 5.5)` | 3x4 2D tensor of 5.5; Goal variadic arg structure requires value to come first |
+| `tensor.NewFloat64Rand(3,4)` | `rand(3, 4)` or `slrand(c, fi, 3, 4)` | `rng.random(3, 4)` | 3x4 2D float64 tensor with uniform random 0..1 elements; `rand` uses current Go `rand` source, while `slrand` uses [gosl](../gpu/gosl/slrand) GPU-safe call with counter `c` and function index `fi` and key = index of element |
+| TODO: | TODO: |`np.concatenate((a,b),1)` or `np.hstack((a,b))` or `np.column_stack((a,b))` or `np.c_[a,b]` | concatenate columns of a and b |
+| TODO: | TODO: |`np.concatenate((a,b))` or `np.vstack((a,b))` or `np.r_[a,b]` | concatenate rows of a and b |
+| TODO: | TODO: |`np.tile(a, (m, n))`   | create m by n copies of a |
+| TODO: | TODO: |`a[np.r_[:len(a),0]]`  | `a` with copy of the first row appended to the end |
+
+### Ranges and grids
+
+See [NumPy](https://numpy.org/doc/stable/user/how-to-partition.html) docs for details.
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| `tensor.NewIntRange(1, 11)` | same: |`np.arange(1., 11.)` or `np.r_[1.:11.]` or `np.r_[1:10:10j]` | create an increasing vector; `arange` in goal is always ints; use `linspace` or `tensor.AsFloat64` for floats |
+| . | same: |`np.arange(10.)` or `np.r_[:10.]` or `np.r_[:9:10j]` | create an increasing vector; 1 arg is the stop value in a slice |
+| . | . |`np.arange(1.,11.)` `[:, np.newaxis]` | create a column vector |
+| `t.NewFloat64` `SpacedLinear(` `1, 3, 4, true)` | `linspace(1,3,4,true)` |`np.linspace(1,3,4)` | 4 equally spaced samples between 1 and 3, inclusive of end (use `false` at end for exclusive) |
+| . | . |`np.mgrid[0:9.,0:6.]` or `np.meshgrid(r_[0:9.],` `r_[0:6.])` | two 2D tensors: one of x values, the other of y values |
+| . | . |`ogrid[0:9.,0:6.]` or `np.ix_(np.r_[0:9.],` `np.r_[0:6.]` | the best way to eval functions on a grid |
+| . | . |`np.meshgrid([1,2,4],` `[2,4,5])` | . |  ??
+| . | . |`np.ix_([1,2,4],` `[2,4,5])`    |  the best way to eval functions on a grid |
+
+### Basic indexing
+
+See [NumPy basic indexing](https://numpy.org/doc/stable/user/basics.indexing.html#basic-indexing). Tensor Go uses the `Reslice` function for all cases (repeated `tensor.` prefix replaced with `t.` to take less space). Here you can clearly see the advantage of Goal in allowing significantly more succinct expressions to be written for accomplishing critical tensor functionality.
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| `t.Reslice(a, 1, 4)` | same: |`a[1, 4]` | access element in second row, fifth column in 2D tensor `a` |
+| `t.Reslice(a, -1)` | same: |`a[-1]` | access last element |
+| `t.Reslice(a,` `1, t.FullAxis)` | same: |`a[1]` or `a[1, :]` | entire second row of 2D tensor `a`; unspecified dimensions are equivalent to `:` (could omit second arg in Reslice too) |
+| `t.Reslice(a,` `Slice{Stop:5})` | same: |`a[0:5]` or `a[:5]` or `a[0:5, :]` | 0..4 rows of `a`; uses same Go slice ranging here: (start:stop) where stop is _exclusive_ |
+| `t.Reslice(a,` `Slice{Start:-5})` | same: |`a[-5:]` | last 5 rows of 2D tensor `a` |
+| `t.Reslice(a,` `t.NewAxis,` `Slice{Start:-5})` | same: |`a[newaxis, -5:]` | last 5 rows of 2D tensor `a`, as a column vector |
+| `t.Reslice(a,` `Slice{Stop:3},` `Slice{Start:4, Stop:9})` | same: |`a[0:3, 4:9]` | The first through third rows and fifth through ninth columns of a 2D tensor, `a`. |
+| `t.Reslice(a,` `Slice{Start:2,` `Stop:25,` `Step:2}, t.FullAxis)` | same: |`a[2:21:2,:]` | every other row of `a`, starting with the third and going to the twenty-first |
+| `t.Reslice(a,` `Slice{Step:2},` `t.FullAxis)` | same: |`a[::2, :]`  | every other row of `a`, starting with the first |
+| `t.Reslice(a,`, `Slice{Step:-1},` `t.FullAxis)` | same: |`a[::-1,:]`  | `a` with rows in reverse order |
+| `t.Clone(t.Reslice(a,` `1, t.FullAxis))` | `b = copy(a[1, :])` or: | b = a[1, :].copy()` | without the copy, `y` would point to a view of values in `x`; `copy` creates distinct values, in this case of _only_ the 2nd row of `x` -- i.e., it "concretizes" a given view into a literal, memory-continuous set of values for that view. |
+| `tmath.Assign(` `t.Reslice(a,` `Slice{Stop:5}),` `t.NewIntScalar(2))` | same: |`a[:5] = 2` | assign the value 2 to 0..4 rows of `a` |
+| (you get the idea) | same: |`a[:5] = b[:, :5]` | assign the values in the first 5 columns of `b` to the first 5 rows of `a` |
+
+### Boolean tensors and indexing
+
+See [NumPy boolean indexing](https://numpy.org/doc/stable/user/basics.indexing.html#boolean-array-indexing).
+
+Note that Goal only supports boolean logical operators (`&&` and `||`) on boolean tensors, not the single bitwise operators `&` and `|`.
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| `tmath.Greater(a, 0.5)` | same: | `(a > 0.5)` | `bool` tensor of shape `a` with elements `(v > 0.5)` |
+| `tmath.And(a, b)` | `a && b` | `logical_and(a,b)` | element-wise AND operator on `bool` tensors |
+| `tmath.Or(a, b)` | `a \|\| b` | `np.logical_or(a,b)` | element-wise OR operator on `bool` tensors | 
+| `tmath.Negate(a)` | `!a` | ? | element-wise negation on `bool` tensors | 
+| `tmath.Assign(` `tensor.Mask(a,` `tmath.Less(a, 0.5),` `0)` | same: |`a[a < 0.5]=0` | `a` with elements less than 0.5 zeroed out |
+| `tensor.Flatten(` `tensor.Mask(a,` `tmath.Less(a, 0.5)))` | same: |`a[a < 0.5].flatten()` | a 1D list of the elements of `a` < 0.5 (as a copy, not a view) |
+| `tensor.Mul(a,` `tmath.Greater(a, 0.5))` | same: |`a * (a > 0.5)` | `a` with elements less than 0.5 zeroed out |
+
+### Advanced index-based indexing
+
+See [NumPy integer indexing](https://numpy.org/doc/stable/user/basics.indexing.html#integer-array-indexing).  Note that the current NumPy version of indexed is rather complex and difficult for many people to understand, as articulated in this [NEP 21 proposal](https://numpy.org/neps/nep-0021-advanced-indexing.html). 
+
+**TODO:** not yet implemented:
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| . | . |`a[np.ix_([1, 3, 4], [0, 2])]` | rows 2,4 and 5 and columns 1 and 3. |
+| . | . |`np.nonzero(a > 0.5)` | find the indices where (a > 0.5) |
+| . | . |`a[:, v.T > 0.5]` | extract the columns of `a` where column vector `v` > 0.5 |
+| . | . |`a[:,np.nonzero(v > 0.5)[0]]` | extract the columns of `a` where vector `v` > 0.5 |
+| . | . |`a[:] = 3` | set all values to the same scalar value |
+| . | . |`np.sort(a)` or `a.sort(axis=0)` | sort each column of a 2D tensor, `a` |
+| . | . |`np.sort(a, axis=1)` or `a.sort(axis=1)` | sort the each row of 2D tensor, `a` |
+| . | . |`I = np.argsort(a[:, 0]); b = a[I,:]` | save the tensor `a` as tensor `b` with rows sorted by the first column |
+| . | . |`np.unique(a)` | a vector of unique values in tensor `a` |
+
+### Basic math operations (add, multiply, etc)
+
+In Goal and NumPy, the standard `+, -, *, /` operators perform _element-wise_ operations because those are well-defined for all dimensionalities and are consistent across the different operators, whereas matrix multiplication is specifically used in a 2D linear algebra context, and is not well defined for the other operators.
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| `tmath.Add(a,b)` | same: |`a + b` | element-wise addition; Goal does this string-wise for string tensors |
+| `tmath.Mul(a,b)` | same: |`a * b` | element-wise multiply |
+| `tmath.Div(a,b)` | same: |`a/b`   | element-wise divide. _important:_ this always produces a floating point result. |
+| `tmath.Mod(a,b)` | same: |`a%b`   | element-wise modulous (works for float and int) |
+| `tmath.Pow(a,3)` | same: | `a**3`  | element-wise exponentiation |
+| `tmath.Cos(a)`   | same: | `cos(a)` | element-wise function application |
+
+### 2D Matrix Linear Algebra
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| `matrix.Mul(a,b)` | same: |`a @ b` | matrix multiply |
+| `tensor.Transpose(a)` | or `a.T` |`a.transpose()` or `a.T` | transpose of `a` |
+| TODO: | . |`a.conj().transpose() or a.conj().T` | conjugate transpose of `a` |
+| `matrix.Det(a)` | `matrix.Det(a)` | `np.linalg.det(a)` | determinant of `a` |
+| `matrix.Identity(3)` | . |`np.eye(3)` | 3x3 identity matrix |
+| `matrix.Diagonal(a)` | . |`np.diag(a)` | returns a vector of the diagonal elements of 2D tensor, `a`. Goal returns a read / write view. |
+| . | . |`np.diag(v, 0)` | returns a square diagonal matrix whose nonzero values are the elements of vector, v |
+| `matrix.Trace(a)` | . |`np.trace(a)` | returns the sum of the elements along the diagonal of `a`. |
+| `matrix.Tri()` | . |`np.tri()` | returns a new 2D Float64 matrix with 1s in the lower triangular region (including the diagonal) and the remaining upper triangular elements zero |
+| `matrix.TriL(a)` | . |`np.tril(a)` | returns a copy of `a` with the lower triangular elements (including the diagonal) from `a` and the remaining upper triangular elements zeroed out |
+| `matrix.TriU(a)` | . |`np.triu(a)` | returns a copy of `a` with the upper triangular elements (including the diagonal) from `a` and the remaining lower triangular elements zeroed out |
+| . | . |`linalg.inv(a)` | inverse of square 2D tensor a |
+| . | . |`linalg.pinv(a)` | pseudo-inverse of 2D tensor a |
+| . | . |`np.linalg.matrix_rank(a)` | matrix rank of a 2D tensor a |
+| . | . |`linalg.solve(a, b)` if `a` is square; `linalg.lstsq(a, b)` otherwise | solution of `a x = b` for x |
+| . | . |Solve `a.T x.T = b.T` instead | solution of x a = b for x |
+| . | . |`U, S, Vh = linalg.svd(a); V = Vh.T` | singular value decomposition of a |
+| . | . |`linalg.cholesky(a)` | Cholesky factorization of a 2D tensor |
+| . | . |`D,V = linalg.eig(a)` | eigenvalues and eigenvectors of `a`, where `[V,D]=eig(a,b)` eigenvalues and eigenvectors of `a, b` where |
+| . | . |`D,V = eigs(a, k=3)`  | `D,V = linalg.eig(a, b)` |  find the k=3 largest eigenvalues and eigenvectors of 2D tensor, a |
+| . | . |`Q,R = linalg.qr(a)`  | QR decomposition
+| . | . |`P,L,U = linalg.lu(a)` where `a == P@L@U` | LU decomposition with partial pivoting (note: P(MATLAB) == transpose(P(NumPy))) | 
+| . | . |`x = linalg.lstsq(Z, y)` | perform a linear regression of the form |
+
+### Statistics
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| . | `a.max()` or `max(a)` or `stats.Max(a)` | `a.max()` or `np.nanmax(a)` | maximum element of `a`, Goal always ignores `NaN` as missing data |
+| . | . |`a.max(0)` | maximum element of each column of tensor `a` |
+| . | . |`a.max(1)` | maximum element of each row of tensor `a` |
+| . | . |`np.maximum(a, b)` | compares a and b element-wise, and returns the maximum value from each pair |
+| `stats.L2Norm(a)` | . | `np.sqrt(v @ v)` or `np.linalg.norm(v)` | L2 norm of vector v |
+| . | . |`cg`  | conjugate gradients solver |
+
+### FFT and complex numbers
+
+todo: huge amount of work needed to support complex numbers throughout!
+
+| `tensor` Go  |   Goal      | NumPy  | Notes            |
+| ------------ | ----------- | ------ | ---------------- |
+| . | . |`np.fft.fft(a)` | Fourier transform of `a` |
+| . | . |`np.fft.ifft(a)` | inverse Fourier transform of `a` |
+| . | . |`signal.resample(x, np.ceil(len(x)/q))` |  downsample with low-pass filtering |
+
+### Tensorfs 
+
+The [[tensorfs]] data filesystem provides a global filesystem-like workspace for storing tensor data, and [[Goal]] has special commands and functions to facilitate interacting with it.
+
+In an interactive `goal` shell, when you do `##` to switch into math mode, the prompt changes to show your current directory in the tensorfs, not the regular OS filesystem, and the final prompt character turns into a `#`.
+
+Use `get` and `set` (aliases for `tensorfs.Get` and `tensorfs.Set`) to retrieve and store data in the tensorfs:
+
+* `x := get("path/to/item")` retrieves the tensor data value at given path, which can then be used directly in an expression or saved to a new variable as in this example.
+
+* `set("path/to/item", x)` saves tensor data to given path, overwriting any existing value for that item if it already exists, and creating a new one if not. `x` can be any data expression.
+
+You can use the standard shell commands to navigate around the data filesystem:
+
+* `cd <dir>` to change the current working directory. By default, new variables created in the shell are also recorded into the current working directory for later access.
+
+* `ls [-l,r] [dir]` list the contents of a directory; without arguments, it shows the current directory. The `-l` option shows each element on a separate line with its shape. `-r` does a recursive list through subdirectories.
+
+* `mkdir <dir>` makes a new subdirectory.
+
diff --git a/docs/content/matrix.md b/docs/content/matrix.md
new file mode 100644
index 00000000..beff5dad
--- /dev/null
+++ b/docs/content/matrix.md
@@ -0,0 +1,55 @@
++++
+Categories = ["Tensor"]
++++
+
+**Matrix** provides standard 2D linear algebra functions on [[tensor]]s, using [gonum](https://github.com/gonum/gonum) functions for the implementations.
+
+Basic matrix multiplication:
+
+```Goal
+##
+a := linspace(1., 4., 4., true).reshape(2, 2)
+v := [2., 3.]
+
+b  := matrix.Mul(a, a)
+c  := matrix.Mul(a, v)
+d  := matrix.Mul(v, a)
+##
+
+fmt.Println("a:", a)
+fmt.Println("b:", b)
+fmt.Println("c:", c)
+fmt.Println("d:", d)
+```
+
+And other standard matrix operations:
+
+```Goal
+##
+a := linspace(1., 4., 4., true).reshape(2, 2)
+
+t  := tensor.Transpose(a)
+d  := matrix.Det(a)
+i  := matrix.Inverse(a)
+##
+
+fmt.Println("t:", t)
+fmt.Println("d:", d)
+fmt.Println("i:", i)
+```
+
+Including eigenvector functions:
+
+<!--- TODO: not working with 2 return values here: -->
+
+```Goal
+##
+a := [[2., 1.], [1., 2.]]
+
+v := matrix.EigSym(a)
+##
+
+fmt.Println("a:", a)
+fmt.Println("v:", v)
+```
+
diff --git a/docs/content/metric.md b/docs/content/metric.md
new file mode 100644
index 00000000..3ddb559e
--- /dev/null
+++ b/docs/content/metric.md
@@ -0,0 +1,62 @@
++++
+Categories = ["Stats"]
++++
+
+**Metric** computes distance metrics for comparing [[tensor]]s. The different metrics supported are: [[doc:stats/metric.Metrics]].
+
+```Goal
+##
+x := rand(12)
+y := rand(12)
+
+l2 := metric.L2Norm(x, y)
+r := metric.Correlation(x, y)
+##
+
+fmt.Println("x:", x)
+fmt.Println("y:", y)
+fmt.Println("l2:", l2)
+fmt.Println("r:", r)
+```
+
+As with statistics, n-dimensional data is treated in a row-based manner, computing a metric value over the data across rows:
+
+```Goal
+##
+x := rand(12).reshape(3,4)
+y := rand(12).reshape(3,4)
+
+l2 := metric.L2Norm(x, y)
+r := metric.Correlation(x, y)
+##
+
+fmt.Println("x:", x)
+fmt.Println("y:", y)
+fmt.Println("l2:", l2)
+fmt.Println("r:", r)
+```
+
+To get a single value for each row representing the metric computed on the elements within that row, you need to iterate and slice:
+
+<!--- TODO: can't use anything reasonable in the max on this damn for loop! -->
+<!--- x.DimSize(0) or something grabbed in math mode.. -->
+
+```Goal
+##
+x := rand(12).reshape(3,4)
+y := rand(12).reshape(3,4)
+l2 := zeros(3)
+## 
+
+for i := range 3 {
+    ##
+    l2[i] = metric.L2Norm(x[i], y[i])
+    ##
+}
+
+fmt.Println("x:", x)
+fmt.Println("y:", y)
+fmt.Println("l2:", l2)
+
+```
+
diff --git a/docs/content/plot-editor.md b/docs/content/plot-editor.md
index f2606c3c..5f0c5aa4 100644
--- a/docs/content/plot-editor.md
+++ b/docs/content/plot-editor.md
@@ -1,5 +1,5 @@
 +++
-Categories = ["Plots"]
+Categories = ["Plot"]
 +++
 
 A **plot editor** allows you to create data plots that users can customize interactively.
diff --git a/docs/content/plot.md b/docs/content/plot.md
index 9f8f21f7..4536a978 100644
--- a/docs/content/plot.md
+++ b/docs/content/plot.md
@@ -1,8 +1,4 @@
-+++
-Categories = ["Plots"]
-+++
-
-**Plots** allow you to graphically plot data. See [[plot editor]] for interactive customization of plots.
+You can graphically **Plot** data using the [[doc:plots]] package. See [[plot editor]] for interactive customization of plots.
 
 You can plot a [[vector]]:
 
@@ -31,3 +27,131 @@ plot.Styler(x, func(s *plot.Style) {
 })
 plots.NewLine(plt, x)
 ```
+
+
+<!--- TODO: s.Plot.Title = "My Plot" // overall Plot styles -->
+<!--- plots.NewLine(plt, plot.Data{plot.X: xd, plot.Y: yd, plot.Low: low, plot.High: high}) -->
+    
+    
+### Tensor metadata
+
+Styler functions can be attached directly to a `tensor.Tensor` via its metadata, and the `Plotter` elements will automatically grab these functions from any data source that has such metadata set. This allows the data generator to directly set default styling parameters, which can always be overridden later by adding more styler functions. Tying the plot styling directly to the source data allows all of the relevant logic to be put in one place, instead of spreading this logic across different places in the code.
+
+Here is an example of how this works:
+
+```Goal
+tx, ty := tensor.NewFloat64(21), tensor.NewFloat64(21)
+for i := range 21 {
+	tx.SetFloat1D(float64(i*5), i)
+	ty.SetFloat1D(50.0+40*math.Sin((float64(i)/8)*math.Pi), i)
+}
+// attach stylers to the Y axis data: that is where plotter looks for it
+plot.SetStyler(ty, func(s *plot.Style) {
+	s.Plot.Title = "Test Line"
+	s.Plot.XAxis.Label = "X Axis"
+	s.Plot.YAxisLabel = "Y Axis"
+	s.Plot.Scale = 2
+	s.Plot.XAxis.Range.SetMax(105)
+	s.Plot.SetLinesOn(plot.On).SetPointsOn(plot.On)
+	s.Line.Color = colors.Uniform(colors.Red)
+	s.Point.Color = colors.Uniform(colors.Blue)
+	s.Range.SetMin(0).SetMax(100)
+})
+
+// somewhere else in the code:
+
+plt := lab.NewPlot(b)
+// NewLine automatically gets stylers from ty tensor metadata
+plots.NewLine(plt, plot.Data{plot.X: tx, plot.Y: ty})
+```
+
+## Plot Types
+
+The following are the builtin standard plot types, in the `plots` package:
+
+## 1D and 2D XY Data
+
+### XY
+
+`XY` is the workhorse standard Plotter, taking at least `X` and `Y` inputs, and plotting lines and / or points at each X, Y point. 
+
+Optionally `Size` and / or `Color` inputs can be provided, which apply to the points. Thus, by using a `Point.Shape` of `Ring` or `Circle`, you can create a bubble plot by providing Size and Color data.
+
+### Bar
+
+`Bar` takes `Y` inputs, and draws bars of corresponding height.
+
+An optional `High` input can be provided to also plot error bars above each bar.
+
+To create a plot with multiple error bars, multiple Bar Plotters are created, with `Style.Width` parameters that have a shared `Stride = 1 / number of bars` and `Offset` that increments for each bar added.  The `plots.NewBars` function handles this directly.
+
+### ErrorBar
+
+`XErrorBar` and `YErrorBar` take `X`, `Y`, `Low`, and `High` inputs, and draws an `I` shaped error bar at the X, Y coordinate with the error "handles" around it.
+
+### Labels
+
+`Labels` takes `X`, `Y` and `Labels` string inputs and plots labels at the given coordinates.
+
+### Box
+
+`Box` takes `X`, `Y` (median line), `U`, `V` (box first and 3rd quartile values), and `Low`, `High` (Min, Max) inputs, and renders a box plot with error bars.
+
+### XFill, YFill
+
+`XFill` and `YFill` are used to draw filled regions between pairs of X or Y points, using the `X`, `Y`, and `Low`, `High` values to specify the center point (X, Y) and the region below / left and above / right to fill around that central point.
+
+XFill along with an XY line can be used to draw the equivalent of the [matplotlib fill_between](https://matplotlib.org/stable/plot_types/basic/fill_between.html#sphx-glr-plot-types-basic-fill-between-py) plot.
+
+YFill can be used to draw the equivalent of the [matplotlib violin plot](https://matplotlib.org/stable/plot_types/stats/violin.html#sphx-glr-plot-types-stats-violin-py).
+
+### Pie
+
+`Pie` takes a list of `Y` values that are plotted as the size of segments of a circular pie plot.  Y values are automatically normalized for plotting.
+
+TODO: implement, details on mapping, 
+
+## 2D Grid-based
+
+### ColorGrid
+
+Input = Values and X, Y size
+
+### Contour
+
+??
+
+### Vector
+
+X,Y,U,V
+
+Quiver?
+
+## 3D 
+
+TODO: use math32 3D projection math and you can just take each 3d point and reduce to 2D. For stuff you want to actually be able to use in SVG, it needs to ultimately be 2D, so it makes sense to support basic versions here, including XYZ (points, lines), Bar3D, wireframe.
+
+Could also have a separate plot3d package based on `xyz` that is true 3D for interactive 3D plots of surfaces or things that don't make sense in this more limited 2D world.
+
+# Statistical plots
+
+The `statplot` package provides functions taking `tensor` data that produce statistical plots of the data, including Quartiles (Box with Median, Quartile, Min, Max), Histogram (Bar), Violin (YFill), Range (XFill), Cluster... 
+
+TODO: add a Data scatter that plots points to overlay on top of Violin or Box.
+
+## LegendGroups
+
+* implements current legend grouping logic -- ends up being a multi-table output -- not sure how to interface.
+
+## Histogram
+
+## Quartiles
+
+## Violin
+
+## Range
+
+## Cluster
+
+## Plot pages
+
diff --git a/docs/content/shell.md b/docs/content/shell.md
new file mode 100644
index 00000000..4cbb4efc
--- /dev/null
+++ b/docs/content/shell.md
@@ -0,0 +1,173 @@
++++
+Categories = ["Goal"]
++++
+
+In general, the [[Goal]] shell mode behavior mimics that of `bash`.
+
+The following documentation describes specific use-cases.
+
+## Environment variables
+
+* `set <var> <value>` (space delimited as in all shell mode, no equals)
+
+## Output redirction
+
+* Standard output redirect: `>` and `>&` (and `|`, `|&` if needed)
+
+## Control flow
+
+* Any error stops the script execution, except for statements wrapped in `[ ]`, indicating an "optional" statement, e.g.:
+
+```sh
+cd some; [mkdir sub]; cd sub
+```
+
+* `&` at the end of a statement runs in the background (as in bash) -- otherwise it waits until it completes before it continues.
+
+* `jobs`, `fg`, `bg`, and `kill` builtin commands function as in usual bash.
+
+## Shell functions (aliases)
+
+Use the `command` keyword to define new functions for Shell mode execution, which can then be used like any other command, for example:
+
+```sh
+command list {
+	ls -la args...
+}
+```
+
+```sh
+cd data
+list *.tsv
+```
+
+The `command` is transpiled into a Go function that takes `args ...string`.  In the command function body, you can use the `args...` expression to pass all of the args, or `args[1]` etc to refer to specific positional indexes, as usual.
+
+The command function name is registered so that the standard shell execution code can run the function, passing the args.  You can also call it directly from Go code using the standard parentheses expression.
+
+## Script Files and Makefile-like functionality
+
+As with most scripting languages, a file of goal code can be made directly executable by appending a "shebang" expression at the start of the file:
+
+```sh
+#!/usr/bin/env goal
+```
+
+When executed this way, any additional args are available via an `args []any` variable, which can be passed to a command as follows:
+```go
+install {args...}
+```
+or by referring to specific arg indexes etc.
+
+To make a script behave like a standard Makefile, you can define different `command`s for each of the make commands, and then add the following at the end of the file to use the args to run commands:
+
+```go
+goal.RunCommands(args)
+```
+
+See [make](cmd/goal/testdata/make) for an example, in `cmd/goal/testdata/make`, which can be run for example using:
+
+```sh
+./make build
+```
+
+Note that there is nothing special about the name `make` here, so this can be done with any file.
+
+The `make` package defines a number of useful utility functions that accomplish the standard dependency and file timestamp checking functionality from the standard `make` command, as in the [magefile](https://magefile.org/dependencies/) system.  Note that the goal direct shell command syntax makes the resulting make files much closer to a standard bash-like Makefile, while still having all the benefits of Go control and expressions, compared to magefile.
+
+TODO: implement and document above.
+
+## SSH connections to remote hosts
+
+Any number of active SSH connections can be maintained and used dynamically within a script, including simple ways of copying data among the different hosts (including the local host).  The Go mode execution is always on the local host in one running process, and only the shell commands are executed remotely, enabling a unique ability to easily coordinate and distribute processing and data across various hosts.
+
+Each host maintains its own working directory and environment variables, which can be configured and re-used by default whenever using a given host.
+
+* `gossh hostname.org [name]`  establishes a connection, using given optional name to refer to this connection.  If the name is not provided, a sequential number will be used, starting with 1, with 0 referring always to the local host.
+
+* `@name` then refers to the given host in all subsequent commands, with `@0` referring to the local host where the goal script is running. 
+
+* You can use a variable name for the server, like this (the explicit `$ $` shell mode is required because a line starting with `{` is not recognized as shell code):
+```Go
+server := "@myserver"
+${server} ls$
+```
+
+### Explicit per-command specification of host
+
+```sh
+@name cd subdir; ls
+```
+
+### Default host
+
+```sh
+@name // or:
+gossh @name
+```
+
+uses the given host for all subsequent commands (unless explicitly specified), until the default is changed.  Use `gossh @0` to return to localhost.
+
+### Redirect input / output among hosts
+
+The output of a remote host command can be sent to a file on the local host:
+```sh
+@name cat hostfile.tsv > @0:localfile.tsv
+```
+Note the use of the `:` colon delimiter after the host name here.  TODO: You cannot send output to a remote host file (e.g., `> @host:remotefile.tsv`) -- maybe with sftp?
+
+The output of any command can also be piped to a remote host as its standard input:
+```sh
+ls *.tsv | @host cat > files.txt
+```
+
+### scp to copy files easily
+
+The builtin `scp` function allows easy copying of files across hosts, using the persistent connections established with `gossh` instead of creating new connections as in the standard scp command.
+
+`scp` is _always_ run from the local host, with the remote host filename specified as `@name:remotefile`
+
+```sh
+scp @name:hostfile.tsv localfile.tsv
+```
+
+Importantly, file wildcard globbing works as expected:
+```sh
+scp @name:*.tsv @0:data/
+```
+
+and entire directories can be copied, as in `cp -a` or `cp -r` (this behavior is automatic and does not require a flag).
+
+### Close connections
+
+```sh
+gossh close
+```
+
+Will close all active connections and return the default host to @0.  All active connections are also automatically closed when the shell terminates.
+
+## Other Utilties
+
+** TODO: need a replacement for findnm -- very powerful but garbage..
+
+## Rules for Go vs. Shell determination
+
+These are the rules used to determine whether a line is Go vs. Shell (word = IDENT token):
+
+* `$` at the start: Shell.
+* Within Shell, `{}`: Go
+* Within Go, `$ $`: Shell
+* Line starts with `go` keyword: if no `( )` then Shell, else Go
+* Line is one word: Shell
+* Line starts with `path` expression (e.g., `./myexec`) : Shell
+* Line starts with `"string"`: Shell
+* Line starts with `word word`: Shell
+* Line starts with `word {`: Shell
+* Otherwise: Go
+
+TODO: update above
+
+## Multiple statements per line
+
+* Multiple statements can be combined on one line, separated by `;` as in regular Go and shell languages.  Critically, the language determination for the first statement determines the language for the remaining statements; you cannot intermix the two on one line, when using `;` 
+
diff --git a/docs/content/stats.md b/docs/content/stats.md
new file mode 100644
index 00000000..dfc96a07
--- /dev/null
+++ b/docs/content/stats.md
@@ -0,0 +1,63 @@
+In addition to the basic statistics functions described below, there are several packages for computing **statistics** on [[tensor]] and [[table]] data:
+
+* [[metric]] computes similarity / distance metrics for comparing two tensors, and associated distance / similarity matrix functions.
+
+* [[cluster]] implements agglomerative clustering of items based on metric distance / similarity matrix data.
+
+* [[convolve]] convolves data (e.g., for smoothing).
+
+* [[glm]] fits a general linear model for one or more dependent variables as a function of one or more independent variables. This encompasses all forms of regression.
+
+* [[histogram]] bins data into groups and reports the frequency of elements in the bins.
+
+## Stats
+
+The standard statistics functions supported are enumerated in [[doc:stats/stats.Stats]], and include things like `Mean`, `Var`iance, etc.
+
+```Goal
+##
+x := linspace(0., 12., 12., false)
+d := x.reshape(3,2,2) // n-dimensional data is handled 
+
+mean := stats.Mean(x)
+meand := stats.Mean(d)
+##
+
+fmt.Println("x:", x)
+fmt.Println("mean:", mean)
+fmt.Println("d:", d)
+fmt.Println("mean d:", meand)
+```
+
+You can see that the stats on n-dimensional data are automatically computed across the _row_ (outer-most) dimension. You can reshape your data and the results as needed to get the statistics you want.
+
+## Grouping and stats
+
+The `stats` package has functions that group values in a [[tensor]] or a [[table]] so that statistics can be computed across the groups. The grouping uses [[tensorfs]] to organize the groups and statistics, as in the following example:
+
+```Goal
+dt := table.New().SetNumRows(4)
+dt.AddStringColumn("Name")
+dt.AddFloat32Column("Value")
+for i := range 4 {
+	gp := "A"
+	if i >= 2 {
+		gp = "B"
+	}
+	dt.Column("Name").SetStringRow(gp, i, 0)
+	dt.Column("Value").SetFloatRow(float64(i), i, 0)
+}
+dir, _ := tensorfs.NewDir("Group")
+stats.TableGroups(dir, dt, "Name")
+stats.TableGroupStats(dir, stats.StatMean, dt, "Value")
+gdt := stats.GroupStatsAsTableNoStatName(dir)
+
+fmt.Println("dt:", dt)
+fmt.Println("tensorfs listing:")
+fmt.Println(dir.ListLong(true, 2))
+
+fmt.Println("gdt:", gdt)
+```
+
+## Stats pages
+
diff --git a/docs/content/table.md b/docs/content/table.md
new file mode 100644
index 00000000..bff9d765
--- /dev/null
+++ b/docs/content/table.md
@@ -0,0 +1,85 @@
+**table** provides a DataTable / DataFrame structure similar to [pandas](https://pandas.pydata.org/) and [xarray](http://xarray.pydata.org/en/stable/) in Python, and [Apache Arrow Table](https://github.com/apache/arrow/tree/master/go/arrow/array/table.go), using [[tensor]] n-dimensional columns aligned by common outermost row dimension.
+
+Data in the table is accessed by first getting the `Column` tensor (typically by name), and then using the [[doc:tensor.RowMajor]] methods to access data within that tensor in a row-wise manner:
+
+```Goal
+dt := table.New()
+dt.AddStringColumn("Name")
+dt.AddFloat64Column("Data", 2, 2)
+dt.SetNumRows(3)
+
+dt.Column("Name").SetStringRow("item0", 0, 0)
+dt.Column("Name").SetStringRow("item1", 1, 0)
+dt.Column("Name").SetStringRow("item2", 2, 0)
+
+dt.Column("Data").SetFloatRow(55, 0, 0)
+dt.Column("Data").SetFloatRow(102, 1, 1) // note: last arg is 1D "cell" index
+dt.Column("Data").SetFloatRow(37, 2, 3)
+
+val := dt.Column("Data").FloatRow(2, 3)
+
+fmt.Println(dt)
+fmt.Printf("val: %v\n", val)
+```
+
+## Sorting and filtering
+
+The `Column` method creates a [[doc:tensor.Rows]] for the underlying column values, with a list of indexes used for the row-level access, which enables efficient sorting and filtering by row, as only these indexes need to be updated, not the underlying data values. The indexes are maintained on the table, which provides an indexed view onto the underlying data values that are stored in a separate [[doc:table.Columns]] structure. Thus, there can be multiple different such table views onto the same underlying columns data.
+
+```Goal
+dt := table.New()
+dt.AddStringColumn("Name")
+dt.AddFloat64Column("Data")
+dt.SetNumRows(3)
+
+fruits := []string{"peach", "apple", "orange"}
+
+for i := range 3 {
+	dt.Column("Name").SetStringRow(fruits[i], i, 0)
+	dt.Column("Data").SetFloatRow(float64(i+1), i, 0)
+}
+
+dt.Sequential()
+dt.SortColumn("Data", tensor.Descending)
+fmt.Println(dt)
+
+dt.Sequential()
+dt.Filter(func(dt *table.Table, row int) bool {
+	return dt.Column("Data").FloatRow(row, 0) > 1
+})
+fmt.Println(dt)
+```
+
+## CSV / TSV file format
+
+Tables can be saved and loaded from CSV (comma separated values) or TSV (tab separated values) files.  See the next section for special formatting of header strings in these files to record the type and tensor cell shapes.
+
+### Type and Tensor Headers
+
+To capture the type and shape of the columns, we support the following header formatting.  We weren't able to find any other widely supported standard (please let us know if there is one that we've missed!)
+
+Here is the mapping of special header prefix characters to standard types:
+```go
+'$': etensor.STRING,
+'%': etensor.FLOAT32,
+'#': etensor.FLOAT64,
+'|': etensor.INT64,
+'@': etensor.UINT8,
+'^': etensor.BOOL,
+```
+
+Columns that have tensor cell shapes (not just scalars) are marked as such with the *first* such column having a `<ndim:dim,dim..>` suffix indicating the shape of the *cells* in this column, e.g., `<2:5,4>` indicates a 2D cell Y=5,X=4.  Each individual column is then indexed as `[ndims:x,y..]` e.g., the first would be `[2:0,0]`, then `[2:0,1]` etc.
+
+### Example
+
+Here's a TSV file for a scalar String column (`Name`), a 2D 1x4 tensor float32 column (`Input`), and a 2D 1x2 float32 `Output` column.
+
+```
+_H:	$Name	%Input[2:0,0]<2:1,4>	%Input[2:0,1]	%Input[2:0,2]	%Input[2:0,3]	%Output[2:0,0]<2:1,2>	%Output[2:0,1]
+_D:	Event_0	1	0	0	0	1	0
+_D:	Event_1	0	1	0	0	1	0
+_D:	Event_2	0	0	1	0	0	1
+_D:	Event_3	0	0	0	1	0	1
+```
+
+
diff --git a/docs/content/tabs.md b/docs/content/tabs.md
new file mode 100644
index 00000000..6a319c16
--- /dev/null
+++ b/docs/content/tabs.md
@@ -0,0 +1,6 @@
++++
+Categories = ["Lab"]
++++
+
+**Tabs** makes it easy to add various graphical views of data elements in a tabbed [[Browser]] view.  Go API docs: [[doc:lab.Tabs]].
+
diff --git a/docs/content/tensor-math.md b/docs/content/tensor-math.md
new file mode 100644
index 00000000..8becb030
--- /dev/null
+++ b/docs/content/tensor-math.md
@@ -0,0 +1,73 @@
++++
+Categories = ["Tensor"]
++++
+
+The [[doc:tensor/tmath]] package implements most of the standard library [math](https://pkg.go.dev/math) functions, including basic arithmetic operations, for [[tensor]]s.
+
+For example:
+
+```Goal
+x := tensor.NewFromValues(0., 1., 2., 3.)
+add := tmath.Add(x, x)
+sub := tmath.Sub(x, x)
+mul := tmath.Mul(x, x)
+div := tmath.Div(x, x)
+
+fmt.Println("add:", add)
+fmt.Println("sub:", sub)
+fmt.Println("mul:", mul)
+fmt.Println("div:", div)
+```
+
+As you can see, the operations are performed element-wise; see the [[matrix]] package for 2D matrix multiplication and related operations.
+
+Math functions can be performed:
+
+```Goal
+x := tensor.NewFromValues(0., 1., 2., 3.)
+sin := tmath.Sin(x)
+atan := tmath.Atan2(x, tensor.NewFromValues(3.0))
+pow := tmath.Pow(x, tensor.NewFromValues(2.0))
+
+fmt.Println("sin:", sin)
+fmt.Println("atan:", atan)
+fmt.Println("pow:", pow)
+```
+
+See the info below on [[#Alignment of shapes]] for the rules governing the way that different-shaped tensors are aligned for these computations.
+
+Parallel goroutines will be used for implementing these computations if the tensors are sufficiently large to make it generally beneficial to do so.
+
+There are also `*Out` versions of each function, which take an additional output tensor to store the results into, instead of creating a new one. For computationally-intensive pipelines, it can be significantly more efficient to re-use pre-allocated outputs (which are automatically and efficiently resized to the proper capacity if not already).
+
+## Alignment of shapes
+
+The NumPy concept of [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) is critical for flexibly defining the semantics for how functions taking two n-dimensional Tensor arguments behave when they have different shapes. Ultimately, the computation operates by iterating over the length of the longest tensor, and the question is how to _align_ the shapes so that a meaningful computation results from this.
+
+If both tensors are 1D and the same length, then a simple matched iteration over both can take place. However, the broadcasting logic defines what happens when there is a systematic relationship between the two, enabling powerful (but sometimes difficult to understand) computations to be specified.
+
+The following examples demonstrate the logic:
+
+Innermost dimensions that match in dimension are iterated over as you'd expect:
+```
+Image  (3d array): 256 x 256 x 3
+Scale  (1d array):             3
+Result (3d array): 256 x 256 x 3
+```
+
+Anything with a dimension size of 1 (a "singleton") will match against any other sized dimension:
+```
+A      (4d array):  8 x 1 x 6 x 1
+B      (3d array):      7 x 1 x 5
+Result (4d array):  8 x 7 x 6 x 5
+```
+In the innermost dimension here, the single value in A acts like a "scalar" in relationship to the 5 values in B along that same dimension, operating on each one in turn. Likewise for the singleton second-to-last dimension in B.
+
+Any non-1 mismatch represents an error:
+```
+A      (2d array):      2 x 1
+B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched
+```
+
+The `AlignShapes` function performs this shape alignment logic, and the `WrapIndex1D` function is used to compute a 1D index into a given shape, based on the total output shape sizes, wrapping any singleton dimensions around as needed. These are used in the [tmath](tmath) package for example to implement the basic binary math operators.
+
diff --git a/docs/content/tensor.md b/docs/content/tensor.md
new file mode 100644
index 00000000..d12046cc
--- /dev/null
+++ b/docs/content/tensor.md
@@ -0,0 +1,261 @@
+The **tensor.Tensor** represents n-dimensional data of various types, providing similar functionality to the widely used [NumPy](https://numpy.org/doc/stable/index.html) libraries in Python, and the commercial MATLAB framework.
+
+The [[Goal]] [[math]] mode operates on tensor data exclusively: see documentation there for convenient shortcut expressions for common tensor operations. This page documents the underlying Go language implementation of tensors. See [[doc:tensor]] for the Go API docs, [[tensor math]] for basic math operations that can be performed on tensors, and [[stats]] for statistics functions operating on tensor data.
+
+A tensor can be constructed from a Go slice, and accessed using a 1D index into that slice:
+
+```Goal
+x := tensor.NewFromValues(0, 1, 2, 3)
+val := x.Float1D(2)
+
+fmt.Println(val)
+```
+
+Note that the type of the tensor is inferred from the values, using standard Go rules, so you would need to add a decimal to obtain floating-point numbers instead of `int`s:
+
+```Goal
+x := tensor.NewFromValues(0., 1., 2., 3.)
+val := x.Float1D(2)
+
+fmt.Printf("value: %v %T\n", val, val)
+```
+
+You can reshape the tensor by setting the number of values along any number of dimensions, preserving any values that are compatible with the new shape, and access values using n-dimensional indexes:
+
+```Goal
+x := tensor.NewFromValues(0, 1, 2, 3)
+x.SetShapeSizes(2, 2)
+val := x.Float(1, 0)
+
+fmt.Println(val)
+```
+
+The dimensions are organized in _row major_ format (same as [NumPy](https://numpy.org/doc/stable/index.html)), so the number of rows comes first, then columns; the last dimension (i.e., columns in this case) is the _innermost_ dimension, so that each column represents a contiguous array of values in memory, while rows are _not_ contiguous.
+
+You can create a tensor with a specified shape, and fill it with a single value:
+
+```Goal
+x := tensor.NewFloat32(2, 2)
+tensor.SetAllFloat64(x, 1)
+
+fmt.Println(x)
+```
+
+Note the detailed formatting available from the standard stringer `String()` method on any tensor, providing the shape sizes on the first line, with dimensional indexes for the values.
+
+A given tensor can hold any standard Go value type, including `int`, `float32` and `float64`, and `string` values (using Go generics for the numerical types), and it provides accessor methods for the following "core" types:
+* `Float` methods set and return `float64` values.
+* `Int` methods set and return `int` values.
+* `String` methods set and return `string` values.
+
+For example, you can directly get a `string` representation of any value:
+
+```Goal
+x := tensor.NewFromValues(0, 1, 2, 3)
+val := x.String1D(2)
+
+fmt.Println(val)
+```
+
+### Setting values
+
+To set a value, you typically use a type-specific method most appropriate for the underlying data type:
+
+```Goal
+x := tensor.NewFloat32(2, 2)
+tensor.SetAllFloat64(x, 1)
+
+x.SetFloat(3.14, 0, 1) // value comes first, then the appropriate number of indexes as varargs...
+
+fmt.Println(x)
+```
+
+There are also `Value`, `Value1D`, and `Set`, `Set1D` methods that use Generics to operate on the actual underlying data type:
+
+```Goal
+x := tensor.NewFloat32(2, 2)
+tensor.SetAllFloat64(x, 1)
+
+x.Set(3.1415, 0, 1)
+
+val := x.Value(0, 1)
+v1d := x.Value1D(1)
+
+fmt.Println(x)
+fmt.Printf("val: %v %T\n", val, val)
+fmt.Printf("v1d: %v\n", v1d)
+```
+
+## Views and values
+
+The abstract [[doc:tensor.Tensor]] interface is implemented (and extended) by the concrete [[doc:tensor.Values]] types, which are what we've been getting in the above examples, and directly manage an underlying Go slice of values. These can be reshaped and appended to, like a Go slice.
+
+In addition, there are various _View_ types that wrap other tensors and provide more flexible ways of accessing the tensor values, and provide all of the same core functionality present in [NumPy](https://numpy.org/doc/stable/index.html).
+
+### Sliced
+
+First, this is the starting Values tensor, as a 3x4 matrix:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+fmt.Println(x)
+```
+
+Using the [[doc:tensor.Reslice]] function, you can extract any subset from this 2D matrix, for example the values in a given row or column:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+row1 := tensor.Reslice(x, 1) // row is first index; column index is unspecified = all
+col1 := tensor.Reslice(x, tensor.FullAxis, 1) // explicitly request all rows
+
+fmt.Println("row1:", row1)
+fmt.Println("col1:", col1)
+```
+
+Note that the column values got turned into a 1D tensor in this process -- to keep it as a column vector (2D with 1 column and 3 rows), you need to add an extra "blank" dimension, which can be done using the `tensor.NewAxis` value:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+col1 := tensor.Reslice(x, tensor.FullAxis, 1, tensor.NewAxis)
+
+fmt.Println("col1:", col1)
+```
+
+You can also specify sub-ranges along each dimension, or even reorder the values, by using a [[doc:tensor.Slice]] element that has `Start`, `Stop` and `Step` values, like those of a standard Go `for` loop expression, with sensible default behavior for zero values:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+col1 := tensor.Reslice(x, tensor.Slice{Step: -1}, 1)
+
+fmt.Println("col1:", col1)
+```
+
+You can use `tensor.Ellipsis` to specify `FullAxis` for all the dimensions up to those specified, to flexibly focus on the innermost dimensions:
+
+```Goal
+x := tensor.NewFloat64(3, 2, 2)
+x.CopyFrom(tensor.NewIntRange(12))
+
+last1 := tensor.Reslice(x, tensor.Ellipsis, 1)
+
+fmt.Println("x:", x)
+fmt.Println("last1:", last1)
+```
+
+As in [NumPy](https://numpy.org/doc/stable/index.html) (and standard Go slices), the [[doc:tensor.Sliced]] view wraps the original source tensor, so that if you change a value in that original source, _the value automatically changes in the view_ as well. Use the `AsValues()` method on a view to get a new concrete [[doc:tensor.Values]] representation of the view (equivalent to the NumPy `copy` function).
+
+```Goal
+x := tensor.NewFloat64(3, 2, 2)
+x.CopyFrom(tensor.NewIntRange(12))
+
+last1 := tensor.Reslice(x, tensor.Ellipsis, 1)
+
+fmt.Println("values:", x)
+fmt.Println("last1:", last1)
+
+values := last1.AsValues()
+x.Set(3.14, 1, 0, 1)
+
+fmt.Println("values:", x)
+fmt.Println("last1:", last1)
+fmt.Println("values:", values)
+```
+
+### Masked by booleans
+
+You can apply a boolean mask to a tensor, to extract arbitrary values where the boolean value is true:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+m := tensor.NewMasked(x).Filter(func(tsr tensor.Tensor, idx int) bool {
+	return tsr.Float1D(idx) >= 6
+})
+vals := m.AsValues()
+
+fmt.Println("masked: ", m)
+fmt.Println("vals: ", vals)
+```
+
+Note that missing values are encoded as `NaN`, which allows the resulting [[doc:tensor.Masked]] view to retain the shape of the original, and all of the other math functions operating on tensors properly treat `NaN` as a missing value that is ignored. You can also get the concrete values as shown, but this reduces the shape to 1D by default.
+
+### Indexes
+
+You can extract arbitrary values from a tensor using a list of indexes (as a tensor), where the shape of that list then determines the shape of the resulting view:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+ixs := tensor.NewIntFromValues(
+	0, 1,
+	0, 1,
+	0, 2,
+	0, 2,
+	1, 1,
+	1, 1,
+	2, 2,
+	2, 2)
+ixs.SetShapeSizes(2,4,2) // note: last 2 is the number of indexes into source
+
+ix := tensor.NewIndexed(x, ixs)
+
+fmt.Println(ix)
+```
+
+You can also feed Masked indexes into the [[doc:tensor.Indexed]] view to get a reshaped view:
+
+```Goal
+x := tensor.NewFloat64(3, 4)
+x.CopyFrom(tensor.NewIntRange(12))
+
+m := tensor.NewMasked(x).Filter(func(tsr tensor.Tensor, idx int) bool {
+	return tsr.Float1D(idx) >= 6
+})
+ixs := m.SourceIndexes(true)
+ixs.SetShapeSizes(2,3,2)
+ix := tensor.NewIndexed(x, ixs)
+
+fmt.Println("masked:", ix)
+```
+
+### Differences from NumPy
+
+[NumPy](https://numpy.org/doc/stable/index.html) is somewhat confusing with respect to the distinction between _basic indexing_ (using a single index or sliced ranges of indexes along each dimension) versus _advanced indexing_ (using an array of indexes or bools). Basic indexing returns a _view_ into the original data (where changes to the view directly affect the underlying type), while advanced indexing returns a _copy_.
+
+However, rather confusingly (per this [stack overflow question](https://stackoverflow.com/questions/15691740/does-assignment-with-advanced-indexing-copy-array-data)), you can do direct assignment through advanced indexing (more on this below):
+```Python
+a[np.array([1,2])] = 5  # or:
+a[a > 0.5] = 1          # boolean advanced indexing
+```
+
+In the tensor package, all of the View types ([[doc:tensor.Sliced]], [[doc:tensor.Reshaped]], [[doc:tensor.Masked]], and [[doc:tensor.Indexed]]) are unambiguously wrappers around a source tensor, and their values change when the source changes. Use `.AsValues()` to break that connection and get the view as a new set of concrete values.
+
+### Row, Cell access
+
+The [[doc:tensor.RowMajor]] interface provides a convenient set of methods to access tensors where the first, outermost dimension is a row, and there may be multiple remaining dimensions after that. All concrete [[doc:tensor.Values]] tensors implement this interface.
+
+For example, you can easily get a `SubSpace` tensor that contains the values within a given row, and set values within a row tensor using a flat 1D "cell" index that applies to the values within a row:
+
+```Goal
+x := tensor.NewFloat64(3, 2, 2)
+x.CopyFrom(tensor.NewIntRange(12))
+
+x.SetFloatRow(3.14, 1, 2) // set 1D cell 2 in row 1
+row1 := x.RowTensor(1)
+
+fmt.Println("values:", x)
+fmt.Println("row1:", row1)
+```
+
+## Tensor pages
+
diff --git a/docs/content/tensorfs.md b/docs/content/tensorfs.md
new file mode 100644
index 00000000..b0bfb03f
--- /dev/null
+++ b/docs/content/tensorfs.md
@@ -0,0 +1,104 @@
+**tensorfs** provides a virtual filesystem for [[tensor]] data, which can be accessed for example in [[Goal]] [[math]] mode expressions, like the variable storage system in [IPython / Jupyter](https://ipython.readthedocs.io/en/stable/interactive/tutorial.html), with the advantage that the hierarchical structure of a filesystem allows data to be organized in more intuitive and effective ways. For example, data at different time scales can be put into different directories, or multiple different statistics computed on a given set of data can be put into a subdirectory. [[stats#Groups]] creates pivot-table style groups of values as directories, for example.
+
+`tensorfs` implements the Go [fs](https://pkg.go.dev/io/fs) interface, and can be accessed using fs-general tools, including the cogent core `filetree` and the [[Goal]] shell. 
+
+There are two main APIs, one for direct usage within Go, and another that is used by the [[Goal]] framework for interactive shell-based access, which always operates relative to a current working directory.
+
+## Go API
+
+There are type-specific accessor methods for the standard high-frequency data types: `Float64`, `Float32`, `Int`, and `StringValue` (`String` is taken by the stringer interface):
+
+```Goal
+dir, _ := tensorfs.NewDir("root")
+x := dir.Float64("data", 3, 3)
+
+fmt.Println(dir.ListLong(true, 2))
+fmt.Println(x)
+```
+
+Which are wrappers around the underlying Generic `Value` method:
+
+```go
+x := tensorfs.Value[float64](dir, "data", 3, 3)
+```
+
+These methods create the given tensor if it does not yet exist, and otherwise return it, providing a robust order-independent way of accessing / constructing the relevant data.
+
+For efficiency, _there are no checks_ on the existing value relative to the arguments passed, so if you end up using the same name for two different things, that will cause problems that will hopefully become evident. If you want to ensure that the size is correct, you should use an explicit `tensor.SetShapeSizes` call, which is still quite efficient if the size is the same. You can also have an initial call to `Value` that has no size args, and then set the size later -- that works fine.
+
+There are also a few other variants of the `Value` functionality:
+* `Scalar` calls `Value` with a size of 1.
+* `Values` makes multiple tensor values of the same shape, with a final variadic list of names.
+* `ValueType` takes a `reflect.Kind` arg for the data type, which can then be a variable.
+* `SetTensor` sets a tensor to a node of given name, creating the node if needed. This is also available as the `Set` method on a directory node.
+
+`DirTable` returns a [[table]] with all the tensors under a given directory node, which can then be used for making plots or doing other forms of data analysis. This works best when each tensor has the same outer-most row dimension. The table is persistent and very efficient, using direct pointers to the underlying tensor values.
+
+## Directories
+
+A given [[doc:tensorfs.Node]] can either have a [[tensor]] value or be a _subdirectory_ containing a list of other node lements.
+
+To make a new subdirectory:
+
+```Goal
+dir, _ := tensorfs.NewDir("root")
+subdir := dir.Dir("sub")
+x := subdir.Float64("data", 3, 3)
+
+fmt.Println(dir.ListLong(true, 2))
+fmt.Println(x)
+```
+
+If the subdirectory doesn't exist yet, it will be made, and otherwise it is returned. Any errors will be logged and a nil returned, likely causing a panic unless you expect it to fail and check for that.
+
+## Operating over values across directories
+
+The `ValuesFunc` method on a directory node allows you to easily extract a list of values across any number of subdirectories (it only returns the final value "leaves" of the filetree):
+
+```Goal
+dir, _ := tensorfs.NewDir("root")
+subdir := dir.Dir("sub")
+x := subdir.Float64("x", 3, 3)
+subsub := subdir.Dir("stats")
+y := subsub.Float64("y", 1)
+z := subsub.Float64("z", 1)
+
+fmt.Println(dir.ListLong(true, 2))
+
+vals := dir.ValuesFunc(nil) // nil = get everything
+for _, v := range vals {
+	fmt.Println(v)
+}
+```
+
+Thus, even if you have statistics or other data nested down deep, this will "flatten" the hierarchy and allow you to process it. Here's a version that actually filters the nodes:
+
+```Goal
+dir, _ := tensorfs.NewDir("root")
+subdir := dir.Dir("sub")
+x := subdir.Float64("x", 5, 5)
+subsub := subdir.Dir("stats")
+y := subsub.Float64("y", 1)
+z := subsub.Float64("z", 1)
+
+fmt.Println(dir.ListLong(true, 2))
+
+vals := dir.ValuesFunc(func(n *tensorfs.Node) bool {
+    if n.IsDir() { // can filter by dirs here too (get to see everything)
+        return true
+    }
+    return n.Tensor.NumDims() == 1
+})
+for _, v := range vals {
+	fmt.Println(v)
+}
+```
+
+There are parallel `Node` and `Value` access methods for directory nodes, with the Value ones being:
+
+* `tsr := dir.Value("name")` returns tensor directly, will panic if not valid
+* `tsrs, err := dir.Values("name1", "name2")` returns a slice of tensor values within directory by name. a plain `.Values()` returns all values.
+* `tsrs := dir.ValuesFunc(<filter func>)` walks down directories (unless filtered) and returns a flat list of all tensors found. Goes in "directory order" = order nodes were added.
+* `tsrs := dir.ValuesAlphaFunc(<filter func>)` is like `ValuesFunc` but traverses in alpha order at each node.
+
+
diff --git a/docs/docs.go b/docs/docs.go
index aca96714..acfc8dea 100644
--- a/docs/docs.go
+++ b/docs/docs.go
@@ -9,6 +9,7 @@ import (
 
 	"cogentcore.org/core/content"
 	"cogentcore.org/core/core"
+	"cogentcore.org/core/htmlcore"
 	_ "cogentcore.org/lab/yaegilab"
 )
 
@@ -18,6 +19,8 @@ var econtent embed.FS
 func main() {
 	b := core.NewBody("Cogent Lab")
 	ct := content.NewContent(b).SetContent(econtent)
+	ctx := ct.Context
+	ctx.AddWikilinkHandler(htmlcore.GoDocWikilink("doc", "cogentcore.org/lab"))
 	b.AddTopBar(func(bar *core.Frame) {
 		core.NewToolbar(bar).Maker(ct.MakeToolbar)
 	})
diff --git a/go.mod b/go.mod
index 2b2471c1..3ee86809 100644
--- a/go.mod
+++ b/go.mod
@@ -8,9 +8,9 @@ go 1.23.4
 // https://github.com/googleapis/go-genproto/issues/1015
 
 require (
-	cogentcore.org/core v0.3.12-0.20250622201146-a16b152763fe
+	cogentcore.org/core v0.3.12-0.20250629235109-951ce94de7ce
 	github.com/cogentcore/readline v0.1.3
-	github.com/cogentcore/yaegi v0.0.0-20240724064145-e32a03faad56
+	github.com/cogentcore/yaegi v0.0.0-20250622201820-b7838bdd95eb
 	github.com/mitchellh/go-homedir v1.1.0
 	github.com/nsf/termbox-go v1.1.1
 	github.com/stretchr/testify v1.10.0
diff --git a/go.sum b/go.sum
index 3d720925..7060c1b0 100644
--- a/go.sum
+++ b/go.sum
@@ -1,5 +1,5 @@
-cogentcore.org/core v0.3.12-0.20250622201146-a16b152763fe h1:Q8xl3SzvUB/TLaGlXoXQSm+gljmjA7c+d5p8ubXO4DI=
-cogentcore.org/core v0.3.12-0.20250622201146-a16b152763fe/go.mod h1:A82XMVcq3XOiG9TpT+rt7/iYD5Eu87bxxmTk8O7F4cM=
+cogentcore.org/core v0.3.12-0.20250629235109-951ce94de7ce h1:sF2rNFNzzof1mW/ZkbxjEFgd5mZC52V4qgtbY4F80AA=
+cogentcore.org/core v0.3.12-0.20250629235109-951ce94de7ce/go.mod h1:A82XMVcq3XOiG9TpT+rt7/iYD5Eu87bxxmTk8O7F4cM=
 github.com/Bios-Marcel/wastebasket/v2 v2.0.3 h1:TkoDPcSqluhLGE+EssHu7UGmLgUEkWg7kNyHyyJ3Q9g=
 github.com/Bios-Marcel/wastebasket/v2 v2.0.3/go.mod h1:769oPCv6eH7ugl90DYIsWwjZh4hgNmMS3Zuhe1bH6KU=
 github.com/BurntSushi/toml v0.3.1/go.mod h1:xHWCNGjB5oqiDr8zfno3MHue2Ht5sIBksp03qcyfWMU=
@@ -28,8 +28,8 @@ github.com/cogentcore/readline v0.1.3 h1:tYmjP3XHvsGwhsDLkAp+vBhkERmLFENZfftyPOR
 github.com/cogentcore/readline v0.1.3/go.mod h1:IHVtJHSKXspK7CMg3OC/bbPEXxO++dFlug/vsPktvas=
 github.com/cogentcore/webgpu v0.23.0 h1:hrjnnuDZAPSRsqBjQAsJOyg2COGztIkBbxL87r0Q9KE=
 github.com/cogentcore/webgpu v0.23.0/go.mod h1:ciqaxChrmRRMU1SnI5OE12Cn3QWvOKO+e5nSy+N9S1o=
-github.com/cogentcore/yaegi v0.0.0-20240724064145-e32a03faad56 h1:Fz1uHiFCHnijFcMXzn36KLamcx5q4pxoR5rKCrcXIcQ=
-github.com/cogentcore/yaegi v0.0.0-20240724064145-e32a03faad56/go.mod h1:+MGpZ0srBmeJ7aaOLTdVss8WLolt0/y/plVHLpxgd3A=
+github.com/cogentcore/yaegi v0.0.0-20250622201820-b7838bdd95eb h1:vXYqPLO36pRyyk1cVILVlk+slDI+Q7N4bgeWlh1sjA0=
+github.com/cogentcore/yaegi v0.0.0-20250622201820-b7838bdd95eb/go.mod h1:+MGpZ0srBmeJ7aaOLTdVss8WLolt0/y/plVHLpxgd3A=
 github.com/coreos/etcd v3.3.10+incompatible/go.mod h1:uF7uidLiAD3TWHmW31ZFd/JWoc32PjwdhPthX9715RE=
 github.com/coreos/go-etcd v2.0.0+incompatible/go.mod h1:Jez6KQU2B/sWsbdaef3ED8NzMklzPG4d5KIOhIy30Tk=
 github.com/coreos/go-semver v0.2.0/go.mod h1:nnelYz7RCh+5ahJtPPxZlU+153eP4D4r3EedlOD2RNk=
diff --git a/goal/README.md b/goal/README.md
index ac3daa9d..411b6e7b 100644
--- a/goal/README.md
+++ b/goal/README.md
@@ -2,6 +2,10 @@
 
 Goal is an augmented version of the Go language, which combines the best parts of Go, `bash`, and Python, to provide and integrated shell and numerical expression processing experience, which can be combined with the [yaegi](https://github.com/traefik/yaegi) interpreter to provide an interactive "REPL" (read, evaluate, print loop).
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/goal) for full documentation.
+
+## Design discussion
+
 Goal transpiles directly into Go, so it automatically leverages all the great features of Go, and remains fully compatible with it. The augmentation is designed to overcome some of the limitations of Go in specific domains:
 
 * Shell scripting, where you want to be able to directly call other executable programs with arguments, without having to navigate all the complexity of the standard [os.exec](https://pkg.go.dev/os/exec) package.
@@ -52,392 +56,3 @@ The rationale and mnemonics for using `$` and `#` are as follows:
 
 * `#` is commonly used to refer to numbers. It is also often used as a comment syntax, but on balance the number semantics and uniqueness relative to Go syntax outweigh that issue.
 
-# Examples
-
-Here are a few useful examples of Goal code:
-
-You can easily perform handy duration and data size formatting:
-
-```go
-22010706 * time.Nanosecond  // 22.010706ms
-datasize.Size(44610930)     // 42.5 MB
-```
-
-# Shell mode
-
-## Environment variables
-
-* `set <var> <value>` (space delimited as in all shell mode, no equals)
-
-## Output redirction
-
-* Standard output redirect: `>` and `>&` (and `|`, `|&` if needed)
-
-## Control flow
-
-* Any error stops the script execution, except for statements wrapped in `[ ]`, indicating an "optional" statement, e.g.:
-
-```sh
-cd some; [mkdir sub]; cd sub
-```
-
-* `&` at the end of a statement runs in the background (as in bash) -- otherwise it waits until it completes before it continues.
-
-* `jobs`, `fg`, `bg`, and `kill` builtin commands function as in usual bash.
-
-## Shell functions (aliases)
-
-Use the `command` keyword to define new functions for Shell mode execution, which can then be used like any other command, for example:
-
-```sh
-command list {
-	ls -la args...
-}
-```
-
-```sh
-cd data
-list *.tsv
-```
-
-The `command` is transpiled into a Go function that takes `args ...string`.  In the command function body, you can use the `args...` expression to pass all of the args, or `args[1]` etc to refer to specific positional indexes, as usual.
-
-The command function name is registered so that the standard shell execution code can run the function, passing the args.  You can also call it directly from Go code using the standard parentheses expression.
-
-## Script Files and Makefile-like functionality
-
-As with most scripting languages, a file of goal code can be made directly executable by appending a "shebang" expression at the start of the file:
-
-```sh
-#!/usr/bin/env goal
-```
-
-When executed this way, any additional args are available via an `args []any` variable, which can be passed to a command as follows:
-```go
-install {args...}
-```
-or by referring to specific arg indexes etc.
-
-To make a script behave like a standard Makefile, you can define different `command`s for each of the make commands, and then add the following at the end of the file to use the args to run commands:
-
-```go
-goal.RunCommands(args)
-```
-
-See [make](cmd/goal/testdata/make) for an example, in `cmd/goal/testdata/make`, which can be run for example using:
-
-```sh
-./make build
-```
-
-Note that there is nothing special about the name `make` here, so this can be done with any file.
-
-The `make` package defines a number of useful utility functions that accomplish the standard dependency and file timestamp checking functionality from the standard `make` command, as in the [magefile](https://magefile.org/dependencies/) system.  Note that the goal direct shell command syntax makes the resulting make files much closer to a standard bash-like Makefile, while still having all the benefits of Go control and expressions, compared to magefile.
-
-TODO: implement and document above.
-
-## SSH connections to remote hosts
-
-Any number of active SSH connections can be maintained and used dynamically within a script, including simple ways of copying data among the different hosts (including the local host).  The Go mode execution is always on the local host in one running process, and only the shell commands are executed remotely, enabling a unique ability to easily coordinate and distribute processing and data across various hosts.
-
-Each host maintains its own working directory and environment variables, which can be configured and re-used by default whenever using a given host.
-
-* `gossh hostname.org [name]`  establishes a connection, using given optional name to refer to this connection.  If the name is not provided, a sequential number will be used, starting with 1, with 0 referring always to the local host.
-
-* `@name` then refers to the given host in all subsequent commands, with `@0` referring to the local host where the goal script is running. 
-
-* You can use a variable name for the server, like this (the explicit `$ $` shell mode is required because a line starting with `{` is not recognized as shell code):
-```Go
-server := "@myserver"
-${server} ls$
-```
-
-### Explicit per-command specification of host
-
-```sh
-@name cd subdir; ls
-```
-
-### Default host
-
-```sh
-@name // or:
-gossh @name
-```
-
-uses the given host for all subsequent commands (unless explicitly specified), until the default is changed.  Use `gossh @0` to return to localhost.
-
-### Redirect input / output among hosts
-
-The output of a remote host command can be sent to a file on the local host:
-```sh
-@name cat hostfile.tsv > @0:localfile.tsv
-```
-Note the use of the `:` colon delimiter after the host name here.  TODO: You cannot send output to a remote host file (e.g., `> @host:remotefile.tsv`) -- maybe with sftp?
-
-The output of any command can also be piped to a remote host as its standard input:
-```sh
-ls *.tsv | @host cat > files.txt
-```
-
-### scp to copy files easily
-
-The builtin `scp` function allows easy copying of files across hosts, using the persistent connections established with `gossh` instead of creating new connections as in the standard scp command.
-
-`scp` is _always_ run from the local host, with the remote host filename specified as `@name:remotefile`
-
-```sh
-scp @name:hostfile.tsv localfile.tsv
-```
-
-Importantly, file wildcard globbing works as expected:
-```sh
-scp @name:*.tsv @0:data/
-```
-
-and entire directories can be copied, as in `cp -a` or `cp -r` (this behavior is automatic and does not require a flag).
-
-### Close connections
-
-```sh
-gossh close
-```
-
-Will close all active connections and return the default host to @0.  All active connections are also automatically closed when the shell terminates.
-
-## Other Utilties
-
-** TODO: need a replacement for findnm -- very powerful but garbage..
-
-## Rules for Go vs. Shell determination
-
-These are the rules used to determine whether a line is Go vs. Shell (word = IDENT token):
-
-* `$` at the start: Shell.
-* Within Shell, `{}`: Go
-* Within Go, `$ $`: Shell
-* Line starts with `go` keyword: if no `( )` then Shell, else Go
-* Line is one word: Shell
-* Line starts with `path` expression (e.g., `./myexec`) : Shell
-* Line starts with `"string"`: Shell
-* Line starts with `word word`: Shell
-* Line starts with `word {`: Shell
-* Otherwise: Go
-
-TODO: update above
-
-## Multiple statements per line
-
-* Multiple statements can be combined on one line, separated by `;` as in regular Go and shell languages.  Critically, the language determination for the first statement determines the language for the remaining statements; you cannot intermix the two on one line, when using `;` 
-
-# Math mode
-
-The math mode in Goal is designed to be generally compatible with Python NumPy / SciPy syntax, so that the widespread experience with that syntax transfers well to Goal. This syntax is also largely compatible with MATLAB and other languages as well. However, we did not fully replicate the NumPy syntax, instead choosing to clean up a few things and generally increase consistency with Go.
-
-In general the Goal global functions are named the same as NumPy, without the `np.` prefix, which improves readability. It should be very straightforward to write a conversion utility that converts existing NumPy code into Goal code, and that is a better process than trying to make Goal itself perfectly compatible.
-
-All elements of a Goal math expression are [tensors](../tensor) (i.e., `tensor.Tensor`), which can represent everything from a scalar to an n-dimenstional tensor, with different _views_ that support the arbitrary slicing and flexible forms of indexing documented in the table below.  These are called an `ndarray` in NumPy terms.  See [array vs. tensor](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html#array-or-matrix-which-should-i-use) NumPy docs for more information.  Note that Goal does not have a distinct `matrix` type; everything is a tensor, and when these are 2D, they function appropriately via the [matrix](../tensor/matrix) package.
-
-The _view_ versions of `Tensor` include `Sliced`, `Reshaped`,  `Masked`, `Indexed`, and `Rows`, each of which wraps around another "source" `Tensor`, and provides its own way of accessing the underlying data:
-
-* `Sliced` has an arbitrary set of indexes for each dimension, so access to values along that dimension go through the indexes.  Thus, you could reverse the order of the columns (dimension 1), or only operate on a subset of them.
-
-* `Masked` has a `tensor.Bool` tensor that filters access to the underlying source tensor through a mask: anywhere the bool value is `false`, the corresponding source value is not settable, and returns `NaN` (missing value) when accessed.
-
-* `Indexed` uses a tensor of indexes where the final, innermost dimension is the same size as the number of dimensions in the wrapped source tensor. The overall shape of this view is that of the remaining outer dimensions of the Indexes tensor, and like other views, assignment and return values are taken from the corresponding indexed value in the wrapped source tensor.
-
-    The current NumPy version of indexed is rather complex and difficult for many people to understand, as articulated in this [NEP 21 proposal](https://numpy.org/neps/nep-0021-advanced-indexing.html). The `Indexed` view at least provides a simpler way of representing the indexes into the source tensor, instead of requiring multiple parallel 1D arrays.
-
-* `Rows` is an optimized version of `Sliced` with indexes only for the first, outermost, _row_ dimension.
-
-The following sections provide a full list of equivalents between the `tensor` Go code, Goal, NumPy, and MATLAB, based on the table in [numpy-for-matlab-users](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html).
-* The _same:_ in Goal means that the same NumPy syntax works in Goal, minus the `np.` prefix, and likewise for _or:_ (where Goal also has additional syntax).
-* In the `tensor.Go` code, we sometimes just write a scalar number for simplicity, but these are actually `tensor.NewFloat64Scalar` etc.
-* Goal also has support for `string` tensors, e.g., for labels, and operators such as addition that make sense for strings are supported. Otherwise, strings are automatically converted to numbers using the `tensor.Float` interface. If you have any doubt about whether you've got a `tensor.Float64` when you expect one, use `tensor.AsFloat64Tensor` which makes sure.
-
-## Tensor shape
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `a.NumDim()` | `ndim(a)` or `a.ndim` | `np.ndim(a)` or `a.ndim`   | `ndims(a)` | number of dimensions of tensor `a` |
-| `a.Len()`    | `len(a)` or `a.len` or: | `np.size(a)` or `a.size`   | `numel(a)` | number of elements of tensor `a` |
-| `a.Shape().Sizes` | same: | `np.shape(a)` or `a.shape` | `size(a)`  | "size" of each dimension in a; `shape` returns a 1D `int` tensor |
-| `a.Shape().Sizes[1]` | same: | `a.shape[1]` | `size(a,2)` | the number of elements of the 2nd dimension of tensor `a` |
-| `tensor.Reshape(a, 10, 2)` | same except no `a.shape = (10,2)`: | `a.reshape(10, 2)` or `np.reshape(a, 10, 2)` or `a.shape = (10,2)` | `reshape(a,10,2)` | set the shape of `a` to a new shape that has the same total number of values (len or size); No option to change order in Goal: always row major; Goal does _not_ support direct shape assignment version. |
-| `tensor.Reshape(a, tensor.AsIntSlice(sh)...)` | same: | `a.reshape(10, sh)` or `np.reshape(a, sh)` |  `reshape(a,sh)` | set shape based on list of dimension sizes in tensor `sh` |
-| `tensor.Reshape(a, -1)` or `tensor.As1D(a)` | same: | `a.reshape(-1)` or `np.reshape(a, -1)` | `reshape(a,-1)` | a 1D vector view of `a`; Goal does not support `ravel`, which is nearly identical. |
-| `tensor.Flatten(a)` | same: | `b = a.flatten()`   | `b=a(:)` | returns a 1D copy of a |
-| `b := tensor.Clone(a)` | `b := copy(a)` or: | `b = a.copy()` | `b=a`  | direct assignment `b = a` in Goal or NumPy just makes variable b point to tensor a; `copy` is needed to generate new underlying values (MATLAB always makes a copy) |
-| `tensor.Squeeze(a)` | same: |`a.squeeze()` | `squeeze(a)` | remove singleton dimensions of tensor `a`. |
-
-
-## Constructing
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `tensor.NewFloat64FromValues(` `[]float64{1, 2, 3})` | `[1., 2., 3.]` | `np.array([1., 2., 3.])` | `[ 1 2 3 ]` | define a 1D tensor |
-|  | `[[1., 2., 3.], [4., 5., 6.]]` or: | `(np.array([[1., 2., 3.], [4., 5., 6.]])` | `[ 1 2 3; 4 5 6 ]` | define a 2x3 2D tensor |
-|  |  | `[[a, b], [c, d]]` or `block([[a, b], [c, d]])` | `np.block([[a, b], [c, d]])` | `[ a b; c d ]` | construct a matrix from blocks `a`, `b`, `c`, and `d` |
-| `tensor.NewFloat64(3,4)` | `zeros(3,4)` | `np.zeros((3, 4))` | `zeros(3,4)` | 3x4 2D tensor of float64 zeros; Goal does not use "tuple" so no double parens |
-| `tensor.NewFloat64(3,4,5)` | `zeros(3, 4, 5)` | `np.zeros((3, 4, 5))` | `zeros(3,4,5)` | 3x4x5 three-dimensional tensor of float64 zeros |
-| `tensor.NewFloat64Ones(3,4)` | `ones(3, 4)`  | `np.ones((3, 4))` | `ones(3,4)` | 3x4 2D tensor of 64-bit floating point ones |
-| `tensor.NewFloat64Full(5.5, 3,4)` | `full(5.5, 3, 4)` | `np.full((3, 4), 5.5)` | ? | 3x4 2D tensor of 5.5; Goal variadic arg structure requires value to come first |
-| `tensor.NewFloat64Rand(3,4)` | `rand(3, 4)` or `slrand(c, fi, 3, 4)` | `rng.random(3, 4)` | `rand(3,4)` | 3x4 2D float64 tensor with uniform random 0..1 elements; `rand` uses current Go `rand` source, while `slrand` uses [gosl](../gpu/gosl/slrand) GPU-safe call with counter `c` and function index `fi` and key = index of element |
-| TODO: |  |`np.concatenate((a,b),1)` or `np.hstack((a,b))` or `np.column_stack((a,b))` or `np.c_[a,b]` | `[a b]` | concatenate columns of a and b |
-| TODO: |  |`np.concatenate((a,b))` or `np.vstack((a,b))` or `np.r_[a,b]` | `[a; b]` | concatenate rows of a and b |
-| TODO: |  |`np.tile(a, (m, n))`    | `repmat(a, m, n)` | create m by n copies of a |
-| TODO: |  |`a[np.r_[:len(a),0]]`  | `a([1:end 1],:)`  | `a` with copy of the first row appended to the end |
-
-## Ranges and grids
-
-See [NumPy](https://numpy.org/doc/stable/user/how-to-partition.html) docs for details.
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `tensor.NewIntRange(1, 11)` | same: |`np.arange(1., 11.)` or `np.r_[1.:11.]` or `np.r_[1:10:10j]` | `1:10` | create an increasing vector; `arange` in goal is always ints; use `linspace` or `tensor.AsFloat64` for floats |
-|  | same: |`np.arange(10.)` or `np.r_[:10.]` or `np.r_[:9:10j]` | `0:9` | create an increasing vector; 1 arg is the stop value in a slice |
-|  |  |`np.arange(1.,11.)` `[:, np.newaxis]` | `[1:10]'` | create a column vector |
-| `t.NewFloat64` `SpacedLinear(` `1, 3, 4, true)` | `linspace(1,3,4,true)` |`np.linspace(1,3,4)` | `linspace(1,3,4)` | 4 equally spaced samples between 1 and 3, inclusive of end (use `false` at end for exclusive) |
-|  |  |`np.mgrid[0:9.,0:6.]` or `np.meshgrid(r_[0:9.],` `r_[0:6.])` | `[x,y]=meshgrid(0:8,0:5)` | two 2D tensors: one of x values, the other of y values |
-|  |  |`ogrid[0:9.,0:6.]` or `np.ix_(np.r_[0:9.],` `np.r_[0:6.]` | | the best way to eval functions on a grid |
-|  |  |`np.meshgrid([1,2,4],` `[2,4,5])` | `[x,y]=meshgrid([1,2,4],[2,4,5])` |  |
-|  |  |`np.ix_([1,2,4],` `[2,4,5])`    |  | the best way to eval functions on a grid |
-
-## Basic indexing
-
-See [NumPy basic indexing](https://numpy.org/doc/stable/user/basics.indexing.html#basic-indexing). Tensor Go uses the `Reslice` function for all cases (repeated `tensor.` prefix replaced with `t.` to take less space). Here you can clearly see the advantage of Goal in allowing significantly more succinct expressions to be written for accomplishing critical tensor functionality.
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `t.Reslice(a, 1, 4)` | same: |`a[1, 4]` | `a(2,5)` | access element in second row, fifth column in 2D tensor `a` |
-| `t.Reslice(a, -1)` | same: |`a[-1]` | `a(end)` | access last element |
-| `t.Reslice(a,` `1, t.FullAxis)` | same: |`a[1]` or `a[1, :]` | `a(2,:)` | entire second row of 2D tensor `a`; unspecified dimensions are equivalent to `:` (could omit second arg in Reslice too) |
-| `t.Reslice(a,` `Slice{Stop:5})` | same: |`a[0:5]` or `a[:5]` or `a[0:5, :]` | `a(1:5,:)` | 0..4 rows of `a`; uses same Go slice ranging here: (start:stop) where stop is _exclusive_ |
-| `t.Reslice(a,` `Slice{Start:-5})` | same: |`a[-5:]` | `a(end-4:end,:)` | last 5 rows of 2D tensor `a` |
-| `t.Reslice(a,` `t.NewAxis,` `Slice{Start:-5})` | same: |`a[newaxis, -5:]` | ? | last 5 rows of 2D tensor `a`, as a column vector |
-| `t.Reslice(a,` `Slice{Stop:3},` `Slice{Start:4, Stop:9})` | same: |`a[0:3, 4:9]` | `a(1:3,5:9)` | The first through third rows and fifth through ninth columns of a 2D tensor, `a`. |
-| `t.Reslice(a,` `Slice{Start:2,` `Stop:25,` `Step:2}, t.FullAxis)` | same: |`a[2:21:2,:]` | `a(3:2:21,:)` | every other row of `a`, starting with the third and going to the twenty-first |
-| `t.Reslice(a,` `Slice{Step:2},` `t.FullAxis)` | same: |`a[::2, :]`  | `a(1:2:end,:)` | every other row of `a`, starting with the first |
-| `t.Reslice(a,`, `Slice{Step:-1},` `t.FullAxis)` | same: |`a[::-1,:]`  | `a(end:-1:1,:) or flipud(a)` | `a` with rows in reverse order |
-| `t.Clone(t.Reslice(a,` `1, t.FullAxis))` | `b = copy(a[1, :])` or: | `b = a[1, :].copy()` | `y=x(2,:)` | without the copy, `y` would point to a view of values in `x`; `copy` creates distinct values, in this case of _only_ the 2nd row of `x` -- i.e., it "concretizes" a given view into a literal, memory-continuous set of values for that view. |
-| `tmath.Assign(` `t.Reslice(a,` `Slice{Stop:5}),` `t.NewIntScalar(2))` | same: |`a[:5] = 2` | `a(1:5,:) = 2` | assign the value 2 to 0..4 rows of `a` |
-| (you get the idea) | same: |`a[:5] = b[:, :5]` | `a(1:5,:) = b(:, 1:5)` | assign the values in the first 5 columns of `b` to the first 5 rows of `a` |
-
-## Boolean tensors and indexing
-
-See [NumPy boolean indexing](https://numpy.org/doc/stable/user/basics.indexing.html#boolean-array-indexing).
-
-Note that Goal only supports boolean logical operators (`&&` and `||`) on boolean tensors, not the single bitwise operators `&` and `|`.
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `tmath.Greater(a, 0.5)` | same: | `(a > 0.5)` | `(a > 0.5)` | `bool` tensor of shape `a` with elements `(v > 0.5)` |
-| `tmath.And(a, b)` | `a && b` | `logical_and(a,b)` | `a & b` | element-wise AND operator on `bool` tensors |
-| `tmath.Or(a, b)` | `a \|\| b` | `np.logical_or(a,b)` | `a \| b` | element-wise OR operator on `bool` tensors | 
-| `tmath.Negate(a)` | `!a` | ? | ? | element-wise negation on `bool` tensors | 
-| `tmath.Assign(` `tensor.Mask(a,` `tmath.Less(a, 0.5),` `0)` | same: |`a[a < 0.5]=0` | `a(a<0.5)=0` | `a` with elements less than 0.5 zeroed out |
-| `tensor.Flatten(` `tensor.Mask(a,` `tmath.Less(a, 0.5)))` | same: |`a[a < 0.5].flatten()` | ? | a 1D list of the elements of `a` < 0.5 (as a copy, not a view) |
-| `tensor.Mul(a,` `tmath.Greater(a, 0.5))` | same: |`a * (a > 0.5)` | `a .* (a>0.5)` | `a` with elements less than 0.5 zeroed out |
-
-## Advanced index-based indexing
-
-See [NumPy integer indexing](https://numpy.org/doc/stable/user/basics.indexing.html#integer-array-indexing).  Note that the current NumPy version of indexed is rather complex and difficult for many people to understand, as articulated in this [NEP 21 proposal](https://numpy.org/neps/nep-0021-advanced-indexing.html). 
-
-**TODO:** not yet implemented:
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-|  |  |`a[np.ix_([1, 3, 4], [0, 2])]` | `a([2,4,5],[1,3])` | rows 2,4 and 5 and columns 1 and 3. |
-|  |  |`np.nonzero(a > 0.5)` | `find(a > 0.5)` | find the indices where (a > 0.5) |
-|  |  |`a[:, v.T > 0.5]` | `a(:,find(v>0.5))` | extract the columns of `a` where column vector `v` > 0.5 |
-|  |  |`a[:,np.nonzero(v > 0.5)[0]]` | `a(:,find(v > 0.5))` | extract the columns of `a` where vector `v` > 0.5 |
-|  |  |`a[:] = 3` | `a(:) = 3` | set all values to the same scalar value |
-|  |  |`np.sort(a)` or `a.sort(axis=0)` | `sort(a)` | sort each column of a 2D tensor, `a` |
-|  |  |`np.sort(a, axis=1)` or `a.sort(axis=1)` | `sort(a, 2)` | sort the each row of 2D tensor, `a` |
-|  |  |`I = np.argsort(a[:, 0]); b = a[I,:]` | `[b,I]=sortrows(a,1)`  | save the tensor `a` as tensor `b` with rows sorted by the first column |
-|  |  |`np.unique(a)` | `unique(a)` | a vector of unique values in tensor `a` |
-
-## Basic math operations (add, multiply, etc)
-
-In Goal and NumPy, the standard `+, -, *, /` operators perform _element-wise_ operations because those are well-defined for all dimensionalities and are consistent across the different operators, whereas matrix multiplication is specifically used in a 2D linear algebra context, and is not well defined for the other operators.
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `tmath.Add(a,b)` | same: |`a + b` | `a .+ b` | element-wise addition; Goal does this string-wise for string tensors |
-| `tmath.Mul(a,b)` | same: |`a * b` | `a .* b` | element-wise multiply |
-| `tmath.Div(a,b)` | same: |`a/b`   | `a./b` | element-wise divide. _important:_ this always produces a floating point result. |
-| `tmath.Mod(a,b)` | same: |`a%b`   | `a./b` | element-wise modulous (works for float and int) |
-| `tmath.Pow(a,3)` | same: | `a**3`  | `a.^3` | element-wise exponentiation |
-| `tmath.Cos(a)`   | same: | `cos(a)` | `cos(a)` | element-wise function application |
-
-## 2D Matrix Linear Algebra
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `matrix.Mul(a,b)` | same: |`a @ b` | `a * b` | matrix multiply |
-| `tensor.Transpose(a)` | <- or `a.T` |`a.transpose()` or `a.T` | `a.'` | transpose of `a` |
-| TODO: |  |`a.conj().transpose() or a.conj().T` | `a'` | conjugate transpose of `a` |
-| `matrix.Det(a)` | `matrix.Det(a)` | `np.linalg.det(a)` | ? | determinant of `a` |
-| `matrix.Identity(3)` | <- |`np.eye(3)` | `eye(3)` | 3x3 identity matrix |
-| `matrix.Diagonal(a)` | <- |`np.diag(a)` | `diag(a)` | returns a vector of the diagonal elements of 2D tensor, `a`. Goal returns a read / write view. |
-|  |  |`np.diag(v, 0)` | `diag(v,0)` | returns a square diagonal matrix whose nonzero values are the elements of vector, v |
-| `matrix.Trace(a)` | <- |`np.trace(a)` | `trace(a)` | returns the sum of the elements along the diagonal of `a`. |
-| `matrix.Tri()` | <- |`np.tri()` | `tri()` | returns a new 2D Float64 matrix with 1s in the lower triangular region (including the diagonal) and the remaining upper triangular elements zero |
-| `matrix.TriL(a)` | <- |`np.tril(a)` | `tril(a)` | returns a copy of `a` with the lower triangular elements (including the diagonal) from `a` and the remaining upper triangular elements zeroed out |
-| `matrix.TriU(a)` | <- |`np.triu(a)` | `triu(a)` | returns a copy of `a` with the upper triangular elements (including the diagonal) from `a` and the remaining lower triangular elements zeroed out |
-|  |  |`linalg.inv(a)` | `inv(a)` | inverse of square 2D tensor a |
-|  |  |`linalg.pinv(a)` | `pinv(a)` | pseudo-inverse of 2D tensor a |
-|  |  |`np.linalg.matrix_rank(a)` | `rank(a)` | matrix rank of a 2D tensor a |
-|  |  |`linalg.solve(a, b)` if `a` is square; `linalg.lstsq(a, b)` otherwise | `a\b` | solution of `a x = b` for x |
-|  |  |Solve `a.T x.T = b.T` instead | `b/a` | solution of x a = b for x |
-|  |  |`U, S, Vh = linalg.svd(a); V = Vh.T` | `[U,S,V]=svd(a)` | singular value decomposition of a |
-|  |  |`linalg.cholesky(a)` | `chol(a)` | Cholesky factorization of a 2D tensor |
-|  |  |`D,V = linalg.eig(a)` | `[V,D]=eig(a)` | eigenvalues and eigenvectors of `a`, where `[V,D]=eig(a,b)` eigenvalues and eigenvectors of `a, b` where |
-|  |  |`D,V = eigs(a, k=3)`  | `D,V = linalg.eig(a, b)` |  `[V,D]=eigs(a,3)` | find the k=3 largest eigenvalues and eigenvectors of 2D tensor, a |
-|  |  |`Q,R = linalg.qr(a)`  | `[Q,R]=qr(a,0)` | QR decomposition
-|  |  |`P,L,U = linalg.lu(a)` where `a == P@L@U`  | `[L,U,P]=lu(a)` where `a==P'*L*U` | LU decomposition with partial pivoting (note: P(MATLAB) == transpose(P(NumPy))) | 
-|  |  |`x = linalg.lstsq(Z, y)` | `x = Z\y` | perform a linear regression of the form |
-
-## Statistics
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-| `a.max()` or `max(a)` or `stats.Max(a)` | `a.max()` or `np.nanmax(a)` | `max(max(a))` | maximum element of `a`, Goal always ignores `NaN` as missing data |
-|  |  |`a.max(0)` | `max(a)` | maximum element of each column of tensor `a` |
-|  |  |`a.max(1)` | `max(a,[],2)` | maximum element of each row of tensor `a` |
-|  |  |`np.maximum(a, b)` | `max(a,b)` | compares a and b element-wise, and returns the maximum value from each pair |
-| `stats.L2Norm(a)` | `np.sqrt(v @ v)` or `np.linalg.norm(v)` | `norm(v)` | L2 norm of vector v |
-|  |  |`cg`  | `conjgrad` | conjugate gradients solver |
-
-## FFT and complex numbers
-
-todo: huge amount of work needed to support complex numbers throughout!
-
-| `tensor` Go  |   Goal      | NumPy  | MATLAB | Notes            |
-| ------------ | ----------- | ------ | ------ | ---------------- |
-|  |  |`np.fft.fft(a)` | `fft(a)` | Fourier transform of `a` |
-|  |  |`np.fft.ifft(a)` | `ifft(a)` | inverse Fourier transform of `a` |
-|  |  |`signal.resample(x, np.ceil(len(x)/q))` |  `decimate(x, q)` | downsample with low-pass filtering |
-
-## tensorfs
-
-The [tensorfs](../tensor/tensorfs) data filesystem provides a global filesystem-like workspace for storing tensor data, and Goal has special commands and functions to facilitate interacting with it. In an interactive `goal` shell, when you do `##` to switch into math mode, the prompt changes to show your current directory in the tensorfs, not the regular OS filesystem, and the final prompt character turns into a `#`.
-
-Use `get` and `set` (aliases for `tensorfs.Get` and `tensorfs.Set`) to retrieve and store data in the tensorfs:
-
-* `x := get("path/to/item")` retrieves the tensor data value at given path, which can then be used directly in an expression or saved to a new variable as in this example.
-
-* `set("path/to/item", x)` saves tensor data to given path, overwriting any existing value for that item if it already exists, and creating a new one if not. `x` can be any data expression.
-
-You can use the standard shell commands to navigate around the data filesystem:
-
-* `cd <dir>` to change the current working directory. By default, new variables created in the shell are also recorded into the current working directory for later access.
-
-* `ls [-l,r] [dir]` list the contents of a directory; without arguments, it shows the current directory. The `-l` option shows each element on a separate line with its shape. `-r` does a recursive list through subdirectories.
-
-* `mkdir <dir>` makes a new subdirectory.
-
-TODO: other commands, etc.
-
-
-
diff --git a/goal/TODO.md b/goal/TODO.md
index 6eefd395..0f65aadb 100644
--- a/goal/TODO.md
+++ b/goal/TODO.md
@@ -1,5 +1,10 @@
 This is a list of all the stuff that doesn't work in goal.
 
+## docs issues
+
+* for loop max in metric.md doesn't work..
+* matrix.EigSym return 2 values not working -- only 1
+
 ## converting between go / tensor
 
 * `for range` expression that deals with iterating over tensors, in math mode
diff --git a/goal/interpreter/interpreter.go b/goal/interpreter/interpreter.go
index f065467f..6a81cb6d 100644
--- a/goal/interpreter/interpreter.go
+++ b/goal/interpreter/interpreter.go
@@ -262,7 +262,7 @@ func (in *Interpreter) Interactive() error {
 		in.Goal.Errors = nil
 		v, hasPrint, err := in.Eval(line)
 		if err == nil && !hasPrint && v.IsValid() && !v.IsZero() && v.Kind() != reflect.Func {
-			fmt.Println(v.Interface())
+			in.Goal.Config.StdIO.Println(v.Interface())
 		}
 	}
 }
diff --git a/goal/transpile/math.go b/goal/transpile/math.go
index c9ae0c49..0079f1d3 100644
--- a/goal/transpile/math.go
+++ b/goal/transpile/math.go
@@ -368,7 +368,7 @@ func (mp *mathParse) expr(ex ast.Expr) {
 
 	case *ast.Ellipsis:
 		cfun := mp.funcs.Peek()
-		if cfun != nil && cfun.Name == "tensor.Reslice" {
+		if cfun != nil && cfun.Name == "tensor.AnySlice" {
 			mp.out.Add(token.IDENT, "tensor.Ellipsis")
 			mp.idx++
 		} else {
@@ -725,10 +725,7 @@ func (mp *mathParse) indexExpr(il *ast.IndexExpr) {
 
 func (mp *mathParse) basicSlicingExpr(il *ast.IndexExpr) {
 	iil := il.Index.(*ast.IndexListExpr)
-	fun := "tensor.Reslice"
-	if mp.exprsAreBool(iil.Indices) {
-		fun = "tensor.Mask"
-	}
+	fun := "tensor.AnySlice"
 	mp.startFunc(fun)
 	mp.out.Add(token.LPAREN)
 	mp.expr(il.X)
@@ -859,6 +856,7 @@ var numpyFuncs = map[string]funWrap{
 	"copy":     {"tensor.Clone", ""},
 	"get":      {"tensorfs.Get", ""},
 	"set":      {"tensorfs.Set", ""},
+	"setcp":    {"tensorfs.SetCopy", ""},
 	"flatten":  {"tensor.Flatten", "nofun"},
 	"squeeze":  {"tensor.Squeeze", "nofun"},
 }
diff --git a/goal/transpile/datafs.go b/goal/transpile/tensorfs.go
similarity index 100%
rename from goal/transpile/datafs.go
rename to goal/transpile/tensorfs.go
diff --git a/goal/transpile/transpile_test.go b/goal/transpile/transpile_test.go
index b06562c0..e64c5a46 100644
--- a/goal/transpile/transpile_test.go
+++ b/goal/transpile/transpile_test.go
@@ -302,32 +302,32 @@ func TestMath(t *testing.T) {
 		{"# reshape(x, sh)", `tensor.Reshape(x, tensor.AsIntSlice(sh) ...)`},
 		{"# reshape(arange(36), 6, 6)", `tensor.Reshape(tensor.NewIntRange(36), 6, 6)`},
 		{"# a.reshape(6, 6)", `tensor.Reshape(a, 6, 6)`},
-		{"# a[1, 2]", `tensor.Reslice(a, 1, 2)`},
-		{"# a[:, 2]", `tensor.Reslice(a, tensor.FullAxis, 2)`},
-		{"# a[1:3:1, 2]", `tensor.Reslice(a, tensor.Slice { Start:1, Stop:3, Step:1 } , 2)`},
-		{"# a[::-1, 2]", `tensor.Reslice(a, tensor.Slice { Step:-1 } , 2)`},
-		{"# a[:3, 2]", `tensor.Reslice(a, tensor.Slice { Stop:3 } , 2)`},
-		{"# a[2:, 2]", `tensor.Reslice(a, tensor.Slice { Start:2 } , 2)`},
-		{"# a[2:, 2, newaxis]", `tensor.Reslice(a, tensor.Slice { Start:2 } , 2, tensor.NewAxis)`},
-		{"# a[..., 2:]", `tensor.Reslice(a, tensor.Ellipsis, tensor.Slice { Start:2 } )`},
-		{"# a[:, 2] = b", `tmath.Assign(tensor.Reslice(a, tensor.FullAxis, 2), b)`},
-		{"# a[:, 2] += b", `tmath.AddAssign(tensor.Reslice(a, tensor.FullAxis, 2), b)`},
+		{"# a[1, 2]", `tensor.AnySlice(a, 1, 2)`},
+		{"# a[:, 2]", `tensor.AnySlice(a, tensor.FullAxis, 2)`},
+		{"# a[1:3:1, 2]", `tensor.AnySlice(a, tensor.Slice { Start:1, Stop:3, Step:1 } , 2)`},
+		{"# a[::-1, 2]", `tensor.AnySlice(a, tensor.Slice { Step:-1 } , 2)`},
+		{"# a[:3, 2]", `tensor.AnySlice(a, tensor.Slice { Stop:3 } , 2)`},
+		{"# a[2:, 2]", `tensor.AnySlice(a, tensor.Slice { Start:2 } , 2)`},
+		{"# a[2:, 2, newaxis]", `tensor.AnySlice(a, tensor.Slice { Start:2 } , 2, tensor.NewAxis)`},
+		{"# a[..., 2:]", `tensor.AnySlice(a, tensor.Ellipsis, tensor.Slice { Start:2 } )`},
+		{"# a[:, 2] = b", `tmath.Assign(tensor.AnySlice(a, tensor.FullAxis, 2), b)`},
+		{"# a[:, 2] += b", `tmath.AddAssign(tensor.AnySlice(a, tensor.FullAxis, 2), b)`},
 		{"# cos(a)", `tmath.Cos(a)`},
 		{"# stats.Mean(a)", `stats.Mean(a)`},
 		{"# (stats.Mean(a))", `(stats.Mean(a))`},
 		{"# stats.Mean(reshape(a,36))", `stats.Mean(tensor.Reshape(a, 36))`},
-		{"# z = a[1:5,1:5] - stats.Mean(ra)", `z = tmath.Sub(tensor.Reslice(a, tensor.Slice { Start:1, Stop:5 } , tensor.Slice { Start:1, Stop:5 } ), stats.Mean(ra))`},
+		{"# z = a[1:5,1:5] - stats.Mean(ra)", `z = tmath.Sub(tensor.AnySlice(a, tensor.Slice { Start:1, Stop:5 } , tensor.Slice { Start:1, Stop:5 } ), stats.Mean(ra))`},
 		{"# metric.Matrix(metric.Cosine, a)", `metric.Matrix(metric.Cosine, a)`},
 		{"# a > 5", `tmath.Greater(a, tensor.NewIntScalar(5))`},
 		{"# !a", `tmath.Not(a)`},
-		{"# a[a > 5]", `tensor.Mask(a, tmath.Greater(a, tensor.NewIntScalar(5)))`},
-		{"# a[a > 5].flatten()", `tensor.Flatten(tensor.Mask(a, tmath.Greater(a, tensor.NewIntScalar(5))))`},
-		{"# a[:3, 2].copy()", `tensor.Clone(tensor.Reslice(a, tensor.Slice { Stop:3 } , 2))`},
-		{"# a[:3, 2].reshape(4,2)", `tensor.Reshape(tensor.Reslice(a, tensor.Slice { Stop:3 } , 2), 4, 2)`},
+		{"# a[a > 5]", `tensor.AnySlice(a, tmath.Greater(a, tensor.NewIntScalar(5)))`},
+		{"# a[a > 5].flatten()", `tensor.Flatten(tensor.AnySlice(a, tmath.Greater(a, tensor.NewIntScalar(5))))`},
+		{"# a[:3, 2].copy()", `tensor.Clone(tensor.AnySlice(a, tensor.Slice { Stop:3 } , 2))`},
+		{"# a[:3, 2].reshape(4,2)", `tensor.Reshape(tensor.AnySlice(a, tensor.Slice { Stop:3 } , 2), 4, 2)`},
 		{"# a > 5 || a < 1", `tmath.Or(tmath.Greater(a, tensor.NewIntScalar(5)), tmath.Less(a, tensor.NewIntScalar(1)))`},
 		{"# fmt.Println(a)", `fmt.Println(a)`},
 		{"# }", `}`},
-		{"# if a[1,2] == 2 {", `if tmath.Equal(tensor.Reslice(a, 1, 2), tensor.NewIntScalar(2)).Bool1D(0) {`},
+		{"# if a[1,2] == 2 {", `if tmath.Equal(tensor.AnySlice(a, 1, 2), tensor.NewIntScalar(2)).Bool1D(0) {`},
 		{"# for i := 0; i < 3; i++ {", `for i := tensor.Tensor(tensor.NewIntScalar(0)); tmath.Less(i, tensor.NewIntScalar(3)).Bool1D(0); tmath.Inc(i) {`},
 		{"# for i, v := range a {", `for i := 0; i < a.Len(); i++ { v := a .Float1D(i)`},
 		{`# x := get("item")`, `x := tensor.Tensor(tensorfs.Get("item"))`},
diff --git a/gosl/README.md b/gosl/README.md
index 6ae25895..517b5bdf 100644
--- a/gosl/README.md
+++ b/gosl/README.md
@@ -2,257 +2,9 @@
 
 `gosl` implements _Go as a shader language_ for GPU compute shaders (using [WebGPU](https://www.w3.org/TR/webgpu/)), **enabling standard Go code to run on the GPU**.
 
-`gosl` converts Go code to WGSL which can then be loaded directly into a WebGPU compute shader, using the [gpu](../../gpu) GPU compute shader system. It operates within the overall [Goal](../README.md) framework of an augmented version of the Go language. See the [GPU](../GPU.md) documentation for an overview of issues in GPU computation.
-
-The relevant regions of Go code to be run on the GPU are tagged using the `//gosl:start` and `//gosl:end` comment directives, and this code must only use basic expressions and concrete types that will compile correctly in a GPU shader (see [Restrictions](#restrictions) below).  Method functions and pass-by-reference pointer arguments to `struct` types are supported and incur no additional compute cost due to inlining (see notes below for more detail).
+`gosl` converts Go code to WGSL which can then be loaded directly into a WebGPU compute shader, using the [core gpu](https://github.com/cogentcore/core/tree/main/gpu) compute shader system. It operates within the overall [Goal](../goal/README.md) framework of an augmented version of the Go language.
 
 See [examples/basic](examples/basic) and [rand](examples/rand) for complete working examples.
 
-Typically, `gosl` is called from a go generate command, e.g., by including this comment directive:
-
-```
-//go:generate gosl 
-```
-
-To install the `gosl` command:
-```bash
-$ go install cogentcore.org/lab/gosl/@latest
-```
-
-It is also strongly recommended to install the `naga` WGSL compiler from https://github.com/gfx-rs/wgpu and the `tint` compiler from https://dawn.googlesource.com/dawn/ Both of these are used if available to validate the generated GPU shader code. It is much faster to fix the issues at generation time rather than when trying to run the app later. Once code passes validation in both of these compilers, it should load fine in your app, and if the Go version runs correctly, there is a good chance of at least some reasonable behavior on the GPU.
-
-# Usage
-
-There are two key elements for GPU-enabled code:
-
-1. One or more [Kernel](#kernels) compute functions that take an _index_ argument and perform computations for that specific index of data, _in parallel_. **GPU computation is effectively just a parallel `for` loop**. On the GPU, each such kernel is implemented by its own separate compute shader code, and one of the main functions of `gosl` is to generate this code from the Go sources, in the automatically created `shaders/` directory.
-
-2. [Global variables](#global-variables) on which the kernel functions _exclusively_ operate: all relevant data must be specifically copied from the CPU to the GPU and back. As explained in the [GPU](../GPU.md) docs, each GPU compute shader is effectively a _standalone_ program operating on these global variables. To replicate this environment on the CPU, so the code works in both contexts, we need to make these variables global in the CPU (Go) environment as well.
-
-`gosl` generates a file named `gosl.go` in your package directory that initializes the GPU with all of the global variables, and functions for running the kernels and syncing the gobal variable data back and forth between the CPu and GPU.
-
-## Kernels
-
-Each distinct compute kernel must be tagged with a `//gosl:kernel` comment directive, as in this example (from `examples/basic`):
-```Go
-// Compute does the main computation.
-func Compute(i uint32) { //gosl:kernel
-	Params[0].IntegFromRaw(int(i))
-}
-```
-
-The kernel functions receive a `uint32` index argument, and use this to index into the global variables containing the relevant data. Typically the kernel code itself just calls other relevant function(s) using the index, as in the above example. Critically, _all_ of the data that a kernel function ultimately depends on must be contained with the global variables, and these variables must have been sync'd up to the GPU from the CPU prior to running the kernel (more on this below).
-
-In the CPU mode, the kernel is effectively run in a `for` loop like this:
-```Go
-	for i := range n {
-		Compute(uint32(i))
-	}
-```
-A parallel goroutine-based mechanism is actually used, but conceptually this is what it does, on both the CPU and the GPU. To reiterate: **GPU computation is effectively just a parallel for loop**.
-
-## Global variables
-
-The global variables on which the kernels operate are declared in the usual Go manner, as a single `var` block, which is marked at the top using the `//gosl:vars` comment directive:
-
-```Go
-//gosl:vars
-var (
-	// Params are the parameters for the computation.
-	//gosl:read-only
-	Params []ParamStruct
-
-	// Data is the data on which the computation operates.
-	// 2D: outer index is data, inner index is: Raw, Integ, Exp vars.
-	//gosl:dims 2
-	Data *tensor.Float32
-)
-```
-
-All such variables must be either:
-1. A `slice` of GPU-alignment compatible `struct` types, such as `ParamStruct` in the above example. In general such structs should be marked as `//gosl:read-only` due to various challenges associated with writing to structs, detailed below.
-2. A `tensor` of a GPU-compatible elemental data type (`float32`, `uint32`, or `int32`), with the number of dimensions indicated by the `//gosl:dims <n>` tag as shown above. This is the preferred type for writable data.
-
-You can also just declare a slice of elemental GPU-compatible data values such as `float32`, but it is generally preferable to use the tensor instead, because it has built-in support for higher-dimensional indexing in a way that is transparent between CPU and GPU.
-
-### Tensor data
-
-On the GPU, the tensor data is represented using a simple flat array of the basic data type. To index into this array, the _strides_ for each dimension are encoded in a special `TensorStrides` tensor that is managed by `gosl`, in the generated `gosl.go` file. `gosl` automatically generates the appropriate indexing code using these strides (which is why the number of dimensions is needed).
-
-Whenever the strides of any tensor variable change, and at least once at initialization, your code must call the function that copies the current strides up to the GPU:
-```Go
-	ToGPUTensorStrides()
-```
-
-### Multiple tensor variables for large data
-
-The size of each memory buffer is limited by the GPU, to a maximum of at most 4GB on modern GPU hardware. Therefore, if you need to have any single tensor that holds more than this amount of data, then a bank of multiple vars are required. `gosl` provides helper functions to make this relatively straightforward.
-
-TODO: this could be encoded in the TensorStrides. It will always be the outer-most index that determines when it gets over threshold, which all can be pre-computed.
-
-### Systems and Groups
-
-Each kernel belongs to a `gpu.ComputeSystem`, and each such system has one specific configuration of memory variables. In general, it is best to use a single set of global variables, and perform as much of the computation as possible on this set of variables, to minimize the number of memory transfers. However, if necessary, multiple systems can be defined, using an optional additional system name argument for the `args` and `kernel` tags.
-
-In addition, the vars can be organized into _groups_, which generally should have similar memory syncing behavior, as documented in the [gpu](../gpu) system.
-
-Here's an example with multiple groups:
-```Go
-//gosl:vars [system name]
-var (
-    // Layer-level parameters
-    //gosl:group -uniform Params
-    Layers   []LayerParam // note: struct with appropriate memory alignment
-
-    // Path-level parameters
-    Paths    []PathParam  
-
-    // Unit state values
-    //gosl:group Units
-    Units    tensor.Float32
-    
-    // Synapse weight state values
-    Weights  tensor.Float32
-)
-```
-
-## Memory syncing
-
-Each global variable gets an automatically-generated `*Var` enum (e.g., `DataVar` for global variable named `Data`), that used for the memory syncing functions, to make it easy to specify any number of such variables to sync, which is by far the most efficient. All of this is in the generated `gosl.go` file. For example:
-
-```Go
-	ToGPU(ParamsVar, DataVar)
-```
-
-Specifies that the current contents of `Params` and `Data` are to be copied up to the GPU, which is guaranteed to complete by the time the next kernel run starts, within a given system.
-
-## Kernel running
-
-As with memory transfers, it is much more efficient to run multiple kernels in sequence, all operating on the current data variables, followed by a single sync of the updated global variable data that has been computed. Thus, there are separate functions for specifying the kernels to run, followed by a single "Done" function that actually submits the entire batch of kernels, along with memory sync commands to get the data back from the GPU. For example:
-
-```Go
-    RunCompute1(n)
-    RunCompute2(n)
-    ...
-    RunDone(Data1Var, Data2Var) // launch all kernels and get data back to given vars
-```
-
-For CPU mode, `RunDone` is a no-op, and it just runs each kernel during each `Run` command.
-
-It is absolutely essential to understand that _all data must already be on the GPU_ at the start of the first Run command, and that any CPU-based computation between these calls is completely irrelevant for the GPU. Thus, it typically makes sense to just have a sequence of Run commands grouped together into a logical unit, with the relevant `ToGPU` calls at the start and the final `RunDone` grabs everything of relevance back from the GPU.
-
-## GPU relevant code taggng
-
-In a large GPU-based application, you should organize your code as you normally would in any standard Go application, distributing it across different files and packages. The GPU-relevant parts of each of those files can be tagged with the gosl tags:
-```
-//gosl:start
-
-< Go code to be translated >
-
-//gosl:end
-```
-to make this code available to all of the shaders that are generated.
-
-Use the `//gosl:import "package/path"` directive to import GPU-relevant code from other packages, similar to the standard Go import directive. It is assumed that many other Go imports are not GPU relevant, so this separate directive is required.
-
-If any `enums` variables are defined, pass the `-gosl` flag to the `core generate` command to ensure that the `N` value is tagged with `//gosl:start` and `//gosl:end` tags.
-
-**IMPORTANT:** all `.go` and `.wgsl` files are removed from the `shaders` directory prior to processing to ensure everything there is current -- always specify a different source location for any custom `.wgsl` files that are included.
-
-# Command line usage
-
-```
-gosl [flags] 
-```
-    
-The flags are:
-```
-  -debug
-    	enable debugging messages while running
-  -exclude string
-    	comma-separated list of names of functions to exclude from exporting to WGSL (default "Update,Defaults")
-  -keep
-    	keep temporary converted versions of the source files, for debugging
-  -out string
-    	output directory for shader code, relative to where gosl is invoked -- must not be an empty string (default "shaders")
-```
-
-`gosl` always operates on the current directory, looking for all files with `//gosl:` tags, and accumulating all the `import` files that they include, etc.
-  
-Any `struct` types encountered will be checked for 16-byte alignment of sub-types and overall sizes as an even multiple of 16 bytes (4 `float32` or `int32` values), which is the alignment used in WGSL and glsl shader languages, and the underlying GPU hardware presumably.  Look for error messages on the output from the gosl run.  This ensures that direct byte-wise copies of data between CPU and GPU will be successful.  The fact that `gosl` operates directly on the original CPU-side Go code uniquely enables it to perform these alignment checks, which are otherwise a major source of difficult-to-diagnose bugs.
-
-# Restrictions    
-
-In general shader code should be simple mathematical expressions and data types, with minimal control logic via `if`, `for` statements, and only using the subset of Go that is consistent with C.  Here are specific restrictions:
-
-* Can only use `float32`, `[u]int32` for basic types (`int` is converted to `int32` automatically), and `struct` types composed of these same types -- no other Go types (i.e., `map`, slices, `string`, etc) are compatible.  There are strict alignment restrictions on 16 byte (e.g., 4 `float32`'s) intervals that are enforced via the `alignsl` sub-package.
-
-* WGSL does _not_ support 64 bit float or int.
-
-* Use `slbool.Bool` instead of `bool` -- it defines a Go-friendly interface based on a `int32` basic type.
-
-* Alignment and padding of `struct` fields is key -- this is automatically checked by `gosl`.
-
-* WGSL does not support enum types, but standard go `const` declarations will be converted.  Use an `int32` or `uint32` data type.  It will automatically deal with the simple incrementing `iota` values, but not more complex cases.  Also, for bitflags, define explicitly, not using `bitflags` package, and use `0x01`, `0x02`, `0x04` etc instead of `1<<2` -- in theory the latter should be ok but in practice it complains.
-
-* Cannot use multiple return values, or multiple assignment of variables in a single `=` expression.
-
-* *Can* use multiple variable names with the same type (e.g., `min, max float32`) -- this will be properly converted to the more redundant form with the type repeated, for WGSL.
-
-* `switch` `case` statements are _purely_ self-contained -- no `fallthrough` allowed!  does support multiple items per `case` however. Every `switch` _must_ have a `default` case.
-
-* WGSL does specify that new variables are initialized to 0, like Go, but also somehow discourages that use-case.  It is safer to initialize directly:
-```Go
-    val := float32(0) // guaranteed 0 value
-    var val float32 // ok but generally avoid
-```    
-
-* Use the automatically-generated `GetX` methods to get a local variable to a slice of structs:
-```Go
-    ctx := GetCtx(0)
-```
-This automatically does the right thing on GPU while returning a pointer to the indexed struct on CPU.
-
-* tensor variables can only be used in `storage` (not `uniform`) memory, due to restrictions on dynamic sizing and alignment. Aside from this constraint, it is possible to designate a group of variables to use uniform memory, with the `-uniform` argument as the first item in the `//gosl:group` comment directive.
-
-## Other language features
-
-* [tour-of-wgsl](https://google.github.io/tour-of-wgsl/types/pointers/passing_pointers/) is a good reference to explain things more directly than the spec.
-
-* `ptr<function,MyStruct>` provides a pointer arg
-* `private` scope = within the shader code "module", i.e., one thread.  
-* `function` = within the function, not outside it.
-* `workgroup` = shared across workgroup -- coudl be powerful (but slow!) -- need to learn more.
-
-## Atomic access
-
-WGSL adopts the Metal (lowest common denominator) strong constraint of imposing a _type_ level restriction on atomic operations: you can only do atomic operations on variables that have been declared atomic, as in:
-
-```
-var<storage, read_write> PathGBuf: array<atomic<i32>>;
-...
-atomicAdd(&PathGBuf[idx], val);
-```
-
-This also unfortunately has the side-effect that you cannot do _non-atomic_ operations on atomic variables, as discussed extensively here: https://github.com/gpuweb/gpuweb/issues/2377  Gosl automatically detects the use of atomic functions on GPU variables, and tags them as atomic. 
-
-## Random numbers: slrand
-
-See [slrand](https://github.com/emer/gosl/v2/tree/main/slrand) for a shader-optimized random number generation package, which is supported by `gosl` -- it will convert `slrand` calls into appropriate WGSL named function calls.  `gosl` will also copy the `slrand.wgsl` file, which contains the full source code for the RNG, into the destination `shaders` directory, so it can be included with a simple local path:
-
-```Go
-//gosl:wgsl mycode
-// #include "slrand.wgsl"
-//gosl:end mycode
-```
-
-# Performance
-
-With sufficiently large N, and ignoring the data copying setup time, around ~80x speedup is typical on a Macbook Pro with M1 processor.  The `rand` example produces a 175x speedup!
-
-# Implementation / Design Notes
-
-# Links
-
-Key docs for WGSL as compute shaders:
+See the [Cogent Lab Docs](https://cogentcore.org/lab/gosl) for full documentation.
 
diff --git a/lab/README.md b/lab/README.md
index cc8cf3b1..87459a74 100644
--- a/lab/README.md
+++ b/lab/README.md
@@ -2,15 +2,5 @@
 
 The lab package provides GUI elements for data exploration and visualization, and a simple `Browser` implementation that combines these elements.
 
-* `FileTree` (with `FileNode` elements), implementing a [filetree](https://github.com/cogentcore/tree/main/filetree) that has support for a [tensorfs](../tensorfs) filesystem, and data files in an actual filesystem. It has a `Tabber` pointer that handles the viewing actions on `tensorfs` elements (showing a Plot, etc).
-
-* `Tabber` interface and `Tabs` base implementation provides methods for showing data plots and editors in tabs.
-
-* `Terminal` running a `goal` shell that supports interactive commands operating on the `tensorfs` data etc. TODO!
-
-* `Browser` provides a hub structure connecting the above elements, which can be included in an actual GUI widget, that also provides additional functionality / GUI elements.
-
-The basic `Browser` puts the `FileTree` in a left `Splits` and the `Tabs` in the right, and supports interactive exploration and visualization of data. See the [basic](examples/basic) example for a simple instance.
-
-In the [emergent](https://github.com/emer) framework, these elements are combined with other GUI elements to provide a full neural network simulation environment on top of the databrowser foundation.
+See the [Cogent Lab Docs](https://cogentcore.org/lab/lab) for full documentation.
 
diff --git a/matrix/README.md b/matrix/README.md
index c27c88a6..2f516208 100644
--- a/matrix/README.md
+++ b/matrix/README.md
@@ -2,6 +2,8 @@
 
 This package provides interfaces for `Tensor` types to the [gonum](https://github.com/gonum/gonum) functions for linear algebra, defined on the 2D `mat.Matrix` interface.
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/matrix) for full documentation.
+
 # TODO
 
 Add following functions here:
diff --git a/plot/README.md b/plot/README.md
index 2119b0a3..6928ee78 100644
--- a/plot/README.md
+++ b/plot/README.md
@@ -1,5 +1,9 @@
 # Plot
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/plot) for full documentation.
+
+## Design discussion
+
 The `plot` package generates 2D plots of data using the Cogent Core `paint` rendering system.  The `plotcore` sub-package has Cogent Core Widgets that can be used in applications.  
 * `Plot` is just a wrapper around a `plot.Plot`, for code-generated plots.
 * `Editor` is an interactive plot viewer that supports selection of which data to plot, and GUI configuration of plot parameters.
@@ -16,162 +20,10 @@ The GUI constraint requires a more systematic, factorial organization of the spa
 
 * Each `Plotter` element can generally handle multiple different data elements, that are index-aligned. For example, the basic `XY` plotter requires a `Y` Valuer, and typically an `X`, but indexes will be used if it is not present. It optionally uses `Size` or `Color` Valuers that apply to the Point elements. A `Bar` gets at least a `Y` but also optionally a `High` Valuer for an error bar.  The `plot.Data` = `map[Roles]Valuer` is used to create new Plotter elements, allowing an unordered and explicit way of specifying the `Roles` of each `Valuer` item. Each Plotter also allows a single `Valuer` (i.e., Tensor) argument instead of the data, for a convenient minimal plot cse.  There are also shortcut methods for `NewXY` and `NewY`.
 
-Here is a minimal example for how a plotter XY Line element is created using Y data `yd`:
-
-```Go
-plt := plot.NewPlot()
-plots.NewLine(plt, yd)
-```
-
-And here's a more complex example setting the `plot.Data` map of roles to data:
-
-```Go
-plots.NewLine(plt, plot.Data{plot.X: xd, plot.Y: yd, plot.Low: low, plot.High: high})
-```
-
 The table-driven plotting case uses a `Group` name along with the `Roles` type (`X`, `Y` etc) and Plotter type names to organize different plots based on `Style` settings.  Columns with the same Group name all provide data to the same plotter using their different Roles, making it easy to configure various statistical plots of multiple series of grouped data.
 
 Different plotter types (including custom ones) are registered along with their accepted input roles, to allow any type of plot to be generated.
 
-# Styling
-
-`plot.Style` contains the full set of styling parameters, which can be set using Styler functions that are attached to individual plot elements (e.g., lines, points etc) that drive the content of what is actually plotted (based on the `Plotter` interface).
-
-Each such plot element defines a `Styler` method, e.g.,:
-
-```Go
-plt := plot.NewPlot()
-ln := plots.NewLine(plt, data).Styler(func(s *plot.Style) {
-    s.Plot.Title = "My Plot" // overall Plot styles
-    s.Line.Color = colors.Uniform(colors.Red) // line-specific styles
-})
-```
-
-The `Plot` field (of type `PlotStyle`) contains all the properties that apply to the plot as a whole. Each element can set these values, and they are applied in the order the elements are added, so the last one gets final say. Typically you want to just set these plot-level styles on one element only and avoid any conflicts.
-
-The rest of the style properties (e.g., `Line`, `Point`) apply to the element in question. There are also some default plot-level settings in `Plot` that apply to all elements, and the plot-level styles are updated first, so in this way it is possible to have plot-wide settings applied from one styler, that affect all plots (e.g., the line width, and whether lines and / or points are plotted or not).
-
-## Tensor metadata
-
-Styler functions can be attached directly to a `tensor.Tensor` via its metadata, and the `Plotter` elements will automatically grab these functions from any data source that has such metadata set. This allows the data generator to directly set default styling parameters, which can always be overridden later by adding more styler functions. Tying the plot styling directly to the source data allows all of the relevant logic to be put in one place, instead of spreading this logic across different places in the code.
-
-Here is an example of how this works:
-
-```Go
-	tx, ty := tensor.NewFloat64(21), tensor.NewFloat64(21)
-	for i := range tx.DimSize(0) {
-		tx.SetFloat1D(float64(i*5), i)
-		ty.SetFloat1D(50.0+40*math.Sin((float64(i)/8)*math.Pi), i)
-	}
-	// attach stylers to the Y axis data: that is where plotter looks for it
-	plot.SetStyler(ty, func(s *plot.Style) {
-		s.Plot.Title = "Test Line"
-		s.Plot.XAxis.Label = "X Axis"
-		s.Plot.YAxisLabel = "Y Axis"
-		s.Plot.Scale = 2
-		s.Plot.XAxis.Range.SetMax(105)
-		s.Plot.SetLinesOn(plot.On).SetPointsOn(plot.On)
-		s.Line.Color = colors.Uniform(colors.Red)
-		s.Point.Color = colors.Uniform(colors.Blue)
-		s.Range.SetMin(0).SetMax(100)
-	})
-
-	// somewhere else in the code:
-
-	plt := plot.New()
-   // NewLine automatically gets stylers from ty tensor metadata
-	plots.NewLine(plt, plot.Data{plot.X: tx, plot.Y: ty})
-	plt.Draw()
-```
-
-# Plot Types
-
-The following are the builtin standard plot types, in the `plots` package:
-
-## 1D and 2D XY Data
-
-### XY
-
-`XY` is the workhorse standard Plotter, taking at least `X` and `Y` inputs, and plotting lines and / or points at each X, Y point. 
-
-Optionally `Size` and / or `Color` inputs can be provided, which apply to the points. Thus, by using a `Point.Shape` of `Ring` or `Circle`, you can create a bubble plot by providing Size and Color data.
-
-### Bar
-
-`Bar` takes `Y` inputs, and draws bars of corresponding height.
-
-An optional `High` input can be provided to also plot error bars above each bar.
-
-To create a plot with multiple error bars, multiple Bar Plotters are created, with `Style.Width` parameters that have a shared `Stride = 1 / number of bars` and `Offset` that increments for each bar added.  The `plots.NewBars` function handles this directly.
-
-### ErrorBar
-
-`XErrorBar` and `YErrorBar` take `X`, `Y`, `Low`, and `High` inputs, and draws an `I` shaped error bar at the X, Y coordinate with the error "handles" around it.
-
-### Labels
-
-`Labels` takes `X`, `Y` and `Labels` string inputs and plots labels at the given coordinates.
-
-### Box
-
-`Box` takes `X`, `Y` (median line), `U`, `V` (box first and 3rd quartile values), and `Low`, `High` (Min, Max) inputs, and renders a box plot with error bars.
-
-### XFill, YFill
-
-`XFill` and `YFill` are used to draw filled regions between pairs of X or Y points, using the `X`, `Y`, and `Low`, `High` values to specify the center point (X, Y) and the region below / left and above / right to fill around that central point.
-
-XFill along with an XY line can be used to draw the equivalent of the [matplotlib fill_between](https://matplotlib.org/stable/plot_types/basic/fill_between.html#sphx-glr-plot-types-basic-fill-between-py) plot.
-
-YFill can be used to draw the equivalent of the [matplotlib violin plot](https://matplotlib.org/stable/plot_types/stats/violin.html#sphx-glr-plot-types-stats-violin-py).
-
-### Pie
-
-`Pie` takes a list of `Y` values that are plotted as the size of segments of a circular pie plot.  Y values are automatically normalized for plotting.
-
-TODO: implement, details on mapping, 
-
-## 2D Grid-based
-
-### ColorGrid
-
-Input = Values and X, Y size
-
-### Contour
-
-??
-
-### Vector
-
-X,Y,U,V
-
-Quiver?
-
-## 3D 
-
-TODO: use math32 3D projection math and you can just take each 3d point and reduce to 2D. For stuff you want to actually be able to use in SVG, it needs to ultimately be 2D, so it makes sense to support basic versions here, including XYZ (points, lines), Bar3D, wireframe.
-
-Could also have a separate plot3d package based on `xyz` that is true 3D for interactive 3D plots of surfaces or things that don't make sense in this more limited 2D world.
-
-# Statistical plots
-
-The `statplot` package provides functions taking `tensor` data that produce statistical plots of the data, including Quartiles (Box with Median, Quartile, Min, Max), Histogram (Bar), Violin (YFill), Range (XFill), Cluster... 
-
-TODO: add a Data scatter that plots points to overlay on top of Violin or Box.
-
-## LegendGroups
-
-* implements current legend grouping logic -- ends up being a multi-table output -- not sure how to interface.
-
-## Histogram
-
-## Quartiles
-
-## Violin
-
-## Range
-
-## Cluster
-
 # History
 
 The code is adapted from the [gonum plot](https://github.com/gonum/plot) package (which in turn was adapted from google's [plotinum](https://code.google.com/archive/p/plotinum/), to use the Cogent Core [styles](../styles) and [paint](../paint) rendering framework, which also supports SVG output of the rendering.
@@ -185,6 +37,8 @@ Here is the copyright notice for that package:
 
 # TODO
 
+* Min / Max not just for extending but also _limiting_ the range -- currently doesn't do
+
 * tensor index
 * Grid? in styling.
 
diff --git a/stats/README.md b/stats/README.md
index 3c84a69d..74fb9bc1 100644
--- a/stats/README.md
+++ b/stats/README.md
@@ -2,10 +2,16 @@
 
 There are several packages here for operating on [tensor](../), and [table](../table) data, for computing standard statistics and performing related computations, such as normalizing the data.
 
+* [stats](stats) computes standard summary statistics (mean, standard deviation, etc).
+
+* [metric](metric) computes similarity / distance metrics for comparing two tensors, and associated distance / similarity matrix functions.
+
 * [cluster](cluster) implements agglomerative clustering of items based on [metric](metric) distance / similarity matrix data.
+
 * [convolve](convolve) convolves data (e.g., for smoothing).
+
 * [glm](glm) fits a general linear model for one or more dependent variables as a function of one or more independent variables.  This encompasses all forms of regression.
+
 * [histogram](histogram) bins data into groups and reports the frequency of elements in the bins.
-* [metric](metric) computes similarity / distance metrics for comparing two tensors, and associated distance / similarity matrix functions, including PCA and SVD analysis functions that operate on a covariance matrix.
-* [stats](stats) provides a set of standard summary statistics on a range of different data types, including basic slices of floats, to tensor and table data.  It also includes the ability to extract Groups of values and generate statistics for each group, as in a "pivot table" in a spreadsheet.
+
 
diff --git a/stats/metric/README.md b/stats/metric/README.md
index 7d117418..466fe0c0 100644
--- a/stats/metric/README.md
+++ b/stats/metric/README.md
@@ -5,6 +5,8 @@
 type MetricFunc func(a, b, out tensor.Tensor) error
 ```
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/metric) for full documentation.
+
 The metric functions always operate on the outermost _row_ dimension, and it is up to the caller to reshape the tensors to accomplish the desired results. The two tensors must have the same shape.
 
 * To obtain a single summary metric across all values, use `tensor.As1D`.
diff --git a/stats/metric/enumgen.go b/stats/metric/enumgen.go
index c9802a23..28cb8a13 100644
--- a/stats/metric/enumgen.go
+++ b/stats/metric/enumgen.go
@@ -13,7 +13,7 @@ const MetricsN Metrics = 13
 
 var _MetricsValueMap = map[string]Metrics{`L2Norm`: 0, `SumSquares`: 1, `L1Norm`: 2, `Hamming`: 3, `L2NormBinTol`: 4, `SumSquaresBinTol`: 5, `InvCosine`: 6, `InvCorrelation`: 7, `CrossEntropy`: 8, `DotProduct`: 9, `Covariance`: 10, `Correlation`: 11, `Cosine`: 12}
 
-var _MetricsDescMap = map[Metrics]string{0: `L2Norm is the square root of the sum of squares differences between tensor values, aka the L2 Norm.`, 1: `SumSquares is the sum of squares differences between tensor values.`, 2: `L1Norm is the sum of the absolute value of differences between tensor values, the L1 Norm.`, 3: `Hamming is the sum of 1s for every element that is different, i.e., &#34;city block&#34; distance.`, 4: `L2NormBinTol is the [L2Norm] square root of the sum of squares differences between tensor values, with binary tolerance: differences &lt; 0.5 are thresholded to 0.`, 5: `SumSquaresBinTol is the [SumSquares] differences between tensor values, with binary tolerance: differences &lt; 0.5 are thresholded to 0.`, 6: `InvCosine is 1-[Cosine], which is useful to convert it to an Increasing metric where more different vectors have larger metric values.`, 7: `InvCorrelation is 1-[Correlation], which is useful to convert it to an Increasing metric where more different vectors have larger metric values.`, 8: `CrossEntropy is a standard measure of the difference between two probabilty distributions, reflecting the additional entropy (uncertainty) associated with measuring probabilities under distribution b when in fact they come from distribution a. It is also the entropy of a plus the divergence between a from b, using Kullback-Leibler (KL) divergence. It is computed as: a * log(a/b) + (1-a) * log(1-a/1-b).`, 9: `DotProduct is the sum of the co-products of the tensor values.`, 10: `Covariance is co-variance between two vectors, i.e., the mean of the co-product of each vector element minus the mean of that vector: cov(A,B) = E[(A - E(A))(B - E(B))].`, 11: `Correlation is the standardized [Covariance] in the range (-1..1), computed as the mean of the co-product of each vector element minus the mean of that vector, normalized by the product of their standard deviations: cor(A,B) = E[(A - E(A))(B - E(B))] / sigma(A) sigma(B). Equivalent to the [Cosine] of mean-normalized vectors.`, 12: `Cosine is high-dimensional angle between two vectors, in range (-1..1) as the normalized [DotProduct]: inner product / sqrt(ssA * ssB). See also [Correlation].`}
+var _MetricsDescMap = map[Metrics]string{0: `L2Norm is the square root of the sum of squares differences between tensor values, aka the Euclidean distance.`, 1: `SumSquares is the sum of squares differences between tensor values.`, 2: `L1Norm is the sum of the absolute value of differences between tensor values, the L1 Norm.`, 3: `Hamming is the sum of 1s for every element that is different, i.e., &#34;city block&#34; distance.`, 4: `L2NormBinTol is the [L2Norm] square root of the sum of squares differences between tensor values, with binary tolerance: differences &lt; 0.5 are thresholded to 0.`, 5: `SumSquaresBinTol is the [SumSquares] differences between tensor values, with binary tolerance: differences &lt; 0.5 are thresholded to 0.`, 6: `InvCosine is 1-[Cosine], which is useful to convert it to an Increasing metric where more different vectors have larger metric values.`, 7: `InvCorrelation is 1-[Correlation], which is useful to convert it to an Increasing metric where more different vectors have larger metric values.`, 8: `CrossEntropy is a standard measure of the difference between two probabilty distributions, reflecting the additional entropy (uncertainty) associated with measuring probabilities under distribution b when in fact they come from distribution a. It is also the entropy of a plus the divergence between a from b, using Kullback-Leibler (KL) divergence. It is computed as: a * log(a/b) + (1-a) * log(1-a/1-b).`, 9: `DotProduct is the sum of the co-products of the tensor values.`, 10: `Covariance is co-variance between two vectors, i.e., the mean of the co-product of each vector element minus the mean of that vector: cov(A,B) = E[(A - E(A))(B - E(B))].`, 11: `Correlation is the standardized [Covariance] in the range (-1..1), computed as the mean of the co-product of each vector element minus the mean of that vector, normalized by the product of their standard deviations: cor(A,B) = E[(A - E(A))(B - E(B))] / sigma(A) sigma(B). Equivalent to the [Cosine] of mean-normalized vectors.`, 12: `Cosine is high-dimensional angle between two vectors, in range (-1..1) as the normalized [DotProduct]: inner product / sqrt(ssA * ssB). See also [Correlation].`}
 
 var _MetricsMap = map[Metrics]string{0: `L2Norm`, 1: `SumSquares`, 2: `L1Norm`, 3: `Hamming`, 4: `L2NormBinTol`, 5: `SumSquaresBinTol`, 6: `InvCosine`, 7: `InvCorrelation`, 8: `CrossEntropy`, 9: `DotProduct`, 10: `Covariance`, 11: `Correlation`, 12: `Cosine`}
 
diff --git a/stats/metric/metrics.go b/stats/metric/metrics.go
index 8685f33c..b4b0a87f 100644
--- a/stats/metric/metrics.go
+++ b/stats/metric/metrics.go
@@ -32,7 +32,7 @@ type Metrics int32 //enums:enum -trim-prefix Metric
 
 const (
 	// L2Norm is the square root of the sum of squares differences
-	// between tensor values, aka the L2 Norm.
+	// between tensor values, aka the Euclidean distance.
 	MetricL2Norm Metrics = iota
 
 	// SumSquares is the sum of squares differences between tensor values.
diff --git a/stats/stats/README.md b/stats/stats/README.md
index 0e76fc64..74fe6a10 100644
--- a/stats/stats/README.md
+++ b/stats/stats/README.md
@@ -4,7 +4,9 @@ The `stats` package provides standard statistic computations operating on the `t
 ```Go
 type StatsFunc func(in, out tensor.Tensor) error
 ```
-n
+
+See the [Cogent Lab Docs](https://cogentcore.org/lab/stats) for full documentation.
+
 The stats functions always operate on the outermost _row_ dimension, and it is up to the caller to reshape the tensor to accomplish the desired results.
 
 * To obtain a single summary statistic across all values, use `tensor.As1D`.
diff --git a/table/README.md b/table/README.md
index 55e3f152..de6742aa 100644
--- a/table/README.md
+++ b/table/README.md
@@ -1,9 +1,9 @@
 # table 
 
-[![Go Reference](https://pkg.go.dev/badge/cogentcore.org/core/table.svg)](https://pkg.go.dev/cogentcore.org/core/table)
-
 **table** provides a DataTable / DataFrame structure similar to [pandas](https://pandas.pydata.org/) and [xarray](http://xarray.pydata.org/en/stable/) in Python, and [Apache Arrow Table](https://github.com/apache/arrow/tree/master/go/arrow/array/table.go), using [tensor](../tensor) n-dimensional columns aligned by common outermost row dimension.
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/table) for full documentation.
+
 See [examples/dataproc](examples/dataproc) for a demo of how to use this system for data analysis, paralleling the example in [Python Data Science](https://jakevdp.github.io/PythonDataScienceHandbook/03.08-aggregation-and-grouping.html) using pandas, to see directly how that translates into this framework.
 
 Whereas an individual `Tensor` can only hold one data type, the `Table` allows coordinated storage and processing of heterogeneous data types, aligned by the outermost row dimension. The main `tensor` data processing functions are defined on the individual tensors (which are the universal computational element in the `tensor` system), but the coordinated row-wise indexing in the table is important for sorting or filtering a collection of data in the same way, and grouping data by a common set of "splits" for data analysis.  Plotting is also driven by the table, with one column providing a shared X axis for the rest of the columns.
@@ -20,97 +20,3 @@ There are also multi-column `Sort` and `Filter` methods on the Table itself.
 
 It is very low-cost to create a new View of an existing Table, via `NewView`, as they can share the underlying `Columns` data.
 
-# Cheat Sheet
-
-`dt` is the Table pointer variable for examples below:
-
-## Table Access
-
-Column data access:
-
-```Go
-// FloatRow is a method on the `tensor.Rows` returned from the `Column` method.
-// This is the best method to use in general for generic 1D data access,
-// as it works on any data from 1D on up (although it only samples the first value
-// from higher dimensional data) .
-val := dt.Column("Values").FloatRow(3)
-```
-
-```Go
-dt.Column("Name").SetStringRow(4)
-```
-
-To access higher-dimensional "cell" level data using a simple 1D index into the cell patterns:
-
-```Go
-// FloatRow is a method on the `tensor.Rows` returned from the `Column` method.
-// This is the best method to use in general for generic 1D data access,
-// as it works on any data from 1D on up (although it only samples the first value
-// from higher dimensional data) .
-val := dt.Column("Values").FloatRow(3, 2)
-```
-
-```Go
-dt.Column("Name").SetStringRow("Alia", 4, 1)
-```
-
-todo: more
-
-## Sorting and Filtering
-
-## Splits ("pivot tables" etc), Aggregation
-
-Create a table of mean values of "Data" column grouped by unique entries in "Name" column, resulting table will be called "DataMean":
-
-```Go
-byNm := split.GroupBy(ix, []string{"Name"}) // column name(s) to group by
-split.Agg(byNm, "Data", agg.AggMean) // 
-gps := byNm.AggsToTable(etable.AddAggName) // etable.AddAggName or etable.ColNameOnly for naming cols
-```
-
-Describe (basic stats) all columns in a table:
-
-```Go
-ix := etable.NewRows(et) // new view with all rows
-desc := agg.DescAll(ix) // summary stats of all columns
-// get value at given column name (from original table), row "Mean"
-mean := desc.Float("ColNm", desc.RowsByString("Agg", "Mean", etable.Equals, etable.UseCase)[0])
-```
-
-# CSV / TSV file format
-
-Tables can be saved and loaded from CSV (comma separated values) or TSV (tab separated values) files.  See the next section for special formatting of header strings in these files to record the type and tensor cell shapes.
-
-## Type and Tensor Headers
-
-To capture the type and shape of the columns, we support the following header formatting.  We weren't able to find any other widely supported standard (please let us know if there is one that we've missed!)
-
-Here is the mapping of special header prefix characters to standard types:
-```Go
-'$': etensor.STRING,
-'%': etensor.FLOAT32,
-'#': etensor.FLOAT64,
-'|': etensor.INT64,
-'@': etensor.UINT8,
-'^': etensor.BOOL,
-```
-
-Columns that have tensor cell shapes (not just scalars) are marked as such with the *first* such column having a `<ndim:dim,dim..>` suffix indicating the shape of the *cells* in this column, e.g., `<2:5,4>` indicates a 2D cell Y=5,X=4.  Each individual column is then indexed as `[ndims:x,y..]` e.g., the first would be `[2:0,0]`, then `[2:0,1]` etc.
-
-## Example
-
-Here's a TSV file for a scalar String column (`Name`), a 2D 1x4 tensor float32 column (`Input`), and a 2D 1x2 float32 `Output` column.
-
-```
-_H:	$Name	%Input[2:0,0]<2:1,4>	%Input[2:0,1]	%Input[2:0,2]	%Input[2:0,3]	%Output[2:0,0]<2:1,2>	%Output[2:0,1]
-_D:	Event_0	1	0	0	0	1	0
-_D:	Event_1	0	1	0	0	1	0
-_D:	Event_2	0	0	1	0	0	1
-_D:	Event_3	0	0	0	1	0	1
-```
-
-## Logging one row at a time
-
-
-
-
diff --git a/table/io.go b/table/io.go
index 8903eb36..510de963 100644
--- a/table/io.go
+++ b/table/io.go
@@ -6,6 +6,7 @@ package table
 
 import (
 	"bufio"
+	"bytes"
 	"encoding/csv"
 	"fmt"
 	"io"
@@ -51,6 +52,13 @@ func (dt *Table) SaveCSV(filename fsx.Filename, delim tensor.Delims, headers boo
 	return err
 }
 
+// String returns a string of the CSV formatted file for the table.
+func (dt *Table) String() string {
+	var b bytes.Buffer
+	dt.WriteCSV(&b, tensor.Tab, true)
+	return b.String()
+}
+
 // OpenCSV reads a table from a comma-separated-values (CSV) file
 // (where comma = any delimiter, specified in the delim arg),
 // using the Go standard encoding/csv reader conforming to the official CSV standard.
diff --git a/tensor/README.md b/tensor/README.md
index 9da1d044..f8fba8a2 100644
--- a/tensor/README.md
+++ b/tensor/README.md
@@ -4,6 +4,10 @@ Tensor and related sub-packages provide a simple yet powerful framework for repr
 
 The [Goal](../goal) augmented version of the _Go_ language directly supports NumPy-like operations on tensors. A `Tensor` is comparable to the NumPy `ndarray` type, and it provides the universal representation of a homogenous data type throughout all the packages here, from scalar to vector, matrix and beyond. All functions take and return `Tensor` arguments.
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/tensor) for full documentation.
+
+## Design discussion
+
 The `Tensor` interface is implemented at the basic level with n-dimensional indexing into flat Go slices of any numeric data type (by `Number`), along with `String`, and `Bool` (which uses [bitslice](bitslice) for maximum efficiency). These implementations satisfy the `Values` sub-interface of Tensor, which supports the most direct and efficient operations on contiguous memory data. The `Shape` type provides all the n-dimensional indexing with arbitrary strides to allow any ordering, although _row major_ is the default and other orders have to be manually imposed.
 
 In addition, there are five important "view" implementations of `Tensor` that wrap another "source" Tensor to provide more flexible and efficient access to the data, consistent with the NumPy functionality.  See [Basic and Advanced Indexing](#basic-and-advanced-indexing) below for more info.
@@ -74,117 +78,6 @@ There are various standard shapes of tensor data that different functions expect
 
 The `SetNumRows` function can be used to progressively increase the number of rows to fit more data, as is typically the case when logging data (often using a [table](table)). You can set the row dimension to 0 to start -- that is (now) safe. However, for greatest efficiency, it is best to set the number of rows to the largest expected size first, and _then_ set it back to 0. The underlying slice of data retains its capacity when sized back down. During incremental increasing of the slice size, if it runs out of capacity, all the elements need to be copied, so it is more efficient to establish the capacity up front instead of having multiple incremental re-allocations.
 
-# Cheat Sheet
-
-TODO: update
-
-`ix` is the `Rows` tensor for these examples:
-
-## Tensor Access
-
-### 1D
-
-```Go
-// 5th element in tensor regardless of shape:
-val := ix.Float1D(5)
-```
-
-```Go
-// value as a string regardless of underlying data type; numbers converted to strings.
-str := ix.String1D(2)
-```
-
-### 2D Row, Cell
-
-```Go
-// value at row 3, cell 2 (flat index into entire `SubSpace` tensor for this row)
-// The row index will be indirected through any `Indexes` present on the Rows view.
-val := ix.FloatRow(3, 2)
-// string value at row 2, cell 0. this is safe for 1D and 2D+ shapes
-// and is a robust way to get 1D data from tensors of unknown shapes.
-str := ix.FloatRow(2, 0)
-```
-
-```Go
-// get the whole n-dimensional tensor of data cells at given row.
-// row is indirected through indexes.
-// the resulting tensor is a "subslice" view into the underlying data
-// so changes to it will automatically update the parent tensor.
-tsr := ix.RowTensor(4)
-....
-// set all n-dimensional tensor values at given row from given tensor.
-ix.SetRowTensor(tsr, 4) 
-```
-
-```Go
-// returns a flat, 1D Rows view into n-dimensional tensor values at 
-// given row. This is used in compute routines that operate generically
-// on the entire row as a flat pattern.
-ci := tensor.Cells1D(ix, 5)
-```
-
-### Full N-dimensional Indexes
-
-```Go
-// for 3D data
-val := ix.Float(3,2,1)
-```
-
-# `Tensor` vs. Python NumPy
-
-The [Goal](../goal) language provides a reasonably faithful translation of NumPy `ndarray` syntax into the corresponding Go tensor package implementations. For those already familiar with NumPy, it should mostly "just work", but the following provides a more in-depth explanation for how the two relate, and when you might get different results.
-
-## Basic and Advanced Indexing
-
-NumPy distinguishes between _basic indexing_ (using a single index or sliced ranges of indexes along each dimension) versus _advanced indexing_ (using an array of indexes or bools). Basic indexing returns a **view** into the original data (where changes to the view directly affect the underlying type), while advanced indexing returns a **copy**.
-
-However, rather confusingly (per this [stack overflow question](https://stackoverflow.com/questions/15691740/does-assignment-with-advanced-indexing-copy-array-data)), you can do direct assignment through advanced indexing (more on this below):
-```Python
-a[np.array([1,2])] = 5  # or:
-a[a > 0.5] = 1          # boolean advanced indexing
-```
-
-Although powerful, the semantics of all of this is a bit confusing. In the `tensor` package, we provide what are hopefully more clear and concrete _view_ types that have well-defined semantics, and cover the relevant functionality, while perhaps being a bit easier to reason with. These were described at the start of this README.  The correspondence to NumPy indexing is as follows:
-
-* Basic indexing by individual integer index coordinate values is supported by the `Number`, `String`, `Bool` `Values` Tensors.  For example, `Float(3,1,2)` returns the value at the given coordinates.  The `Sliced` (and `Rows`) and `Reshaped` views then complete the basic indexing with arbitrary reordering and filtering along entire dimension values, and reshaping dimensions. As noted above, `Reslice` supports the full NumPy basic indexing syntax, and `Reshape` implements the NumPy `reshape` function.
-
-* The `Masked` view corresponds to the NumPy _advanced_ indexing using a same-shape boolean mask, although in the NumPy case it makes a copy (although practically it is widely used for direct assignment as shown above.) Critically, you can always extract just the `true` values from a Masked view by using the `AsValues` method on the view, which returns a 1D tensor of those values, similar to what the boolean advanced indexing produces in NumPy. In addition, the `SourceIndexes` method returns a 1D list of indexes of the `true` (or `false`) values, which can be used for the `Indexed` view.
-    
-* The `Indexed` view corresponds to the array-based advanced indexing case in NumPy, but again it is a view, not a copy, so the assignment semantics are as expected from a view (and how NumPy behaves some of the time). Note that the NumPy version uses `n` separate index tensors, where each such tensor specifies the value of a corresponding dimension index, and all such tensors _must have the same shape_; that form can be converted into the single Indexes form with a utility function.  Also, NumPy advanced indexing has a somewhat confusing property where it de-duplicates index references during some operations, such that `a+=1` only increments +1 even when there are multiple elements in the view. The tensor version does not implement that special case, due to its direct view semantics.
-
-To reiterate, all view tensors have a `AsValues` function, equivalent to the `copy` function in NumPy, which turns the view into a corresponding basic concrete value Tensor, so the copy semantics of advanced indexing (modulo the direct assignment behavior) can be achieved when assigning to a new variable.
-
-## Alignment of shapes for computations ("broadcasting")
-
-The NumPy concept of [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) is critical for flexibly defining the semantics for how functions taking two n-dimensional Tensor arguments behave when they have different shapes. Ultimately, the computation operates by iterating over the length of the longest tensor, and the question is how to _align_ the shapes so that a meaningful computation results from this.
-
-If both tensors are 1D and the same length, then a simple matched iteration over both can take place. However, the broadcasting logic defines what happens when there is a systematic relationship between the two, enabling powerful (but sometimes difficult to understand) computations to be specified.
-
-The following examples demonstrate the logic:
-
-Innermost dimensions that match in dimension are iterated over as you'd expect:
-```
-Image  (3d array): 256 x 256 x 3
-Scale  (1d array):             3
-Result (3d array): 256 x 256 x 3
-```
-
-Anything with a dimension size of 1 (a "singleton") will match against any other sized dimension:
-```
-A      (4d array):  8 x 1 x 6 x 1
-B      (3d array):      7 x 1 x 5
-Result (4d array):  8 x 7 x 6 x 5
-```
-In the innermost dimension here, the single value in A acts like a "scalar" in relationship to the 5 values in B along that same dimension, operating on each one in turn. Likewise for the singleton second-to-last dimension in B.
-
-Any non-1 mismatch represents an error:
-```
-A      (2d array):      2 x 1
-B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched
-```
-
-The `AlignShapes` function performs this shape alignment logic, and the `WrapIndex1D` function is used to compute a 1D index into a given shape, based on the total output shape sizes, wrapping any singleton dimensions around as needed. These are used in the [tmath](tmath) package for example to implement the basic binary math operators.
-
 # Printing format
 
 The following are examples of tensor printing via the `Sprintf` function, which is used with default values for the `String()` stringer method on tensors. It does a 2D projection of higher-dimensional tensors, using the `Projection2D` set of functions, which assume a row-wise outermost dimension in general, and pack even sets of inner dimensions into 2D row x col shapes (see examples below).
diff --git a/tensor/masked.go b/tensor/masked.go
index 53d5caba..efa8ee79 100644
--- a/tensor/masked.go
+++ b/tensor/masked.go
@@ -253,11 +253,12 @@ func (ms *Masked) SetInt1D(val int, i int) {
 
 // Filter sets the mask values using given Filter function.
 // The filter function gets the 1D index into the source tensor.
-func (ms *Masked) Filter(filterer func(tsr Tensor, idx int) bool) {
+func (ms *Masked) Filter(filterer func(tsr Tensor, idx int) bool) *Masked {
 	n := ms.Tensor.Len()
 	for i := range n {
 		ms.Mask.SetBool1D(filterer(ms.Tensor, i), i)
 	}
+	return ms
 }
 
 // check for interface impl
diff --git a/tensor/sliced.go b/tensor/sliced.go
index a5090951..5b8d46ea 100644
--- a/tensor/sliced.go
+++ b/tensor/sliced.go
@@ -51,6 +51,31 @@ func NewSliced(tsr Tensor, idxs ...[]int) *Sliced {
 	return sl
 }
 
+// AnySlice returns a new Tensor view using the given index
+// variables to be used for Sliced, Masked, or Indexed
+// depending on what is present.
+//   - If a [Bool] tensor is provided then [NewMasked] is called.
+//   - If a single tensor is provided with Len > max(1, tsr.NumDims()),
+//     [NewIndexed] is called.
+//   - Otherwise, [Reslice] is called with the args.
+func AnySlice(tsr Tensor, idx ...any) Tensor {
+	n := len(idx)
+	if n == 0 {
+		return tsr
+	}
+	if n == 1 {
+		if b, ok := idx[0].(*Bool); ok {
+			return NewMasked(tsr, b)
+		}
+		if i, ok := idx[0].(Tensor); ok {
+			if i.Len() > 1 && i.Len() > tsr.NumDims() {
+				return NewIndexed(tsr, AsInt(i))
+			}
+		}
+	}
+	return Reslice(tsr, idx...)
+}
+
 // Reslice returns a new [Sliced] (and potentially [Reshaped]) view of given tensor,
 // with given slice expressions for each dimension, which can be:
 //   - an integer, indicating a specific index value along that dimension.
@@ -77,6 +102,20 @@ func Reslice(tsr Tensor, sls ...any) Tensor {
 	ci := 0
 	for d := range ns {
 		s := sls[d]
+		if st, ok := s.(Tensor); ok {
+			doReshape = true // doesn't add to new shape.
+			ni := st.Len()
+			for i := range ni {
+				ix := st.Int1D(i)
+				if ix < 0 {
+					ixs[ci] = []int{tsr.DimSize(ci) + ix}
+				} else {
+					ixs[ci] = []int{ix}
+				}
+				ci++
+			}
+			continue
+		}
 		switch x := s.(type) {
 		case int:
 			doReshape = true // doesn't add to new shape.
diff --git a/tensorfs/README.md b/tensorfs/README.md
index 50aa6c64..7a443bfa 100644
--- a/tensorfs/README.md
+++ b/tensorfs/README.md
@@ -2,6 +2,10 @@
 
 `tensorfs` is a virtual file system that implements the Go `fs` interface, and can be accessed using fs-general tools, including the cogent core `filetree` and the `goal` shell.
 
+See the [Cogent Lab Docs](https://cogentcore.org/lab/tensorfs) for full documentation.
+
+## Design discussion
+
 Values are represented using the [tensor] package universal data type: the `tensor.Tensor`, which can represent everything from a single scalar value up to n-dimensional collections of patterns, in a range of data types.
 
 A given `Node` in the file system is either:
@@ -10,73 +14,10 @@ A given `Node` in the file system is either:
 
 Each Node has a name which must be unique within the directory. The nodes in a directory are processed in the order of its ordered map list, which initially reflects the order added, and can be re-ordered as needed. An alphabetical sort is also available with the `Alpha` versions of methods, and is the default sort for standard FS operations.
 
-The hierarchical structure of a filesystem naturally supports various kinds of functions, such as various time scales of logging, with lower-level data aggregated into upper levels.  Or hierarchical splits for a pivot-table effect.
-
-# Usage
-
-There are two main APIs, one for direct usage within Go, and another that is used by the [goal](../goal) framework for interactive shell-based access, which always operates relative to a current working directory.
-
-## Go API
-
-The primary Go access function is the generic `Value`:
-
-```Go
-tsr := tensorfs.Value[float64](dir, "filename", 5, 5)
-```
-
-This returns a `tensor.Values` for the node `"filename"` in the directory Node `dir` with the tensor shape size of 5x5, and `float64` values.
-
-If the tensor was previously created, then it is returned, and otherwise it is created.  This provides a robust single-function API for access and creation, and it doesn't return any errors, so the return value can used directly, in inline expressions etc.
-
-For efficiency, _there are no checks_ on the existing value relative to the arguments passed, so if you end up using the same name for two different things, that will cause problems that will hopefully become evident. If you want to ensure that the size is correct, you should use an explicit `tensor.SetShapeSizes` call, which is still quite efficient if the size is the same. You can also have an initial call to `Value` that has no size args, and then set the size later -- that works fine.
-
-There are also functions for high-frequency types, defined on the `Node`: `Float64`, `Float32`, `Int`, and `StringValue` (`String` is taken by `fmt.Stringer`, `StringValue` is used in `tensor`), e.g.,:
-
-```Go
-tsr := dir.Float64("filename", 5, 5)
-```
-
-There are also a few other variants of the `Value` functionality:
-* `Scalar` calls `Value` with a size of 1.
-* `Values` makes multiple tensor values of the same shape, with a final variadic list of names.
-* `ValueType` takes a `reflect.Kind` arg for the data type, which can then be a variable.
-* `SetTensor` sets a tensor to a node of given name, creating the node if needed. This is also available as the `Set` method on a directory node.
-
-`DirTable` returns a `table.Table` with all the tensors under a given directory node, which can then be used for making plots or doing other forms of data analysis. This works best when each tensor has the same outer-most row dimension. The table is persistent and very efficient, using direct pointers to the underlying tensor values.
-
-### Directories
-
-Directories are `Node` elements that have a `nodes` value (ordered map of named nodes) instead of a tensor value.
-
-The primary way to make / access a subdirectory is the `Dir` method:
-```Go
-subdir := dir.Dir("subdir")
-```
-If the subdirectory doesn't exist yet, it will be made, and otherwise it is returned. Any errors will be logged and a nil returned, likely causing a panic unless you expect it to fail and check for that.
-
-There are parallel `Node` and `Value` access methods for directory nodes, with the Value ones being:
-
-* `tsr := dir.Value("name")` returns tensor directly, will panic if not valid
-* `tsrs, err := dir.Values("name1", "name2")` returns a slice of tensor values within directory by name. a plain `.Values()` returns all values.
-* `tsrs := dir.ValuesFunc(<filter func>)` walks down directories (unless filtered) and returns a flat list of all tensors found. Goes in "directory order" = order nodes were added.
-* `tsrs := dir.ValuesAlphaFunc(<filter func>)` is like `ValuesFunc` but traverses in alpha order at each node.
-
-### Existing items and unique names
-
-As in a real filesystem, names must be unique within each directory, which creates issues for how to manage conflicts between existing and new items. To make the overall framework maximally robust and eliminate the need for a controlled initialization-then-access ordering, we generally adopt the "Recycle" logic:
+The uniqueness constraint of names within each directory creates issues for how to manage conflicts between existing and new items. To make the overall framework maximally robust and eliminate the need for a controlled initialization-then-access ordering, we generally adopt the "Recycle" logic:
 
 * _Return an existing item of the same name, or make a new one._
 
 In addition, if you really need to know if there is an existing item, you can use the `Node` method to check for yourself -- it will return `nil` if no node of that name exists. Furthermore, the global `NewDir` function returns an `fs.ErrExist` error for existing items (e.g., use `errors.Is(fs.ErrExist)`), as used in various `os` package functions.
 
-## `goal` Command API
-
-The following shell command style functions always operate relative to the global `CurDir` current directory and `CurRoot` root, and `goal` in math mode exposes these methods directly. Goal operates on tensor valued variables always.
-
-* `Chdir("subdir")` change current directory to subdir.
-* `Mkdir("subdir")` make a new directory.
-* `List()` print a list of nodes.
-* `tsr := Get("mydata")` get tensor value at "mydata" node.
-* `Set("mydata", tsr)` set tensor to "mydata" node.
-
 
diff --git a/tensorfs/commands.go b/tensorfs/commands.go
index 93189b4a..97a5caa2 100644
--- a/tensorfs/commands.go
+++ b/tensorfs/commands.go
@@ -6,6 +6,7 @@ package tensorfs
 
 import (
 	"fmt"
+	"io"
 	"io/fs"
 	"path"
 	"strings"
@@ -21,6 +22,10 @@ var (
 	// CurRoot is the current root tensorfs system.
 	// A default root tensorfs is created at startup.
 	CurRoot *Node
+
+	// ListOutput is where to send the output of List commands,
+	// if non-nil (otherwise os.Stdout).
+	ListOutput io.Writer
 )
 
 func init() {
@@ -93,7 +98,11 @@ func List(opts ...string) error {
 		}
 	}
 	ls := dir.List(long, recursive)
-	fmt.Println(ls)
+	if ListOutput != nil {
+		fmt.Fprintln(ListOutput, ls)
+	} else {
+		fmt.Println(ls)
+	}
 	return nil
 }
 
@@ -156,3 +165,11 @@ func Set(name string, tsr tensor.Tensor) error {
 	SetTensor(cd, tsr, name)
 	return nil
 }
+
+// SetCopy sets tensor to given name or path relative to the
+// current working directory.
+// Unlike [Set], this version saves a [tensor.Clone] of the tensor,
+// so future changes to the tensor do not affect this value.
+func SetCopy(name string, tsr tensor.Tensor) error {
+	return Set(name, tensor.Clone(tsr))
+}
diff --git a/tensorfs/fs.go b/tensorfs/fs.go
index da095cd9..8c5d0f6f 100644
--- a/tensorfs/fs.go
+++ b/tensorfs/fs.go
@@ -40,10 +40,12 @@ func (nd *Node) Sub(dir string) (fs.FS, error) {
 	if err := nd.mustDir("Sub", dir); err != nil {
 		return nil, err
 	}
-	if !fs.ValidPath(dir) {
-		return nil, &fs.PathError{Op: "Sub", Path: dir, Err: errors.New("invalid name")}
-	}
-	if dir == "." || dir == "" || dir == nd.name {
+	// todo: this does not allow .. expressions, so we can't use it:
+	// if !fs.ValidPath(dir) {
+	// 	return nil, &fs.PathError{Op: "Sub", Path: dir, Err: errors.New("invalid path")}
+	// }
+	if dir == "." || dir == "" || dir == nd.name { // todo: this last condition seems bad.
+		// need tests
 		return nd, nil
 	}
 	cd := dir
@@ -61,6 +63,14 @@ func (nd *Node) Sub(dir string) (fs.FS, error) {
 			return cur, nil
 		}
 		cd = rest
+		if root == ".." {
+			if cur.Parent != nil {
+				cur = cur.Parent
+				continue
+			} else {
+				return nil, &fs.PathError{Op: "Sub", Path: dir, Err: errors.New("already at root")}
+			}
+		}
 		sd, ok := cur.nodes.AtTry(root)
 		if !ok {
 			return nil, &fs.PathError{Op: "Sub", Path: dir, Err: errors.New("directory not found")}
diff --git a/tensorfs/list.go b/tensorfs/list.go
index cb15736d..b6f2ce38 100644
--- a/tensorfs/list.go
+++ b/tensorfs/list.go
@@ -22,7 +22,11 @@ const (
 
 func (nd *Node) String() string {
 	if !nd.IsDir() {
-		return nd.Tensor.Label()
+		lb := nd.Tensor.Label()
+		if !strings.HasPrefix(lb, nd.name) {
+			lb = nd.name + " " + lb
+		}
+		return lb
 	}
 	return nd.List(Short, DirOnly)
 }
diff --git a/yaegilab/labsymbols/cogentcore_org-lab-lab.go b/yaegilab/labsymbols/cogentcore_org-lab-lab.go
index c756d2a3..d80bbae0 100644
--- a/yaegilab/labsymbols/cogentcore_org-lab-lab.go
+++ b/yaegilab/labsymbols/cogentcore_org-lab-lab.go
@@ -12,6 +12,7 @@ func init() {
 	Symbols["cogentcore.org/lab/lab/lab"] = map[string]reflect.Value{
 		// function, constant and variable definitions
 		"AsDataTree":         reflect.ValueOf(lab.AsDataTree),
+		"DirAndFileNoSlash":  reflect.ValueOf(lab.DirAndFileNoSlash),
 		"FirstComment":       reflect.ValueOf(lab.FirstComment),
 		"IsTableFile":        reflect.ValueOf(lab.IsTableFile),
 		"Lab":                reflect.ValueOf(&lab.Lab).Elem(),
diff --git a/yaegilab/labsymbols/cogentcore_org-lab-plot.go b/yaegilab/labsymbols/cogentcore_org-lab-plot.go
index b3c51473..74fee429 100644
--- a/yaegilab/labsymbols/cogentcore_org-lab-plot.go
+++ b/yaegilab/labsymbols/cogentcore_org-lab-plot.go
@@ -59,6 +59,7 @@ func init() {
 		"On":                 reflect.ValueOf(plot.On),
 		"PlotX":              reflect.ValueOf(plot.PlotX),
 		"PlotY":              reflect.ValueOf(plot.PlotY),
+		"PlotYR":             reflect.ValueOf(plot.PlotYR),
 		"PlotterByType":      reflect.ValueOf(plot.PlotterByType),
 		"Plotters":           reflect.ValueOf(&plot.Plotters).Elem(),
 		"Plus":               reflect.ValueOf(plot.Plus),
@@ -159,7 +160,7 @@ type _cogentcore_org_lab_plot_Plotter struct {
 	WData        func() (data plot.Data, pixX []float32, pixY []float32)
 	WPlot        func(pt *plot.Plot)
 	WStylers     func() *plot.Stylers
-	WUpdateRange func(plt *plot.Plot, xr *minmax.F64, yr *minmax.F64, zr *minmax.F64)
+	WUpdateRange func(plt *plot.Plot, x *minmax.F64, y *minmax.F64, yr *minmax.F64, z *minmax.F64)
 }
 
 func (W _cogentcore_org_lab_plot_Plotter) ApplyStyle(plotStyle *plot.PlotStyle, idx int) {
@@ -170,8 +171,8 @@ func (W _cogentcore_org_lab_plot_Plotter) Data() (data plot.Data, pixX []float32
 }
 func (W _cogentcore_org_lab_plot_Plotter) Plot(pt *plot.Plot)     { W.WPlot(pt) }
 func (W _cogentcore_org_lab_plot_Plotter) Stylers() *plot.Stylers { return W.WStylers() }
-func (W _cogentcore_org_lab_plot_Plotter) UpdateRange(plt *plot.Plot, xr *minmax.F64, yr *minmax.F64, zr *minmax.F64) {
-	W.WUpdateRange(plt, xr, yr, zr)
+func (W _cogentcore_org_lab_plot_Plotter) UpdateRange(plt *plot.Plot, x *minmax.F64, y *minmax.F64, yr *minmax.F64, z *minmax.F64) {
+	W.WUpdateRange(plt, x, y, yr, z)
 }
 
 // _cogentcore_org_lab_plot_Thumbnailer is an interface wrapper for Thumbnailer type
@@ -185,11 +186,11 @@ func (W _cogentcore_org_lab_plot_Thumbnailer) Thumbnail(pt *plot.Plot) { W.WThum
 // _cogentcore_org_lab_plot_Ticker is an interface wrapper for Ticker type
 type _cogentcore_org_lab_plot_Ticker struct {
 	IValue interface{}
-	WTicks func(min float64, max float64, nticks int) []plot.Tick
+	WTicks func(mn float64, mx float64, nticks int) []plot.Tick
 }
 
-func (W _cogentcore_org_lab_plot_Ticker) Ticks(min float64, max float64, nticks int) []plot.Tick {
-	return W.WTicks(min, max, nticks)
+func (W _cogentcore_org_lab_plot_Ticker) Ticks(mn float64, mx float64, nticks int) []plot.Tick {
+	return W.WTicks(mn, mx, nticks)
 }
 
 // _cogentcore_org_lab_plot_Valuer is an interface wrapper for Valuer type
diff --git a/yaegilab/tensorsymbols/cogentcore_org-lab-tensor.go b/yaegilab/tensorsymbols/cogentcore_org-lab-tensor.go
index 2e001cb6..2107554a 100644
--- a/yaegilab/tensorsymbols/cogentcore_org-lab-tensor.go
+++ b/yaegilab/tensorsymbols/cogentcore_org-lab-tensor.go
@@ -15,6 +15,7 @@ func init() {
 		"AddShapes":               reflect.ValueOf(tensor.AddShapes),
 		"AlignForAssign":          reflect.ValueOf(tensor.AlignForAssign),
 		"AlignShapes":             reflect.ValueOf(tensor.AlignShapes),
+		"AnySlice":                reflect.ValueOf(tensor.AnySlice),
 		"As1D":                    reflect.ValueOf(tensor.As1D),
 		"AsFloat32":               reflect.ValueOf(tensor.AsFloat32),
 		"AsFloat64":               reflect.ValueOf(tensor.AsFloat64),
diff --git a/yaegilab/tensorsymbols/cogentcore_org-lab-tensorfs.go b/yaegilab/tensorsymbols/cogentcore_org-lab-tensorfs.go
index 31e46194..1ebc5eeb 100644
--- a/yaegilab/tensorsymbols/cogentcore_org-lab-tensorfs.go
+++ b/yaegilab/tensorsymbols/cogentcore_org-lab-tensorfs.go
@@ -18,6 +18,7 @@ func init() {
 		"DirTable":     reflect.ValueOf(tensorfs.DirTable),
 		"Get":          reflect.ValueOf(tensorfs.Get),
 		"List":         reflect.ValueOf(tensorfs.List),
+		"ListOutput":   reflect.ValueOf(&tensorfs.ListOutput).Elem(),
 		"Long":         reflect.ValueOf(tensorfs.Long),
 		"Mkdir":        reflect.ValueOf(tensorfs.Mkdir),
 		"NewDir":       reflect.ValueOf(tensorfs.NewDir),
@@ -26,6 +27,7 @@ func init() {
 		"Record":       reflect.ValueOf(tensorfs.Record),
 		"Recursive":    reflect.ValueOf(tensorfs.Recursive),
 		"Set":          reflect.ValueOf(tensorfs.Set),
+		"SetCopy":      reflect.ValueOf(tensorfs.SetCopy),
 		"SetTensor":    reflect.ValueOf(tensorfs.SetTensor),
 		"Short":        reflect.ValueOf(tensorfs.Short),
 		"ValueType":    reflect.ValueOf(tensorfs.ValueType),
diff --git a/yaegilab/yaegilab.go b/yaegilab/yaegilab.go
index 77e2b47a..d129e64c 100644
--- a/yaegilab/yaegilab.go
+++ b/yaegilab/yaegilab.go
@@ -12,6 +12,7 @@ import (
 	"cogentcore.org/core/base/errors"
 	"cogentcore.org/core/yaegicore"
 	"cogentcore.org/lab/goal/interpreter"
+	"cogentcore.org/lab/tensorfs"
 	"cogentcore.org/lab/yaegilab/labsymbols"
 	"cogentcore.org/lab/yaegilab/tensorsymbols"
 	"github.com/cogentcore/yaegi/interp"
@@ -44,6 +45,9 @@ func (in *Interpreter) ImportUsed() {
 }
 
 func (in *Interpreter) Eval(src string) (res reflect.Value, err error) {
+	tensorfs.ListOutput = in.Goal.Config.StdIO.Out
+	in.Interpreter.Goal.TrState.MathRecord = true
 	res, _, err = in.Interpreter.Eval(src)
+	tensorfs.ListOutput = nil
 	return
 }