Index Variables are the future

# Introduction

A very common fitting problem is to find a vector `x` which solves the matrix equation `A x = y`. 
Currently such problems cannot be solved using `symfit`, unless one explicitly writes out all the components to the system. For example, if `A` is a 2x2 matrix, then in `symfit` we have to write:
    
```python
A_mat = np.array([[2, 3],
                  [4, 1]])
y_dat = np.array([2, 2])

x1, x2 = parameters('x1, x2')
y1, y2 = variables('y1, y2')
a11, a12, a21, a22 = variables('a11, a12, a21, a22')

model = {
    y1: a11 * x1 + a12 * x2,
    y2: a21 * x1 + a22 * x2,
}

fit = Fit(model, y1=y[0], y2=y_dat[1], a11=A_mat[0, 0], ...)
```

Apart from seeming unnecessarily verbose, this also becomes computationally expensive very quickly. 
Compare this to the following syntax which uses `sympy`'s beautiful `Indexed` objects:

```python
A, x, y = symbols('A, x, y', cls=IndexedBase)
i, j = symbols('i, j', cls=Idx)

model = {
    y[i]: A[i, j] * x[j]  # or y[i]: Sum(A[i, j] * x[j], j) for the mathematicians
} 

fit = Fit(model, A=A_mat, y=y_dat)
``` 

This syntax is infinitely more readable, and it instructs `symfit` on how to interpret the data so the arrays we feed to `Fit` do not have to be dissected, giving a huge performance gain.

The only downside to this beautiful code is that it doesn't work. Firstly, for `symfit` to be happy the distinction between parameters and variables will have to be reintroduced. Secondly, an `IndexedParameterBase` and `IndexedParameter` object have to be added. Thirdly, `Variable`'s will have to inherit from `IndexedBase` instead of `Symbol`. With these changes, the example will end up looking like this:

```python
A, y = variables('A, y')
x = symbols('x', cls=IndexedParameterBase)
i, j = symbols('i, j', cls=Idx)

model = {
    y[i]: A[i, j] * x[j]
}

fit = Fit(model, A=A_mat, y=y_dat)
``` 

# New Features

Changing to this syntax will open `symfit` up to a whole range of exciting and powerful features.

These are just some of the examples:
* Global fitting problems will be a walk in the park. For example, a model with a global intersect but a local slope per condition to be fitted, could be written as
    ```python
    A, y = variables('A, y')
    x_loc = symbols('x_loc', cls=IndexedParameterBase)
    x_glob = parameters('x_glob')
    
    i, j = symbols('i, j', cls=Idx)
    
    model = {
        y[i]: A[i, j] * x_loc[j] + x_glob
    }
    
    fit = Fit(model, A=A_mat, y=y_dat)
    ``` 

* Equality constraints could be added to any model in the form of Lagrange-multipliers. For example, suppose we want to do a linear fit `y = a * x + b`, subject to `a + b == 1`. Using this new syntax, one would write:
     ```python
    x, y = parameters('x, y')
    l, = parameters('l') # Lagrange multiplier
    i = symbols('i', cls=Idx)
    
    model = {y[i]: a * x[i] + b}
    fit = Fit(model, constraints={l: Eq(a + b, 1)})
    ```
    Internally, this can then be wrapped as such:
    ```python
    chi2 = Sum(y[i]**2 - (a * x[i] + b)**2, i)
    L = chi2 + l * (a + b - 1)
    ```
    The system of equations to be solved is then the jacobian of the Lagrangian `L`. Issue #148 can therefore not be solved before this is introduced. 

* Constraint which need data. So far, constraints can only be given on relations between `Parameter`'s, but not with regards to data. Soon one could do something like (artificial example)
    ```python
    a, b = parameters('a, b')
    x, y = variables('x, y')
    i = symbols('i', cls=Idx)

    model = {y[i]: a * x[i] + b}
    constraints = [
        Eq(a * Sum(x[i], (i, 0, len(xdata)), 1)
    ]
    ```
    I do not necessarily know use-cases for this, but perhaps you do!
 
# Possible Issues

1. A point of discussion is wether to write sums explicitly, or to go for the Einstein-summation convention as used above. As a reminder, in the Einstein convention we assume that repeated indices are summed over. Hence, instead of `Sum(A[i, j] * x[j], j)`, we write simply `A[i, j] * x[j]`, where the sum is implicit. However, this leads to problems when taking derivatives. For example, deriving this sum with respect to `x[k]`, one would expect to get `A[i, k]` after summation. But if no sum is performed, the answer is `KroneckerDelta(i, k)*A[i, j]`.

    Despite `sympy`'s documentation saying that summation is implicit with these objects, the answer it returns when deriving `A[i, j] * x[j]` is `KroneckerDelta(i, k)*A[i, j]`, so clearly no sum has been performed. This might cause problems in our determination of Jacobians, and therefore a sum should be performed at some point.
    
    A solution might be to require the users to write the sum explicitely, or to see if there is some way to infer which indices have been repeated, and to add a sum over those.
  
2. Another danger is to infer the range of the indices from the data, which is needed to perform such sums. The danger here is that if we assign this range to the `Idx` provided by the user, we end up modifying the object they thought they were dealing with. Definitely not gentlemanly.

    Possible solutions are to work with a Dummy-copy of the original `Idx`, or to not asign the range to the `Idx` objects directly but to give it to `Sum` directly when needed, e.g. `Sum(x[i], (i, 0, m -1))` where `m` is `len(x)`. However, this would require more bookkeeping on our part.
  
3. ~~Should Variable inherit from IndexedBase, or do we need a new type IndexedVariable to keep the destinction between the two types? In my experience so far, if you create an instance of `IndexedBase` but you don't provide it with `Idx`, you can still infer its name etc., so at this time I do not see a reason why changing to `Variable(IndexedBase)` should not be possible without any loss of functionality.~~ See second commend down

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Index Variables are the future #175

Introduction

New Features

Possible Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Index Variables are the future #175

Description

Introduction

New Features

Possible Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions