Generic Vector<T> Class Implementation With Use of Expression Trees





5.00/5 (4 votes)
This article show another way to implement generic operators. Vector is in middle of attention.
Content
- Introduction
- Expression Trees
- Dive Deeply into Implementation
- Generic T[] Binary Operator
- Time Tests
- Vector<T> Class Implementation
- Code Use and Limitations
- Future Development
- History
- References
Introduction
Using generics in Vector like class implementation is neverending story. This article depictes another way for implementation.
Main problem with generic Vector implementation is that there does not exists possibility to directly implement something like
class Vector<T>
{
public static Vector<T> operator +(Vector<T> a, Vector<T> b)
{
int cnt = a.Lenght;
Vector<T> result = new Vector(new T[cnt]);
for(int i = 0; i < cnt; i++)
result[i] = a[i] + b[i];
return result;
}
}
the compiler will throw an error at line
result[i] = a[i] + b[i];
So if you still want to implement it you have to solve this problem and avoid by some way the line code above.
One way which can help in this problem is use of lambda expression as implementation if operators are very simillar to lambda expressions. Let consider line code
Func<double, double, double> add = (a, b) => (a + b);
it behaves exactly as we will expect fot the + operator over double type. But if generic version will be typed compiler will throw an error, even if expression Func<T, T, T> add;
is for compiler OK.
Func<T, T, T> add = (a, b) => (a + b); //give compiler error
So the question is how to fill variable Func<T, T, T> add;
with proper generic form?
Expression Trees
Expression trees represents set of classes which are useable for building structures which can be compiled into delegates. Some articles describing the process, structures, classes etc. can be found here:
- Abul Kayes, http://www.codeproject.com/Articles/235860/Expression-Tree-Basics
- Sacha Barber, http://www.codeproject.com/Articles/30604/A-Journey-into-Expressions
- Alexandra Rusina, http://blogs.msdn.com/b/csharpfaq/archive/2009/11/19/debugging-expression-trees-in-visual-studio-2010.aspx
- Pranay Rana, http://www.codeproject.com/Tips/438804/Expression-Tree
Going directly to the goal of this article let show very simple expression tree use
ParameterExpression ap = Expression.Parameter(typeof(double), "a");
ParameterExpression bp = Expression.Parameter(typeof(double), "b");
Expression operationResult = Expression.Add(ap, bp);
Expression<Func<double, double, double>> lambda = Expression.Lambda<Func<double, double, double>>(operationResult, ap, bp);
Func<double, double, double> add = lambda.Compile();
this code will create Func<double, double, double> add
delegate with same functionallity as
Func<double, double, double> add = (a, b) => (a + b);
Dive Deeply into Implementation
So the non generic form is implemented. What about generic form? If the double
keyword will be replaced in implementation it will still works? (header and footer were added to define full method)
Func<T, T, T> CreateAdd()
{
ParameterExpression ap = Expression.Parameter(typeof(T), "a");
ParameterExpression bp = Expression.Parameter(typeof(T), "b");
Expression operationResult = Expression.Add(ap, bp);
Expression<Func<T, T, T>> lambda = Expression.Lambda<Func<T, T, T>>(operationResult, ap, bp);
Func<T, T, T> add = lambda.Compile();
return add;
}
This implementation is nearly same as
Func<T, T, T> add = (a, b) => (a + b);
The difference is that CreateAdd
method is OK for compiler and the lambda.Compile()
expression will be called at runtime which can throw an error if operator + is not defined for type T. Second implementation is compiled at compile time and the generic avalability of operator + is tested (and failed).
Method CreateAdd
can be generalized for all operators so only one implementation is enought for any kind of binary operator.
public static class FuncGenerator<T>
{
/// <summary>
/// Convert BinaryExpression into Func which acts as operator on types T and T. T = T op T, where op is provided by f param.
/// </summary>
/// <param name="f">Func which provides BinaryExpression.</param>
/// <returns>Func<T, T, T> </returns>
public static Func<T, T, T> ExpressionToFunc(Func<ParameterExpression, ParameterExpression, BinaryExpression> f)
{
ParameterExpression ap = Expression.Parameter(typeof(T), "a");
ParameterExpression bp = Expression.Parameter(typeof(T), "b");
Expression operationResult = f(ap, bp);
Expression<Func<T, T, T>> lambda = Expression.Lambda<Func<T, T, T>>(operationResult, ap, bp);
return lambda.Compile();
}
}
Now it is easy to call method FuncGenerator<T>.ExpressionToFunc()
for creating appropriate Func delegate. But still the user must be familliar with expression trees.
Func<double, double, double> add = FuncGenerator<double>.ExpressionToFunc((a, b) => Expression.Add(a, b));
Func<double, double, double> sub = FuncGenerator<double>.ExpressionToFunc((a, b) => Expression.Subtract(a, b));
Now we have generic implementation of binary operators and the rest is quite easy. Or not?
Generic T[] Binary Operator
Bellow is implemented generic operator implementation for array binary operations (see method ExpressionToFuncArray
). This implementation use for loop and evaluation of Func method returned by FuncGenerator<T>.ExpressionToFunc()
.
public static class FuncGenerator<T>
{
/// <summary>
/// Convert BinaryExpression into Func which acts as operator on types T and T. T = T op T, where op is provided by f param.
/// </summary>
/// <param name="f">Func which provides BinaryExpression.</param>
/// <returns>Func<T, T, T> </returns>
public static Func<T, T, T> ExpressionToFunc(Func<ParameterExpression, ParameterExpression, BinaryExpression> f)
{
ParameterExpression ap = Expression.Parameter(typeof(T), "a");
ParameterExpression bp = Expression.Parameter(typeof(T), "b");
Expression operationResult = f(ap, bp);
Expression<Func<T, T, T>> lambda = Expression.Lambda<Func<T, T, T>>(operationResult, ap, bp);
return lambda.Compile();
}
public static Func<T[], T[], T[]> ExpressionToFuncArray(Func<ParameterExpression, ParameterExpression, BinaryExpression> f)
{
Func<T, T, T> op = ExpressionToFunc(f);
return (a, b) =>
{
int len = a.Length;
T[] result = new T[len];
for (int i = 0; i < len; i++)
result[i] = op(ap[i], bp[i]);
return result;
};
}
}
Focus now on line
result[i] = op(ap[i], bp[i]);
op
delegate is called n times per one array operation which is really time consuming. So we have generic implementation but as can be tested it is very slow implementation. Back to expression trees and hope that it give us some solution.
public static Func<double[], double[], double[]> ArrayExpressionToFunc()
{
return (a, b) =>
{
int len = a.Length;
double[] result = new double[len];
for (int i = 0; i < len; i++)
result[i] = ap[i] + bp[i];
return result;
};
}
Above is depicted nongeneric implementation for double
type. This implementation is faster than generic implementation introduced earlier. Now lets try to construct expression tree which behaves exactly as this nongeneric implementation.
/// <summary>
/// Convert BinaryExpression into Func which acts as operator on types double[] and double[]. double[] = double[] op double[], where op is provided by f param.
/// </summary>
/// <param name="f"></param>
/// <returns></returns>
private static Func<double[], double[], double[]> ArrayExpressionToFunc(Func<IndexExpression, IndexExpression, BinaryExpression> f)
{
//a, b are input parametres for returned Func<double[], double[], double[]> delegate
//c is output (result)
//i is loop variable
//
// //implementation looks like:
// for(int i = a.Length; i > -1; i--)
// c[i] = a[i] op b[i];
// return c;
//
ParameterExpression apA = Expression.Parameter(typeof(double[]), "a");
ParameterExpression bpA = Expression.Parameter(typeof(double[]), "b");
ParameterExpression operationResult = Expression.Parameter(typeof(double[]), "c");
ParameterExpression iA = Expression.Parameter(typeof(int), "i");
LabelTarget labelReturn = Expression.Label(typeof(double[]));
//this block represent block inside loop
Expression innerBlock = Expression.Block(
Expression.SubtractAssign(iA, Expression.Constant(1)),
Expression.IfThen(Expression.Equal(iA, Expression.Constant(-1)),
Expression.Return(labelReturn, operationResult)),
Expression.Assign(Expression.ArrayAccess(operationResult, iA), f(Expression.ArrayAccess(apA, iA), Expression.ArrayAccess(bpA, iA)))
);
//expression for easy implementation of new double[i] constructor
Expression<Func<int, double[]>> newTA = (i) => new double[i];
//main body of Func. Variable initialization and loop execution
Expression addeA = Expression.Block(
new[] { iA, operationResult },
Expression.Assign(iA, Expression.ArrayLength(apA)),
Expression.Assign(operationResult, Expression.Invoke(newTA, iA)),
Expression.Loop(innerBlock, labelReturn)
);
//Compilation to get result.
Expression<Func<double[], double[], double[]>> lambdaA = Expression.Lambda<Func<double[], double[], double[]>>(addeA, apA, bpA);
return lambdaA.Compile();
}
This implementation returns delegate which can be used for binary operators application on double arrays. Possible use is:
Func<double[], double[], double[]> addArray = FuncGenerator<double>.ArrayExpressionToFunc((a, b) => Expression.Add(a, b));
Func<double[], double[], double[]> subArray = FuncGenerator<double>.ArrayExpressionToFunc((a, b) => Expression.Subtract(a, b));
Now the last step is needed for generic implementation. Just replace double
keyword at appropriate places with T
keyword.
/// <summary>
/// Convert BinaryExpression into Func which acts as operator on types T[] and T[]. T[] = T[] op T[], where op is provided by f param.
/// </summary>
/// <param name="f"></param>
/// <returns></returns>
private static Func<T[], T[], T[]> ArrayExpressionToFunc(Func<IndexExpression, IndexExpression, BinaryExpression> f)
{
//a, b are input parametres for returned Func<T[], T[], T[]> delegate
//c is output (result)
//i is loop variable
//
// //implementation looks like:
// for(int i = a.Length; i > -1; i--)
// c[i] = a[i] op b[i];
// return c;
//
ParameterExpression apA = Expression.Parameter(typeof(T[]), "a");
ParameterExpression bpA = Expression.Parameter(typeof(T[]), "b");
ParameterExpression operationResult = Expression.Parameter(typeof(T[]), "c");
ParameterExpression iA = Expression.Parameter(typeof(int), "i");
LabelTarget labelReturn = Expression.Label(typeof(T[]));
//this block represent block inside loop
Expression innerBlock = Expression.Block(
Expression.SubtractAssign(iA, Expression.Constant(1)),
Expression.IfThen(Expression.Equal(iA, Expression.Constant(-1)),
Expression.Return(labelReturn, operationResult)),
Expression.Assign(Expression.ArrayAccess(operationResult, iA), f(Expression.ArrayAccess(apA, iA), Expression.ArrayAccess(bpA, iA)))
);
//expression for easy implementation of new T[i] constructor
Expression<Func<int, T[]>> newTA = (i) => new T[i];
//main body of Func. Variable initialization and loop execution
Expression addeA = Expression.Block(
new[] { iA, operationResult },
Expression.Assign(iA, Expression.ArrayLength(apA)),
Expression.Assign(operationResult, Expression.Invoke(newTA, iA)),
Expression.Loop(innerBlock, labelReturn)
);
//Compilation to get result.
Expression<Func<T[], T[], T[]>> lambdaA = Expression.Lambda<Func<T[], T[], T[]>>(addeA, apA, bpA);
return lambdaA.Compile();
}
The code can be used for definition array operators:
Func<int[], int[], int[]> addArray = FuncGenerator<int>.ArrayExpressionToFunc((a, b) => Expression.Add(a, b));
Func<int[], int[], int[]> subArray = FuncGenerator<int>.ArrayExpressionToFunc((a, b) => Expression.Subtract(a, b));
Time Tests
For testing and comparison of implementation above set of tests was implemented.
- Simple implementation (full compile time specification expected to be faster implementation)
- Func<double, double, double> in for loop (use of delegate for + operator of type T, expected to be slow)
- Func<double[], double[], double[]> created at compile time (use of delegate for + operator of type T[], expected to be fast)
- Func<double[], double[], double[]> created at runtime (use of expression trees to get operator + of type T[], hope to be fast enought)
Code for all tests is bellow:
int lastTick = 0;
private void Msg(string msg)
{
int thisTick = Environment.TickCount;
int delta = thisTick - lastTick;
textBox1.AppendText(delta.ToString() + "ms\t" + msg + "\r\n");
lastTick = thisTick;
}
private void TimeTests()
{
double[] a = new double[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
double[] b = new double[] { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 };
double[] r = new double[] { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
Func<double, double, double> Add = (ap, bp) => (ap + bp);
Func<double[], double[], double[]> AddA = (ap, bp) =>
{
int len = ap.Length;
double[] result = new double[len];
for (int i = 0; i < len; i++)
result[i] = ap[i] + bp[i];
return result;
};
Func<double[], double[], double[]> AddA2 = FuncGenerator<double>.CreateArrayOperatorFuncAdd();
Msg("beginLoops");
int cnt = 1000000; //1M
//Simple implementation
for (int i = 0; i < cnt; i++)
{
r = new double[a.Length];
for (int j = 0; j < a.Length; j++)
r[j] = a[j] + b[j];
}
Msg("Simple implementation");
//Func<double, double, double> in for loop
for (int i = 0; i < cnt; i++)
for (int j = 0; j < a.Length; j++)
r[j] = Add(a[j], b[j]);
Msg("Func<double, double, double> in for loop");
//Func<double[], double[], double[]> created at compile time
for (int i = 0; i < cnt; i++)
r = AddA(a, b);
Msg("Func<double[], double[], double[]> created at compile time");
//Func<double[], double[], double[]> created at runtime
for (int i = 0; i < cnt; i++)
r = AddA2(a, b);
Msg("Func<double[], double[], double[]> created at runtime");
}
Result of this test:
0ms beginLoops
109ms Simple implementation
282ms Func<double, double, double> in for loop
109ms Func<double[], double[], double[]> without loop
109ms Func<double[], double[], double[]> created runtime
"Simple implementation" is fastest and it is on par with "created runtime". "Func in for loop" is slowest. This prove that goal was reached and this implementation is very good. Interesting is the difference between slowest and fastest implementation as this difference is spend on repeated calls of Func<double, double, double>
delegate.
Please take into account that during test garbage collection sometimes go to collect memory so test results can vary but still the introduced implementation is on par with "Simple implementation".
Vector<T> Class Implementation
All partial jobs were done and now lets introduce template for full implementation. Full implementation is not goal so just main parts are depicted.
/// <summary>
/// Generic class for arithmetic operator overloading demonstration
/// </summary>
/// <typeparam name="T"></typeparam>
class Vector<T>
{
//static operator delegates
private static Func<T, T, T> Add = FuncGenerator<T>.CreateOperatorFuncAdd();
private static Func<T[], T[], T[]> AddT = FuncGenerator<T>.CreateArrayOperatorFuncAdd();
//http://stackoverflow.com/questions/1717444/combining-two-lamba-expressions-in-c-sharp
//the ideas for implementation was based on article
//http://blogs.msdn.com/b/csharpfaq/archive/2009/11/19/debugging-expression-trees-in-visual-studio-2010.aspx
//
private T[] values;
public Vector(int dim)
{
this.values = new T[dim];
}
public Vector(IEnumerable<T> initialValues)
{
values = initialValues.ToArray();
}
public static Vector<T> operator +(Vector<T> a, Vector<T> b)
{
return AddT(a.values, b.values);
}
/// <summary>
/// Allows easily convert an array to vector of same item type. Can be used with operator overloading.
/// </summary>
/// <param name="a">array to be converted</param>
/// <returns></returns>
public static implicit operator Vector<T>(T[] a)
{
return new Vector<T>(a);
}
/// <summary>
/// Some kind of conversion of Vector into string
/// </summary>
/// <returns></returns>
public override string ToString()
{
string result = "";
result = values.Select(t => t.ToString()).Aggregate((a, b) => a + ";" + b);
return "[" + result + "]";
}
/// <summary>
/// Some kind of parsing string into Vector. Items must be delimited by ';' char. On left side '[' char is expected, on right side ']' char is expected.
/// This is just demonstration class so I am not proud of this implementation as some strange string forms will be successfuly parsed into T[].
/// </summary>
/// <param name="value">String value which will be parsed.</param>
/// <param name="parse">Parse delegate for conversion from string into T value.</param>
/// <returns></returns>
public static T[] Parse(string value, Func<string, T> parse)
{
if (value[0] != '[' | value[value.Length - 1] != ']')
throw new FormatException(string.Format("{0} is not valid format for type {1}", value, typeof(T[]).ToString()));
string tmpStr = value.Substring(1, value.Length - 2).Trim();
string[] items = tmpStr.Split(new char[] { ';' });
var values = items.Select(s => parse(s.Trim()));
return values.ToArray();
}
}
Code Use and Limitations
Basic form of generic Vector<T> was introduced and now lets show possible way to use. Firstly there are set of numerical types which works with this implementation very well.
/// <summary>
/// Initialization values for numeric types
/// </summary>
string aValue = "[0; 1; 2; 3; 4; 5; 6; 7; 8; 9]";
string bValue = "[1; 1; 1; 1; 1; 1; 1; 1; 1; 1]";
/// <summary>
/// Generic type test
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="parser"></param>
private void TypeTest<T>(Func<string, T> parser)
{
Msg("Test for " + typeof(T).ToString());
Vector<T> a = Vector<T>.Parse(aValue, parser);
Vector<T> b = Vector<T>.Parse(bValue, parser);
Vector<T> result = a + b;
Msg(string.Format("c = a + b; => {0} = {1} + {2}", result.ToString(), a.ToString(), b.ToString()));
}
/// <summary>
/// Test of native numeric types
/// </summary>
private void ImplementationTest()
{
TypeTest<int>(s => int.Parse(s));
TypeTest<uint>(s => uint.Parse(s));
TypeTest<short>(s => short.Parse(s));
TypeTest<ushort>(s => ushort.Parse(s));
TypeTest<long>(s => long.Parse(s));
TypeTest<ulong>(s => ulong.Parse(s));
TypeTest<double>(s => double.Parse(s));
TypeTest<float>(s => float.Parse(s));
TypeTest<decimal>(s => decimal.Parse(s));
}
Result of this test:
0ms Test for System.Int32 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 15ms Test for System.UInt32 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 0ms Test for System.Int16 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 16ms Test for System.UInt16 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 0ms Test for System.Int64 16ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 0ms Test for System.UInt64 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 15ms Test for System.Double 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 16ms Test for System.Single 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1] 16ms Test for System.Decimal 0ms c = a + b; => [1;2;3;4;5;6;7;8;9;10] = [0;1;2;3;4;5;6;7;8;9] + [1;1;1;1;1;1;1;1;1;1]
Unfortunately there are types which cannot be used by this way even if someone can suppose that they will work. Presented implementation does not work for byte
and sbyte
types.
Test on user definded types is last test which have to be run with this implementation. For this test UserNumericType
(with + operator overloaded) was prepared and hypothesis about structure Vector<Vector<double>>
was set.
/// <summary>
/// Test of types (with overloaded operator) defined by user
/// </summary>
private void UserTypeTest()
{
Vector<UserNumericType> a = new UserNumericType[] { new UserNumericType(0, 1), new UserNumericType(10, 10) };
Vector<UserNumericType> b = new UserNumericType[] { new UserNumericType(0, 1), new UserNumericType(20, 20) };
Vector<UserNumericType> result = a + b;
Msg("Test for UserNumericType");
Msg(string.Format("c = a + b; => {0} = {1} + {2}", result.ToString(), a.ToString(), b.ToString()));
Vector<Vector<double>> avvd = new Vector<double>[] { new double[] { 0, 1, 2 }, new double[] { 10, 10, 10 } };
Vector<Vector<double>> bvvd = new Vector<double>[] { new double[] { 1, 1, 1 }, new double[] { 21, 21, 21 } };
Vector<Vector<double>> resultvvd = avvd + bvvd;
Msg("Test for Vector<Vector<double>> type");
Msg(string.Format("c = a + b; => {0} = {1} + {2}", resultvvd.ToString(), avvd.ToString(), bvvd.ToString()));
}
Results for this test:
0ms Test for UserNumericType 16ms c = a + b; => [(0; 2);(30; 30)] = [(0; 1);(10; 10)] + [(0; 1);(20; 20)] 15ms Test for Vector<Vector<double>> type 0ms c = a + b; => [[1;2;3];[31;31;31]] = [[0;1;2];[10;10;10]] + [[1;1;1];[21;21;21]]
Future Development
Many times was in article mentioned that goal is not implement perfect class Vector<T> so it is not done. Just some principles which create solid basement for future development was set. If someone want it is possible to find many implementations of some kind of vectors over internet with friendly licence and reform them to principles presented in this article.
History
2014-07-21 First implementation after few days of investigation.
References
- Abul Kayes, http://www.codeproject.com/Articles/235860/Expression-Tree-Basics
- Sacha Barber, http://www.codeproject.com/Articles/30604/A-Journey-into-Expressions
- Alexandra Rusina, http://blogs.msdn.com/b/csharpfaq/archive/2009/11/19/debugging-expression-trees-in-visual-studio-2010.aspx
- Pranay Rana, http://www.codeproject.com/Tips/438804/Expression-Tree