Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Releases: ClosedXML/ClosedXML

0.105.0-rc

22 Jan 22:53
Compare
Choose a tag to compare
0.105.0-rc Pre-release
Pre-release

What's Changed

Major enhancements

Automatic fixer of function names

Correct name for newer functions (post 2013) is not what is seen in the GUI (e.g. correct name for CONCAT is _xlfn.CONCAT). That is rather obscure fact not known to most developers. The formula setters (e.g. IXLCell.FormulaA1) now automatically fix function names, so it is stored correctly in the file.

using var wb = new XLWorkbook();
var ws = wb.AddWorksheet();
// Originally required "_xlfn.CONCAT(\"hello\", \" world!\")";
ws.Cell("A1").FormulaA1 = "CONCAT(\"hello\", \" world!\")"; 

Pre-0.105
image
0.105
image

Sorting updates references

In many cases, the sorted area has a column with references. The formula often references another row. Pre-0.105, the references in the formulas weren't updated correctly.
image

using var wb = new XLWorkbook();
var ws = wb.AddWorksheet();
ws.Cell("A1").Value = 4;
ws.Cell("B1").FormulaA1 = "A1+1";
ws.Cell("A2").Value = 2;
ws.Cell("B2").FormulaA1 = "A2+1";
ws.Cell("A3").Value = 1;
ws.Cell("B3").FormulaA1 = "A3+1";

ws.Range("A1:B3").Sort(1, XLSortOrder.Ascending);

Reimplementation/refactoring of old function infrastructure

Basically all implemented functions should be more faithful to how Excel behaves and evaluation of functions should be faster. implemented functions should be "complete" in sense that they correctly work for various arguments (e.g. various forms of ROMAN or pattern search in SUMIFS).

The functions (before refactoring) had serious problems with ranges, errors or type coercion or structured references. The original parser back then didn't even parse literal arrays ({1,2,3;4,5,6}). Parser and other things were updated, but because there was ~180 functions, original implementation was kept and functions were functions were reused through an adapter. Except the adapter never worked right and there were some other serious problems.

Changes

Bugfixes

  • Improve VML inset parsing by @jahav in #2595
  • Fix evaluation of LOG10 function by @jahav in #2597
  • Structural change caused ParsingException on defined names (Unexpected token EofSymbolId) by @Igor-Zlatomrezhev in #2462
  • Don't round TimeSpan to milliseconds by @jahav in #2444
  • Update rich text when text of a rich run changes by @jahav in #2516
  • Mark IF function as a range function by @jahav in #2509
  • Fix whitespace preservation in a rich text by @jahav in #2512
  • Hyperlinks move on structural changes by @jahav in #2412 +related changes:
    • Move hyperlink from MiscSlice to XLHyperlinks collection by @jahav in #2409
    • Remove unused hyperlink prop from misc slice by @jahav in #2408
    • Move IXLRangeBase.Hyperlinks to IXLWorksheet.Hyperlinks by @jahav in #2407
    • Fix hyperlink copy between sheets by @jahav in #2605

Enhancements

  • Add load/save of external pivot cache source by @jahav in #2585
  • Add consolidate,scenario pivot sources for load/save by @jahav in #2586
  • Sorting of ranges adjusts formula references by @jahav in #2413
  • Add a missing prefix to future functions by @jahav in #2598

Dependencies

Technical debt

  • Add methods to XLSheetRange to determine an area after area deletion by @jahav in #2410
  • Add methods to XLSheetRange to determine an area after area insertion by @jahav in #2411
  • Use strong name for dlls by @jahav in #2599

Documentation

  • Update NuGet Badge for Improved Visual Consistency by @hitensam in #2468

Performance

  • Fix O(n^2) issue in pivot cache creation by @jahav in #2403

Remove legacy function infrastructure

Read more

0.104.2

15 Nov 00:19
Compare
Choose a tag to compare

What's Changed

  • Update DocumentFormat.OpenXml due to vulnerability in System.IO.Packaging 8.0.0 by @jahav in #2503
  • Update RBUsh 4.0.0 due to security analyzer by @jahav in #2504

Full Changelog: 0.104.1...0.104.2

0.104.1

30 Sep 11:12
Compare
Choose a tag to compare

Release notes from 0.102.1 to the 0.104.1.

Summary of breaking changes is available at docs.closedxml.io:

OpenXML SDK

OpenXML SDK has released version 3. The 0.104.0 uses it as a dependency.

XLParser replaced with ClosedParser

The XLParser has been replaced with ClosedParser. The key benefits are

  • performance - ~2ΞΌs/formula, it's likely formulas will be parseable on the demand, necessary for construction of dependency tree
  • A1/R1C1 parsing parity - both modes can be parsed with no problems
  • AST oriented - it's likely a construction of AST in memory won't even be necessary, just use AST factory to evaluate formula directly

There is also a visualizer to display AST in a browser at https://parser.closedxml.io

image

Formula Calculation

In previous version, formulas used to be calculated recursively. Each formula checked it's supporting cells for other formulas and if there were some, they were recursively evaluated. There was some logic to decrease number of evaluations. That works for a very simple cases, but isn't very good for various non-happy paths (i.e. cells weren't calculated when they should be).

This version has replaced it with a standard

  • dependency tree for checking which formulas are dirty and need to be recalculated
  • calculation chain that manages dependencies and order of formulas during calculation

For more info, see docs, the Microsoft has a page about principles Excel Recalculation
and there is one with API at docs.closedxml.io.

image

Structured references

New parser also allows a basic evaluation of structured references. Format of structured reference must use official grammar, not Excel friendly names (e.g. Pastry[@Name] is user-friendly name for Pastry[[#This Row],[Name]]). It's now possible to

using var wb = new XLWorkbook();
var ws = wb.AddWorksheet();
ws.Cell("A1").InsertTable(new Pastry[]
{
    new("Cake", 14),
    new("Waffle", 3),
}, "Pastry");

ws.Cell("D1").FormulaA1 = "SUM(Pastry[Price])";
ws.Cell("D3").FormulaA1 = "\"Pastry \" & Pastry[[#This Row],[Name]]";
wb.RecalculateAllFormulas();

Console.WriteLine($"Expected: {17}, Actual: {ws.Cell("D1").Value}");
Console.WriteLine($"Expected: \"Pastry Waffle\", Actual: {ws.Cell("D3").Value}");

Expected: 17, Actual: 17
Expected: "Pastry Waffle", Actual: Pastry Waffle

Renaming sheet updates formulas

When a sheet is renamed, a formula referencing the sheet is also updated. This is a part of long term effort to fix effects of structural changes of a workbook. It will be a long road (e.g. sheet still delete doesn't swicth to #REF!),** but is one of basic features that should be working acorss the board.

using var wb = new XLWorkbook();
var sheet = wb.AddWorksheet();
var anotherSheet = wb.AddWorksheet("Another");
sheet.Cell("A1").FormulaA1 = "Another!B4";
anotherSheet.Name = "Changed";
Console.WriteLine(sheet.Cell("A1").FormulaA1);

Changed!B4

Workbook structure

Internal structure has been cleaned up and optimized.

The dirty tracking has been moved out of cells to formulas and thus memory taken up by a single cell value is now only 16 bytes instead of 24 (?) bytes in 0.102. Of course there are some other structures around that take up memory as well, but the single cell value is now 16 bytes (I hoped for 8, but not feasible with double, DateTime and TimeSpan as possible cell values - all take up 8 bytes... not enough bits).

The same string in different instances is now not duplicated, but only one instance is used. As seen on following test, it can lead to significant decrease in memory consumption. 250k rows with 10 text rows (same string, different instance): 117 MiB om 0.103 vs 325 MiB in 0.102.1.

InsertData performance

Insert 250k rows of 10 columns of text and 5 columns of numbers (gist).

Description Rows Columns Time/Memory to insert data Save workbook Total time/memory
0.103.0-beta 250 000 15 1.619 sec / 117 MiB 6.343 sec 477 MiB
0.102.1 250 000 15 7.160 sec / 325 MiB 6.676 sec 692 MiB

Loading of cells is now done through streaming

Basically workbooks with a large amount of cells should see ~15%-20% speedup (as long as there are mainly values, not styles or OLAP metadata....).

Reading the 250k from previous chapter:

Description Rows Columns Time to load data Used memory
0.103.0-beta 250 000 15 15.648 sec 236 MiB
0.102.1 250 000 15 20.460 sec 329 MiB

Of course, this includes all stuff from 0.103.0-beta. Version 0.103 never got a non-beta release.

Pivot tables

The internal structure of pivot tables, along with most other features, has been completely overhauled. This update should significantly reduce crashes when loading and saving workbooks containing pivot tables.

The main issue with the previous internal structure was that it didn't align with the structure used by OOXML. This was problematic because we need to support all valid files. As a result, we have to handle a wide range of inputs and correctly convert them to our internal structure, which is rather hard. A more clear 1:1 mapping with OOXML is much simpler and more reliable.

AutoFilter

The Autofilter feature has been revamped, which includes some API changes. Its behavior is now more closely aligned with how Excel operates. The XML documentation provides detailed explanations, and there is a dedicated documentation page. Several bugs have also been fixed.

For more details, refer to the Autofilter section of the migration guide.

Source link

Although ClosedXML still doesn't have source package (Fody static weaving causes pdb mismatch and nuget will refuse symbol package), there is a source link info in the package.

SourceLink basically takes a repository and a commit from the package and retrieves source from directly from forge (in this case GitHub).

CommonCrawl dataset

When workbook is a valid one, ClosedXML shouldn't throw on load. That is a rather high priority (more than saving or manipulation). Unfortunately, that is hard to find such areas that cause most problems.

One of activities that was going in a background is trying to use excel files around the internet (found by CommonCrawl) to evaluate how bad it is. There aren't results yet, but it is something that is going on.

What's Changed

Technical debt

  • Add shared string table for plain text by @jahav in #2115
  • Store rich text as an immutable rich text by @jahav in #2116
  • Save SST part directly from SST instance. by @jahav in #2118
  • Move saving of parts into separate writers by @jahav in #2177
  • Enable nullable in a few more classes by @sbeca in #2188

Performance improvements

  • Convert InsertData to streaming&bulk by @jahav in #2173
  • Load cells from workbook using a streaming by @jahav in #2174
  • Improve sorting performance by @igitur in #1649
  • Remove multiple enumerations by @jahav in #2236
  • Optimise workbook loading by stopping unneeded invalidation by @sbeca in #2284
  • Remove IXLStylized.Styles property by @jahav in #2361
  • Convert XLNumberFormatKey and XLAlignmentKey to readonly structs by @jahav in #2364
  • Convert XLBorderKey, XLFillKey, XLFontKey and XLProtectionKey to readonly structs by @jahav in #2365
  • Convert XLStyleKey to readonly struct by @jahav in #2366
  • Eliminate a couple of performance killers - cherry pick for develop by @jahav in #2371

Features

Read more

0.104-rc1

17 Sep 14:30
Compare
Choose a tag to compare
0.104-rc1 Pre-release
Pre-release

A release candidate 1 for 0.104.0.

Of course, this includes all stuff from 0.103.0-beta. Version 0.103 never got a non-beta release.

OpenXML SDK

OpenXML SDK has released version 3.0.1. The 0.104-rc1 uses it as a dependency.

Pivot tables

The internal structure of pivot tables, along with most other features, has been completely overhauled. This update should significantly reduce crashes when loading and saving workbooks containing pivot tables.

The main issue with the previous internal structure was that it didn't align with the structure used by OOXML. This was problematic because we need to support all valid files. As a result, we have to handle a wide range of inputs and correctly convert them to our internal structure, which is rather hard. A more clear 1:1 mapping with OOXML is much simpler and more reliable.

AutoFilter

The Autofilter feature has been revamped, which includes some API changes. Its behavior is now more closely aligned with how Excel operates. The XML documentation provides detailed explanations, and there is a dedicated documentation page. Several bugs have also been fixed.

For more details, refer to the Autofilter section of the migration guide.

CommonCrawl dataset

When workbook is a valid one, ClosedXML shouldn't throw on load. That is a rather high priority (more than saving or manipulation). Unfortunately, that is hard to find such areas that cause most problems.

One of activities that was going in a background is trying to use excel files around the internet (found by CommonCrawl) to evaluate how bad it is. There aren't results yet, but it is something that is going on.

What's Changed

Breaking changes

  • First page number can be negative -> change API type to int by @jahav in #2237
  • Rename IXLNamedRange to IXLDefinedName by @jahav in #2258

AutoFilter

  • AutoFilter rework - 1/? - Regular filter matches string. by @jahav in #2238
  • AutoFilter rework - 2/? - fix types for custom filters by @jahav in #2239
  • AutoFilter rework - 3/? - Top and average filter refactor, remove setters of internal state by @jahav in #2240
  • AutoFilter rework - 4/? - Top/Average filters work after loading by @jahav in #2241
  • AutoFilter rework - 5/? - Unify Regular and DateTimeGrouping filters by @jahav in #2242
  • AutoFilter rework - 6/7 - Add tests by @jahav in #2243
  • AutoFilter rework - 7/7 - Add documentation by @jahav in #2245

Formulas

  • Update ClosedXML.Parser to 1.0 by @jahav in #2250
  • Implement structured references by @jahav in #2251
  • Replace regex-powered code for A1-R1C1 formula conversion with AST-based one by @jahav in #2253
  • Change source of truth for defined names from union of ranges to a formula by @jahav in #2263
  • When sheet is renamed, rename it also in defined name formula by @jahav in #2264
  • Reimplement legacy MAX function. by @jahav in #2269
  • Implement Large Formula - Targets #1716 by @NickNack2020 in #2050
  • Update sheet names in formulas when sheet is renamed by @jahav in #2273
  • Implement FV and IPMT Excel functions and adapt 2 existing functions by @sbeca in #2199
  • Reimplement COUNTA function. by @jahav in #2277
  • Reimplement FACT function by @jahav in #2280
  • Reimplement COMBIN function by @jahav in #2281
  • Add BINOMDIST function by @jahav in #2282

Docs

Performance

  • Improve sorting performance by @igitur in #1649
  • Remove multiple enumerations by @jahav in #2236
  • Optimise workbook loading by stopping unneeded invalidation by @sbeca in #2284
  • Remove IXLStylized.Styles property by @jahav in #2361
  • Convert XLNumberFormatKey and XLAlignmentKey to readonly structs by @jahav in #2364
  • Convert XLBorderKey, XLFillKey, XLFontKey and XLProtectionKey to readonly structs by @jahav in #2365
  • Convert XLStyleKey to readonly struct by @jahav in #2366
  • Eliminate a couple of performance killers - cherry pick for develop by @jahav in #2371

Dependencies

Fixes

  • Preserve VML part with form controls across load/save. by @jahav in #2205
  • Ignore sheets with invalid id by @mihailmacarie in #2008
  • Fixed NullReferenceException at loading workbook with empty si element by @psynomorph in #2218
  • Fix InvalidCastException at workbook loading by @lvxiao312 in #2231
  • Get formula only if it is neither null or empty by @PascalGeschwillBIS in #2216
  • Fix column/row style combination for non-materialized cells. by @jahav in #2249
  • Fix ROW function so it works in array formulas. by @jahav in #2268
  • Make legacy functions work with error argument by @jahav in #2270
  • Write cells with empty text to file (used to be treated as blanks) by @jahav in #2278
  • Add a load test for dialog sheet by @jahav in #2334
  • Fix TimeSpan conversion by @OldBuddy in #2318
  • Fix an edge case when ranges got unmerged when the adjacent range of the same size got deleted by @Pankraty in #2358
  • Correctly load quoted and unquoted sheet names in print area definitions by @igitur in #2380

Pivot tables

  • Implement a data structure to hold pivot table close to file structure by @jahav in #2275
  • Load all PivotTableDefinition fields into the XLPivotTable by @jahav in #2285
  • Start new pivot table part writer by @jahav in #2287
  • Write pivot table row/column axes, filter fields and data fields by @jahav in #2290
  • Add conditional formatting structure to pivot table by @jahav in #2297
  • Reimplement pivot table logic by @jahav in #2307
  • Pivot table grand total styles by @jahav in #2374
  • Modify pivot field labels style by @jahav in #2375
  • Fix the pivot table Layout property by @jahav in #2384
  • Add style API for pivot field headers by @jahav in #2385
  • Pivot style subtotals by @jahav in #2386
  • Pivot style data by @jahav in #2393
  • Improve error message when adding a pivot value field by @jahav in #2394
  • Style intersection of pivot value field and axis field by @jahav in #2395
  • Style pivot area based on pivot field axis values by @jahav in #2396
  • Add support for Excel files containing 'Adobe' branded jpeg images by @ctmatt in #2391

New Contributors

Read more

0.102.3

18 Jul 14:15
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.102.2...0.102.3

0.102.2

05 Jan 14:11
Compare
Choose a tag to compare

Add a warning about allowed ranges of DocumentFormat.OpenXML see issue #2220 and PR #2246.

What's Changed

  • Add dependency version range for DocumentFormat.OpenXml by @jahav in #2246

Full Changelog: 0.102.1...0.102.2

0.104.0-preview2

26 Oct 19:19
Compare
Choose a tag to compare
0.104.0-preview2 Pre-release
Pre-release

Second test release for checking SourceLink support on nuget (first failed due to fody/PDB checksum) #2070

0.104.0-preview1

26 Oct 18:16
Compare
Choose a tag to compare
0.104.0-preview1 Pre-release
Pre-release

Test release for checking SourceLink support #2070

0.103.0-beta

28 Sep 22:37
Compare
Choose a tag to compare

There won't be a non-beta release for 0.103. The production release will be 0.104, not 0.103. This milestone was about fixing technical debt, but ultimately it needs some more time to mature before it is sent to the users.

There are some nice performance updates, so in spirit of release early, release often, there will be a beta package on nuget.

Breaking changes

Rich text is now immutable behind the scenes (and will likely be turned into immutable in the future). It should be transparent to the user, though IXLPhonetic no longer has a setter for its IXLPhonetic.Text, IXLPhonetic.Start and IXLPhonetic.End properties.

New calculation engine just works in a different way and will behave differently.

Significant changes

XLParser replaced with ClosedParser

The XLParser has been replaced with ClosedParser. The key benefits are

  • performance - ~2ΞΌs/formula, it's likely formulas will be parseable on the demand, necessary for construction of dependency tree
  • A1/R1C1 parsing parity - both modes can be parsed with no problems
  • AST oriented - it's likely a construction of AST in memory won't even be necessary, just use AST factory to evaluate formula directly

There is also a visualizer to display AST in a browser at https://parser.closedxml.io

image

Formula Calculation

In previous version, formulas used to be calculated recursively. Each formula checked it's supporting cells for other formulas and if there were some, they were recursively evaluated. There was some logic to decrease number of evaluations. That works for a very simple cases, but isn't very good for various non-happy paths (i.e. cells weren't calculated when they should be).

This version has replaced it with a standard

  • dependency tree for checking which formulas are dirty and need to be recalculated
  • calculation chain that manages dependencies and order of formulas during calculation

For more info, see docs, the Microsoft has a page about principles Excel Recalculation
and there is one with API at docs.closedxml.io.

image

Workbook structure

Internal structure has been cleaned up and optimized.

The dirty tracking has been moved out of cells to formulas and thus memory taken up by a single cell value is now only 16 bytes instead of 24 (?) bytes in 0.102. Of course there are some other structures around that take up memory as well, but the single cell value is now 16 bytes (I hoped for 8, but not feasible with double, DateTime and TimeSpan as possible cell values - all take up 8 bytes... not enough bits).

The same string in different instances is now not duplicated, but only one instance is used. As seen on following test, it can lead to significant decrease in memory consumption. 250k rows with 10 text rows (same string, different instance): 117 MiB om 0.103 vs 325 MiB in 0.102.1.

InsertData performance

Insert 250k rows of 10 columns of text and 5 columns of numbers (gist).

Description Rows Columns Time/Memory to insert data Save workbook Total time/memory
0.103.0-beta 250 000 15 1.619 sec / 117 MiB 6.343 sec 477 MiB
0.102.1 250 000 15 7.160 sec / 325 MiB 6.676 sec 692 MiB

Loading of cells is now done through streaming

Basically workbooks with a large amount of cells should see ~15%-20% speedup (as long as there are mainly values, not styles or OLAP metadata....).

Reading the 250k from previous chapter:

Description Rows Columns Time to load data Used memory
0.103.0-beta 250 000 15 15.648 sec 236 MiB
0.102.1 250 000 15 20.460 sec 329 MiB

What's Changed

Technical debt

  • Add shared string table for plain text by @jahav in #2115
  • Store rich text as an immutable rich text by @jahav in #2116
  • Save SST part directly from SST instance. by @jahav in #2118
  • Replace XLParser with ClosedParser. by @jahav in #2138
  • Get areas a formula depends on by @jahav in #2152
  • An initial work on a dependency tree for formulas by @jahav in #2155
  • Add names areas to dependency tree by @jahav in #2156
  • Add API to remove cell formula from dependency tree by @jahav in #2160
  • Add XLCalculationChain by @jahav in #2167
  • Add ability to detect cycle in calculation chain by @jahav in #2169
  • Evaluate calculation chain by @jahav in #2172
  • Move saving of parts into separate writers by @jahav in #2177

Performance improvements

  • Convert InsertData to streaming&bulk by @jahav in #2173
  • Load cells from workbook using a streaming by @jahav in #2174

Features

  • Add FontScheme property to a font by @jahav in #2114
  • Implement loading of workbook theme colors by @sbeca in #2117

Bugfixes

  • update accessibility of string.Contains(char) polyfill to internal by @Applesauce314 in #2134
  • Make SheetId a unique across XLWorksheets by @jahav in #2142
  • Fix incorrect logic check in Array Rescale by @sbeca in #2157
  • Fix ROUND (and probably others) not handling the result of binary operations on refs by @sbeca in #2153

Documentation

  • Add themes documentation by @jahav in #2154
  • Add documentation about how formulas are calculated. by @jahav in #2176

New Contributors

Full Changelog: 0.102.0...0.103.0-beta

0.102.1 - SixLabors.Fonts dependency update

18 Aug 23:04
Compare
Choose a tag to compare

SixLabors.Fonts has released version 1.0.0 and some NET Framework projects suddently have errors due to NuGet behavior.

If a project is consuming ClosedXML through package.config instead of PackageReference style projects, the NuGet will resolve version 1.0.0 instead of declared beta19 dependency. SixLabors.Fonts has API changes and thus it will start to throw MissingMethodExceptions.

The issue should only affect net framework projects, not dotnet core that use PackageReference style by default.

What's Changed

  • Update SixLabors.Fonts dependency to version 1.0.0 by @jahav in #2149

Full Changelog: 0.102.0...0.102.1