Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Changes from 1 commit
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
164bb0a
Implemented IndexOf and LastIndexOf functions
mkhamoyan May 30, 2023
1e4e9a5
Updated test cases
mkhamoyan May 31, 2023
292b915
Remove not needed parts
mkhamoyan May 31, 2023
9d23f7b
Implemented IsPrefix, IsSuffix functions
mkhamoyan Jun 1, 2023
1f8b5d5
Remove logs
mkhamoyan Jun 1, 2023
7b91b6c
Fix CI build failures
mkhamoyan Jun 2, 2023
c4a2d3c
Refactored
mkhamoyan Jun 2, 2023
001c366
Fixed some test cases for OSX
mkhamoyan Jun 6, 2023
0ec8d79
Minor changes in test cases
mkhamoyan Jun 6, 2023
fa7322c
test case minor refactoring
mkhamoyan Jun 6, 2023
fc2837b
Merge branch 'main' into hybrid_collation_functions
mkhamoyan Jun 6, 2023
a7ab572
Merge branch 'main' into hybrid_collation_functions
mkhamoyan Jun 7, 2023
36de1a9
Merge branch 'main' into hybrid_collation_functions
mkhamoyan Jun 14, 2023
139a6ba
Changed IndexOf functions implementation
mkhamoyan Jun 15, 2023
f3da1e5
Fix build failue
mkhamoyan Jun 15, 2023
5c3f172
Minor fixes
mkhamoyan Jun 15, 2023
ab317f7
Minor fix
mkhamoyan Jun 15, 2023
24094fe
Refactor as per review comments
mkhamoyan Jun 16, 2023
c425d38
Refactored Indexing functions calls
mkhamoyan Jun 16, 2023
a3f44b4
Updated doc and added comments
mkhamoyan Jun 16, 2023
cbaaf80
Applied changes suggested by @jkotas
mkhamoyan Jun 16, 2023
caebcbe
Refactored some files
mkhamoyan Jun 19, 2023
71b026e
Make the doc more readable
mkhamoyan Jun 19, 2023
9dda83a
Refactored IndexOf function
mkhamoyan Jun 19, 2023
88c6861
Add more comments in IndexOF function
mkhamoyan Jun 19, 2023
460aba0
remove localizedStandardRangeOfString
mkhamoyan Jun 20, 2023
db0a8f8
Initial changes for casing functions
mkhamoyan Jun 20, 2023
3d5a195
Added exception in case mixed compositions
mkhamoyan Jun 21, 2023
006bdb7
Merge branch 'hybrid_collation_functions' into hybrid_casing_functions
mkhamoyan Jun 21, 2023
24136fc
Merge branch 'main' into hybrid_casing_functions
mkhamoyan Jun 21, 2023
d9fa03b
Update test cases
mkhamoyan Jun 21, 2023
356b250
Refactor casing functions
mkhamoyan Jun 22, 2023
040f214
Updated doc and did refactoing
mkhamoyan Jun 22, 2023
583f7bb
align code lines
mkhamoyan Jun 22, 2023
a244198
Update test comment
mkhamoyan Jun 22, 2023
f7823a3
Merge branch 'main' into hybrid_casing_functions
mkhamoyan Jun 23, 2023
8c2efb4
Order alphabetically function declarations
mkhamoyan Jun 23, 2023
ba2f1b3
Done minor refactoring
mkhamoyan Jun 23, 2023
ddd17c4
Refactor as requested by review
mkhamoyan Jun 26, 2023
f39dd83
Minor refactoring
mkhamoyan Jun 26, 2023
f46a9ba
Fix casing function implementation
mkhamoyan Jun 27, 2023
6661e4f
Update doc and test cases
mkhamoyan Jun 27, 2023
60a945a
Fix index in Append function
mkhamoyan Jun 27, 2023
2257cbe
minor refactoring
mkhamoyan Jun 27, 2023
3ff5de7
Update method comment and remove GetCurrentLocale
mkhamoyan Jun 27, 2023
bb6fbf5
Use Interop.GlobalizationInterop.ResultCode
mkhamoyan Jun 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Make the doc more readable
  • Loading branch information
mkhamoyan committed Jun 19, 2023
commit 71b026e81e1ff2a6cd6ee130e9e7aef68469013f
80 changes: 43 additions & 37 deletions docs/design/features/globalization-hybrid-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@ new CultureInfo("de-DE").CompareInfo.IndexOf("strasse", "stra\u00DFe", 0, Compar

For OSX platforms we are using native apis instead of ICU data.

**String comparison**
## String comparison

Affected public APIs:
- CompareInfo.Compare,
Expand All @@ -292,47 +292,47 @@ The number of `CompareOptions` and `NSStringCompareOptions` combinations are lim

- `None`:

`CompareOptions.None` is mapped to `NSStringCompareOptions.NSLiteralSearch`
`CompareOptions.None` is mapped to `NSStringCompareOptions.NSLiteralSearch`

There are some behaviour changes. Below are examples of such cases.
There are some behaviour changes. Below are examples of such cases.

| **character 1** | **character 2** | **CompareOptions** | **hybrid globalization** | **icu** | **comments** |
|:---------------:|:---------------:|--------------------|:------------------------:|:-------:|:-------------------------------------------------------:|
| `\u3042` あ | `\u30A1` ァ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u304D\u3083` きゃ | `\u30AD\u30E3` キャ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u304D\u3083` きゃ | `\u30AD\u3083` キゃ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u3070\u3073\uFF8C\uFF9E\uFF8D\uFF9E\u307C` ばびブベぼ | `\u30D0\u30D3\u3076\u30D9\uFF8E\uFF9E` バビぶベボ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u3060` だ | `\u30C0` ダ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| **character 1** | **character 2** | **CompareOptions** | **hybrid globalization** | **icu** | **comments** |
|:---------------:|:---------------:|--------------------|:------------------------:|:-------:|:-------------------------------------------------------:|
| `\u3042` あ | `\u30A1` ァ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u304D\u3083` きゃ | `\u30AD\u30E3` キャ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u304D\u3083` きゃ | `\u30AD\u3083` キゃ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u3070\u3073\uFF8C\uFF9E\uFF8D\uFF9E\u307C` ばびブベぼ | `\u30D0\u30D3\u3076\u30D9\uFF8E\uFF9E` バビぶベボ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| `\u3060` だ | `\u30C0` ダ | None | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |

- `StringSort` :

`CompareOptions.StringSort` is mapped to `NSStringCompareOptions.NSLiteralSearch` .ICU's default is to use "StringSort", i.e. nonalphanumeric symbols come before alphanumeric. That is how works also `NSLiteralSearch`.
`CompareOptions.StringSort` is mapped to `NSStringCompareOptions.NSLiteralSearch` .ICU's default is to use "StringSort", i.e. nonalphanumeric symbols come before alphanumeric. That is how works also `NSLiteralSearch`.

- `IgnoreCase`:

`CompareOptions.IgnoreCase` is mapped to `NSStringCompareOptions.NSCaseInsensitiveSearch | NSStringCompareOptions.NSLiteralSearch`
`CompareOptions.IgnoreCase` is mapped to `NSStringCompareOptions.NSCaseInsensitiveSearch | NSStringCompareOptions.NSLiteralSearch`

There are some behaviour changes. Below are examples of such cases.
There are some behaviour changes. Below are examples of such cases.

| **character 1** | **character 2** | **CompareOptions** | **hybrid globalization** | **icu** | **comments** |
|:---------------:|:---------------:|--------------------|:------------------------:|:-------:|:-------------------------------------------------------:|
| `\u3060` だ | `\u30C0` ダ | IgnoreCase | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |
| **character 1** | **character 2** | **CompareOptions** | **hybrid globalization** | **icu** | **comments** |
|:---------------:|:---------------:|--------------------|:------------------------:|:-------:|:-------------------------------------------------------:|
| `\u3060` だ | `\u30C0` ダ | IgnoreCase | 1 | -1 | hiragana and katakana characters are ordered differently compared to ICU |

- `IgnoreNonSpace`:

`CompareOptions.IgnoreNonSpace` is mapped to `NSStringCompareOptions.NSDiacriticInsensitiveSearch | NSStringCompareOptions.NSLiteralSearch`
`CompareOptions.IgnoreNonSpace` is mapped to `NSStringCompareOptions.NSDiacriticInsensitiveSearch | NSStringCompareOptions.NSLiteralSearch`

- `IgnoreWidth`:

`CompareOptions.IgnoreWidth` is mapped to `NSStringCompareOptions.NSWidthInsensitiveSearch | NSStringCompareOptions.NSLiteralSearch`
`CompareOptions.IgnoreWidth` is mapped to `NSStringCompareOptions.NSWidthInsensitiveSearch | NSStringCompareOptions.NSLiteralSearch`

- All combinations that contain below `CompareOptions` always throw `PlatformNotSupportedException`:

`IgnoreSymbols`,
`IgnoreSymbols`,

`IgnoreKanaType`,
`IgnoreKanaType`,

**String starts with / ends with**
## String starts with / ends with

Affected public APIs:
- CompareInfo.IsPrefix
Expand All @@ -348,9 +348,9 @@ Apple Native API does not expose locale-sensitive endsWith/startsWith function.

- `IgnoreSymbols`

As there is no IgnoreSymbols equivalent in NSStringCompareOptions all `CompareOptions` combinations that include `IgnoreSymbols` throw `PlatformNotSupportedException`
As there is no IgnoreSymbols equivalent in NSStringCompareOptions all `CompareOptions` combinations that include `IgnoreSymbols` throw `PlatformNotSupportedException`

**String indexing**
## String indexing

Affected public APIs:
- CompareInfo.IndexOf
Expand All @@ -362,38 +362,44 @@ Mapped to Apple Native API `rangeOfString:options:range:locale:`(https://develop

In `rangeOfString:options:range:locale:` objects are compared by checking the Unicode canonical equivalence of their code point sequences.
In cases where search string contains diaeresis and has different normalization form than in source string result can be incorrect.
Here are covered these cases with diaeresis:

Here are the covered cases with diaeresis:
1. Search string contains diaeresis and has same normalization form as in source string.
2. Search string contains diaeresis but with source string they have same letters with different char lengths but substring is normalized in source.

a. search string `normalizing to form C` is substring of source string. example: search string: `U\u0308` source string: `Source is \u00DC` => matchLength is 1

b. search string `normalizing to form D` is substring of source string. example: search string: `\u00FC` source string: `Source is \u0075\u0308` => matchLength is 2

Not covered case:
Search string contains diaeresis but with source string they have same letters with different char lengths but substring is not
normalized in source. example: search string: `U\u0308 and \u00FC` source string: `Source is a\u0308\u0308a and \u0075\u0308`
as it is visible from example normalizaing search strin to form C or D will not help to find substring in source string.

Search string contains diaeresis and with source string they have same letters with different char lengths but substring is not
normalized in source. example: search string: `U\u0308 and \u00FC` source string: `Source is a\u0308\u0308a and \u0075\u0308`
as it is visible from example normalizaing search string to form C or D will not help to find substring in source string.

- `IgnoreSymbols`

As there is no IgnoreSymbols equivalent in NSStringCompareOptions all `CompareOptions` combinations that include `IgnoreSymbols` throw `PlatformNotSupportedException`
As there is no IgnoreSymbols equivalent in NSStringCompareOptions all `CompareOptions` combinations that include `IgnoreSymbols` throw `PlatformNotSupportedException`

- Some letters consist of more than one grapheme.

Apple Native Api does not guarantee that string will be segmented by letters but by graphemes. E.g. in `cs-CZ` and `sk-SK` "ch" is 1 letter, 2 graphemes. The following code with `HybridGlobalization` switched off returns -1 (not found) while with `HybridGlobalization` switched on, it returns 1.
Apple Native Api does not guarantee that string will be segmented by letters but by graphemes. E.g. in `cs-CZ` and `sk-SK` "ch" is 1 letter, 2 graphemes. The following code with `HybridGlobalization` switched off returns -1 (not found) while with `HybridGlobalization` switched on, it returns 1.

``` C#
new CultureInfo("sk-SK").CompareInfo.IndexOf("ch", "h"); // -1 or 1
```
``` C#
new CultureInfo("sk-SK").CompareInfo.IndexOf("ch", "h"); // -1 or 1
```

- Some graphemes have multi-grapheme equivalents.
E.g. in `de-DE` ß (%u00DF) is one letter and one grapheme and "ss" is one letter and is recognized as two graphemes. Apple Native API's equivalent of `IgnoreNonSpace` treats them as the same letter when comparing. Similar case: dz (%u01F3) and dz.
E.g. in `de-DE` ß (%u00DF) is one letter and one grapheme and "ss" is one letter and is recognized as two graphemes. Apple Native API's equivalent of `IgnoreNonSpace` treats them as the same letter when comparing. Similar case: dz (%u01F3) and dz.

Using `IgnoreNonSpace` for these two with `HybridGlobalization` off, also returns 0 (they are equal). However, the workaround used in `HybridGlobalization` will compare them grapheme-by-grapheme and will return -1.
Using `IgnoreNonSpace` for these two with `HybridGlobalization` off, also returns 0 (they are equal). However, the workaround used in `HybridGlobalization` will compare them grapheme-by-grapheme and will return -1.

``` C#
new CultureInfo("de-DE").CompareInfo.IndexOf("strasse", "stra\u00DFe", 0, CompareOptions.IgnoreNonSpace); // 0 or -1
``` C#
new CultureInfo("de-DE").CompareInfo.IndexOf("strasse", "stra\u00DFe", 0, CompareOptions.IgnoreNonSpace); // 0 or -1
```


**SortKey**
## SortKey

Affected public APIs:
- CompareInfo.GetSortKey
Expand Down