@@ -777,6 +777,8 @@ that is, if they consist of the same sequence of Unicode code points after
777
777
[ Unicode Normalization Form C] ( https://unicode.org/reports/tr15/ ) ("NFC")
778
778
has been applied to both.
779
779
780
+ The _ names_ are [ immutable identifiers] ( https://www.unicode.org/reports/tr31/#Immutable_Identifier_Syntax ) .
781
+
780
782
> [ !NOTE]
781
783
> Implementations are not required to normalize all _ names_ .
782
784
> Comparisons of _ name_ values only need be done "as-if" normalization
@@ -786,12 +788,6 @@ has been applied to both.
786
788
> implementations can often substitute checking for actually applying normalization
787
789
> to _ name_ values.
788
790
789
- Valid content for _ names_ is based on <cite >Namespaces in XML 1.0</cite >'s
790
- [ NCName] ( https://www.w3.org/TR/xml-names/#NT-NCName ) .
791
- This is different from XML's [ Name] ( https://www.w3.org/TR/xml/#NT-Name )
792
- in that it MUST NOT contain a U+003A COLON ` : ` .
793
- Otherwise, the set of characters allowed in a _ name_ is large.
794
-
795
791
> [ !NOTE]
796
792
> _ External variables_ can be passed in that are not valid _ names_ .
797
793
> Such variables cannot be referenced in a _ message_ ,
@@ -843,15 +839,64 @@ option = identifier o "=" o (literal / variable)
843
839
identifier = [namespace ":"] name
844
840
namespace = name
845
841
name = [bidi] name-start *name-char [bidi]
846
- name-start = ALPHA / "_"
847
- / %xC0-D6 / %xD8-F6 / %xF8-2FF
848
- / %x370-37D / %x37F-61B / %x61D-1FFF / %x200C-200D
849
- / %x2070-218F / %x2C00-2FEF / %x3001-D7FF
850
- / %xF900-FDCF / %xFDF0-FFFC / %x10000-EFFFF
842
+ name-start = ALPHA
843
+ ; omit Cc: %x0-1F, Whitespace: « », Ascii: «!"#$%&'()*»
844
+ / %x2B ; «+» omit Ascii: «,-./0123456789:;<=>?@» «[\]^»
845
+ / %x5F ; «_» omit Cc: %x7F-9F, Whitespace: %xA0, Ascii: «`» «{|}~»
846
+ / %xA1-61B ; omit BidiControl: %x61C
847
+ / %x61D-167F ; omit Whitespace: %x1680
848
+ / %x1681-1FFF ; omit Whitespace: %x2000-200A
849
+ / %x200B-200D ; omit BidiControl: %x200E-200F
850
+ / %x2010-2027 ; omit Whitespace: %x2028-2029 %x202F, BidiControl: %x202A-202E
851
+ / %x2030-205E ; omit Whitespace: %x205F
852
+ / %x2060-2065 ; omit BidiControl: %x2066-2069
853
+ / %x206A-2FFF ; omit Whitespace: %x3000
854
+ / %x3001-D7FF ; omit Cs: %xD800-DFFF
855
+ / %xE000-FDCF ; omit NChar: %xFDD0-FDEF
856
+ / %xFDF0-FFFD ; omit NChar: %xFFFE-FFFF
857
+ / %x10000-1FFFD ; omit NChar: %x1FFFE-1FFFF
858
+ / %x20000-2FFFD ; omit NChar: %x2FFFE-2FFFF
859
+ / %x30000-3FFFD ; omit NChar: %x3FFFE-3FFFF
860
+ / %x40000-4FFFD ; omit NChar: %x4FFFE-4FFFF
861
+ / %x50000-5FFFD ; omit NChar: %x5FFFE-5FFFF
862
+ / %x60000-6FFFD ; omit NChar: %x6FFFE-6FFFF
863
+ / %x70000-7FFFD ; omit NChar: %x7FFFE-7FFFF
864
+ / %x80000-8FFFD ; omit NChar: %x8FFFE-8FFFF
865
+ / %x90000-9FFFD ; omit NChar: %x9FFFE-9FFFF
866
+ / %xA0000-AFFFD ; omit NChar: %xAFFFE-AFFFF
867
+ / %xB0000-BFFFD ; omit NChar: %xBFFFE-BFFFF
868
+ / %xC0000-CFFFD ; omit NChar: %xCFFFE-CFFFF
869
+ / %xD0000-DFFFD ; omit NChar: %xDFFFE-DFFFF
870
+ / %xE0000-EFFFD ; omit NChar: %xEFFFE-EFFFF
871
+ / %xF0000-FFFFD ; omit NChar: %xFFFFE-FFFFF
872
+ / %x100000-10FFFD ; omit NChar: %x10FFFE-10FFFF
851
873
name-char = name-start / DIGIT / "-" / "."
852
- / %xB7 / %x300-36F / %x203F-2040
853
874
```
854
875
876
+ > [ !NOTE]
877
+ > Syntactically, the definitions of ` identifier ` and ` name-char ` provide backwards compatibility over time by allowing a stable,
878
+ > wide range of characters.
879
+ > So when there is a new character in a version of Unicode, it can be used in any conformant implementation of MessageFormat.
880
+ > The definition currently excludes:
881
+ > * Most ASCII except for letters and characters used for numbers
882
+ > * This avoids conflicts with syntax characters, and reserves some characters for future syntax.
883
+ > * Bidirectional controls (` Bidi_C ` )
884
+ > * Control characters (` GC=Cc ` , but not Format characters: ` GC=Cf ` )
885
+ > * Whitespace characters (` WSpace ` )
886
+ > * Surrogate code points (` GC=Cs ` )
887
+ > * Non-Characters (` NChar ` )
888
+
889
+ This syntax allows a wide range of characters in _ names_ and _ identifiers_ .
890
+ Implementers and authors of _ functions_ and _ messages_ ,
891
+ including _ functions_ , _ options_ , and _ operands_ (variable names),
892
+ SHOULD avoid creating _ names_ that could produce confusion or harm usability
893
+ by choosing names consistent with the following guidelines.
894
+ MessageFormat tools, such as linters, SHOULD warn when _ names_ chosen by users
895
+ violate these constraints.
896
+ >
897
+ > 1 . [ Unicode Default Identifier Syntax] ( https://www.unicode.org/reports/tr31/#Default_Identifier_Syntax )
898
+ > 2 . [ Unicode General Security Profile for Identifiers] ( https://www.unicode.org/reports/tr39/#General_Security_Profile )
899
+
855
900
### Escape Sequences
856
901
857
902
An ** _ <dfn >escape sequence</dfn >_ ** is a two-character sequence starting with
0 commit comments