Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Implemented feature as in #560 (ints/doubles as strings)#564

Merged
miloyip merged 26 commits into
Tencent:masterfrom
corporateshark:stringnumbers
Mar 3, 2016
Merged

Implemented feature as in #560 (ints/doubles as strings)#564
miloyip merged 26 commits into
Tencent:masterfrom
corporateshark:stringnumbers

Conversation

@corporateshark
Copy link
Copy Markdown
Contributor

Fast parsing of ints/doubles as strings as described in #560

Note: the null-termination character is never added, always use explicit length

@corporateshark corporateshark changed the title Implemented feature as in #560 Implemented feature as in #560 (ints/doubles as strings) Feb 28, 2016
@miloyip
Copy link
Copy Markdown
Collaborator

miloyip commented Feb 29, 2016

I think:

  1. BaseReaderHandler::Number() should call String() instead of Default() (Similar to Key()).
  2. Why kParseNumbersAsStringsFlag can only be enabled with kParseInsituFlag?
  3. Is it also possible to eliminate some code path in ParseNumber() when this flag is on (to improve performance)? The bottomline is that it can still validate the JSON standard of number.

@corporateshark
Copy link
Copy Markdown
Contributor Author

Some code fragments can be eliminated without breaking the validation rules but it does not seem much can be done. For example, code like this can be excluded:

            if (expMinus)
                exp = -exp;

However, it is not in any tight loop and disabling anything from the parsing code will break JSON validation rules (may be a subject to yet another flag but not this one).

It seems, the encoding unit tests are broken because of this line:

typename InputStream::Ch *head = is.PutBegin();

What is the proper way to treat encoding of input & output streams?

@miloyip
Copy link
Copy Markdown
Collaborator

miloyip commented Feb 29, 2016

As example, the following code involving i, significandDigit, etc. should not be necessary for kParseNumbersAsStringsFlag:

        // Parse int: zero / ( digit1-9 *DIGIT )
        unsigned i = 0;
        uint64_t i64 = 0;
        bool use64bit = false;
        int significandDigit = 0;
        if (RAPIDJSON_UNLIKELY(s.Peek() == '0')) {
            i = 0;
            s.TakePush();
        }
        else if (RAPIDJSON_LIKELY(s.Peek() >= '1' && s.Peek() <= '9')) {
            i = static_cast<unsigned>(s.TakePush() - '0');

            if (minus)
                while (RAPIDJSON_LIKELY(s.Peek() >= '0' && s.Peek() <= '9')) {
                    if (RAPIDJSON_UNLIKELY(i >= 214748364)) { // 2^31 = 2147483648
                        if (RAPIDJSON_LIKELY(i != 214748364 || s.Peek() > '8')) {
                            i64 = i;
                            use64bit = true;
                            break;
                        }
                    }
                    i = i * 10 + static_cast<unsigned>(s.TakePush() - '0');
                    significandDigit++;
                }

For the following line:

typename InputStream::Ch *head = is.PutBegin();

It should only be possible for insitu stream. For other streams, you cannot get pointers to the current position (such as a file stream).

I think you should make use of existing NumberStream to backup the characters in a stack, with modification for ., e and exponent digits.

And I will suggest appending \0 as in String() event.

@miloyip
Copy link
Copy Markdown
Collaborator

miloyip commented Feb 29, 2016

Besides, I think new unit tests are necessary as well.

@corporateshark
Copy link
Copy Markdown
Contributor Author

Do you mean something like this?

        if (parseFlags & kParseNumbersAsStringsFlag) {

            if (parseFlags & kParseInsituFlag) {
                ...
            }
            else {
                const char* str = s.Pop();
                SizeType length = static_cast<SizeType>(s.Length()) - 1;
                cont = handler.RawNumber(str, SizeType(length), true);
            }

        }

@corporateshark corporateshark force-pushed the stringnumbers branch 3 times, most recently from 6707351 to 8430fc0 Compare March 2, 2016 00:31
@corporateshark
Copy link
Copy Markdown
Contributor Author

Unit tests added.

@corporateshark
Copy link
Copy Markdown
Contributor Author

Zero-termination in in-situ mode seems problematic. Consider this valid json fragment:

                    "BufferWidth": 512,
                    "BufferHeight": 512,

Inserting \0 after 512 will erase the comma making any further parsing invalid. Currently no zero-character is inserted in in-situ mode, this is explicitly stated in the documentation. Does it sound like a viable solution?

@miloyip
Copy link
Copy Markdown
Collaborator

miloyip commented Mar 3, 2016

Yes, you are right. there is no " for numbers which can be replaced by null-terminator in in situ mode.

miloyip added a commit that referenced this pull request Mar 3, 2016
Implemented feature as in #560 (ints/doubles as strings)
@miloyip miloyip merged commit 6d0f0b2 into Tencent:master Mar 3, 2016
@miloyip
Copy link
Copy Markdown
Collaborator

miloyip commented Mar 3, 2016

Thank you for your contribution.

@corporateshark corporateshark deleted the stringnumbers branch March 18, 2016 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants