Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BigTable: Adding a row generator on a table. #4679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jan 26, 2018

Conversation

aneepct
Copy link
Contributor

@aneepct aneepct commented Dec 29, 2017

Adding a row generator on a table. This will allow iteration over the rows in a table instead of reading the rows into an internal dictionary first. As soon as a row has been validated, it is available in the iterator, by adding a row generator on a table. This will allow iteration over the rows in a table instead of reading the rows into an internal dictionary first. As soon as a row has been validated, it is available in the iterator, by using yield in the generator. The read_rows() method on table will still return PartialRowsData, and consume_all() will now use the row generator to process the ReadRowsResponses to populate the rows or in an internal dictionary. The yield_rows() method on table provides a row iterator and does not store any of the rows internally. The rows are available as soon as they are read from the channel and validated.

Closes #4586.

… rows in a table instead of reading the rows into an internal dictionary first. As soon as a row has been validated, it is available in the iterator, by using yield in the generator.
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Dec 29, 2017
@googlebot
Copy link

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this State. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

@googlebot googlebot added cla: no This human has *not* signed the Contributor License Agreement. and removed cla: yes This human has signed the Contributor License Agreement. labels Jan 2, 2018
@@ -194,8 +194,66 @@ class PartialRowsData(object):

def __init__(self, response_iterator):
self._response_iterator = response_iterator
self.generator = YieldRowsData(response_iterator)

This comment was marked as spam.


def __eq__(self, other):
if not isinstance(other, self.__class__):
return False

This comment was marked as spam.

return other._response_iterator == self._response_iterator

def __ne__(self, other):
return not self.__eq__(other)

This comment was marked as spam.

"""
# NOTE: To avoid duplicating large objects, this is just the
# mutable private data.
return self._rows

This comment was marked as spam.

:param response_iterator: A streaming iterator returned from a
``ReadRows`` request.
"""
START = "Start" # No responses yet processed.

This comment was marked as spam.

def state(self):
"""State machine state.

:rtype: str

This comment was marked as spam.

@@ -425,18 +451,9 @@ def _copy_from_previous(self, cell):
cell.row_key = previous.row_key
if not cell.family_name:
cell.family_name = previous.family_name
# NOTE: ``cell.qualifier`` **can** be empty string.
if cell.qualifier is None:
if not cell.qualifier:

This comment was marked as spam.

@@ -189,117 +189,85 @@ def test_row_key_getter(self):


class TestPartialRowsData(unittest.TestCase):
@staticmethod

This comment was marked as spam.


@staticmethod
def _get_target_class():
def _get_partial_target_class():

This comment was marked as spam.

@tseaver tseaver added the api: bigtable Issues related to the Bigtable API. label Jan 2, 2018
@sduskis
Copy link
Contributor

sduskis commented Jan 9, 2018

Since consume_next() was a public method, does that mean that we need to maintain backwards compatibility? If so, can we just have it read a single row via the generator?

@dhermes
Copy link
Contributor

dhermes commented Jan 9, 2018

does that mean that we need to maintain backwards compatibility

Not until 1.0 (current version is 0.28.1). We typically try to provide a "warning" period though before breaking an interface, but in a case like this it might not be possible and it'd be fine to make a breaking change in the release (we can just include that in the release notes).

@chemelnucfin chemelnucfin added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jan 15, 2018
@zakons
Copy link
Contributor

zakons commented Jan 18, 2018

All review requests made but not finding a solution for providing unit tests that provide coverage of iterator / generator code. Please provide any suggestions for this so this can be completed.

@chemelnucfin
Copy link
Contributor

This would close #1812 from discussion in #4761

@aneepct aneepct changed the title Adding a row generator on a table. To: BigTable: Adding a row generator on a table. Jan 22, 2018
@aneepct aneepct changed the title To: BigTable: Adding a row generator on a table. BigTable: Adding a row generator on a table. Jan 22, 2018
@zakons
Copy link
Contributor

zakons commented Jan 24, 2018

@tseaver @sduskis We are getting 100% code coverage in PyCharm, which uses coverage.py:

yield_row_coverage

However, circleci is identifying lines 338-341 as NOT having coverage. This is the only thing holding up the merge for this PR. Any suggestions or ideas would be very much appreciated so we can complete this PR.

@tseaver
Copy link
Contributor

tseaver commented Jan 24, 2018

@zakons The output on CircleCI is 338 -> 341, which is a branch miss -- a normal gap would be denoted 338-341. Can you configure the PyCharm -> coverage bit so that it enables branch coverage?

To mitigate: the branch is the test if self._cell: -- that condition is always true, and therefore the call to self._save_current_cell() is always executed. We need a test which gets to that point without having a cell assigned, or else (if that is impossible) we can just take out the if altogether.

@zakons
Copy link
Contributor

zakons commented Jan 26, 2018

@tseaver That was the issue - branch coverage. Your help is very much appreciated. Can you please fix the cla so it is green (see OK on this from @aneepct) and I believe we are ready to merge. Thanks!

@tseaver tseaver merged commit 166278f into googleapis:master Jan 26, 2018
@sduskis
Copy link
Contributor

sduskis commented Jan 26, 2018

Thanks!

@chemelnucfin chemelnucfin added cla: yes This human has signed the Contributor License Agreement. and removed cla: no This human has *not* signed the Contributor License Agreement. labels Feb 3, 2018
@zakons zakons deleted the feature/row_iterator branch February 5, 2018 22:06
@zunger-humu
Copy link

zunger-humu commented Mar 5, 2018

So, guess which breaking change didn't get mentioned in the release notes? ;)

Also, the documentation is now out-of-date, so it ended up taking some source diving to figure out how to use the new code. I think this means that Table.read_rows needs to be marked as deprecated, or its documentation otherwise modified to indicate when you should use it and when you should use yield_rows. TBH, from the code I think read_rows can just be ripped out; if you need to pull all the information, you can just wrap a call to yield_rows in list().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API. cla: yes This human has signed the Contributor License Agreement. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants