-
Notifications
You must be signed in to change notification settings - Fork 65
Standardize materialize filenames a little #1283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
Signed-off-by: Henry Lindeman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits, nice work!
| assert len(self.metadata["lineage_links"]["from_ids"]) > 0 | ||
|
|
||
| self.data["doc_id"] = mkdocid() | ||
| if "doc_id" not in self.data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
o/w metadata documents get a new docid every time they're materialized (as materialize call the constructor) which I'm pretty sure is the incorrect behavior
|
|
||
| # Create a function that fails for specific documents | ||
| def failing_map(doc): | ||
| # logger.info(doc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
| ctx = sycamore.init(exec_mode=self.exec_mode) | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmpdir1, tempfile.TemporaryDirectory() as tmpdir2: | ||
| docs = make_docs(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end of this test can we assert that we have 11 docs? The make_docs should give you 10 Documents and 1 MetadataDocument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depending on how docs get batched through noop I can't guarantee that there will be 11 but I can assert that there are at least one
| return d | ||
|
|
||
|
|
||
| logger = logging.getLogger(__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
lib/sycamore/sycamore/materialize.py
Outdated
| ret = [] | ||
| count = 0 | ||
| for fi in self._fshelper.list_files(self._root): | ||
| # logger.info(fi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
Signed-off-by: Henry Lindeman <[email protected]>
Intent: add ability to materialize from a list of doc_ids
Resulting changes: