Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@juliaputko
Copy link
Contributor

db_identifier uniqueness testing for multi database support

Copy link
Contributor

@billschereriii billschereriii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start on the problem of catching duplicate database identifiers. The names you have used for variables and methods are making the code a bit hard to follow though.

In broad strokes, there are some areas that haven't been touched on yet in the PR. I assume you're intending to address these:

  • There are several placeholders where you are clearly intending to make changes but haven't yet.
  • I don't see the changes to the launcher code to invoke the new database ID names
  • Changelog isn't updated

@ashao ashao marked this pull request as draft August 16, 2023 19:38
Copy link
Member

@ashao ashao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start to this PR!

General comments:

  • Try to avoid adding properties to objects unless they absolutely need them. This restricts the flow of information and can help avoid writing to much spaghetti code down the line
  • Most functions actually do something, so a good naming convention for them is to start with a verb and a short description of what it does.
  • Python uses the soft convention of indicating that a function or property is private with an underscore, e.g. _foo
  • While it is in general good to keep functions short, having functions that are one-liners are generally discouraged.

:rtype: Orchestrator or derived class
"""

self.append_to_db_identifier_list(db_identifier)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the id list feels like it duplicates information found elsewhere (e.g. the list of orchestrators) and gives us two sources of truth that can fall out of sync. Consider not adding this additional collection.

@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

Merging #342 (5bdf6eb) into develop (f00c426) will increase coverage by 0.24%.
The diff coverage is 95.16%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #342      +/-   ##
===========================================
+ Coverage    89.52%   89.77%   +0.24%     
===========================================
  Files           58       59       +1     
  Lines         3571     3609      +38     
===========================================
+ Hits          3197     3240      +43     
+ Misses         374      369       -5     
Files Coverage Δ
smartsim/_core/control/jobmanager.py 94.03% <100.00%> (+0.75%) ⬆️
smartsim/_core/generation/generator.py 96.82% <100.00%> (+0.76%) ⬆️
smartsim/_core/utils/helpers.py 90.19% <100.00%> (+1.06%) ⬆️
smartsim/database/orchestrator.py 86.71% <100.00%> (+0.14%) ⬆️
smartsim/entity/dbnode.py 92.10% <100.00%> (+0.06%) ⬆️
smartsim/entity/model.py 95.59% <ø> (ø)
smartsim/error/__init__.py 100.00% <ø> (ø)
smartsim/error/errors.py 100.00% <100.00%> (ø)
smartsim/experiment.py 80.95% <100.00%> (+0.81%) ⬆️
smartsim/ml/data.py 94.14% <100.00%> (ø)
... and 4 more

... and 1 file with indirect coverage changes

Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! Some cleanup may be needed - I'll leave it up to your judgment, but otherwise looks good.

run: |
python -m pip install git+https://github.com/CrayLabs/SmartRedis.git@develop#egg=smartredis
python -m pip install git+https://github.com/billschereriii/smartredis.git@multidb
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this has to be reverted to CrayLabs before we merge

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call! Thanks - might be the last thing I do to make sure the CI passes

# Versions
SMARTSIM = Version_(get_env("SMARTSIM_VERSION", "0.5.1"))
SMARTREDIS = Version_(get_env("SMARTREDIS_VERSION", "0.4.2"))
SMARTREDIS = Version_(get_env("SMARTREDIS_VERSION", "0.4.1"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will have to be modified once we can use the CrayLabs branch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo before final merge


# Retrieve num_shards to append to client env
client_env[f"SR_DB_TYPE{db_name}"] = (
"Clustered" if len(addresses) > 1 else "Standalone"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think we will use "Clustered" and "Standalone" in other classes, I suggest defining them as enumerators (like we do for Status in other classes). That should reduce the chances of misspelling them, and make it easier to read them as constants.

class SSReservedKeywordError(SmartSimError):
"""Raised when a Reserved Keyword is used incorrectly"""

class DBIDConflictError(SmartSimError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we prefix this with SS? Just to make it clear it is a SmartSim-generated error.

steps.append((batch_step, elist))
else:
# if ensemble is to be run as separate job steps, aka not in a batch
# if ensemble is to be run as separate job steps,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is slightly "redundant", as it simply explains what .batch means. Maybe it is left over from debugging?

db_cpus: int = 1,
custom_pinning: t.Optional[t.Iterable[t.Union[int, t.Iterable[int]]]] = None,
debug: bool = False,
db_identifier: t.Optional[str] = "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, but @MattToast can correct me, that if we define the default value as "", the variable type is str and not Optional[str]. But I may be wrong.

db_cpus: int = 1,
custom_pinning: t.Optional[t.Iterable[t.Union[int, t.Iterable[int]]]] = None,
debug: bool = False,
db_identifier: t.Optional[str] = "",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above.

if db_identifier in self.db_identifiers:
logger.warning(
f"A database with the identifier {db_identifier} has already been made"
"An error will be raised if multiple databases are started"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing white space at the end of strings, when concatenated the result will not be the desired one (i.e. "... madeAn ... startedwith").

Copy link
Member

@ashao ashao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very minor changes requested. Thanks for doing such a careful job

" name for db_identifier"
)

db_name_colo = unpack_colo_db_identfifier(db_name_colo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelling: identfifier -> identifier

raise SSInternalError(
"Colocated database was not configured for either TCP or UDS"
)
client_env[f"SR_DB_TYPE{db_name_colo}"] = "Standalone"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the STANDALONE enum from servertypes


self.kill_on_interrupt = True # flag for killing jobs on SIGINT

self.active_db_identifiers: t.Set[str] = set()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this, no longer used

:param job: job instance we are transitioning
:type job: Job
"""
# remove db id from active entity list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comments


def _gen_orc_dir(self, orchestrator: t.Optional[Orchestrator]) -> None:
def _gen_orc_dir(self, orchestrator_list: t.List[Orchestrator]) -> None:
# orchestrator: t.Optional[Orchestrator]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment

self._control = Controller(launcher=launcher)
self._launcher = launcher.lower()
self.db_identifiers: t.Set[str] = set()
self.db_dict: t.Dict[str, t.Any] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db_dict is no longer used, remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this relies on any of the ML backends. Move this file one directory up. If any of the tests do use an ML backend keep them here.

assert all([stat == status.STATUS_CANCELLED for stat in statuses])


# JPNOTE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comments

def unpack_db_identifier(db_id: str, token: str) -> t.Tuple[str, str]:
"""Unpack the unformatted database identifier using the token,
and format for env variable suffix
:db_id: the unformatted database identifier eg. charizard_0
Copy link
Member

@ashao ashao Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll play the role of crotchety, no-fun old dev along with @al-rigazzi here...but we should probably rename this :(

@al-rigazzi al-rigazzi self-requested a review October 10, 2023 17:42
Copy link
Collaborator

@al-rigazzi al-rigazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending conflicts to be resolved. Thanks for this big PR!

@billschereriii
Copy link
Contributor

Looks great! Thanks for sticking with this one, it turned out to be a real beast!

@juliaputko juliaputko merged commit a9e64c8 into CrayLabs:develop Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants