Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@franluca
Copy link
Contributor

Issue: when subclassing Jumpstart/SM/Bedrock model runners one has also to override the __reduce__ function to change the returned class, which is not intuitive

Description of changes: changed return of __reduce__ functions to self.__class__ rather than (e.g.) BedrockRunner, making overriding of reduce optional (and very likely not necessary).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link
Contributor

@keerthanvasist keerthanvasist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is overriding of __reduce__ not necessary? It's a very critical that reduce is implemented correctly.

Copy link
Contributor

@danielezhu danielezhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using self.__class__ helps in the case where you subclass one of these model runners and there are no changes to the subclass's __init__ method. In that case, the parent class's __reduce__ can directly be used without modification. However, I'm not sure how relevant this use case is. If you're subclassing one of these classes, you presumably need to pass in additional data to __init__, and this data needs to be captured by serialized_data. In other words, you will need to override __reduce__ in the subclass. Can you give a concrete example of a subclass that you're trying to implement?

@franluca
Copy link
Contributor Author

franluca commented Dec 14, 2023

Yes correct. Here's an example:

class ClaudeModelRunner(BedrockModelRunner):

    def predict(self, prompt: str) -> Tuple[Optional[str], Optional[float]]:
        """
        Overridden to globally follow the Claude Human: [...] Assistant: convention.
        todo is there a way to avoid doing this and just use the BedrockModelRunner from the library?
        """
        prompt = f"Human: {prompt}\n\n Assistant:"
        return super().predict(prompt)

In fact an even more general implementation of reduce could be the following

    def __reduce__(self):
        """
        Custom serializer method used by Ray when it serializes instances of this
        class in eval_algorithms.util.generate_model_predict_response_for_dataset.
        """
        # serialized_data = (
        #     self._model_id,
        #     self._content_template,
        #     self._output,
        #     self._log_probability,
        #     self._content_type,
        #     self._accept_type,
        # )
        serialized_data = super().__reduce__()[2]
        serialized_data.pop('_extractor', None)
        serialized_data.pop('_composer', None)
        serialized_data.pop('_bedrock_runtime_client', None)
        return self.__class__, tuple(serialized_data.values())

as basically you do not want to serialize these three objects, right?

@danielezhu danielezhu merged commit f70243c into main Dec 14, 2023
@danielezhu danielezhu deleted the subclassable-runners branch December 14, 2023 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants