Way to communicate more information between libraries

I'm trying  to thinking about a way for pydantic to communicate extra field information to hypothesis which is:
1. reusable by other libraries - e.g. doesn't use hypothesis types
2. doesn't require any understanding of pydantic internals - e.g. not based on pydantic core schema
3. can be extended without further integration discussion - e.g. is a proper protocol, not a list of types

For reference [here](https://github.com/pydantic/pydantic/blob/1.10.X-fixes/pydantic/_hypothesis_plugin.py) is the hypothesis plugin from pydantic V1. The types we'd like to support with this system are as follows:
* EmailStr - a string that is a valid email address, could crudely be just a regex
* NameEmail - email and name in the `name <email>` format, can be a regex + email
* PyObject (now ImportString) - string that can represent anything, the hypothesis plugin used a random attribute of `math`
* Color - could be a regex
* luhn valid card number - currently generated by trial and error
* IPvAnyAddress - either IPv4 or IPv6
* JsonWrapper - a JSON string, possibly with a type hint

Everything else I think is covered by using `Annotated` and things already defined by `annotated-types`.

Note in V2 some of the above are implemented as arguments to `Annotated` (albeit with an alias), some are legitimate custom types.

One idea I thought of was to use JSON Schema - you provide a method or property on either type or argument to
`Annotated` which returned some JSON Schema. But looking at the above list, I don't think JSON Schema would help with many.

Therefore here's my proposal:

`annotated-types` defines a new property or method on types and arguments to `Annotated` which returns the following 
pieces of information (could be a tuple, a dict with specific keys, or a dataclass defined herein):
* `documentation_example` - a canonical example of the datatype as might be shown in documentation, e.g. `john@example.com`
* `random_example` - a random varying example of the datatype, e.g. `ad90cj-i3ljlkd@dk33w4poedd.co.uk`
* `type_code` - a string that libraries can use to identify the type (e.g. `email`), and thereby decide to do more powerful things - e.g. hypothesis has a better strategy for generating random email addresses than pydantic will. Would be `None` if no `type_code` exists for a field.

The idea is that annotated-types defines:
* the above data structure
* a list of agreed `type_code`s, starting with the ones from above

# Example usage

`EmailStr`: would emit `ExtraTypeInfo('email', 'john@example.com', 'ad90cj-i3ljlkd@dk33w4poedd.co.uk')`.

hypothesis would ignore the crude random example and use it's own strategy for email addresses since it recognises the `email` type code.

---

`Color`: would emit `ExtraTypeInfo('color', '#ff0000', '#00ff00')`, if hypothesis doesn't recognize `color` it could
fallback to using the random example generated by pydantic.

---

If a user wanted their own `UKPostCode` type (alias of `Annotated[str, UKPostCodeMetadata]`), then `UKPostCodeMetadata` could emit `ExtraTypeInfo(None, 'W1A 1AA', 'sp119dg')`, `None` for type code since no type code exists for uk post codes, hypothesis would use just use the random example `'sp119dg'`.

---

In theory another tool could use this data (e.g. for generating documentation) with no knowledge of hypothesis or pydantic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Way to communicate more information between libraries #37

Example usage

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Way to communicate more information between libraries #37

Description

Example usage

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions