handle string size more consistently in mysql dialect#5
Conversation
|
I like it more, as you said, is not fully consistent, but I think is more what is expected. |
lieut-data
left a comment
There was a problem hiding this comment.
Thanks for taking a stab at this, @brunoenten!
Whatever change we end up making here will eventually make its way back into mattermost-server (when we point at the updated commits), and change the semantics of newly created tables. This will, in turn, require a migration to "fix" existing databases, otherwise a fully-migrated schema will deviate from a fresh schema -- something for which our CI builds will soon be failing. Any such migration will ultimately then make its way into the product, and impact live customer databases.
@cpoile raised some concerns about TEXT that I think we need to fully think through here before we can pivot any which way. But first, I wanted to understand your pivot from the suggestion of varchar(16384) towards using TEXT anyway. I've commented inline.
| if maxsize < 256 { | ||
| if maxsize == 0 { | ||
| // Closer match for unbounded text | ||
| return "longtext" |
There was a problem hiding this comment.
As per https://dev.mysql.com/doc/refman/8.0/en/string-type-overview.html, longtext is:
A TEXT column with a maximum length of 4,294,967,295 or 4GB (232 − 1) characters.
In the discussion on the ticket, you mentioned:
I want to follow @lieut-data suggestion but use "varchar(16384)" instead of "text" for maxsize = 0.
Can you elaborate on the thinking here? Isn't using longtext basically the same as text, which invokes the issues @cpoile mentioned on the ticket?
There was a problem hiding this comment.
Yes, the change on mind here is related to the comments in the PR #4, where is more "expected" by a user, if I don't give you any boundary in the field, I want an unbounded field (that is the behavior for postgres) and because it is about consistency, I think here it makes sense.
There was a problem hiding this comment.
Discussed offline with @jespino, but we agreed longtext makes sense as proposed, though we'll never plan to actually use it in mattermost-server, preferring instead to always set an explicit length.
| if maxsize == 0 { | ||
| // Closer match for unbounded text | ||
| return "longtext" | ||
| } else if maxsize < 513 { |
There was a problem hiding this comment.
I'm hesitant to change the boundary conditions here, given the migration considerations above.
There was a problem hiding this comment.
Yes, I think if we maintain it as 256, we can keep it without the need of a migration for certain cases.
There was a problem hiding this comment.
That makes sense. I've committed the change.
| } else { | ||
| return "text" | ||
| // mysql will choose the right text variant according to the specified size | ||
| return fmt.Sprintf("text(%d)", maxsize) |
There was a problem hiding this comment.
Also as per https://dev.mysql.com/doc/refman/8.0/en/string-type-overview.html:
An optional length M can be given for this type. If this is done, MySQL creates the column as the smallest TEXT type large enough to hold values M characters long.
So this still creates a TEXT column of some type, with the same concerns as above.
There was a problem hiding this comment.
Yes, again if you want consistencly between postgres and mysql, that makes a lot of sense, because is the postgres behavior.
There was a problem hiding this comment.
Yes and no, because text/varchar is the same for postgres.
|
Continuing the discussion from #4, I feel that there's no perfect solution because IMHO the very concept of building a data model from a bunch of structs/objects is flawed. There's just too much black box magic involved as it's akin to trying to use a relational database as an object storage. That being said, I think the solution we found is a good compromise, taking into account the existing data model. |
|
Yes, I agree with that, the current solution is good enough, and do not generate too much changes in the data model. I would like to review with fields get affected by this change (basically any text field without |
|
Here's a CSV file with all the impacted columns. I'm looking in more detail into the text to text(n) change to establish exactly which ones would actually change. |
|
Super interesting, if we remove all the text to text(XXX) cases, we only have 3 cases, actually if we remove the Then we only will have 3 changes, Teams.Type, that I would migrate to a VARCHAR(1) changing the type |
|
You're right, we can simply remove the (n) from text(n). |
|
@brunoenten, thanks for digging into the migration details: that gives me the confidence we can proceed without too much impact. I agree with @jespino's thinking about the impact to these tables -- a migration won't be an issue. @jespino, can you just confirm that the proposed changes will be backwards compatible -- that is, we are only relaxing constraints on these columns? (For context to others, we recently committed to /not/ making backwards incompatible schema changes across minor versions.) Otherwise, I'm good with the proposed changes! @brunoenten, let me suggest you also run the |
|
It is not exactly backward compatible, here are 3 affected fields, and this are the consequences:
|
|
Thanks for the detailed breakdown, @jespino. I think we have to avoid any backwards incompatible schema changes -- at least until v6.0.0. It's true that the application may not have depended on those column sizes, but I'm not sure it's worth the risk. I also note that the @jespino & @brunoenten, wondering if it would make sense to hop on a call to discuss this further? As an aside, I fully agree with @brunoenten's comments:
The medium-term plan remains to move away from gorp -- we're experimenting with some changes to this effect in a new project, and will hopefully be able to fold back some thinking to mattermost-server. |
lieut-data
left a comment
There was a problem hiding this comment.
Thanks, @brunoenten! @jespino helped me understand the impact of the changes here, and how we'll effectively migrate the required columns going forward.
The key takeaway is that we /won't/ be worried about consistency between MySQL and PostgreSQL with respect to columns with a maxlength > 255 -- these will stay as TEXT and VARCHAR(maxsize) respectively. The more important change -- and the one this PR addresses -- is the handling of maxsize == 0 which will fix MySQL to actually be (pseudo-)unbounded, like PostgreSQL.
With respect to the actual migration, @jespino and I agreed that even though some of these column lengths are "weird" relative to the application, we'll preserve backwards compatibility at the schema layer by at most relaxing constraints:
| Column | Current (MySQL) | New (MySQL) |
|---|---|---|
| Team.Type | VARCHAR(255) | VARCHAR(255) |
| IncomingWebhook.Username | VARCHAR(255) | VARCHAR(255) |
| IncomingWebhoook.IconURL | VARCHAR(255) | VARCHAR(1024) |
This amounts to a few small changes to the table definition in the _store.go files, and a single migration to make IncomingWebhoook.IconURL the more useful length of 1024 (matching its OutgoingWebhook equivalent).
Are you interested in helping us make the last of these changes, while also updating the vendored version of mattermost/gorp in the mattermost-server repository?
|
You're welcome :) I think I understand how the migration system from mattermost-server works. I'll give it a try and also update the vendored gorp. |
Use longtext for maxsize=0, varchar for maxsize <= 512 and text if maxsize > 512.
No change to the postgres dialect.
regarding issue mattermost/mattermost#10328
alternative to #4