Thanks to visit codestin.com
Credit goes to github.com

Skip to content

handle string size more consistently in mysql dialect#5

Merged
lieut-data merged 3 commits into
mattermost:masterfrom
brunoenten:alternative
May 1, 2019
Merged

handle string size more consistently in mysql dialect#5
lieut-data merged 3 commits into
mattermost:masterfrom
brunoenten:alternative

Conversation

@brunoenten

@brunoenten brunoenten commented Apr 24, 2019

Copy link
Copy Markdown

Use longtext for maxsize=0, varchar for maxsize <= 512 and text if maxsize > 512.
No change to the postgres dialect.

regarding issue mattermost/mattermost#10328
alternative to #4

@jespino

jespino commented Apr 24, 2019

Copy link
Copy Markdown

I like it more, as you said, is not fully consistent, but I think is more what is expected.

@lieut-data lieut-data left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a stab at this, @brunoenten!

Whatever change we end up making here will eventually make its way back into mattermost-server (when we point at the updated commits), and change the semantics of newly created tables. This will, in turn, require a migration to "fix" existing databases, otherwise a fully-migrated schema will deviate from a fresh schema -- something for which our CI builds will soon be failing. Any such migration will ultimately then make its way into the product, and impact live customer databases.

@cpoile raised some concerns about TEXT that I think we need to fully think through here before we can pivot any which way. But first, I wanted to understand your pivot from the suggestion of varchar(16384) towards using TEXT anyway. I've commented inline.

Comment thread dialect_mysql.go
if maxsize < 256 {
if maxsize == 0 {
// Closer match for unbounded text
return "longtext"

@lieut-data lieut-data Apr 24, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per https://dev.mysql.com/doc/refman/8.0/en/string-type-overview.html, longtext is:

A TEXT column with a maximum length of 4,294,967,295 or 4GB (232 − 1) characters.

In the discussion on the ticket, you mentioned:

I want to follow @lieut-data suggestion but use "varchar(16384)" instead of "text" for maxsize = 0.

Can you elaborate on the thinking here? Isn't using longtext basically the same as text, which invokes the issues @cpoile mentioned on the ticket?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the change on mind here is related to the comments in the PR #4, where is more "expected" by a user, if I don't give you any boundary in the field, I want an unbounded field (that is the behavior for postgres) and because it is about consistency, I think here it makes sense.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline with @jespino, but we agreed longtext makes sense as proposed, though we'll never plan to actually use it in mattermost-server, preferring instead to always set an explicit length.

Comment thread dialect_mysql.go Outdated
if maxsize == 0 {
// Closer match for unbounded text
return "longtext"
} else if maxsize < 513 {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to change the boundary conditions here, given the migration considerations above.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think if we maintain it as 256, we can keep it without the need of a migration for certain cases.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I've committed the change.

Comment thread dialect_mysql.go Outdated
} else {
return "text"
// mysql will choose the right text variant according to the specified size
return fmt.Sprintf("text(%d)", maxsize)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also as per https://dev.mysql.com/doc/refman/8.0/en/string-type-overview.html:

An optional length M can be given for this type. If this is done, MySQL creates the column as the smallest TEXT type large enough to hold values M characters long.

So this still creates a TEXT column of some type, with the same concerns as above.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, again if you want consistencly between postgres and mysql, that makes a lot of sense, because is the postgres behavior.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no, because text/varchar is the same for postgres.

@brunoenten

Copy link
Copy Markdown
Author

Continuing the discussion from #4, I feel that there's no perfect solution because IMHO the very concept of building a data model from a bunch of structs/objects is flawed. There's just too much black box magic involved as it's akin to trying to use a relational database as an object storage.

That being said, I think the solution we found is a good compromise, taking into account the existing data model.

@jespino

jespino commented Apr 26, 2019

Copy link
Copy Markdown

Yes, I agree with that, the current solution is good enough, and do not generate too much changes in the data model. I would like to review with fields get affected by this change (basically any text field without SetMaxSize)

@brunoenten

brunoenten commented Apr 26, 2019

Copy link
Copy Markdown
Author

Here's a CSV file with all the impacted columns. I'm looking in more detail into the text to text(n) change to establish exactly which ones would actually change.

string_mysql_migration-master-20190426.txt

@jespino

jespino commented Apr 26, 2019

Copy link
Copy Markdown

Super interesting, if we remove all the text to text(XXX) cases, we only have 3 cases, actually if we remove the (XXX) from the text that will be the situation.

Then we only will have 3 changes, Teams.Type, that I would migrate to a VARCHAR(1) changing the type SetMaxSize in the store and the other two are IncommingWebhooks.Username and IncommingWebhooks.IconURL, which in Outgoing webhooks the sizes are set explicitly to 64 and 1024, I would do that change. Both are changes that can be done with a migration without much troubles because normally Teams and IncommingWebhooks aren't crowded tables.

@brunoenten

Copy link
Copy Markdown
Author

You're right, we can simply remove the (n) from text(n).

@lieut-data

Copy link
Copy Markdown
Member

@brunoenten, thanks for digging into the migration details: that gives me the confidence we can proceed without too much impact. I agree with @jespino's thinking about the impact to these tables -- a migration won't be an issue.

@jespino, can you just confirm that the proposed changes will be backwards compatible -- that is, we are only relaxing constraints on these columns? (For context to others, we recently committed to /not/ making backwards incompatible schema changes across minor versions.)

Otherwise, I'm good with the proposed changes! @brunoenten, let me suggest you also run the make test-db-migration once the updated gorp is in place and the migrations are run -- this will prove that the schemas are now convergent.

@jespino

jespino commented Apr 30, 2019

Copy link
Copy Markdown

It is not exactly backward compatible, here are 3 affected fields, and this are the consequences:

  1. we can restrict the Team.Type to 1, for example, and should work for any existing data without the need of altering the database, we still be using a VARCHAR(255) for old databases and a VARCHAR(1) for new databases.
  2. We can restrict the IncommingWebhook.Username to 64 bytes, which converts the VARCHAR(255) to a VARCHAR(64), it will work properly for any incomming webhook with username under 64 characters, and maybe (I'm not 100% sure) for the rest of them. We don't need to execute a migration here, so... new databases will have VARCHAR(64) and databases will have a VARCHAR(255).
  3. We can change the field IncommingWebhoook.IconURL to 1024, which is the current size for OutgoingWebhook.IconURL, that would convert the VARCHAR(255) field in a TEXT. And there is the problem, I'm not sure if we need to migrate that to have it properly working with the ORM, and for sure, if you don't have a migration, it will be restricting the IconURL to 255 characters in mysql.

@lieut-data

Copy link
Copy Markdown
Member

Thanks for the detailed breakdown, @jespino. I think we have to avoid any backwards incompatible schema changes -- at least until v6.0.0. It's true that the application may not have depended on those column sizes, but I'm not sure it's worth the risk.

I also note that the make test-db-migration target is slated to become part of the CI, after which it will be impossible to merge changes that don't leave a "new" database in the same state as a "migrated" database (upgrading from 5.0.0). So however we proceed, we must arrive at the same definition for both -- we can't leave them diverging.

@jespino & @brunoenten, wondering if it would make sense to hop on a call to discuss this further?

As an aside, I fully agree with @brunoenten's comments:

the very concept of building a data model from a bunch of structs/objects is flawed

The medium-term plan remains to move away from gorp -- we're experimenting with some changes to this effect in a new project, and will hopefully be able to fold back some thinking to mattermost-server.

@jespino jespino left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lieut-data lieut-data left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @brunoenten! @jespino helped me understand the impact of the changes here, and how we'll effectively migrate the required columns going forward.

The key takeaway is that we /won't/ be worried about consistency between MySQL and PostgreSQL with respect to columns with a maxlength > 255 -- these will stay as TEXT and VARCHAR(maxsize) respectively. The more important change -- and the one this PR addresses -- is the handling of maxsize == 0 which will fix MySQL to actually be (pseudo-)unbounded, like PostgreSQL.

With respect to the actual migration, @jespino and I agreed that even though some of these column lengths are "weird" relative to the application, we'll preserve backwards compatibility at the schema layer by at most relaxing constraints:

Column Current (MySQL) New (MySQL)
Team.Type VARCHAR(255) VARCHAR(255)
IncomingWebhook.Username VARCHAR(255) VARCHAR(255)
IncomingWebhoook.IconURL VARCHAR(255) VARCHAR(1024)

This amounts to a few small changes to the table definition in the _store.go files, and a single migration to make IncomingWebhoook.IconURL the more useful length of 1024 (matching its OutgoingWebhook equivalent).

Are you interested in helping us make the last of these changes, while also updating the vendored version of mattermost/gorp in the mattermost-server repository?

@brunoenten

Copy link
Copy Markdown
Author

You're welcome :)

I think I understand how the migration system from mattermost-server works. I'll give it a try and also update the vendored gorp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants