-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Steps to reproduce
- Use owncloud server few years and update from time to time, use file versions and trashbin
- Upgrade to using utf8mb4 charset, run and don't care for some time
- Observe long run and heavy cpu load during
occ system:cron
Expected behaviour
occ system:cron
should run lightning fast
Actual behaviour
Long running operation while heavy cpu load from mariadb processes
According to the guide ( https://doc.owncloud.com/server/next/admin_manual/configuration/database/linux_database_configuration.html ), db table charset result in utf8mb4
charset and utf8mb4_bin
collation.
But running occ system:cron
took a very long time...
> SHOW FULL PROCESSLIST;
| Id | User | Host | db | Command | Time | State | Info | Progress |
...
|| 105 | owncloud | 172.23.0.1:59388 | owncloud | Query | 17 | Sending data | SELECT `fileid`, `storage`, `path`, `parent`, `name`,
`mimetype`, `mimepart`, `size`, `mtime`, `encrypted`,
`etag`, `permissions`, `checksum`
FROM `oc_filecache`
WHERE `storage` = '3' AND `name` COLLATE utf8mb4_general_ci LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685'
As one can see, the query time is 17 seconds but can take more, like 40 seconds. And it forcess the collation to utf8mb4_general_ci
.
The table has around 500k records. I tried to add index on the name
column and for path
to speed things up. Resulting table structure is like:
> show create table oc_filecache;
| Table | Create Table | oc_filecache | CREATE TABLE `oc_filecache` (
`fileid` bigint(20) NOT NULL AUTO_INCREMENT,
`storage` int(11) NOT NULL DEFAULT 0,
`path` varchar(4000) COLLATE utf8mb4_bin DEFAULT NULL,
`path_hash` varchar(32) COLLATE utf8mb4_bin NOT NULL DEFAULT '',
`parent` bigint(20) NOT NULL DEFAULT 0,
`name` varchar(250) COLLATE utf8mb4_bin DEFAULT NULL,
`mimetype` int(11) NOT NULL DEFAULT 0,
`mimepart` int(11) NOT NULL DEFAULT 0,
`size` bigint(20) NOT NULL DEFAULT 0,
`mtime` bigint(20) NOT NULL DEFAULT 0,
`storage_mtime` bigint(20) NOT NULL DEFAULT 0,
`encrypted` int(11) NOT NULL DEFAULT 0,
`unencrypted_size` bigint(20) NOT NULL DEFAULT 0,
`etag` varchar(40) COLLATE utf8mb4_bin DEFAULT NULL,
`permissions` int(11) DEFAULT 0,
`checksum` varchar(255) COLLATE utf8mb4_bin DEFAULT NULL,
PRIMARY KEY (`fileid`),
UNIQUE KEY `fs_storage_path_hash` (`storage`,`path_hash`),
KEY `fs_parent_name_hash` (`parent`,`name`),
KEY `fs_storage_mimetype` (`storage`,`mimetype`),
KEY `fs_storage_mimepart` (`storage`,`mimepart`),
KEY `fs_storage_size` (`storage`,`size`,`fileid`),
KEY `fs_parent_storage_size` (`parent`,`storage`,`size`),
KEY `path_index` (`path`(512)),
KEY `path_hash_index` (`path`(750)) USING HASH,
KEY `name_index` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=2413871 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin ROW_FORMAT=COMPRESSED |
then I tried fiddling with the query itself, requesting collation different than column's one looked weird:
SELECT `fileid`, `storage`, `path`, `parent`, `name`, `mimetype`, `mimepart`, `size`, `mtime`, `encrypted`, `etag`, `permissions`, `checksum` FROM `oc_filecache` WHERE `storage` = '3' AND `name` COLLATE utf8mb4_general_ci LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685'
poor results - high time and database load
Set the collation to match the column:
SELECT `fileid`, `storage`, `path`, `parent`, `name`, `mimetype`, `mimepart`, `size`, `mtime`, `encrypted`, `etag`, `permissions`, `checksum` FROM `oc_filecache` WHERE `storage` = '3' AND `name` COLLATE utf8mb4_bin LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685'
same poor results - high time and database load
No collation forcing:
SELECT `fileid`, `storage`, `path`, `parent`, `name`, `mimetype`, `mimepart`, `size`, `mtime`, `encrypted`, `etag`, `permissions`, `checksum` FROM `oc_filecache` WHERE `storage` = '3' AND `name` COLLATE LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685'
swift response with little load
explaining the queries show the difference - only the last query uses index:
MariaDB [owncloud]> explain SELECT `fileid`, `storage`, `path`, `parent`, `name`, `mimetype`, `mimepart`, `size`, `mtime`, `encrypted`, `etag`, `permissions`, `checksum` FROM `oc_filecache` WHERE `storage` = '3' AND `name` COLLATE utf8mb4_general_ci LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685';
+------+-------------+--------------+------+------------------------------------------------------------------------------+----------------------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------+------+------------------------------------------------------------------------------+----------------------+---------+-------+--------+-------------+
| 1 | SIMPLE | oc_filecache | ref | fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size | fs_storage_path_hash | 4 | const | 316334 | Using where |
+------+-------------+--------------+------+------------------------------------------------------------------------------+----------------------+---------+-------+--------+-------------+
1 row in set (0.007 sec)
MariaDB [owncloud]> explain SELECT `fileid`, `storage`, `path`, `parent`, `name`, `mimetype`, `mimepart`, `size`, `mtime`, `encrypted`, `etag`, `permissions`, `checksum` FROM `oc_filecache` WHERE `storage` = '3' AND `name` COLLATE utf8mb4_bin LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685';
+------+-------------+--------------+------+------------------------------------------------------------------------------+----------------------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------+------+------------------------------------------------------------------------------+----------------------+---------+-------+--------+-------------+
| 1 | SIMPLE | oc_filecache | ref | fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size | fs_storage_path_hash | 4 | const | 316334 | Using where |
+------+-------------+--------------+------+------------------------------------------------------------------------------+----------------------+---------+-------+--------+-------------+
1 row in set (0.007 sec)
MariaDB [owncloud]> explain SELECT `fileid`, `storage`, `path`, `parent`, `name`, `mimetype`, `mimepart`, `size`, `mtime`, `encrypted`, `etag`, `permissions`, `checksum` FROM `oc_filecache` WHERE `storage` = '3' AND `name` LIKE 'MathNet.Numerics.5.0.0.v%.d1717775685';
+------+-------------+--------------+-------+-----------------------------------------------------------------------------------------+------------+---------+------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------+-------+-----------------------------------------------------------------------------------------+------------+---------+------+------+------------------------------------+
| 1 | SIMPLE | oc_filecache | range | fs_storage_path_hash,fs_storage_mimetype,fs_storage_mimepart,fs_storage_size,name_index | name_index | 1003 | NULL | 1 | Using index condition; Using where |
+------+-------------+--------------+-------+-----------------------------------------------------------------------------------------+------------+---------+------+------+------------------------------------+
1 row in set (0.011 sec)
I downloaded current source package ( https://download.owncloud.com/server/stable/owncloud-complete-20250311.zip ), and searched internals for collation processing:
while following files use the utf8mb4_bin
collation:
\lib\private\DB\ConnectionFactory.php
\lib\private\Repair\Collation.php
\lib\private\Setup\MySQL.php
... these files use utf8mb4_general_ci
collation:
\lib\private\DB\AdapterMySQL.php
\lib\private\DB\QueryBuilder\ExpressionBuilder\MySqlExpressionBuilder.php
So...
- there is a mix of collations used in code I don't understand...
- according to the experiment above it seems MariaDB doesn't like the specification of collation in the query (LIKE clause) at all
- as a result the engine seems to ignore existing index
- whether the requested collation matches column collation doesn't make a difference
- I don't know if current collations in my DB are correct, but it seems OK by the conversion guide to 4-byte unicode
- I didn't find any config option that can change the behavior
- I didn't find any reports regarding this issue
- Upgrading app (which is planned though) is not expected to bring solution as the mix of collations still appears in current source package.
Server configuration
Operating system:
Linux
Web server:
Apache (docker image, 10.13.4) , but seems same in current package (https://download.owncloud.com/server/stable/owncloud-complete-20250311.zip)
Database:
MariaDB (10.3.10-MariaDB-log)
PHP version:
7.4.3 (docker image, 10.13.4)
ownCloud version: (see ownCloud admin page)
10.13.4 (docker image, 10.13.4)
Updated from an older ownCloud or fresh install:
updated
Where did you install ownCloud from:
docker
Signing status (ownCloud 9.0 and above):
No errors have been found.
Are you using external storage, if yes which one: local/smb/sftp/...
no
Are you using encryption: yes/no
no
Are you using an external user-backend, if yes which one: LDAP/ActiveDirectory/Webdav/...
no