Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Multiple table_disk tables with s3_plain_rewritable disk with shared endpoint corrupt each other's data #98512

@rienath

Description

@rienath

Problem

table_disk assumes that the entire disk is designated for that table. This leads to data corruption when multiple tables use table_disk with the same underlying disk, as one table pollutes the other.

Repro

Add to server config:

<s3_plain_rewritable>
    <type>s3_plain_rewritable</type>
    <endpoint>http://localhost:11111/plain-rewritable/</endpoint>
</s3_plain_rewritable>

Run :

CREATE TABLE t1 (a Int64, b Int64) ENGINE = MergeTree settings table_disk = 1, disk = disk(type = cache, path = '/tmp/filesystem_caches/stateful_1', max_size = '4G', disk = 's3_plain_rewritable');

CREATE TABLE t2 (a Int64, b Int64) ENGINE = MergeTree settings table_disk = 1, disk = disk(type = cache, path = '/tmp/filesystem_caches/stateful_2', max_size = '4G', disk = 's3_plain_rewritable');

INSERT INTO t1 SELECT * FROM generateRandom() LIMIT 100000;

INSERT INTO t2 SELECT * FROM generateRandom() LIMIT 200000;

SELECT * FROM t1;


DB::Exception: Cannot read all data in MergeTreeReaderCompact. Rows read: 8192. Rows expected: 10000: (while reading column b): (while reading from part all_1_1_0/ in table default.t1 (aadd0f6a-b23b-438f-9cdd-d0a77235fc5f) located on disk s3_plain_rewritable of type s3, from mark 0 with max_rows_to_read = 10000, offset = 0): While reading part all_1_1_0: While executing MergeTreeSelect(pool: PrefetchedReadPool, algorithm: Thread). (CANNOT_READ_ALL_DATA)

SELECT * FROM t2;

       ┌────────────────────a─┬────────────────────b─┐
    1. │  214110505487333988522280515674767932452. │ -416339482088505486787376705159074517293. │  37687950274778479592947923418608131607 │
...

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugConfirmed user-visible misbehaviour in official releasecomp-s3DiskS3, Read/WriteBufferFromS3 and s3 table functions.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions