-
Notifications
You must be signed in to change notification settings - Fork 157
Open
Description
Hello.
I need to connect an empty cluster in another DC to an existing cluster that already has data.
DC1 cluster info
# /usr/local/bin/leofs-adm status
[System Confiuration]
-----------------------------------+----------
Item | Value
-----------------------------------+----------
Basic/Consistency level
-----------------------------------+----------
system version | 1.4.3
cluster Id | leofs_1
DC Id | dc_1
Total replicas | 1
number of successes of R | 1
number of successes of W | 1
number of successes of D | 1
number of rack-awareness replicas | 0
ring size | 2^128
-----------------------------------+----------
Multi DC replication settings
-----------------------------------+----------
[mdcr] max number of joinable DCs | 2
[mdcr] total replicas per a DC | 1
[mdcr] number of successes of R | 1
[mdcr] number of successes of W | 1
[mdcr] number of successes of D | 1
-----------------------------------+----------
Manager RING hash
-----------------------------------+----------
current ring-hash | 80897912
previous ring-hash | 80897912
-----------------------------------+----------
[State of Node(s)]
-------+------------------------------+--------------+---------+----------------+----------------+----------------------------
type | node | state | rack id | current ring | prev ring | updated at
-------+------------------------------+--------------+---------+----------------+----------------+----------------------------
S | [email protected] | running | | 80897912 | 80897912 | 2022-01-10 14:39:27 +0200
G | [email protected] | running | | 80897912 | 80897912 | 2022-01-10 11:34:47 +0200
-------+------------------------------+--------------+---------+----------------+----------------+----------------------------
# /usr/local/bin/leofs-adm du [email protected]
active number of objects: 9786
total number of objects: 9802
active size of objects: 1222737081
total size of objects: 1262702978
ratio of active size: 96.83%
last compaction start: 2022-01-10 16:07:23 +0200
last compaction end: 2022-01-10 16:07:29 +0200
DC2 cluster info
# /usr/local/bin/leofs-adm status
[System Confiuration]
-----------------------------------+----------
Item | Value
-----------------------------------+----------
Basic/Consistency level
-----------------------------------+----------
system version | 1.4.3
cluster Id | leofs_2
DC Id | dc_2
Total replicas | 2
number of successes of R | 1
number of successes of W | 1
number of successes of D | 1
number of rack-awareness replicas | 2
ring size | 2^128
-----------------------------------+----------
Multi DC replication settings
-----------------------------------+----------
[mdcr] max number of joinable DCs | 2
[mdcr] total replicas per a DC | 1
[mdcr] number of successes of R | 1
[mdcr] number of successes of W | 1
[mdcr] number of successes of D | 1
-----------------------------------+----------
Manager RING hash
-----------------------------------+----------
current ring-hash | 84eb107d
previous ring-hash | 84eb107d
-----------------------------------+----------
[State of Node(s)]
-------+------------------------------+--------------+-----------+----------------+----------------+----------------------------
type | node | state | rack id | current ring | prev ring | updated at
-------+------------------------------+--------------+-----------+----------------+----------------+----------------------------
S | [email protected] | running | R7 | 84eb107d | 84eb107d | 2022-01-10 15:53:13 +0200
S | [email protected] | running | R8 | 84eb107d | 84eb107d | 2022-01-10 15:54:15 +0200
G | [email protected] | running | | 84eb107d | 84eb107d | 2022-01-10 12:18:13 +0200
G | [email protected] | running | | 84eb107d | 84eb107d | 2022-01-10 12:18:20 +0200
-------+------------------------------+--------------+-----------+----------------+----------------+----------------------------
join-cluster
# /usr/local/bin/leofs-adm join-cluster [email protected]:13075 [email protected]:13076
OK
/usr/local/leofs/1.4.3/leo_storage/log/app/crash.log
After join-cluster command such errors appears in the crash.log on the [email protected]
{module,"leo_backend_db_eleveldb"},{function,"prefix_search/3"},{line,227},{body,{timeout,{gen_server,call,[leo_object_storage_read_0_1,{head,{29834374738833832619322004778813394310,<<"b2b-cache/cities/04/noparams.xml">>},7007419},30000]}}}
2022-01-10 16:37:12 =ERROR REPORT====
{module,"leo_backend_db_eleveldb"},{function,"prefix_search/3"},{line,227},{body,{timeout,{gen_server,call,[leo_object_storage_read_1_2,{head,{49036670905450747481373050418517138571,<<"b2b-cache/cities/05/noparams.xml">>},7037644},30000]}}}
2022-01-10 16:37:12 =ERROR REPORT====
{module,"leo_backend_db_eleveldb"},{function,"prefix_search/3"},{line,227},{body,{timeout,{gen_server,call,[leo_object_storage_read_0_1,{head,{29834374738833832619322004778813394310,<<"b2b-cache/cities/04/noparams.xml">>},7037659},30000]}}}
2022-01-10 16:37:42 =ERROR REPORT====
{module,"leo_backend_db_eleveldb"},{function,"prefix_search/3"},{line,227},{body,{timeout,{gen_server,call,[leo_object_storage_read_1_2,{head,{49036670905450747481373050418517138571,<<"b2b-cache/cities/05/noparams.xml">>},7067928},30000]}}}
2022-01-10 16:37:42 =ERROR REPORT====
{module,"leo_backend_db_eleveldb"},{function,"prefix_search/3"},{line,227},{body,{timeout,{gen_server,call,[leo_object_storage_read_2_1,{head,{200780644758158633844541877214146315548,<<"b2b-cache/cities/07/noparams.json">>},7067944},30000]}}}
2022-01-10 16:37:42 =ERROR REPORT====
{module,"leo_backend_db_eleveldb"},{function,"prefix_search/3"},{line,227},{body,{timeout,{gen_server,call,[leo_object_storage_read_0_1,{head,{29834374738833832619322004778813394310,<<"b2b-cache/cities/04/noparams.xml">>},7067952},30000]}}}
and DC1 cluster is getting very slow:
# s3cmd --config=/opt/s3cmd/b2b.cfg ls s3://b2b-cache/
WARNING: Retrying failed request: /?delimiter=%2F (500 (InternalError): We encountered an internal error. Please try again.)
WARNING: Waiting 3 sec...
WARNING: Retrying failed request: /?delimiter=%2F (500 (InternalError): We encountered an internal error. Please try again.)
WARNING: Waiting 6 sec...
…
…
mq-stats
# /usr/local/bin/leofs-adm mq-stats [email protected]
id | state | number of msgs | batch of msgs | interval | description
--------------------------------+-------------------+----------------|----------------|----------------|-------------------------------------------------------------------------
leo_async_deletion_queue | idling | 0 | 1600 | 500 | requests of removing objects asynchronously
leo_comp_meta_with_dc_queue | idling | 0 | 1600 | 500 | requests of comparing metadata w/remote-node
leo_delete_dir_queue_1 | idling | 0 | 1600 | 500 | requests of removing buckets #1
leo_delete_dir_queue_2 | idling | 0 | 1600 | 500 | requests of removing buckets #2
leo_delete_dir_queue_3 | idling | 0 | 1600 | 500 | requests of removing buckets #3
leo_delete_dir_queue_4 | idling | 0 | 1600 | 500 | requests of removing buckets #4
leo_delete_dir_queue_5 | idling | 0 | 1600 | 500 | requests of removing buckets #5
leo_delete_dir_queue_6 | idling | 0 | 1600 | 500 | requests of removing buckets #6
leo_delete_dir_queue_7 | idling | 0 | 1600 | 500 | requests of removing buckets #7
leo_delete_dir_queue_8 | idling | 0 | 1600 | 500 | requests of removing buckets #8
leo_per_object_queue | idling | 0 | 1600 | 500 | requests of fixing inconsistency of objects
leo_rebalance_queue | idling | 0 | 1600 | 500 | requests of relocating objects
leo_recovery_node_queue | idling | 0 | 1600 | 500 | requests of recovering objects of the node (incl. recover-consistency)
leo_req_delete_dir_queue | idling | 0 | 1600 | 500 | requests of removing directories
leo_sync_by_vnode_id_queue | idling | 0 | 1600 | 500 | requests of synchronizing objects by vnode-id
leo_sync_obj_with_dc_queue | idling | 0 | 1600 | 500 | requests of synchronizing objects w/remote-node
I can see same users and buckets on the DC2, but endpoints and buckets data from DC1 does not replicate to the DC2 and there is no any warn/errors in the DC2 log-files.
After systemctl restart leofs-storage.service at [email protected] DC1 cluster is getting normal work speed:
# systemctl restart leofs-storage.service
# time s3cmd --config=/opt/s3cmd/b2b.cfg ls s3://b2b-cache/
DIR s3://b2b-cache/01/
DIR s3://b2b-cache/cities/
DIR s3://b2b-cache/warehouses/
real 0m0.256s
user 0m0.161s
sys 0m0.050s
We have no any "firewalling" entities between DC1<->DC2.
Please help!
Metadata
Metadata
Assignees
Labels
No labels