Thanks to visit codestin.com
Credit goes to github.com

Skip to content

bug fix: concurrent increase by CounterWindow may cause PriorityQueue breaken#12505

Merged
kezhenxu94 merged 3 commits intoapache:masterfrom
kael-aiur:bugfix/counterwindow
Aug 4, 2024
Merged

bug fix: concurrent increase by CounterWindow may cause PriorityQueue breaken#12505
kezhenxu94 merged 3 commits intoapache:masterfrom
kael-aiur:bugfix/counterwindow

Conversation

@kael-aiur
Copy link
Contributor

@kael-aiur kael-aiur commented Aug 2, 2024

Fix <#12503>

  • Add a unit test to verify that the fix works.
  • Explain briefly why the bug exists and how to fix it.

as org.apache.skywalking.oap.meter.analyzer.dsl.counter.CounterWindow are single instance, the windows which type ConcurrentHashMap will return the same PriorityQueue object for the same meter, when the same meter increase with concurrent call, some unexcpected exception will occur in PriorityQueue object, source code:

public class CounterWindow {

    public static final CounterWindow INSTANCE = new CounterWindow();

    private final Map<ID, Tuple2<Long, Double>> lastElementMap = new ConcurrentHashMap<>();
    private final Map<ID, Queue<Tuple2<Long, Double>>> windows = new ConcurrentHashMap<>();

    public Tuple2<Long, Double> increase(String name, ImmutableMap<String, String> labels, Double value, long windowSize, long now) {
        ID id = new ID(name, labels);
        // there will return the same PriorityQueue object for concurrent operation
        Queue<Tuple2<Long, Double>> window = windows.computeIfAbsent(id, unused -> new PriorityQueue<>());
        window.offer(Tuple.of(now, value));  // some unexpected error will occur in this line when concurrent operation
        long waterLevel = now - windowSize;
        Tuple2<Long, Double> peek = window.peek();
        if (peek._1 > waterLevel) {
            return peek;
        }

        Tuple2<Long, Double> result = peek;
        while (peek._1 < waterLevel) {
            result = window.poll();  // some unexpected error will occur in this line when concurrent operation
            peek = window.element();  // some unexpected error will occur in this line when concurrent operation
        }

        // Choose the closed slot to the expected timestamp
        if (waterLevel - result._1 <= peek._1 - waterLevel) {
            return result;
        }

        return peek;
    }
    //....
}
  • If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #12503.
  • Update the CHANGES log.

@wu-sheng wu-sheng added this to the 10.1.0 milestone Aug 2, 2024
@wu-sheng wu-sheng added backend OAP backend related. enhancement Enhancement on performance or codes labels Aug 2, 2024
@wu-sheng
Copy link
Member

wu-sheng commented Aug 2, 2024

@kezhenxu94 Docker compose somehow missing. Could you check CI env as well?

@peachisai
Copy link
Contributor

@kezhenxu94 Docker compose somehow missing. Could you check CI env as well?

I had the same problem as well, I think the command docker compose should instead docker-compose
Ref actions/runner-images#9557

@wu-sheng
Copy link
Member

wu-sheng commented Aug 4, 2024

@kezhenxu94 Docker compose somehow missing. Could you check CI env as well?

I had the same problem as well, I think the command docker compose should instead docker-compose

Ref actions/runner-images#9557

We fixed it through new e2e tool version.

@kezhenxu94 kezhenxu94 merged commit 5c33ee0 into apache:master Aug 4, 2024
@kael-aiur kael-aiur deleted the bugfix/counterwindow branch August 5, 2024 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend OAP backend related. enhancement Enhancement on performance or codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants