Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Feb 22, 2023. It is now read-only.

[camera]fix crash due to race condition in dispatch queue #4619

Conversation

hellohuanlin
Copy link
Contributor

@hellohuanlin hellohuanlin commented Dec 16, 2021

This PR fixed a crash due to race condition of _captureSessionQueue ivar. This kind of race condition is annoying and hard to fix because it takes tons of luck to reproduce (I accidentally touched iPhone's home bar, causing the app to become active and then immediately inactive, forming a race condition).

Crash

*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '*** -[AVCaptureAudioDataOutput setSampleBufferDelegate:queue:] NULL queue passed'
terminating with uncaught exception of type NSException

Reproduce
(1) In sample app, start video recording then stop it.
(2) Swipe home bar just a bit on iPhone X, so the app becomes inactive and immediately regain active.
(3) Tap the video button again and observe this crash.

Root Cause
_captureSessionQueue is checked in main thread, but set to nil in background thread, which is open to race condition:

  // main thread: 
  if (_captureSessionQueue == nil) {
    _captureSessionQueue = dispatch_queue_create("io.flutter.camera.captureSessionQueue", NULL);
  } 

  // background thread 
 _captureSessionQueue = nil

Explain
In the above reproduce step, (1) is to assign a new _captureSessionQueue, (2) is to trigger "dispose" and "create" method calls consecutively, resulting in a race condition, and (3) will pass in the _captureSessionQueue which has been nil'ed out asynchronously by the dispose call in previous step.

Solution
(1) It looks like the camera plugin has pretty complicated threading logic and it's getting hard to maintain. There have been multiple related issues. If we get time I am interested in refactoring the threading model in the future.
(2) For now an easy fix to this race condition is not to nil out this _captureSessionQueue. DispatchQueues are quite lightweight as they are not actual threads.
(3) Or we can dispatch to main thread again before setting nil, which should fix _captureSessionQueue race for this particular case, but may still open for other unknown cases

Issues
flutter/flutter#96429
flutter/flutter#59124
flutter/flutter#52578 (comment)

Next Step

  • This section is outdated. We wrote up a formal proposal to address various threading related issues.

This PR should fix the crash, but we should start looking at simplifying the threading logic in this plugin. Ideally, a DispatchQueue should have its lifecycle scoped to the resource that the queue manages, instead of manually nil'ing out the reference. HOWEVER, let's step back a bit: should this even be a plugin problem?

If we trace the execution from UI to plugin and back to UI, we get these thread hops:

Flutter's UI thread ->
iOS platform thread ->
camera background thread -> 
iOS platform thread -> 
Flutter's UI thread

The iOS platform thread is involved twice, just to dispatch to another thread, which means the whole process could have been as simple as:

Flutter's UI thread ->
camera background thread -> 
Flutter's UI thread

I think engine should support invoking plugin methods in background. The benefits are:

  • Avoid unnecessary thread hops, obviously;
  • Plugin developers (e.g. camera) don't have to work on threading problems (if they simply want to run code in background thread).
  • Avoid crashes like this one.

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the relevant style guides and ran the auto-formatter. (Unlike the flutter/flutter repo, the flutter/plugins repo does use dart format.)
  • I signed the CLA.
  • The title of the PR starts with the name of the plugin surrounded by square brackets, e.g. [shared_preferences]
  • I listed at least one issue that this PR fixes in the description above.
  • I updated pubspec.yaml with an appropriate new version according to the pub versioning philosophy, or this PR is exempt from version changes.
  • I updated CHANGELOG.md to add a description of the change, following repository CHANGELOG style.
  • I updated/added relevant documentation (doc comments with ///).
  • I added new tests to check the change I am making, or this PR is test-exempt.
  • All existing and new tests are passing.

@stuartmorgan-g
Copy link
Contributor

(1) It looks like the camera plugin has pretty complicated threading logic and it's getting hard to maintain. There have been multiple related issues.

Yes, this is if anything an understatement. It was until recently very wrong, and some small steps to make it somewhat less wrong were taken, but it needs to be overhauled.

Copy link
Contributor

@stuartmorgan-g stuartmorgan-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you merge in the latest version of master so that the tests run? A key recently expired, which I think is why everything is red.

@hellohuanlin
Copy link
Contributor Author

@stuartmorgan thanks for the review. I think we are having code freeze until next year, so I will fix them next year when we come back.

@stuartmorgan-g
Copy link
Contributor

I think we are having code freeze until next year

There's no code freeze for this repository (although people should make sure not to land publish-triggering changes unless they will be around to deal with any fallout, as is generally the case here).

Copy link
Contributor

@cyanglaz cyanglaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank for the fix! Left some comments.

@hellohuanlin
Copy link
Contributor Author

@stuartmorgan gotcha. thanks for the clarification. I think I will wait till #4608 is landed, because I want to bump version and changelog in both PRs.

Copy link
Contributor

@cyanglaz cyanglaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@stuartmorgan-g
Copy link
Contributor

Is this obsoleted by your other changes, or should it be rebased and cleaned up for landing?

@hellohuanlin
Copy link
Contributor Author

Is this obsoleted by your other changes, or should it be rebased and cleaned up for landing?

will rebase and clean up

@hellohuanlin hellohuanlin force-pushed the camera_dispatch_queue_race_condition branch 6 times, most recently from d394cb5 to d455f20 Compare January 24, 2022 22:48
@hellohuanlin hellohuanlin force-pushed the camera_dispatch_queue_race_condition branch from d455f20 to d45d133 Compare January 24, 2022 22:51
@hellohuanlin
Copy link
Contributor Author

@stuartmorgan updated. PTAL. Thanks.

Copy link
Member

@jmagman jmagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can you update this PR's description since this doesn't touch anything named _dispatchQueue?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
p: camera platform-ios waiting for tree to go green (Use "autosubmit") This PR is approved and tested, but waiting for the tree to be green to land.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants