-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix the PAF problems in the last PR #38321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ac13948 to
3107b21
Compare
|
PR #38321: Size comparison from 61897c7 to 3107b21 Full report (27 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, nrfconnect, qpg, stm32, telink, tizen)
|
|
PR #38321: Size comparison from 61897c7 to 9d1802e Increases above 0.2%:
Full report (75 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, nrfconnect, nxp, psoc6, qpg, stm32, telink, tizen)
|
|
PR #38321: Size comparison from 61897c7 to b4fc1c6 Increases above 0.2%:
Full report (75 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, nrfconnect, nxp, psoc6, qpg, stm32, telink, tizen)
|
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
…ture Signed-off-by: Lo,Chin-Ran <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
Signed-off-by: Lo,Chin-Ran <[email protected]>
Signed-off-by: Lo,Chin-Ran <[email protected]>
Co-authored-by: Boris Zbarsky <[email protected]>
* Update the naming of the functions to be more meaningful. * Remove the added session info if failed to subscribe * Change the implementation to parse the input packets Signed-off-by: Lo,Chin-Ran <[email protected]>
|
PR #38321: Size comparison from b24c59e to 0c80456 Increases above 0.2%:
Full report (75 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, nrfconnect, nxp, psoc6, qpg, stm32, telink, tizen)
|
* Fix by using kUndefinedWiFiPafSessionId as unused id Signed-off-by: Lo,Chin-Ran <[email protected]>
|
PR #38321: Size comparison from 4fbe950 to 58fb960 Full report (14 builds for cc13x4_26x4, cc32xx, nrfconnect, qpg, stm32, tizen)
|
… only if the PAF session has not established Signed-off-by: Lo,Chin-Ran <[email protected]>
|
PR #38321: Size comparison from 4fbe950 to 410e7f9 Increases above 0.2%:
Full report (75 builds for bl602, bl702, bl702l, cc13x4_26x4, cc32xx, cyw30739, efr32, esp32, linux, nrfconnect, nxp, psoc6, qpg, stm32, telink, tizen)
|
…missioneeDevice() * Change the shutdown API of WiFiPAFLayer to release the resource inside the function. * Clean up the callback function once it has stopped discovering the commissionee Signed-off-by: Lo,Chin-Ran <[email protected]>
| if (mAdvertisingOverWiFiPAF) | ||
| { | ||
| ChipLogProgress(AppServer, "Cancel Wi-Fi PAF publish"); | ||
| WiFiPAF::WiFiPAFLayer::GetWiFiPAFLayer().CloseAllConnections(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs some serious documentation or API refactoring, not least because I am not 100% sure I follow #38321 (comment)
It sounds like there are three possible states the PAF "advertising" side (commissionee) can be in?
- Nothing is going on with PAF, everything is shut down.
- We are "advertising" over PAF but no one has "connected" with us (whatever that means).
- We are no longer "advertising", but now have a "connection".
It sounds like StopAdvertising will take us from state 2 to state 1, but remain in state 3 if that's where we are, and here the goal is to move from state 3 to state 1?
As a matter of API factoring, it's a little confusing to me that control over this state lives partly on GetWiFiPAFLayer and partly on ConnectivityMgr, and that we need to interrogate WiFiPAFLayer to figure out what to tell the ConnectivityMgr. It's also confusing that we shut down "everything", not just the "connection" we had.
I guess we can do this for now, with a bunch of documentation both here and in StopAdvertising, but as a followup what we should really have is:
- A clear API for "start advertising".
- A clear API for "stop advertising, but optionally don't kill the underlying transport for the given PASE session". This API should ideally hide away the "query for sessions" implementation detail. But what it can and should take is the PeerAddress of the PASE session we have (if we have one).
- A clear API for "tear down whatever thing was supporting the given PASE session", that takes the PeerAddress of the PASE session, if we have one, as input.
APIs 2 and 3 would take an std::optional<PeerAddress>, and then the PAF machinery can operate on those as desired.
Can we be in a state where PAF is "connected" but we do not in fact have a PASE session? I guess we could, especially if we are in the middle of the PASE handshake?
I know that's not what BLE does. What BLE does is broken and we are trying to fix it, by gradually moving it to a model more like what I describe above....
| WiFiPAF::WiFiPAFSession sessionInfo = { .role = WiFiPAF::WiFiPafRole::kWiFiPafRole_Publisher }; | ||
| WiFiPAF::WiFiPAFLayer & pafLayer = WiFiPAF::WiFiPAFLayer::GetWiFiPAFLayer(); | ||
| WiFiPAF::WiFiPAFSession * pSession = pafLayer.GetPAFInfo(WiFiPAF::PafInfoAccess::kAccSessionId, sessionInfo); | ||
| if ((pSession != nullptr) && (pSession->peer_id == WiFiPAF::kUndefinedWiFiPafSessionId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments above. Those implementation details should be hidden inside the API surface...
Importantly: that will let us see whether we have correct behavior for the case when we want to stop advertising but we have a "PAF session" established.... How does that behave? The device should not remain discoverable if PASE over that PAF session fails, for example. Right now it's pretty hard to tell what this is doing.
| #if CHIP_DEVICE_CONFIG_ENABLE_WIFIPAF | ||
| WiFiPAF::WiFiPAFLayer::GetWiFiPAFLayer().Shutdown( | ||
| [](uint32_t id, WiFiPAF::WiFiPafRole role) { DeviceLayer::ConnectivityMgr().WiFiPAFShutdown(id, role); }); | ||
| mSystemState->WiFiPayLayer()->Shutdown(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand how this is supposed to work.
Say I have two DeviceCommissioner instances. One is in the middle of commissioning over PAF. The other one (totally unrelated) gets shut down.
Won't this break the commissioning? This does not seem like an OK thing to do. This is not a hypothetical scenario: lots of clients have multiple DeviceCommissioner instances around for various reasons, bring brought up/down as needed.
Layer shutdown should be paired with layer startup/init. Init happens in ConnectivityManagerImpl::_Init, right? So shutdown should happen in ConnectivityManager shutdown... If we don't have a shutdown for connectivity manager (which it looks like we don't), then we should have it, and it should be paired with whatever does connectivity manager init.
But anyway, it looks like this shutdown happens in platform manager shutdown, which sort of mirrors connectivity manager init happening in platform manager init. So why is this here at all?
| (device->IsSecureConnected() == true)) | ||
| { | ||
| ChipLogProgress(Discovery, "Closing all WiFiPAF connections"); | ||
| mSystemState->WiFiPayLayer()->CloseAllConnections(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I know you copied the BLE code, but the BLE code is wrong and we are trying to fix it.
Shouldn't we be closing specifically whatever connections are relevant for the PeerAddress of the device involved?
As in, this should not be a CloseAllConnections() call; it should be a CloseConnection() call that takes the PeerAddress.
If under the hood PAF wants, for now, to just close all connections (because it only has one!) and ignore the PeerAddress, that's fine. But the API should not bake that assumption in, because we are trying to move to a world where we in fact support multiple parallel commissioning processes. Especially across multiple different DeviceCommissioner instances.
But it looks like PAF sessions are in fact associated with something like a "device id" (looking at the code in EstablishPASEConnection), and ideally the PeerAddress would just carry that information so we can shut down the right thing.
| } | ||
| #endif | ||
| #if CHIP_DEVICE_CONFIG_ENABLE_WIFIPAF | ||
| if ((mSystemState->WiFiPayLayer() != nullptr) && (device->GetDeviceTransportType() == Transport::Type::kWiFiPAF) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why WiFiPayLayer() is null-checked here but not elsewhere....
|
|
||
| void SetUpCodePairer::OnWifiPAFDiscoveryError(CHIP_ERROR err) | ||
| { | ||
| if (mWaitingForDiscovery[kWiFiPAFTransport] == false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, if discovery has stopped this object may have been destroyed and then this is use-after-free.
|
|
||
| private: | ||
| #if CHIP_DEVICE_CONFIG_ENABLE_WIFIPAF | ||
| WiFiPAFAdvertiseParams mPafAdverParam; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| WiFiPAFAdvertiseParams mPafAdverParam; | |
| WiFiPAFAdvertiseParams mPAFAdvertisementParams; |
Or even "Parameters".... Or at least mPAFAdvertiseParams. But not "Adver", please.
| CHIP_ERROR WiFiPAFPublish(WiFiPAFAdvertiseParam & args); | ||
| struct WiFiPAFAdvertiseParams | ||
| { | ||
| /* The list of the frequencies to advertise on. Each element (uint16_t) is the channel frequency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /* The list of the frequencies to advertise on. Each element (uint16_t) is the channel frequency | |
| /* The list of the frequencies to advertise on. Each element (uint16_t) is the channel | |
| fundamental frequency in MHz, as defined in the Wi-Fi standards. |
| chnl#44: 5220, chnl#149: 5745 | ||
| */ | ||
| ReadOnlyBuffer<uint16_t> freq_list; | ||
| /* publish_id */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#38321 (comment) says more information will be added. That still needs to happen, to explain what this publis_id is and how an API consumer is expected to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not review these pieces.
Testing
AP: Asus RT-N66U, running at chnl#1
ctrl: Rpi + NetGear, Inc. A6210
dut#1: rpi + Linksys AE6000
dut#2: imx93+iw612
dut:
Fixes #38315 by changing the variable type from uint8_t to size_t