-
Notifications
You must be signed in to change notification settings - Fork 364
arvo: refactors internals, adds error-handling #2366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@belisarius222, |
|
Yeah, comparing source (or receiving source with pit) is the right way to solve this. Should we just do that? Working on that is a better use of time than this Ford hack, although it's not good to let this PR languish either.
If not, I could add a hack to Ford to track whether we've built %reef on this desk before, and if not, use the .pit. I think it whould be relatively straightforward.
—
~rovnys-ricfer
https://urbit.org
…On Fri, Feb 28, 2020 at 5:18 PM, Joe Bryan < ***@***.*** > wrote:
@ belisarius222 ( https://github.com/belisarius222 ) , %init wouldn't work
right now, as it happens in the "legacy boot" event, before the initial
userspace commit. I don't know %ford's internals too well, maybe there's
an easy way to check for the first %reef build. Hardcoding cases 1/2 might
be fine. But the right way to do the pit short-circuit is for arvo to
persist the source it's running, so %ford can scry and compare. I aim to
add that minimal arvo filesystem soon(tm).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (
#2366?email_source=notifications&email_token=AAGVR5OTKMBFCIDIH6ET2RTRFGEUHA5CNFSM4K4TNSJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENKLRCI#issuecomment-592754825
) , or unsubscribe (
https://github.com/notifications/unsubscribe-auth/AAGVR5KY5RVJOTAPRHDBKKLRFGEUHANCNFSM4K4TNSJA
).
|
|
@belisarius222, I don't want to churn interfaces for short-term workarounds. And I don't want to add state adapters here, or disrupt the outer layers of arvo. I've added back the pit-shortcircuit for %home and %base (which does get built first, you were right). This makes boot fast again. The only risk is making a bad pill (kernel source mismatch), which is already a risk. It should just be avoided. This seems fine for now, IMO, unless you'd prefer to prime the cache to the same effect. I just don't want to be blocked on this problem anymore. |
|
Ok, no this looks fine. Glad it worked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joemfb The kernel appears to reload fine when applying this (and ford-no-pit) on top of the latest release as a sanity check, but I do observe a single [%poke %bad-wire /] immediately afterwards.
I ran a |reboot after that and got a bunch of find.goof and find.wite errors and such. I made a trivial change to Arvo and recommitted to force it to reload, again observing [%poke %bad-wire /], and |reboot again afterwards produces the same find.goof and find.wite errors.
Anything to be concerned about here?
|
@jtobin, thanks for checking. Was this a new fake-ship from the arvo tagged release? I've pulled the %ford changes into #2384, and will reintegrate here once that's done. Pushing these changes OTA should work (and is why this milestone has been PR'd), but it still needs to be tested. As for |
|
@jtobin, the That being said, I'm not quite ready for this to be merged. I'd like to redo the master merge once #2384 is in, and clean things up generally. And I want to do a review myself. Also worth noting, this does not need to be in or block the os1 release. Due to the issues we've been having with OTA's, it might be best to hold it until after os1. But we can discuss that separately. |
|
Ok, this is cleaned up and ready on my end. @belisarius222, please take a final look. I've tested committing these changes to a fake-zod booted from the @jtobin, I haven't tried merging this and then cherry-picking the merge commit; I'm not sure if that's sufficient to exclude the changes in question. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recent changes all LGTM.
|
On Wed, Mar 04, 2020 at 11:11:48AM -0800, Joe Bryan wrote:
@jtobin, I haven't tried merging this and then cherry-picking the
merge commit; I'm not sure if that's sufficient to exclude the changes
in question.
No sweat, I'll check it out before merging.
|
* master: (484 commits) king: Slight CLI cleanup and fix test build. king: Add command-line flags to configure HTTP and HTTPS ports. groups: reduce metadata updates, removal chat: reducer handles metadata removal groups: exclude group metadata from channels list groups: set and surface group name metadata groups: remove dummy 'share' flow, 'default' group contacts: rename, migrate '~contacts' to '~groups' sh/release: rename vere release tarballs vere: patch version bump (v0.10.3 -> v0.10.4.rc1) [ci skip] pills: updated brass and solid chat: pull room contacts from associated group chat: spell 'permanent' correctly eyre: remove padding from 'access' input chat: only delete metadata for a chat if you created it chat: settings inputs add borders on focus vere: disables gc on |mass in the daemon process chat: remove console.log from metadataAction chat: style fixes during review, use metadata-hook chat: edit description, color settings ...
|
I did one last re-merge (there was a legitimate, minor conflict in %ford as of os1-rc). |
|
The cherry-picked merge doesn't include the %spot hint changes, but it does include the %ford changes that can't yet go out OTA (per #2333 -- presumably this popped up in the conflict you encountered). I think it's ok to merge this, as I don't plan to release any more non-surgical-hotfix updates prior to the OS1 release. This can probably just go out with OS1 proper. |
|
Alternatively, I can probably resolve the %ford conflict that occurs in the cherry-pick to take just the change relevant to this PR, and see if I can get it into the last release prior to OS1. I'll give it a quick test just to check -- if anything looks dodgy I'll just hold off. |
This appears fine. Will merge and push a new Arvo release candidate. |
* origin/arvo-errors: (35 commits) pill: all vane: jet-hints all vanes for profiling arvo: refines crash printfs arvo: fix wire (and adapt old) for %vega reset notification arvo: removes all vase literals from |va arvo: removes all traces of meta-meta card reduction arvo: cleanup per review arvo: removes vestigial |is core arvo: remove refactoring comments arvo: replace $milt with $meta arvo: replace $mill with $maze worker: sends new error-notification events arvo: removes %gave, generalizes %hurl vane: prints error notifications where not handled behn: forward %drip error notifications, refactor %crud handling ames: downcast %hear error notification to %hole vane: downcast all error notifications to %crud arvo: removes (commented out) legacy event routing test: updates vane calling convention dill: "downcast" +call error notification to %crud ... Signed-off-by: Jared Tobin <[email protected]> (cherry picked from commit 6ccc843)
This PR refactors arvo's internal engines, and adds new mechanisms by which vanes can propagate error notifications.
The primary changes are:
|winkrefactored into|meand|va|isrefactored into|leThe new engines -- namespaced in
|part-- are inspired to varying degrees by the unreleased, incompleteneo/arvo(deleted in 7d4b35c). These changes were made to support the error-notification implementation, and generally improve the quality of arvo's internals. No changes have been made to arvo's external interface or persistent structures, so this extensive refactoring is still suitable for OTA release without staging or adaptation.The new error notifications involve changes to arvo's internal loop, and the interface between arvo and its vanes:
$goofis added, a new error-notification structure (similar to$aresfrom the deep past)%hurlis added, a new kernel action to propagate a$goofalong with a%passor%give(unit goof)is added to the sample of vanes'+calland+takearmsInside the vanes, new error notifications are "downcast" to the old style (ie,
%crudeverywhere but %ames, where they become%crudor%hole). The only change in the error-handling behavior of the vanes is in %behn's%driphandling. Errors therein are now propagated to the intended recipients (arriving in their+takearm, where they'll be merely printed).Finally, the worker process is updated to send new error-notification events, including both the
$goof(ie, bail mote and stack trace) and the original event.This PR represents an incomplete but viable snapshot of the error-handling work. Additional changes are needed in the vanes (to handle more error notifications, or more fully) and the runtime, specifically the IPC protocol and I/O drivers (to precisely handle errors in error notifications). Additional improvements to arvo also follow from this work, most notably around upgrade. All such changes will be more disruptive than these, and harder to handle without strict versioning coordination between arvo and the runtime. These changes are a foundation upon which incrementally better error-handling can be built for the live network, while larger efforts continue in the background.
Calling the error-notification structure
$goofis somewhat ... well, goofy. Some other candidates include fail, ruin, crud, flaw, lack, and miss. Feedback is requested.The changes to arvo and the vanes must be released together, but no intermediate staging is needed, and the runtime changes need not be correlated. Since almost every commit in this PR merits a new pill, but none individually require one, I've departed from the recommended approach and saved the pill update for the end.