An efficient cache of metadata about all your Org files.
Builds fast. My M-x indexed-reset:
indexed: Analyzed 160616 lines in 10734 entries (3397 with ID)
in 2394 files in 1.37s (+ 0.16s to build SQLite DB)
This library came from asking myself “what could I move out of org-node, that’d make sense in core?” Maybe a proposal for upstream, or at least a PoC.
Many Org plugins now do reinvent the wheel, when it comes to keeping track of some or many files and what may be in them.
Example: org-roam’s DB, org-node’s hash tables, orgrr’s hash tables, …, and some just re-run grep all the time, which still leads to writing elisp to cross-reference the results with something useful.
And let’s not talk about the org-agenda… Try putting 2000 files into org-agenda-files! It needs to open each file in real-time to know anything about them, so everyday commands grind to a halt.
Data will exist after setup akin to this and you wait a second or two.
(setq indexed-org-dirs '("~/org" "~/Sync/notes"))
(indexed-updater-mode)
(indexed-roam-mode) ;; optionalTwo different APIs to access the same data.
- sql
- elisp
Why two? It’s free. When the data has been gathered anyway, there is no reason to only insert it into a SQLite db, nor only put it in a hash table.
Hash table is nicer for simple lookups (and slightly faster). SQL excels for complex lookups.
As a bonus, it’s fairly easy to design your own SQL database, so this library does not lock you in. Included is one that matches org-roam’s database perfectly, and a future milestone might be adding one that matches org-sql.
To use SQL, see Appendix I. For the elisp API, see Appendix II.
A design choice: Indexed only delivers data. It could easily ship conveniences like, let’s call it a function “indexed-goto”:
(defun indexed-goto (entry)
(find-file (indexed-file-name entry))
(goto-char (indexed-pos entry))but in my experience, that will spiral into dozens of lines over time, to handle a variety of edge cases, and then it will no longer be universally applicable. Maybe you prefer to handle edge cases different than I do.
So, it is up to you to write your own “goto” function and all else to do with user interaction.
A design choice: Indexed does not use Org code to analyze your files, but a custom, more dumb parser. That’s for three reasons:
- Fast init. Since I want Emacs init to be fast, it’s not allowed to load Org, but I still want to be able to use a command like
org-node-findto browse my Org stuff, before deciding to actually jump into an Org file.That means the data must be able to exist, before Org has loaded.
- Future milestone: I want to be told at init if there’s a dangling Org clock somewhere. Or a deadline.
- Robustness. Many users heavily customize Org, so no surprise that it sometimes breaks. In my experience, it’s very nice then to have an alternative way to browse that does not depend on a functional Org setup.
- Fast rebuild. As they say, there are two hard things in computer science: cache invalidation and naming thingsnote1.
Indexed must update its cache as the user saves, renames and deletes files. Not a difficult problem, until you realize that files may change due to a Git operation, OS file operations, a
rmcommand on the terminal, edits by another Emacs instance, or even remote edits by Logseq.A robust approach to cache invalidation is to avoid trying: ensure that a full rebuild is fast enough that you can just do that instead.
In fact,
indexed-updater-modedoes a bit of both, because it is still important that saving a file does not lag; it does its best to update only the necessary tables on save, and an idle timer causes a full reset every now and then.- [note1]: I sometimes feel I failed at naming this library! Got opinions/ideas? Leave a comment!
Included are two designs:
- A drop-in for org-roam’s
(org-roam-db), called(indexed-roam). - Our own experimental
(indexed-orgdb).- Are you a SQL and Org user? Please write what you think should go in a good DB.
- There is prior art on the matter at org-sql, but your story is still welcome!
- Are you a SQL and Org user? Please write what you think should go in a good DB.
UPDATE NOTICE [2025-03-23 Sun 16:44]: Function (indexed-roam) now returns an EmacSQL connection.
Activating the mode creates an in-memory database by default.
(indexed-roam-mode)Test that it works:
(emacsql (indexed-roam) [:select * :from files :limit 10])To end your dependence on org-roam-db-sync, you can set the following. It will overwrite the “org-roam.db” file.
(setq org-roam-db-update-on-save nil)
(setq indexed-roam-overwrite t)
(indexed-roam-mode)Now, you have a new, all-fake org-roam.db! Test that it works:
(org-roam-db-query [:select * :from files :limit 10])N/B: because (equal (org-roam-db) (indexed-roam)), the above is equivalent to these:
(emacsql (org-roam-db) [:select * :from files :limit 10])
(emacsql (indexed-roam) [:select * :from files :limit 10])There’s a known issue if you use multiple Emacsen, the error “attempt to write a readonly database”. Get unstuck with M-: (org-roam-db--close-all) if that happens.
This DB is a bit different, subject to redesign.
(indexed-orgdb-mode)Note it probably does not work with EmacSQL, just the Emacs 29+ built-in sqlite-select.
In practice, you can often translate a statement like
(org-roam-db-query [:select tag :from tags :where (= id $s1)] id)to
(sqlite-select (indexed-orgdb) "select tag from tags where id = ?;" (list id))or if you like mysterious aliases,
(indexed-orgdb "select tag from tags where id = ?;" id)There are three types of objects: file-data, org-entry and org-link. Logically speaking, files contain entries and entries contain links, right?
Functions of no argument
indexed-org-files- Return all file objects
indexed-org-entries- Return all entry objects
indexed-org-id-nodes- Return all entry objects that have an ID
indexed-org-links-and-citations- Return all link objects
indexed-org-links- Return all link objects with a type such as
id:orhttps:
- Return all link objects with a type such as
Functions operating on raw file paths
- indexed-entry-near-lnum-in-file
- indexed-entry-near-pos-in-file
- indexed-id-nodes-in
- indexed-entries-in
Functions operating on raw id
- indexed-entry-by-id
- indexed-file-by-id
- indexed-links-from
Functions operating on raw titles
- indexed-id-node-by-title
Functions operating on ORG-LINK
- indexed-dest
- indexed-type
- indexed-heading-above
- indexed-pos
- indexed-id-nearby
- (old alias:
indexed-origin. Org-roam calls the same thing “source” and org-node calls it “origin”, but both terms presume an ID-centric design to everything, and make less sense when you allow for the absence of IDs.)
- (old alias:
Functions operating on ENTRY
- indexed-deadline
- indexed-heading-lvl
- indexed-id-links-to
- indexed-olpath
- indexed-olpath-with-self
- indexed-olpath-with-self-with-title
- indexed-olpath-with-title
- indexed-pos
- indexed-priority
- indexed-properties
- indexed-property
- indexed-property-assert
- indexed-roam-aliases – works without
indexed-roam-mode - indexed-roam-reflinks-to – needs
indexed-roam-mode - indexed-roam-refs – needs
indexed-roam-mode - indexed-root-heading-to
- indexed-scheduled
- indexed-tags
- indexed-tags-inherited
- indexed-tags-local
- indexed-todo-state
- indexed-toptitle
Polymorphic functions (that work on all three types)
- indexed-file-name
- indexed-file-data
- indexed-file-title
- indexed-file-title-or-basename
- indexed-file-mtime
Hooks used by command indexed-reset and mode indexed-updater-mode
- indexed-pre-full-reset-functions
- indexed-post-full-reset-functions
- indexed-record-file-functions
- indexed-record-entry-functions
- indexed-record-link-functions
Additional hooks used by mode indexed-updater-mode
- indexed-pre-incremental-update-functions
- indexed-post-incremental-update-functions
- indexed-forget-file-functions
- indexed-forget-entry-functions
- indexed-forget-link-functions
A separate file indexed-x.el is loaded when you enable indexed-updater-mode.
It is separate because indexed-updater-mode is not strictly necessary – it could be replaced by a simple timer that calls indexed-reset every 20 seconds, or whatever you deem suitable.
The file also ships some extra tools.
You may want to call the following functions after inserting entries or links in a custom way, if they need to become indexed instantly without waiting for user to save the buffer:
- indexed-x-ensure-entry-at-point-known
- indexed-x-ensure-link-at-point-known
Examples of when those are useful is when you write a command like org-node-extract-subtree or a subroutine like org-node-backlink–add-in-target.
Steps:
- Read file indexed-roam.el as a reference implementation, or file indexed-orgdb.el if you want only the built-in sqlite feature and no EmacSQL
- See how it looks up the data it needs
- See which things require a
prin1-to-string - See how arguments are ultimately passed to
sqlite-execute
[TODO: write a simpler example impl]
- Hook your own DB-creator onto
indexed-post-full-reset-functions, or just on some hook that suits your use-case - Done!
Modes
- indexed-updater-mode
- indexed-roam-mode
Config settings
- indexed-warn-title-collisions
- indexed-seek-link-types
- indexed-org-dirs
- indexed-org-dirs-exclude
- indexed-sync-with-org-id
- indexed-roam-overwrite
Commands
- indexed-list-dead-id-links
- indexed-list-title-collisions
- indexed-list-problems
- indexed-list-entries
- indexed-list-db-contents
- indexed-reset
Never sit through a slow M-x org-id-update-id-locations again.
(setq indexed-sync-with-org-id t)This tells org-id about everything that Indexed can find under indexed-org-dirs. That’s nice because if you had clicked an ID-link that org-id did not know about, it would react by running org-id-update-id-locations, hanging Emacs for quite a while.
Never had this problem? If you came here from org-node or org-roam, that’s because they solve this problem for you.