pdf-to-bloom

A monorepo containing tools for converting PDF documents to Bloom-compatible HTML format through intelligent markdown processing.

Packages

This monorepo contains three packages:

@pdf-to-bloom/lib

The core Node.js library that provides the PDF to Bloom conversion functionality.

@pdf-to-bloom/cli

A command-line interface for converting PDFs to Bloom format.

Requirements

Node.js 22.11.0 or higher
OpenRouter API key

Development

Setup

# Clone the repository
git clone <repository-url>
cd pdf-to-bloom

# Install dependencies
yarn install

Developing

# Watch the lib and cli
yarn dev:lib   # in one terminal
yarn dev:cli   # in another terminal

# Run all tests once
yarn test

# Run tests in watch mode
yarn test:watch

# convert a pdf. When --collection is used, the languages specified in the .bloomCollection will be fed to the llm as a hint of what languages to expect
yarn cli input.pdf # defaults to most recently opened Bloom collection for better language detection
yarn cli input.pdf --collection recent # explicitly use the most recently opened Bloom collection (release, alpha, beta, or betainternal)
yarn cli input.pdf --collection path/to/bloom/collection # output to a particular collection
yarn cli input.pdf --output path/to/output/directory # output to a specific directory instead of a collection


# Extract only images from a PDF
yarn cli input.pdf --target images

# Extract markdown and images from PDF
yarn cli input.pdf --target ocr
yarn cli input.pdf --target ocr --ocr google/gemini-2.5-pro # specify an llm to do the ocr

See ./packages/cli/README.md for details

Building

yarn build

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.github		.github
.vscode		.vscode
.yarn/releases		.yarn/releases
marker		marker
packages		packages
test-inputs		test-inputs
.gitignore		.gitignore
.yarnrc.yml		.yarnrc.yml
CLAUDE.md		CLAUDE.md
README.md		README.md
package.json		package.json
test-collection-path.js		test-collection-path.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

pdf-to-bloom

Packages

@pdf-to-bloom/lib

@pdf-to-bloom/cli

Requirements

Development

Setup

Developing

Building

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

hatton/pdf-to-bloom

Folders and files

Latest commit

History

Repository files navigation

pdf-to-bloom

Packages

@pdf-to-bloom/lib

@pdf-to-bloom/cli

Requirements

Development

Setup

Developing

Building

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages