Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Lightweight Java library that parses Microsoft Word (.docx) files and maps OOXML structures to strongly typed Java POJOs using JAXB. Enables programmatic navigation of document structure without relying on Apache POI.

License

Notifications You must be signed in to change notification settings

e-reznik/DocxJavaMapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Java CI with Maven

DocxJavaMapper

A lightweight Java library that parses Microsoft Word (.docx) files and maps them to Java POJOs using JAXB. Provides a simple, clean API to read and navigate DOCX document content programmatically without the heavyweight Apache POI dependency.

Usage

Parse a DOCX file and navigate its structure using strongly typed Java objects:

for (BodyElement element : doc.getBody().getBodyElements()) {
    if (element instanceof DJMParagraph paragraph) {
        // Process paragraph
    } else if (element instanceof DJMTable table) {
        // Process table
    }
}

Tech Stack

  • Java 17
  • JAXB (Jakarta XML Binding) for XML-to-Java mapping
  • Lombok for boilerplate reduction
  • JUnit 5 for testing
  • Maven build system

Architecture

DocxJM.map(file)  ->  DJMDocument
                          |
                          +-- DJMBody
                                 |
                                 +-- DJMParagraph
                                 |      +-- DJMRun (text, formatting)
                                 |      +-- DJMHyperlink
                                 |      +-- DJMParagraphProperties
                                 |
                                 +-- DJMTable
                                        +-- DJMTableRow -> DJMTableCell

Supported Elements

Category Elements
Structure Document, Body, Paragraphs, Tables
Text Runs, Hyperlinks
Formatting Bold, Italic, Underline, Strike, Font, Color, Highlight
Layout Alignment, Numbering/Lists
Media Drawings, Anchors, Graphics, Pictures
Metadata Relationships

Building

mvn clean package

Testing

The project includes a real DOCX integration test executed in CI to validate end-to-end parsing.

mvn test

Non-goals

  • Full DOCX editing
  • Rendering or layout calculation
  • Word document creation

About

Lightweight Java library that parses Microsoft Word (.docx) files and maps OOXML structures to strongly typed Java POJOs using JAXB. Enables programmatic navigation of document structure without relying on Apache POI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages