Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Using emojis with blog plugin causes crashΒ #5555

Description

@perpil

Context

Including emojis in blog content (in my case πŸ’») causes crashes during serve and build

Bug description

If you include an emoji character like πŸ’» anywhere in your blog content it crashes with:

lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

Using emojis on main site pages work fine. I tried to workaround it by using the pymdownx.emoji plugin, but I need the emoji in a code fence, and it wasn't replacing emojis in the code fence (likely by design).

Full trace:

INFO     -  DeprecationWarning: pkg_resources is deprecated as an API
              File
            "/Users/david/Documents/GitHub/mkdocs-material/material/plugins/info/plugin.py",
            line 33, in <module>
                from pkg_resources import get_distribution, working_set
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 121, in <module>
                warnings.warn("pkg_resources is deprecated as an API",
            DeprecationWarning)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('google')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('google.logging')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('mpl_toolkits')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  DeprecationWarning: Deprecated call to
            `pkg_resources.declare_namespace('ruamel')`.
            Implementing implicit namespace packages (as specified in PEP 420)
            is preferred to `pkg_resources.declare_namespace`. See
            https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2870, in activate
                declare_namespace(pkg)
              File
            "/opt/homebrew/lib/python3.10/site-packages/pkg_resources/__init__.py",
            line 2338, in declare_namespace
                warnings.warn(msg, DeprecationWarning, stacklevel=2)
INFO     -  Building documentation...
INFO     -  Cleaning site directory
INFO     -  The following pages exist in the docs directory, but are not
            included in the "nav" configuration:
              - index.md
ERROR    -  Error reading page 'blog/posts/hello-world.md': Document is empty
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 59, in fromstring
    result = getattr(etree, meth)(context)
  File "src/lxml/etree.pyx", line 3257, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1916, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1796, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1085, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 618, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 728, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 657, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Char 0x0 out of allowed range, line 1, column 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/homebrew/bin/mkdocs", line 8, in <module>
    sys.exit(cli())
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/__main__.py", line 234, in serve_command
    serve.serve(dev_addr=dev_addr, livereload=livereload, watch=watch, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/serve.py", line 83, in serve
    builder(config)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/serve.py", line 76, in builder
    build(config, live_server=live_server, dirty=dirty)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/build.py", line 308, in build
    _populate_page(file.page, config, files, dirty)
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/commands/build.py", line 177, in _populate_page
    page.markdown = config.plugins.run_event(
  File "/opt/homebrew/lib/python3.10/site-packages/mkdocs/plugins.py", line 520, in run_event
    result = method(item, **kwargs)
  File "/Users/david/Documents/GitHub/mkdocs-material/material/plugins/blog/plugin.py", line 357, in on_page_markdown
    read = readtime.of_markdown(markdown, rate)
  File "/opt/homebrew/lib/python3.10/site-packages/readtime/api.py", line 40, in of_markdown
    return utils.read_time(markdown, format='markdown', wpm=wpm)
  File "/opt/homebrew/lib/python3.10/site-packages/readtime/utils.py", line 48, in read_time
    el = pq(html)
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 212, in __init__
    elements = fromstring(context, self.parser)
  File "/opt/homebrew/lib/python3.10/site-packages/pyquery/pyquery.py", line 63, in fromstring
    result = getattr(lxml.html, meth)(context)
  File "/opt/homebrew/lib/python3.10/site-packages/lxml/html/__init__.py", line 873, in fromstring
    doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
  File "/opt/homebrew/lib/python3.10/site-packages/lxml/html/__init__.py", line 761, in document_fromstring
    raise etree.ParserError(
lxml.etree.ParserError: Document is empty

Related links

Reproduction

example.zip

Steps to reproduce

  1. mkdocs build

Note that if you delete the file docs/blog/posts/hello-world.md and build again it works. index.md also contains an emoji: πŸ’»

Browser

No response

Before submitting

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue reports a bugresolvedIssue is resolved, yet unreleased if open

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions