Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@minrk
Copy link
Contributor

@minrk minrk commented Oct 10, 2023

displaying the full HTML document inline can result in problems rendering and can produce invalid HTML documents after export via nbconvert, mystnb, etc. In particular, my jupyter-book-built pages with %%cython --annotate output end up applying .cython to the document body because of the duplicated <body> tags, in particular causing the whole page to render with courier.

This is easier and cleaner with a full HTML parser, but I imagine you don't want that as a dependency just for this. My in-production workaround is here:

from functools import partial

from bs4 import BeautifulSoup


def _clean_annotated_html(original_clean, html):
    """Substitute inline Cython annotated output

    extracts body and style contents to avoid conflicts with page-level body/head elements.
    """
    html = original_clean(html)
    page = BeautifulSoup(html, "html.parser")
    chunks = []
    # could do this only once, but then clearing output and re-running would lose style.
    for style in page.find_all("style"):
        chunks.append(str(style))
    # add css to fix cython line padding in jupyter-book output
    chunks.append('<style type="text/css">.cython.line { padding: 0px; }</style>')
    chunks.append("".join(str(element) for element in page.find("body").contents))
    return "\n".join(chunks)


def load_ipython_extension(ip):
    cython_magics = ip.magics_manager.magics["cell"]["cython"].__self__
    original_clean = cython_magics.clean_annotated_html
    cython_magics.clean_annotated_html = partial(_clean_annotated_html, original_clean)

but in the absence of proper HTML parsing, this regular expression splitting seems to work well enough, given how basic Cython-generated HTML is, and we just want the whole body and any style tag(s).

Another (probably better) way would be to request this subset of HTML from the annotation process, but I couldn't see a simple way to do that.

displaying the full HTML document inline can result in problems
rendering and produces invalid HTML documents after export
@da-woods da-woods added this to the 3.1 milestone Nov 19, 2023
@da-woods da-woods added the Tools label Nov 19, 2023
@da-woods
Copy link
Contributor

This looks reasonable to me. I'm going to target this to the next major release, rather than applying it to a point release.

Thanks.

@da-woods da-woods merged commit d5e0f3b into cython:master Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants