Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Parsing HTML table string with XLSX.read ignores <th> elements #1090

@GigiSan

Description

@GigiSan

I'm trying to convert an HTML table to a CSV file. I have to do the conversion server-side so I pass the table's outerHTML as a string via an $.ajax request to the Node.js server.

It seems like the <th> tags are ignored and not transferred to the workbook. Is there a way to import them aswell or are they not managed by the library itself? I tried doing a quick search on the codebase and couldn't find any "th", but I'm pretty new to GitHub and modules' structure so I might be missing something.

The table looks something like this:

<table>
  <thead>
    <tr>
      <th>Row #</th>
      <th>Label</th>
      <th>Result</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td>SAMPLE_TEXT</td>
      <td>SUCCESS</td>
    </tr>
    <tr>
      <td>2</td>
      <td>SAMPLE_TEXT_WITH_STRING</td>
      <td>ERROR</td>
    </tr>
  </tbody>
</table>

The parsing code, which is executed server-side with Node.js, is the following:

(...)
var workbook = XLSX.read(table, {
  type: "string"
});
return resolve(XLSX.write(workbook, {
  bookType: "csv",
  type: "buffer"
}));

The resulting CSV is the following:

1,SAMPLE_TEXT,SUCCESS
2,SAMPLE_TEXT_WITH_STRING,ERROR

UPDATE:
Replacing <th> with <td> works, even if inside <thead>

CSV:

Row #,Label,Result
1,SAMPLE_TEXT,SUCCESS
2,SAMPLE_TEXT_WITH_STRING,ERROR

Still, it would be nice if <th> was parsed too. ☺

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions