Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Segmenting sentences at colons #9

@fhamborg

Description

@fhamborg

For example the following snippet will be extracted as one single sentence (ending at the last full stop), but it should perhaps be split at the colons.

Here they “warn” anyone who opposes his radical ideology:
Four police officers were sent to hospital:
Violence against police officers is not only acceptable with Bernie Sanders and Black Lives Matter terrorists, its necessary to create chaos and panic:
What kind of violent protest would be complete without Barack Obama’s good friend, domestic terrorist Bill Ayers:
It’s probably just a coincidence that on a day that <u><b>Obama</b></u> was too busy to attend Nancy Reagan’s funeral, he was able to address a crowd about his hate for Trump only hours before this organized chaos in Chicago:
And finally, we’re wondering how much our Organizer In Chief had to do with this Alinsky style chaos in Chicago:
Illegal aliens, paid Soros protesters, angry Black Lives Matter terrorists inspired by Obama’s race war and Bernie Sanders supporters who have absolutely no idea why they showed up, sent four innocent police officers to the hospital; prevented thousands of innocent Americans from exercising their First Amendment right.

Is this by intention? Is there a way to force splitting at colons? Besides this extreme example I think I came across many cases where syntok did not split at colons.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions