Hi, I was doing some research about the linked data nodes (application/ld+json) and I think this is the most reliable (and more structured) source of metadata from a news article. I can work on some of the extractors (image, title, publish_date) to give priority to the data extracted from the linked data node, if its all right @barrust. As of now I will submit a PR in which, besides the already current method of extraction, I extract the authors from the linked data node (schema). :)