Start a new topic
Implemented

Support importing meta-data from arxiv.org

It would be great if you would support importing meta-data from arxiv.org for selected papers. Or importing directly from the html page (example: https://arxiv.org/abs/1811.03600) instead of only from the pdf.


28 people like this idea

+1

+1

+1 


More generally, I would like to be able to import metadata from journal pages, doi references, etc.

+1


At least in arXiv's abstract pages, DOM elements are well annotated by classes.

And regex parsing also seems to work well. 


If there is some way to register user-defined mapping(*) from page elements to meta information, I will contribute.


(*) e.g. by JSON


{

  "domain": "http?::/arxiv.org/abs/*",

  "title": [ "xpath://h1[@class='title']/text()", "regex:/^Title:(.*)$/¥1/" ],

  ....

+1

+1

+1

+1

This is a must have feature for researchers in machine learning, computer vision, and physics fields. 

+1

+1


1 person likes this

+1


1 person likes this

+1

This is implemented. Please remember that since arXiv is a pre-print service the metadata you receive will be generally less than if the paper was from a Journal.

Login or Signup to post a comment