A collection of text embeddings of the arXiv corpus by title and abstract (Q5620): Difference between revisions
From MaRDI portal
Created a new Item |
(No difference)
|
Latest revision as of 15:03, 20 February 2025
Dataset published at Zenodo repository.
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | A collection of text embeddings of the arXiv corpus by title and abstract |
Dataset published at Zenodo repository. |
Statements
A popular online repository of arXiv is home to numerous preprints in many scientific domains. Other than playing a role of disseminating up-to-date knowledge in pertaining domains, arXiv is an interesting complex system by itself from text analytics point of view. In this repository, we provide a collection of text embedding outputs for (almost) all papers from the arXiv corpus by their titles and abstracts in order to provide multi-faceted characteristics of scientific knowledge.
0 references
8 August 2023
0 references
2023-08-08
0 references