Skip to content

πŸ“š An intelligent toolkit to automatically parse, complete, and format academic references.

License

Notifications You must be signed in to change notification settings

HzaCode/OneCite

OneCite Logo

OneCite

Citation & Academic Reference Toolkit

Awesome CLI Apps

Tests codecov PyPI Python Downloads MIT Docs Awesome LaTeX

Features β€’ Quick Start β€’ πŸ“– Advanced Usage β€’ 🀝 Contributing


OneCite is a command-line tool and Python library for citation management. It accepts DOIs, paper titles, arXiv IDs, and mixed inputs, and outputs formatted bibliographic entries.


Features

Feature Description
Fuzzy Matching Match references against multiple academic databases even from incomplete or inaccurate info.
Multiple Formats Input .txt/.bib β†’ Output BibTeX, APA, or MLA.
4-stage Pipeline A 4-stage process (clean β†’ query β†’ validate β†’ format) to produce consistent output.
Field Completion Enrich entries by filling in missing fields like journal, volume, pages, and authors.
πŸŽ“ 7+ Citation Types Handles journal articles, conference papers, books, software, datasets, theses, and preprints.
Domain-Aware Routing Auto-detects content type and domain (Medical/CS/General) to pick the best data source.
Many Identifier Types Accepts DOI, PMID, arXiv ID, ISBN, GitHub URL, Zenodo DOI, or plain text queries.
πŸŽ›οΈ Interactive Mode Manually select the correct entry when multiple potential matches are found.
Custom Templates YAML-based templates to control which fields are collected and how entries are typed.

🌐 Data Sources

CrossRef Semantic Scholar OpenAlex PubMed dblp arXiv DataCite Zenodo Google Books

Quick Start

Install and try OneCite in a few steps.

1. Installation

# Recommended: Install from PyPI
pip install onecite

2. Create an Input File

Create a file named references.txt with your mixed-format references:

# references.txt
# Add blank lines between entries to avoid misidentification

10.1038/nature14539

Attention is all you need, Vaswani et al., NIPS 2017

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

https://github.com/tensorflow/tensorflow

10.5281/zenodo.3233118

arXiv:2103.00020

Smith, J. (2020). Neural Architecture Search. PhD Thesis. Stanford University.

3. Run OneCite

Execute the command to process your file and generate a clean .bib output.

onecite process references.txt -o results.bib --quiet

4. View Output

Your results.bib file now contains entries of different types.

View Complete Output (results.bib)
@article{LeCun2015Deep,
  doi = "10.1038/nature14539",
  title = "Deep learning",
  author = "LeCun, Yann and Bengio, Yoshua and Hinton, Geoffrey",
  journal = "Nature",
  year = 2015,
  volume = 521,
  number = 7553,
  pages = "436-444",
  publisher = "Springer Science and Business Media LLC",
  url = "https://doi.org/10.1038/nature14539",
  type = "journal-article",
}
@inproceedings{Vaswani2017Attention,
  arxiv = "1706.03762",
  title = "Attention Is All You Need",
  author = "Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia",
  year = 2017,
  journal = "arXiv preprint",
  url = "https://arxiv.org/abs/1706.03762",
}
# ... and 5 more entries ...

πŸ“– Advanced Usage

🎨 Multiple Output Formats (APA, MLA)
onecite process refs.txt --output-format apa
# β†’ LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.

onecite process refs.txt --output-format mla
# β†’ LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep Learning." Nature 521.7553 (2015): 436-444.
Interactive Disambiguation

For ambiguous entries, use the --interactive flag to manually select the correct match and ensure accuracy.

Command:

onecite process ambiguous.txt --interactive

Example Interaction:

Found multiple possible matches for "Deep learning Hinton":
1. Deep learning
   Authors: LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey
   Journal: Nature, 2015
   DOI: 10.1038/nature14539

2. Deep belief networks
   Authors: Hinton, Geoffrey E.
   Journal: Scholarpedia, 2009
   DOI: 10.4249/scholarpedia.5947

Please select (1-2, 0=skip): 1
Selected: Deep learning
🐍 Use as a Python Library

Use OneCite directly in your Python scripts.

from onecite import process_references

# A callback can be used for non-interactive selection (e.g., always choose the best match)
def auto_select_callback(candidates):
    return 0 # Index of the best candidate

result = process_references(
    input_content="Deep learning review\nLeCun, Bengio, Hinton\nNature 2015",
    input_type="txt",
    output_format="bibtex",
    interactive_callback=auto_select_callback
)

print(result['output_content'])

🀝 Contributing

Contributions are always welcome! Please see CONTRIBUTING.md for development guidelines and instructions on how to submit a pull request.

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

Disclosure

Development was assisted by standard productivity tools including Generative AI for streamlining implementation details. All output was verified and integrated by the maintainer, and no LLMs are used by the package at runtime.