Review Buddy#

Overview#

GitHub Badge

Review Buddy is a production-ready toolkit for conducting automated systematic literature reviews. It provides a simple 3-script workflow that handles everything from multi-source searches to intelligent PDF downloads, with powerful abstract-based filtering (keyword or AI-powered) to streamline your screening process.

Perfect for:

  • Systematic reviews following PRISMA guidelines

  • Meta-analyses requiring comprehensive paper collection

  • Reproducible research with documented workflows

  • Large-scale literature reviews across multiple databases

Key Features#

Multi-Source Search (5 Databases)#

  • Scopus: Comprehensive coverage of peer-reviewed literature

  • PubMed: Biomedical and life sciences focus with PMC access

  • arXiv: Pre-prints and cutting-edge research

  • Google Scholar: Broadest academic coverage

  • IEEE Xplore: Engineering and computer science

Smart Filtering#

  • Keyword-Based Filtering: Rule-based abstract screening with customizable criteria

  • AI-Powered Filtering (NEW): Local LLM-based filtering using Ollama

  • Built-in Filters: Non-English, no abstract, animal studies, reviews, epilepsy, BCI

  • Custom Filters: Easy to add domain-specific exclusion criteria

  • Manual Review Queue: Papers flagged for manual verification (AI mode)

Intelligent Paper Downloading#

  • 10+ Download Strategies: Multiple fallback methods for PDF retrieval

  • Priority Order: Direct PDF β†’ arXiv β†’ bioRxiv/medRxiv β†’ Unpaywall β†’ PMC β†’ Publisher patterns β†’ Crossref β†’ HTML scraping β†’ Sci-Hub (optional)

  • Open Access Focus: Unpaywall, arXiv, PubMed Central (US & Europe)

  • Publisher Patterns: MDPI, Frontiers, Nature, IEEE, ScienceDirect, Springer, PLOS

  • 70-90% Success Rate (depends on source mix and Sci-Hub usage)

Export Formats#

  • BibTeX: For LaTeX and reference managers

  • RIS: For EndNote, Mendeley, Zotero

  • CSV: For data analysis and spreadsheets

Smart Deduplication#

Automatic removal of duplicate papers across sources:

  • Title matching with fuzzy logic

  • DOI comparison

  • PubMed prioritization (better download success)

  • Source tracking for transparency

Three-Step Workflow#

Review Buddy uses a simple, production-ready workflow:

1️⃣ Fetch Metadata#

python 01_fetch_metadata.py

Searches 5 databases, deduplicates, exports BibTeX/RIS/CSV.

2️⃣ Filter Abstracts (Optional)#

python 02_abstract_filter.py        # Keyword-based
# OR
python 02_abstract_filter_AI.py     # AI-powered (Ollama)

Excludes unwanted papers based on abstract content.

3️⃣ Download PDFs#

python 03_download_papers.py

Tries 10+ strategies to download full-text papers.

Quick Start Example#

1. Configure your search (01_fetch_metadata.py):

QUERY = "machine learning AND healthcare"
YEAR_FROM = 2020
MAX_RESULTS_PER_SOURCE = 50

2. Set up API keys (.env):

SCOPUS_API_KEY=your_key_here
PUBMED_EMAIL=your.email@example.com

3. Run the scripts:

python 01_fetch_metadata.py     # Search β†’ results/references.bib
python 02_abstract_filter.py    # Filter β†’ results/references_filtered.bib  
python 03_download_papers.py    # Download β†’ results/pdfs/

Results:

  • results/papers.csv - All paper metadata

  • results/references.bib - Bibliography for original search

  • results/references_filtered.bib - Filtered bibliography (if using step 2)

  • results/pdfs/ - Downloaded papers

Architecture#

review_buddy/
β”œβ”€β”€ 01_fetch_metadata.py         # Main search script
β”œβ”€β”€ 02_abstract_filter.py        # Keyword-based filtering
β”œβ”€β”€ 02_abstract_filter_AI.py     # AI-powered filtering (new!)
β”œβ”€β”€ 03_download_papers.py        # PDF downloader
β”œβ”€β”€ .env.example                 # Configuration template
β”œβ”€β”€ query.txt                    # Optional: External query file
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ config.py               # Config management
β”‚   β”œβ”€β”€ models.py               # Paper data model
β”‚   β”œβ”€β”€ paper_searcher.py       # Search coordinator
β”‚   β”œβ”€β”€ abstract_filter.py      # Keyword filtering logic
β”‚   β”œβ”€β”€ ai_abstract_filter.py   # AI filtering logic (new!)
β”‚   β”œβ”€β”€ llm_client.py           # Ollama client (new!)
β”‚   β”œβ”€β”€ utils.py                # Helper functions
β”‚   └── searchers/              # Source implementations
β”‚       β”œβ”€β”€ scopus_searcher.py
β”‚       β”œβ”€β”€ pubmed_searcher.py
β”‚       β”œβ”€β”€ arxiv_searcher.py
β”‚       β”œβ”€β”€ scholar_searcher.py
β”‚       β”œβ”€β”€ ieee_searcher.py
β”‚       └── paper_downloader.py # Download strategies
β”œβ”€β”€ docs/                        # Comprehensive guides
β”‚   β”œβ”€β”€ QUERY_SYNTAX.md         # Query building guide
β”‚   β”œβ”€β”€ FILTER_WORKFLOW_EXAMPLE.md
β”‚   β”œβ”€β”€ DOWNLOADER_GUIDE.md
β”‚   └── DEDUPLICATION.md
└── results/                     # Output (auto-created)
    β”œβ”€β”€ papers.csv
    β”œβ”€β”€ references.bib
    β”œβ”€β”€ papers_filtered.csv     # After filtering
    β”œβ”€β”€ references_filtered.bib
    β”œβ”€β”€ filtered_out/           # Papers removed by each filter
    └── pdfs/

When to Use Review Buddy vs Findpapers#

Feature

Review Buddy

Findpapers

Workflow

Script-based (3 steps)

Configuration-based (YAML)

Filtering

Keyword + AI options

Limited post-search filtering

Learning Curve

Easy (Python scripts)

Easy (YAML config)

Customization

Highly customizable filters

Limited to config options

Integration

Easy to integrate in pipelines

Standalone tool

Best For

Systematic reviews, large projects

Quick searches, one-off reviews

Download Success

70-90% (10+ strategies)

50-70% (fewer strategies)

Repository Information#

GitHub: leonardozaggia/review_buddy

The Review Buddy codebase is actively maintained and welcomes contributions from the research community.

Next Steps#

Continue to the next sections to learn:

  1. Installation: Setting up Review Buddy and obtaining API keys

  2. Usage Examples: Complete workflows and code examples

  3. Advanced Features: Custom searchers, batch processing, and more