Paper-Finder#

GUI-based tool for discovering, reviewing, and exporting academic papers across multiple databases. Paper-Finder Interface


Overview#

Paper-Finder is a lightweight, browser-based graphical user interface (GUI) that simplifies the process of searching for academic literature across multiple sources. It is designed for researchers who want a fast, transparent, and reproducible way to explore papers without relying on command-line tools or proprietary platforms.

The tool combines a local Flask backend (for reliable API access) with a clean HTML/JavaScript frontend, making it suitable for office networks, teaching environments, and collaborative research workflows.

Paper-Finder is especially useful as an early-stage literature discovery tool, before moving into systematic screening or automated review pipelines.


Key Features#

✅ Implemented Features#

Graphical User Interface#

  • Runs entirely in the browser

  • No command-line interaction required after setup

  • Clear visual presentation of results

Open-Access Filtering#

  • Restrict results to confirmed open-access papers (currently arXiv)

Selection & Batch Actions#

  • Per-paper checkboxes

  • “Select all” toggle with correct indeterminate state

  • Batch export and download actions

Export Options#

  • Export selected papers to CSV

  • Open all available PDF links in browser tabs

  • Download selected papers as a ZIP archive, including:

    • Metadata CSV

    • Links file

    • PDFs (where available, e.g., arXiv)

Office-Network Friendly#

  • PubMed and arXiv requests are routed through a local Flask server

  • Avoids common corporate firewall issues


Main Capabilities#

Unified Discovery Interface#

Search multiple academic sources from a single screen.

Transparent Metadata Handling#

Titles, authors, year, venue, source, open-access status, and links are clearly exposed.

Manual Review Support#

Designed for human-in-the-loop exploration rather than opaque automation.

Reproducible Outputs#

CSV and ZIP exports can be version-controlled or shared with collaborators.


Getting Started#

Requirements#

  • Python 3.9+

  • Internet access

  • Modern web browser (Chrome, Firefox, Edge)

Installation#

Clone the repository:

git clone https://github.com/sanahassanimam/Automating-the-Information-Extraction
cd Automating-the-Information-Extraction/Paper-Finder

Install Python dependencies:

pip install flask requests

Start the local server:

python server.py

Open your browser and navigate to:

http://127.0.0.1:5174/search.html

That’s it — no additional setup required.


How It Works#

Architecture#

Frontend

  • Plain HTML, CSS, and JavaScript

  • No frameworks or build steps

  • Runs entirely in the browser

Backend

  • Flask server (server.py)

  • Proxies PubMed and arXiv requests

  • Generates ZIP exports server-side

Data Flow#

  1. User submits a query via the GUI

  2. Flask backend fetches results from PubMed and/or arXiv

  3. Link-only sources generate direct search URLs

  4. Results are merged and displayed in the browser

  5. User selects papers and exports data as needed


Exported Data#

CSV Export#

The CSV file includes:

  • Title

  • Authors

  • Publication year

  • Venue / journal

  • DOI (if available)

  • Source database

  • Open-access status

  • PDF URL

  • Landing page URL

This format is suitable for:

  • Spreadsheet analysis

  • Screening tools

  • Downstream automation pipelines

ZIP Export#

The ZIP archive contains:

  • selected_papers.csv

  • links.txt (human-readable overview)

  • pdfs/ directory (when PDFs are available)

  • zip_log.txt (summary of downloaded PDFs)


Comparison with Other Tools#

Feature

Paper-Finder

CLI / Python Tools

Publisher Websites

Interface

GUI (Browser)

Command-line

Web

Ease of Use

⭐⭐⭐⭐⭐

⭐⭐

⭐⭐⭐

Transparency

⭐⭐⭐⭐⭐

⭐⭐⭐⭐

⭐⭐

Automation

⭐⭐⭐

⭐⭐⭐⭐⭐

Reproducibility

⭐⭐⭐⭐

⭐⭐⭐⭐⭐

Setup Overhead

⭐⭐⭐⭐

⭐⭐

⭐⭐⭐⭐⭐


Use Cases#

Paper-Finder is ideal for:

  • Exploratory literature searches

  • Early-stage systematic reviews

  • Teaching literature search strategies

  • Collaborative screening sessions

  • Non-technical researchers

  • Office or university network environments

  • Preparing datasets for downstream review tools


Limitations & Notes#

  • Google Scholar, IEEE Xplore, and Scopus are link-only (no scraping)

  • PubMed open-access status is not resolved automatically

  • PDF downloads depend on source availability (arXiv works best)

  • Not intended as a full systematic review automation engine