Back to Projects

AI-Powered Research Paper Scraper & Evaluator

AI/ML
Research Automation
LLM
Web Scraping
Data Science

Overview

This project automates the rigorous process of systematic literature review. It searches academic databases like IEEE for specific keywords, performs a multi-stage relevance check (first on the abstract, then on the full content), and extracts comprehensive metadata including author details and the full PDF. The system then utilizes LLMs to evaluate the paper's content against predefined criteria, storing both the raw data and the AI-generated insights in a centralized database.

The Solution

I developed a multi-stage scraping and evaluation engine. The first stage uses Selenium to navigate research portals and extract abstracts. A secondary LLM-based filter determines if the paper warrants a full download. If approved, the system retrieves the PDF, parses its content, and performs a deep evaluation using an LLM to summarize findings and assess quality. All data is structured and stored in a PostgreSQL database for easy retrieval and analysis.

Tools Used

Python
Selenium
LLM Integration (GPT/Claude)
PostgreSQL
PDF Processing
REST APIs