Shandu: Advanced AI Research System for Comprehensive Knowledge Synthesis

Shandu is a cutting-edge AI research assistant that performs in-depth, multi-source research on any topic using advanced language models, intelligent web scraping, and iterative exploration to generate comprehensive, well-structured reports with proper citations.

🔍 What is Shandu?

Shandu is an intelligent, LLM-powered research system that automates the comprehensive research process - from initial query clarification to in-depth content analysis and report generation. Built on LangGraph's state-based workflow, it recursively explores topics with sophisticated algorithms for source evaluation, content extraction, and knowledge synthesis.

Key Use Cases

Academic Research: Generate literature reviews, background information, and complex topic analyses
Market Intelligence: Analyze industry trends, competitor strategies, and market opportunities
Content Creation: Produce well-researched articles, blog posts, and reports with proper citations
Technology Exploration: Track emerging technologies, innovations, and technical developments
Policy Analysis: Research regulations, compliance requirements, and policy implications
Competitive Analysis: Compare products, services, and company strategies across industries

⚙️ How Shandu Works

flowchart TB
    subgraph Input
        Q[User Query]
        B[Breadth Parameter]
        D[Depth Parameter]
    end

    DR[Deep Research] -->
    SQ[SERP Queries] -->
    PR[Process Results]

    subgraph Results[Results]
        direction TB
        NL((Learnings))
        ND((Directions))
    end

    PR --> NL
    PR --> ND

    DP{depth > 0?}

    RD["Next Direction:
    - Prior Goals
    - New Questions
    - Learnings"]

    MR[Markdown Report]

    %% Main Flow
    Q & B & D --> DR

    %% Results to Decision
    NL & ND --> DP

    %% Circular Flow
    DP -->|Yes| RD
    RD -->|New Context| DR

    %% Final Output
    DP -->|No| MR

    %% Styling
    classDef input fill:#7bed9f,stroke:#2ed573,color:black
    classDef process fill:#70a1ff,stroke:#1e90ff,color:black
    classDef recursive fill:#ffa502,stroke:#ff7f50,color:black
    classDef output fill:#ff4757,stroke:#ff6b81,color:black
    classDef results fill:#a8e6cf,stroke:#3b7a57,color:black

    class Q,B,D input
    class DR,SQ,PR process
    class DP,RD recursive
    class MR output
    class NL,ND results

🌟 Key Features

Intelligent State-based Workflow: Leverages LangGraph for a structured, step-by-step research process with clear state transitions
Iterative Deep Exploration: Recursively explores topics with dynamic depth and breadth, adapting to information discovered
Multi-source Information Synthesis: Analyzes data from search engines, web content, and structured knowledge bases
Enhanced Web Scraping: Features dynamic JS rendering, content extraction, and ethical scraping practices
Smart Source Evaluation: Automatically assesses source credibility, relevance, and information value
Content Analysis Pipeline: Uses advanced NLP to extract key information, identify patterns, and synthesize findings
Comprehensive Report Generation: Creates detailed, well-structured reports with proper citations and thematic organization
Parallel Processing Architecture: Implements concurrent operations for efficient multi-query execution
Adaptive Search Strategy: Dynamically adjusts search queries based on discovered information and knowledge gaps
Full Citation Management: Properly attributes all sources with formatted citations in multiple styles

🚀 Installation

# Install from PyPI
pip install shandu

# Install from source
git clone https://github.com/jolovicdev/shandu.git
cd shandu
pip install -e .

🏁 Quick Start

# Configure API settings (supports various LLM providers)
shandu configure

# Run comprehensive research
shandu research "Your research query" --depth 2 --breadth 4 --output report.md

# Quick AI-powered search with web scraping (You dont need Perplexity!)
shandu aisearch "Who is the current sitting president of United States?" --detailed

# Basic multi-engine search
shandu search "Your search query"

📚 Detailed Usage

Research Command

shandu research "Your research query" \
    --depth 3 \                # How deep to explore (1-5, default: 2)
    --breadth 5 \              # How many parallel queries (2-10, default: 4)
    --output report.md \       # Save to file instead of terminal
    --verbose                  # Show detailed progress

Example Reports

You can find example reports in the examples directory:

The Intersection of Quantum Computing, Synthetic Biology, and Climate Modeling

shandu research "The Intersection of Quantum Computing, Synthetic Biology, and Climate Modeling" --depth 3 --breadth 3 --output examples/o3-mini-high.md

AI Search Command with Scraping

shandu aisearch "Your search query" \
    --engines "google,duckduckgo" \  # Comma-separated list of search engines
    --max-results 15 \               # Maximum number of results to return
    --output results.md \            # Save to file instead of terminal
    --detailed                       # Generate a detailed analysis

💻 Python API

from shandu.agents import ResearchGraph
from langchain_openai import ChatOpenAI

# Initialize with custom LLM if desired
llm = ChatOpenAI(model="gpt-4")

# Initialize the research graph
researcher = ResearchGraph(
    llm=llm,
    temperature=0.5
)

# Perform deep research
results = researcher.research_sync(
    query="Your research query",
    depth=3,       # How deep to go with recursive research
    breadth=4,     # How many parallel queries to explore
    detail_level="high"
)

# Print or save results
print(results.to_markdown())

🧩 Advanced Architecture

Shandu implements a sophisticated multi-stage research pipeline:

Query Clarification: Interactive questions to understand research needs
Research Planning: Strategic planning for comprehensive topic coverage
Iterative Exploration:
- Smart query generation based on knowledge gaps
- Multi-engine search with parallelized execution
- Relevance filtering of search results
- Intelligent web scraping with content extraction
- Source credibility assessment
- Information analysis and synthesis
- Reflection on findings to identify gaps
Report Generation:
- Theme extraction and organization
- Multi-step report enhancement
- Citation formatting and management
- Section expansion for comprehensive coverage

🔌 Supported Search Engines & Sources

Google Search
DuckDuckGo
Wikipedia
ArXiv (academic papers)
Custom search engines can be added

📊 Technical Capabilities

Dynamic JS Rendering: Handles JavaScript-heavy websites
Content Extraction: Identifies and extracts main content from web pages
Parallel Processing: Concurrent execution of searches and scraping
Caching: Efficient caching of search results and scraped content
Rate Limiting: Respectful access to web resources
Robots.txt Compliance: Ethical web scraping practices
Flexible Output Formats: Markdown, JSON, plain text

📜 License & INFO

Note: this is just an idea, i need contributors, this is project where I'm taking deep dive into LangChain and LangGraph! This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
examples		examples
shandu		shandu
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shandu: Advanced AI Research System for Comprehensive Knowledge Synthesis

🔍 What is Shandu?

Key Use Cases

⚙️ How Shandu Works

🌟 Key Features

🚀 Installation

🏁 Quick Start

📚 Detailed Usage

Research Command

Example Reports

AI Search Command with Scraping

💻 Python API

🧩 Advanced Architecture

🔌 Supported Search Engines & Sources

📊 Technical Capabilities

📜 License & INFO

About

Languages

License

jolovicdev/shandu

Folders and files

Latest commit

History

Repository files navigation

Shandu: Advanced AI Research System for Comprehensive Knowledge Synthesis

🔍 What is Shandu?

Key Use Cases

⚙️ How Shandu Works

🌟 Key Features

🚀 Installation

🏁 Quick Start

📚 Detailed Usage

Research Command

Example Reports

AI Search Command with Scraping

💻 Python API

🧩 Advanced Architecture

🔌 Supported Search Engines & Sources

📊 Technical Capabilities

📜 License & INFO

About

Topics

Resources

License

Stars

Watchers

Forks

Languages