AI-Driven Research Assistant

Overview

This is an advanced AI-powered research assistant system that utilizes multiple specialized agents to assist in tasks such as data analysis, visualization, and report generation. The system employs LangChain, OpenAI's GPT models, and LangGraph to handle complex research processes, integrating diverse AI architectures for optimal performance.

Key Features

Hypothesis generation and validation
Data processing and analysis
Visualization creation
Web search and information retrieval
Code generation and execution
Report writing
Quality review and revision
Diverse Architectural Integration:
- Supervisor agents for overseeing the analysis process
- Chain-of-thought reasoning for complex problem-solving
- Critic agents for quality assurance and error checking
Innovative Note Taker Agent:
- Continuously records the current state of the project
- Provides a more efficient alternative to transmitting complete historical information
- Enhances the system's ability to maintain context and continuity across different analysis stages
Adaptive Workflow: Dynamically adjusts its analysis approach based on the data and task at hand

Why It's Unique

The integration of a dedicated Note Taker agent sets this system apart from traditional data analysis pipelines. By maintaining a concise yet comprehensive record of the project's state, the system can:

Reduce computational overhead
Improve context retention across different analysis phases
Enable more coherent and consistent analysis outcomes

System Requirements

Python 3.10 or higher
Jupyter Notebook environment

Installation

Clone the repository:

git clone https://github.com/starpig1129/ai-data-analysis-MulitAgent.git

Create and activate a Conda virtual environment:

conda create -n data_assistant python=3.10
conda activate data_assistant

Install dependencies:

pip install -r requirements.txt

Set up environment variables: Rename .env Example to .env and fill all the values

# Your data storage path(required)
DATA_STORAGE_PATH =./data_storage/

# Anaconda installation path(required)
CONDA_PATH = /home/user/anaconda3

# Conda environment name(required)
CONDA_ENV = envname

# ChromeDriver executable path(required)
CHROMEDRIVER_PATH =./chromedriver-linux64/chromedriver

# Firecrawl API key (optional)
# Note: If this key is missing, query capabilities may be reduced
FIRECRAWL_API_KEY = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

# OpenAI API key (required)
# Warning: This key is essential; the program will not run without it
OPENAI_API_KEY = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

# LangChain API key (optional)
# Used for monitoring the processing
LANGCHAIN_API_KEY = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Usage

Using Jupyter Notebook

Start Jupyter Notebook:
Set YourDataName.csv in data_storage
Open the main.ipynb file.
Run all cells to initialize the system and create the workflow.
In the last cell, you can customize the research task by modifying the userInput variable.
Run the final few cells to execute the research process and view the results.

Using Python Script

You can also run the system directly using main.py:

Place your data file (e.g., YourDataName.csv) in the data_storage directory
Run the script:

python main.py

By default, it will process 'OnlineSalesData.csv'. To analyze a different dataset, modify the user_input variable in the main() function of main.py:

user_input = '''
datapath:YourDataName.csv
Use machine learning to perform data analysis and write complete graphical reports
'''

Main Components

hypothesis_agent: Generates research hypotheses
process_agent: Supervises the entire research process
visualization_agent: Creates data visualizations
code_agent: Writes data analysis code
searcher_agent: Conducts literature and web searches
report_agent: Writes research reports
quality_review_agent: Performs quality reviews
note_agent: Records the research process

Workflow

The system uses LangGraph to create a state graph that manages the entire research process. The workflow includes the following steps:

Hypothesis generation
Human choice (continue or regenerate hypothesis)
Processing (including data analysis, visualization, search, and report writing)
Quality review
Revision as needed

Customization

You can customize the system behavior by modifying the agent creation and workflow definition in main.ipynb.

Notes

Ensure you have sufficient OpenAI API credits, as the system will make multiple API calls.
The system may take some time to complete the entire research process, depending on the complexity of the task.
WARNING: The agent system may modify the data being analyzed. It is highly recommended to backup your data before using this system.

Current Issues and Solutions

OpenAI Internal Server Error (Error code: 500)
NoteTaker Efficiency Improvement
Overall Runtime Optimization
Refiner needs to be better

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Star History

Other Projects

Here are some of my other notable projects:

ShareLMAPI

ShareLMAPI is a local language model sharing API that uses FastAPI to provide interfaces, allowing different programs or device to share the same local model, thereby reducing resource consumption. It supports streaming generation and various model configuration methods.

GitHub: ShareLMAPI

PigPig: Advanced Multi-modal LLM Discord Bot:

A powerful Discord bot based on multi-modal Large Language Models (LLM), designed to interact with users through natural language. It combines advanced AI capabilities with practical features, offering a rich experience for Discord communities.

GitHub: ai-discord-bot-PigPig

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
agent		agent
core		core
tools		tools
.env Example		.env Example
.gitignore		.gitignore
Architecture.png		Architecture.png
LICENSE		LICENSE
README.md		README.md
create_agent.py		create_agent.py
load_cfg.py		load_cfg.py
logger.py		logger.py
main.ipynb		main.ipynb
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Driven Research Assistant

Overview

Key Features

Why It's Unique

System Requirements

Installation

Usage

Using Jupyter Notebook

Using Python Script

Main Components

Workflow

Customization

Notes

Current Issues and Solutions

Contributing

License

Star History

Other Projects

ShareLMAPI

PigPig: Advanced Multi-modal LLM Discord Bot:

About

Releases

Packages

Languages

License

starpig1129/AI-Data-Analysis-MultiAgent

Folders and files

Latest commit

History

Repository files navigation

AI-Driven Research Assistant

Overview

Key Features

Why It's Unique

System Requirements

Installation

Usage

Using Jupyter Notebook

Using Python Script

Main Components

Workflow

Customization

Notes

Current Issues and Solutions

Contributing

License

Star History

Other Projects

ShareLMAPI

PigPig: Advanced Multi-modal LLM Discord Bot:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages