How to Build a Custom AI Research Agent in 15 Minutes (Step-by-Step Guide 2026)
Want an AI that does actual research for you — searches the web, reads articles, cross-references findings, and writes a summary report — all without you touching a search engine?
You don't need a $20/month subscription. You don't need a GPU cluster. You can build one right now with free tools.
Here's the step-by-step.
What We're Building
A research agent that takes a question, searches multiple sources, analyzes the results, and delivers a structured report. It uses three components:
- An LLM (the brain) — free tier from OpenRouter or Google AI Studio
- A search tool (the eyes) — DuckDuckGo or SerpAPI free tier
- An agent framework (the skeleton) — LangChain or CrewAI
The architecture looks like this:
Your Question → Agent → LLM decides what to search
→ Searches the web
→ Reads and analyzes results
→ Writes structured report → You
Step 1: Pick Your Stack
You have two solid options depending on your comfort level.
Option A: LangChain (flexible, more control)
LangChain's agent framework is mature, well-documented, and lets you customize every piece.
class="language-bash">pip install langchain langchain-community langchain-openai duckduckgo-search
Option B: CrewAI (multi-agent, less code)
CrewAI is built for multi-agent collaboration and needs less boilerplate.
class="language-bash">pip install crewai duckduckgo-search
Recommendation for beginners: Start with LangChain. CrewAI is better when you want multiple agents that hand off work to each other.
Step 2: Get a Free LLM API Key
You don't need OpenAI. Here are free options in 2026:
| Provider | Free Tier | Best For |
|---|---|---|
| OpenRouter | $1 free credit | Access to every model |
| Google AI Studio | 60 requests/min | Gemini models (free) |
| Groq | 30 req/min | Fast inference |
For this guide, I'll use OpenRouter with a free model (Mistral or DeepSeek).
class="language-python">import os
os.environ["OPENAI_API_KEY"] = "your-openrouter-key"
os.environ["OPENAI_API_BASE"] = "https://openrouter.ai/api/v1"
Step 3: Build the Research Agent (LangChain Version)
Create a file called research_agent.py:
class="language-python">from langchain.agents import initialize_agent, Tool, AgentType from langchain_community.tools import DuckDuckGoSearchResults from langchain_openai import ChatOpenAI from langchain.memory import ConversationBufferMemory1. Set up the LLM
llm = ChatOpenAI( model=“mistralai/mistral-small-3.1-24b”, temperature=0.3, openai_api_key=os.getenv(“OPENAI_API_KEY”), openai_api_base=os.getenv(“OPENAI_API_BASE”), )
2. Add search tool
search = DuckDuckGoSearchResults() tools = [ Tool( name=“web_search”, func=search.run, description=“Search the web for current information” ), ]
3. Build the agent with memory
memory = ConversationBufferMemory(memory_key=“chat_history”) agent = initialize_agent( tools=tools, llm=llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, memory=memory, verbose=True, max_iterations=6, )
4. Run it
result = agent.invoke({ “input”: “Research the latest advancements in open-source AI agents. ” “Find at least 3 different projects, compare their features, ” “and summarize which one is best for beginners.” }) print(result[“output”])
That's it. 25 lines of Python. Run it:
class="language-bash">python research_agent.py
The agent will think, search, analyze, and write a report. You'll see the chain-of-thought as it works.
Step 4: Level Up with Multi-Agent Research (CrewAI)
Want multiple AI researchers collaborating? CrewAI makes this elegant:
class="language-python">from crewai import Agent, Task, Crew from crewai_tools import DuckDuckGoSearchToolAgent 1: Finds raw data
researcher = Agent( role=“Senior Research Analyst”, goal=“Find the most relevant and recent information”, backstory=“Expert at finding signal in noise across the web”, tools=[DuckDuckGoSearchTool()], verbose=True, )
Agent 2: Synthesizes and writes
writer = Agent( role=“Technical Writer”, goal=“Synthesize research into clear, structured reports”, backstory=“You transform messy data into actionable insights”, verbose=True, )
Research task
research_task = Task( description=“Research the latest AI agent frameworks in 2026”, expected_output=“A list of 3-5 frameworks with key features”, agent=researcher, )
Writing task
write_task = Task( description=“Write a comparison report based on the research”, expected_output=“A structured markdown report with recommendations”, agent=writer, )
Assemble the crew
crew = Crew( agents=[researcher, writer], tasks=[research_task, write_task], verbose=True, )
result = crew.kickoff() print(result)
The researcher agent gathers data. The writer agent turns it into a polished report. Two agents, one output.
Step 5: Making It Production-Ready
A few things to add before you trust this with real work:
Rate limiting. Free APIs have limits. Add a small delay between searches:
class="language-python">import time
for result in results:
process(result)
time.sleep(1) # Respect rate limits
Caching. Avoid re-searching the same query. Store results in a local file:
class="language-python">import json, hashlib cache_file = "research_cache.json"
def cached_search(query): key = hashlib.md5(query.encode()).hexdigest() try: with open(cache_file) as f: cache = json.load(f) if key in cache: return cache[key] except FileNotFoundError: cache = {} result = search.run(query) cache[key] = result with open(cache_file, “w”) as f: json.dump(cache, f) return result
Structured output. Instead of raw text, get JSON:
class="language-python">from langchain.output_parsers import StructuredOutputParser from langchain_core.pydantic_v1 import BaseModel, Field
class ResearchReport(BaseModel): summary: str = Field(description=“One-paragraph summary”) findings: list[str] = Field(description=“Key findings as bullet points”) sources: list[str] = Field(description=“Source URLs”) confidence: float = Field(description=“Confidence score 0-1”)
What This Looks Like in Practice
Here's a real output from running the agent with the query "Compare open-source AI agents for web research in 2026":
Summary: Three open-source agent frameworks dominate in 2026: LangChain (most flexible), CrewAI (best for multi-agent), and OpenClaw (most integrated for automation tasks).
>
Key Findings:
- LangChain has the largest ecosystem and best documentation — ideal for custom workflows
- CrewAI's multi-agent orchestration is the easiest way to build research pipelines
- OpenClaw excels at long-running automation and integrates directly with messaging platforms
- All three are free and open-source
>
Recommendation: For pure research automation, start with LangChain for control, migrate to CrewAI when you need multi-agent workflows, and use OpenClaw when you want persistent automation that runs unattended.
Cost Breakdown
| Component | Free Tier | Paid Upgrade |
|---|---|---|
| LLM (OpenRouter) | $1 credit / free models | $0.15 per million tokens |
| Search (DuckDuckGo) | Unlimited | N/A (free) |
| Framework (LangChain/CrewAI) | MIT License | Free forever |
| Hosting (your laptop) | Free | — |
Total cost to build and run this: $0.
Real-World Use Cases
This isn't a toy. I use a version of this agent every day for:
- Competitive research — "What did my competitors ship this week?"
- Market analysis — "Summarize the last 5 funding rounds in AI infrastructure"
- Technical deep dives — "Find all the API changes in the latest LangChain release"
- News monitoring — "What happened in AI agents this week?"
Set it on a cron job (or an OpenClaw timer) and you get automated research delivered to your inbox every morning.
What's Next
Once you have a working research agent, extend it:
- Add a database (ChromaDB / SQLite) for persistent memory
- Add more tools — ArXiv API for papers, Wikipedia, Reddit search
- Add a web interface — Gradio or Streamlit in 20 lines
- Schedule it — systemd timer or OpenClaw cron for daily reports
The 2026 AI landscape is shifting fast. Having your own research agent isn't a luxury — it's the difference between knowing what matters and drowning in noise.
Go build yours. It'll take you longer to read this guide than to run the code.
← Back to all posts