Q&A Agent¶
WikiMind's Q&A agent answers questions against your compiled wiki, citing specific articles and suggesting follow-up questions.
Asking Questions¶
Simple question¶
curl -X POST http://localhost:7842/query \
-H "Content-Type: application/json" \
-d '{"question": "What are the main arguments for and against microservices?"}'
The response includes:
{
"query": {
"id": "...",
"question": "What are the main arguments...",
"answer": "Based on your wiki articles...",
"confidence": "high",
"source_article_ids": ["Article Title 1", "Article Title 2"],
"conversation_id": "..."
},
"conversation": {
"id": "...",
"title": "What are the main arguments..."
}
}
Follow-up questions¶
Continue a conversation by passing the conversation_id:
curl -X POST http://localhost:7842/query \
-H "Content-Type: application/json" \
-d '{
"question": "How does that compare to the monolith approach?",
"conversation_id": "CONVERSATION_ID"
}'
The agent uses prior turns as context to disambiguate references like "it" or "that approach". Up to 5 prior turns are included (configurable via WIKIMIND_QA__MAX_PRIOR_TURNS_IN_CONTEXT).
Streaming answers¶
For token-by-token streaming via Server-Sent Events:
curl -N http://localhost:7842/query/stream \
-H "Content-Type: application/json" \
-d '{"question": "Summarize my notes on distributed systems"}'
SSE events:
| Event | Description |
|---|---|
chunk | Text delta (partial answer) |
done | Final response with full answer, sources, and metadata |
error | Error occurred during streaming |
File-Back¶
When an answer has high or medium confidence, you can file it back to the wiki. This creates a new article from the conversation, making the wiki smarter over time.
Auto file-back¶
Set file_back: true in the request to automatically file high/medium confidence answers:
curl -X POST http://localhost:7842/query \
-H "Content-Type: application/json" \
-d '{
"question": "What is the current state of quantum computing?",
"file_back": true
}'
Manual file-back¶
File an entire conversation back to the wiki:
Or file selected turns from one or more conversations:
curl -X POST http://localhost:7842/query/conversations/file-back \
-H "Content-Type: application/json" \
-d '{
"selections": [
{"conversation_id": "...", "turn_indices": [0, 1, 2]}
]
}'
Conversations¶
List conversations¶
Returns conversations ordered by most recently updated first.
Get conversation detail¶
Returns the conversation with all its turns (questions and answers).
Export as markdown¶
Returns a standalone markdown document of the conversation.
Fork a conversation¶
Branch a conversation at a specific turn to explore a different line of reasoning:
curl -X POST http://localhost:7842/query/conversations/{conversation_id}/fork \
-H "Content-Type: application/json" \
-d '{
"turn_index": 2,
"question": "What if we approached it differently?"
}'
This creates a new conversation that shares turns 0 through turn_index - 1 with the parent. The original branch is preserved immutably.
How Context Retrieval Works¶
When you ask a question, the Q&A agent:
- Extracts key terms from your question
- Scores all wiki articles by term overlap
- Selects the top 5 most relevant articles
- Truncates each to 3,000 characters to fit the context window
- Includes prior conversation turns (up to 5) for multi-turn context
- Sends everything to the LLM with a structured prompt
The LLM responds with JSON containing the answer, confidence level, cited sources, related articles, and follow-up questions.
Semantic search coming soon
The current retrieval uses simple term overlap. Semantic search via ChromaDB embeddings is planned, which will significantly improve answer quality for conceptual questions.