How to use LangChain and Psychic to answer questions about your documents with AI

Jason Fan
5 min readMay 21, 2023

At Psychic, we frequently encounter teams eager to utilize AI technologies like ChatGPT to sift through their internal knowledge bases. Today, building such a feature is remarkably straightforward, provided you have these three key elements: a data source, a vector database, and an API key for OpenAI.

In this tutorial, we’ll guide you step-by-step on how to make this happen with LangChain, using Psychic as the data loader. This whole project takes less than 30 min if you’re already familiar with Python.

The code for this tutorial can be found here.

If you need any help, feel free to contact us at

Tables of Contents

Step 1: Connect to your data sources

Step 2: Clean and chunk your data

Step 3: Turn your chunks into embeddings and load them into a vector database

Step 4: Build a retrieval pipeline to insert data into your prompt

Step 5: Put it all together and deploy


The sample project uses Poetry for package management. While not strictly necessary, it makes it much easier to run this project in a virtual environment without contaminating package versions for other python projects you might need to run. Instructions for setting up Poetry can be found here.

After installing poetry, run the following to install the dependencies:

poetry install

Step 1: Connect to Your Data Sources

The first step in this process is to connect to your data sources. There are several ways to do this, but the easiest is to create an account at (it’s free!) and use the Connector playground to connect Notion pages or a Confluence workspace.

You can then query documents from these connectors through Psychic’s python SDK or data loader in LangChain. To do this, we’ll first create a new python file called Eventually we will run a full API server from this file, but for now let’s start by loading the documents we just connected through Psychic.

from psychicapi import Psychic, ConnectorId
# Create a document loader for Notion. We can also load from other connectors e.g. ConnectorId.gdrive
psychic = Psychic(secret_key=os.getenv("PSYCHIC_SECRET_KEY"))
raw_docs = psychic.get_documents(ConnectorId.notion, "connection_id") #replace connection_id with the connection ID you set while creating a new connection at <>
documents = [
Document(page_content=doc["content"], metadata={"title": doc["title"], "source": doc["uri"]},)
for doc in raw_docs

You’ll need to replace connection_id above with whatever you called the new connection when you were using the Psychic dashboard.

You’ll also want to create a .env file for your environment variables.

# Substitute your OpenAI key here. This is necessary to generate embeddings and also for making calls to OpenAI's completions endpoint
# You can get this from <>

Step 2: Chunk Your Data

Once you’ve connected to the data source and loaded your documents, the next step is to chunk your data.

Chunking involves breaking down your data into segments that fit into your LLMs context widow. For GPT 3.5 this is ~2000 tokens, assuming you reserve 2000 tokens for the prompt itself. The best way to do this is to use LangChain’s text splitters. For now we will use the character text splitter, which is the simplest.

from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=250) ## Setting an overlap for the CharacterTextSplitter helps improve results
texts = text_splitter.split_documents(documents)

Step 3: Turn Your Chunks into Embeddings and Load Them into a Vector Database

Now that you have your cleaned and chunked data, the next step is to turn these chunks into embeddings, which are mathematical representations of your data that AI can understand and use.

To do this, you can use OpenAI, which lets you convert text to embeddings through a call to their embedding’s endpoint. This allows you to perform efficient similarity searches in high-dimensional spaces. We recommend using Weaviate or Chroma as open source options.

Using Chroma, embeddings are generated and inserted into a temporary local instance of Chroma all in one step

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
embeddings = OpenAIEmbeddings()
vdb = Chroma.from_documents(texts, embeddings)

Step 4: Build a retrieval pipeline to insert data into your prompt

Now comes the fun part. Our data is vectorized and ready to be used, but how do we actually retrieve it? To do this, we need to create an API endpoint that handles all the data retrieval as well as calls to the OpenAI completions endpoint to back a LLM-generated response.

Requests to this endpoint should contain a query from the user, which we will also vectorize and use to perform an approximate nearest neighbor search using our vector database.

We’ll implement this with FastAPI. Like its name suggests, it’s the fastest way to spin up a local API server in Python.

from langchain.llms import OpenAI
from langchain.chains import RetrievalQAWithSourcesChain
from fastapi import FastAPI
app = FastAPI()@app.get("/get_answer")
async def get_answer(request: Request):
chain = RetrievalQAWithSourcesChain.from_chain_type(OpenAI(temperature=0), chain_type="stuff", retriever=vdb.as_retriever())
body = await request.json()
query = body["query"]
answer = chain({"question": query}, return_only_outputs=True)
return {"answer": answer}

We are using LangChain’s RetrievalQAWithSourcesChain to retrieve the relevant chunks from Chroma and inserting it into a QA prompt. This module then makes a call to OpenAI’s completions endpoint and returns the answer from OpenAI. We can include this answer in our own API response.

Step 5: Put It All Together and Deploy

Run the following commands from your terminal

poetry shell
uvicorn main:app --reload

You should see this message if the server successfully started.

INFO:     Uvicorn running on <> (Press CTRL+C to quit)

Now we have an API that we can call from anywhere to get back answers based on the documents we ingested through Psychic.

You can use Postman or a similar HTTP client to test the API

When deploying this to production, all the server-side code will still work as expected. You’ll just need to do 2 things:

  • Deploy a persistent instance of Chroma or Weaviate to use in production
  • Use the Psychic Link package to add the modal you used to connect your data to any React app. This is great for establishing connections to your users’ data rather than your own.


And there you have it! With these steps, you can now effortlessly connect to your knowledge base, convert your documents into embeddings, and start answering questions about them using AI technologies like ChatGPT. Psychic simplifies the whole process, letting you unlock the full potential of AI in navigating your unstructured data.