Common LangChain Recipes

Bhaskar S *UPDATED*03/22/2025


In the article on LangChain , we covered the basics of LangChain framework and how to get started with it. In this article, we will provide some common code recipes for working with LangChain.

Installation and Setup

The installation and setup will be on a Ubuntu 24.04 LTS based Linux desktop. Ensure that Ollama is installed and setup on the desktop (see instructions).

In addition, ensure that the Python 3.x programming language as well as the Jupyter Notebook package is installed and setup on the desktop.

IFinally, ensure that all the LangChain packages are properly setup on the desktop (see instructions).

Assuming that the ip address on the Linux desktop is, start the Ollama platform by executing the following command in the terminal window:

$ docker run --rm --name ollama --network="host" -p -v $HOME/.ollama:/root/.ollama ollama/ollama:0.6.2

For the LLM model, we will be using the recently released IBM Granite 3.1 1B model.

Open a new terminal window and execute the following docker command to download the LLM model:

$ docker exec -it ollama ollama run granite3.1-moe:1b

For the multi model (includes image processing), we will be using the recently released Gemma 3 4B model.

Open a new terminal window and execute the following docker command to download the LLM model:

$ docker exec -it ollama ollama run gemma3:4b

With this, we are ready to showcase the common code recipes using LangChain.

Code Recipes for LangChain

Create a file called .env with the following environment variables defined:


To load the environment variables and assign them to Python variable, execute the following code snippet:

from dotenv import load_dotenv, find_dotenv

import os


llm_temperature = float(os.getenv('LLM_TEMPERATURE'))
ollama_model = os.getenv('OLLAMA_MODEL')
ollama_vl_model = os.getenv('OLLAMA_VL_MODEL')
ollama_base_url = os.getenv('OLLAMA_BASE_URL')
gpu_dataset = os.getenv('GPU_DATASET')
chroma_db_dir = os.getenv('CHROMA_DB_DIR')
receipt_image = os.getenv('RECEIPT_IMAGE')

Executing the above Python code snippet generates no output.

To initialize an instance of Ollama running the our desired LLM model granite3.1-moe:1b, execute the following code snippet:

from langchain_ollama import OllamaLLM

ollama_llm = OllamaLLM(base_url=ollama_base_url, model=ollama_model, temperature=llm_temperature)

Executing the above Python code snippet generates no output.

To initialize an instance of Ollama running the our desired multi model gemma3:4b, execute the following code snippet:

from langchain_ollama import OllamaLLM

ollama_vl_llm = OllamaLLM(base_url=ollama_base_url, model=ollama_vl_model, temperature=llm_temperature)

Executing the above Python code snippet generates no output.

To initialize an instance of vector embedding class corresponding to the model running in Ollama, execute the following code snippet:

from langchain_ollama import OllamaEmbeddings

ollama_embedding = OllamaEmbeddings(base_url=ollama_base_url, model=ollama_model)

Executing the above Python code snippet generates no output.

Recipe 1 : Execute a simple LLM prompt

The following code snippet creates a simple prompt template, an LLM chain, and executes the chain to get a response from the LLM model:

from langchain_core.prompts import PromptTemplate

template = """
Question: {question}

Answer: Summarize in less than {tokens} words.

prompt = PromptTemplate.from_template(template=template)

chain = prompt | ollama_llm

result = chain.invoke({'question': 'describe langchain ai framework', 'tokens': 50})


Executing the above Python code snippet generates the following typical output:


Langchain is an open-source AI framework that enables developers to build and deploy language models for various applications, including chatbots, content generation, and translation services. It supports multiple languages and offers a modular architecture for customization.

We have successfully demonstrated the Python code snippet for Recipe 1.

Recipe 2 : Generate Structured Output from LLM

The following code snippet creates an LLM chat instance, a data class that will conform to the structured output, a chain using the data class as the schema, and executes the chain to get a structuredresponse from the LLM model:

from pydantic import BaseModel
from langchain_ollama import ChatOllama

class GpuSpecs(BaseModel):
  name: str
  vram: int
  cuda_cores: int

ollama_struct_llm = ChatOllama(base_url=ollama_base_url, model=ollama_model, format='json', temperature=llm_temperature)

structured_llm = ollama_struct_llm.with_structured_output(GpuSpecs)

result2 = structured_llm.invoke('Get the GPU specs for RTX 4070 Ti')


Executing the above Python code snippet generates the following typical output:


name='RTX 4070 Ti' vram=128 cuda_cores=128

We have successfully demonstrated the Python code snippet for Recipe 2.

Recipe 3 : Chat with LLM preserving History

The following code snippet creates a chat session, an in-memory chat history store, an LLM chain using the history store, and executes the chain to get responses from the LLM model:

from langchain_ollama import ChatOllama
from langchain_core.chat_history import BaseChatMessageHistory, InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

ollama_chat_llm = ChatOllama(base_url=ollama_base_url, model=ollama_model, temperature=llm_temperature)

in_memory_store = {}

def get_chat_history(session_id: str) -> BaseChatMessageHistory:
  if session_id not in in_memory_store:
    in_memory_store[session_id] = InMemoryChatMessageHistory()
  return in_memory_store[session_id]

prompt2 = ChatPromptTemplate.from_messages([
  ('system', 'Helpful AI assistant!'),
  ('human', '{input}')

config = {'configurable': {'session_id': 'recipes'}}

chain2 = prompt2 | ollama_chat_llm

chain2_with_history = RunnableWithMessageHistory(chain2, get_chat_history, input_messages_key='input', history_messages_key='chat_history')

result3 = chain2_with_history.invoke({'input': 'Suggest top 3 budget GPUs in one line'}, config=config)



result4 = chain2_with_history.invoke({'input': 'Not impressed, try again'}, config=config)


Executing the above Python code snippet generates the following typical output:


1. NVIDIA GeForce RTX 3060: High-performance GPU for gaming and professional use, with competitive price.
2. AMD Radeon RX 5600 XT: Affordable option offering excellent single-GPU performance.
3. NVIDIA GeForce GTX 1650 Super: Balanced budget choice providing good gaming performance.
1. NVIDIA GeForce RTX 3060 Ti: High-end GPU for demanding tasks and professional use, with excellent price-performance ratio.
2. AMD Radeon RX 5700 XT: Budget-friendly option offering competitive single-GPU performance in gaming and professional applications.
3. NVIDIA GeForce GTX 1660 Super: Balanced budget choice providing good gaming performance at a lower price point.

We have successfully demonstrated the Python code snippet for Recipe 3.

Recipe 4 : Q&A on a CSV file content using LLM

For this recipe, we will create a CSV file called gpu_specs.csv that will contain the GPU card specs for a handful of popular GPU cards. The following list the contents of the CSV file:

NVIDIA,GeForce RTX 4060,12,5888,AD104
NVIDIA,GeForce RTX 4070,12,7680,AD104
NVIDIA,GeForce RTX 4080,16,9728,AD103
NVIDIA,GeForce RTX 4090,24,17408,AD102
AMD,Radeon RX 6950 XT,16,5120,Navi 21
AMD,Radeon RX 7700 XT,8, 4096,Navi 33
AMD,Radeon RX 7800 XT,12,8192,Navi 32
AMD,Radeon RX 7900 XT,16,12288,Navi 31
NVIDIA,GeForce RTX 3060 Ti,8,4864,GA103S
NVIDIA,GeForce RTX 3070 Ti,8,5632,GA104
NVIDIA,GeForce RTX 3080,12,8960,GA102
NVIDIA,GeForce RTX 3090 Ti,24,10752,GA102
AMD,Radeon RX 6400,4,768,Navi 24
AMD,Radeon RX 6500 XT,4,1024,Navi 24
AMD,Radeon RX 6650 XT,8,2048,Navi 23
AMD,Radeon RX 6750 XT,12,2560,Navi 22
AMD,Radeon RX 6850M XT,12,2560,Navi 22
Intel,Arc A770,16,4096,DG2-512
Intel,Arc A780,16,4096,DG2-512

The following code snippet creates a CSV file loader, loads the rows from the file as documents into an in-memory vector store using the embedding class, creates a prompt template with the data from the CSV file as the context, creates a Q&A LLM chain, and executes the chain to get answers to questions from the LLM model:

from langchain_core.documents import Document
from langchain.document_loaders import CSVLoader
from langchain_core.vectorstores import InMemoryVectorStore
from langchain.chains import RetrievalQA

loader = CSVLoader(gpu_dataset, encoding='utf-8')

encoding = '\ufeff'
newline = '\n'
comma_space = ', '

documents = [Document(metadata=doc.metadata, 
                                                   .replace(newline, comma_space)) for doc in loader.load()]

vector_store = InMemoryVectorStore(ollama_embedding)


template2 = """
Given the following context, answer the question based only on the provided context.

Context: {context}

Question: {question}

prompt2 = PromptTemplate.from_template(template2)

retriever = vector_store.as_retriever()

qa_chain = RetrievalQA.from_chain_type(llm=ollama_llm, retriever=retriever, chain_type_kwargs={'prompt': prompt})

result5 = qa_chain.invoke({'query': 'what is the memory size on GeForce RTX 3070 Ti'})



result6 = qa_chain.invoke({'query': 'who is the manufacturer of rx 7800 xt'})



result7 = qa_chain.invoke({'query': 'what is the GPU chip on RTX 3090 Ti'})


Executing the above Python code snippet generates the following typical output:


{'query': 'what is the memory size on GeForce RTX 3070 Ti', 'result': 'The memory size on GeForce RTX 3070 Ti is 8GB.'}
{'query': 'who is the manufacturer of rx 7800 xt', 'result': 'The manufacturer of RX 7800 XT is AMD.'}
{'query': 'what is the GPU chip on RTX 3090 Ti', 'result': "The GPU chip on NVIDIA's GeForce RTX 3090 Ti is GA102."}

We have successfully demonstrated the Python code snippet for Recipe 4.

Recipe 5 : Q&A on a PDF file content using LLM

For this recipe, we will use the Nvidia 3rd Quarter 2024 financial report to analyze it !!!

Also, ensure to install the additional Python module(s) by executing the following command:

$ pip install pypdf

The following code snippet creates a PDF file loader, loads the page chunks from the PDF file as embedded documents into the persistent vector store Chroma using the embedding class, creates a prompt template with the data from the PDF file as the context, creates a Q&A LLM chain, and executes the chain to get answers to questions from the LLM model:

from langchain.document_loaders import PyPDFLoader
from langchain_chroma import Chroma
from langchain.chains import RetrievalQA

nvidia_q3_2024 = './data/NVIDIA-3rd-Qtr.pdf'

pdf_loader = PyPDFLoader(nvidia_q3_2024)

pdf_pages = pdf_loader.load_and_split()

vector_store2 = Chroma(collection_name='pdf_docs', embedding_function=ollama_embedding, persist_directory=chroma_db_dir)


template3 = """
Given the following context, answer the question based only on the provided context.

Context: {context}

Question: {question}

prompt4 = PromptTemplate.from_template(template3)

retriever2 = vector_store2.as_retriever()

qa_chain2 = RetrievalQA.from_chain_type(llm=ollama_llm, retriever=retriever2, chain_type_kwargs={'prompt': prompt4})

result8 = qa_chain2.invoke({'query': 'what was the revenue in q3 2024'})



result9 = qa_chain2.invoke({'query': 'what were the expenses in q3 2023'})


Executing the above Python code snippet generates the following typical output:


{'query': 'what was the revenue in q3 2024', 'result': 'The revenue for Q3 2024 is $17,475 million.'}
{'query': 'what were the expenses in q3 2023', 'result': "In Q3 FY23, NVIDIA's operating expenses increased by 16% compared to Q2 FY23. The specific increase was $298 million from $2,576 million to $2,874 million. This growth in expenses is part of the company's strategy to support its growth engines such as GPUs, CPUs, networking, AI foundry services, and NVIDIA AI Enterprise software."}

We have successfully demonstrated the Python code snippet for Recipe 5.

Recipe 6 : Basic usage of the ReAct Framework

The LangChain ReAct Framework is a technique that enables LangChain virtual agents to interact with LLMs through prompting, which mimics the reasoning (Re) and acting (Act) behavior of a human to solve problems in an environment, using various external tools. In other words, the ReAct framework enables LLMs to reason and act based on the situation in the environment.

For this recipe, we will demonstrate a virtual agent that mimics the behavior of a basic sysadmin !!!

The following code snippet creates a custom tool for executing shell commands, creates a ReAct prompt template, creates a ReAct agent that will use the LLM model and the custom tool, creates an instance of AgentExecutor that enables the multi-step reasoning process for the ReAct agent, and invokes the virtual agent to get specific answers to the questions from the LLM model:

import subprocess
from langchain_core.prompts import PromptTemplate
from langchain_ollama import ChatOllama
from langchain.agents import tool
from langchain.agents import AgentExecutor, create_react_agent

def execute_shell_command(command: str) -> str:
  """Tool to execute shell commands"""
  print(f'Executing shell command: {command}')

    result = subprocess.run(command, shell=True, check=True, text=True, capture_output=True)
    if result.returncode != 0:
      return f'Error executing shell command - {command}'
    return result.stdout
  except subprocess.CalledProcessError as e:

react_template = '''Answer the following questions as best you can. You have access to the following tools:


Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat no more than N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question


Question: {input}
Thought: {agent_scratchpad}'''

react_prompt_template = PromptTemplate.from_template(react_template)

tools = [execute_shell_command]

react_agent = create_react_agent(ollama_chat_llm, tools, react_prompt_template, stop_sequence=True)

agent_executor = AgentExecutor(agent=react_agent, tools=tools, max_iterations=5, max_execution_time=30, verbose=True, handle_parsing_errors=True)

question_1 = 'Can you find what operating system is running on this system?'

print(agent_executor.invoke({'input': question_1}))


question_2 = 'Can you all the network interfaces from this system?'

print(agent_executor.invoke({'input': question_2}))


question_3 = 'Linux system seems a bit sluggish, can you help find which processes maybe consuming system resources?'

print(agent_executor.invoke({'input': question_3}))

Executing the above Python code snippet generates the following trimmed output:


> Entering new AgentExecutor chain...
INFO 2025-01-12 07:43:40,853 - Executing shell command: echo $OSTYPE
> Finished chain.
{'input': 'Can you find what operating system is running on this system?',
 'output': 'The operating system running on this system is Linux.'}
> Entering new AgentExecutor chain
INFO 2025-01-12 07:46:04,836 - Executing shell command: ifconfig | grep 'inet addr:' | awk '{print $2}' | xargs ip addr showINFO 2025-01-12 07:46:04,836 - Executing shell command: ifconfig | grep 'inet addr:' | awk '{print $2}' | xargs ip addr show
> Finished chain.
{'input': 'Can you all the network interfaces from this system?',
 'output': 'The network interfaces from this system are:\n1: lo (loopback) with IP address and netmask 65536\n2: enp42s0 (Ethernet) with IP address, MAC address 00:00:00:00:00:00, and netmask 1500\n3: wlo1 (Wireless LoRaWAN) with IP address, MAC address 11:11:11:11:11:11, and netmask 64\n4: docker0 (Docker network interface) with IP address'}
> Entering new AgentExecutor chain
INFO 2025-01-12 07:49:11,068 - Executing shell command: ps aux | grep -i 'cpu' | awk '{print $2}' | sort | uniq -cINFO 2025-01-12 07:49:11,068 - Executing shell command: ps aux | grep -i 'cpu' | awk '{print $2}' | sort | uniq -c
> Finished chain.
{'input': 'Linux system seems a bit sluggish, can you help find which processes maybe consuming system resources?',
 'output': 'The processes consuming the most system resources on your Linux system are:\n\n1. 105 CPU usage\n2. 123 CPU usage\n3. 154 CPU usage\n4. 20 CPU usage\n5. 21 CPU usage\n6. 27 CPU usage\n7. 33 CPU usage\n8. 3325 CPU usage\n9. 3660 CPU usage\n10. 39 CPU usage\n11. 4474 CPU usage\n12. 45 CPU usage\n13. 4503 CPU usage\n14. 4537 CPU usage\n15. 51 CPU usage\n16. 5129 CPU usage\n17. 5131 CPU usage\n18. 55 CPU usage\n19. 57 CPU usage\n20. 63 CPU usage\n21. 69 CPU usage\n22. 75 CPU usage\n23. 755 CPU usage\n24. 81 CPU usage\n25. 87 CPU usage\n26. 93 CPU usage\n27. 99 CPU usage\n28. PID: 1'}

We have successfully demonstrated the Python code snippet for Recipe 6.

Recipe 7 : Image Processing using Multi Model

A Multi Model is a AI model that can process multiple types of data, such as text, images, and audio, etc.

For this recipe, we will demonstrate the Optical Character Recognition (OCR) capabilities of the Multi Model by processing the image of a Receipt of Transactions !!!

The following code snippet defines a method to convert a JPG image to base64 string, defines a method to create a chat prompt, creates an instance of chat running the multi model, creates a chain using the chat prompt and the chat instance, and finally invokes the chain passing in the prompt text and the image to process:

from langchain_core.messages import HumanMessage
from io import BytesIO
from PIL import Image

import base64

def jpg_to_base64(image):
  jpg_buffer = BytesIO()
  pil_image = Image.open(image)
  pil_image.save(jpg_buffer, format='JPEG')
  return base64.b64encode(jpg_buffer.getvalue()).decode('utf-8')

def create_chat_prompt(data):
  text = data['text']
  image = data['image']

  image_part = {
    'type': 'image_url',
    'image_url': f'data:image/jpeg;base64,{image}',

  content_parts = []

  text_part = {'type': 'text', 'text': text}


  return [HumanMessage(content=content_parts)]

ollama_chat_vlm = ChatOllama(base_url=ollama_base_url, model=ollama_vl_model, temperature=llm_temperature)

vl_chain = create_chat_prompt | ollama_chat_vlm

result10 = vl_chain.invoke({'text': 'Itemize all the transactions from the receipt image in detail', 
                            'image': jpg_to_base64(receipt_image)})


Executing the above Python code snippet generates the following trimmed output:


Okay, here's a detailed breakdown of all the transactions from the receipt image:

**Darth Vader #1234: Transactions**

*   Feb 17: AMAZON MKTPL*N606Z9AF3Amzn.com/billWA - $9.87
*   Feb 17: AMAZON MKTPL*L89WB2J13Amzn.com/billWA - $29.99

**Rey Skywalker #9876: Transactions**

*   Feb 17: TJMAX*0224LREYNCEVILLENJ - $21.31
*   Feb 17: WEGMANS*93PRINCETONNJ - $17.79
*   Feb 17: TJ MAX*82BEAST WINDSORNJ - $90.58
*   Feb 17: TRADER JOE S*607PRINCETONNJ - $2.69
*   Feb 18: WEGMANS*93PRINCETONNJ - $19.35

Let me know if you'd like me to perform any calculations or have any other questions about the receipt!

We have successfully demonstrated the Python code snippet for Recipe 7.


Quick Primer on LangChain

LangChain Documentation

© PolarSPARC