PolarSPARC |
Common LangChain Recipes
Bhaskar S | 12/28/2024 |
Overview
In the article on LangChain , we covered the basics of LangChain framework and how to get started with it. In this article, we will provide some common code recipes for working with LangChain.
Installation and Setup
The installation and setup will be on a Ubuntu 24.04 LTS based Linux desktop. Ensure that Ollama is installed and setup on the desktop (see instructions).
In addition, ensure that the Python 3.x programming language as well as the Jupyter Notebook package is installed and setup on the desktop.
IFinally, ensure that all the LangChain packages are properly setup on the desktop (see instructions).
For the LLM model, we will be using the recently released IBM Granite 3.1 1B model.
Open a new terminal window and execute the following docker command to download the LLM model:
$ docker exec -it ollama ollama run granite3.1-moe:1b
With this, we are ready to showcase the common code recipes using LangChain.
Code Recipes for LangChain
Create a file called .env with the following environment variables defined:
LLM_TEMPERATURE=0.0 OLLAMA_MODEL='granite3.1-moe:1b' OLLAMA_BASE_URL='http://192.168.1.25:11434' CHROMA_DB_DIR='/home/bswamina/.chromadb' GPU_DATASET='./data/gpu_specs.csv'
To load the environment variables and assign them to Python variable, execute the following code snippet:
from dotenv import load_dotenv, find_dotenv import os load_dotenv(find_dotenv()) llm_temperature = float(os.getenv('LLM_TEMPERATURE')) ollama_model = os.getenv('OLLAMA_MODEL') ollama_base_url = os.getenv('OLLAMA_BASE_URL') gpu_dataset = os.getenv('GPU_DATASET') chroma_db_dir = os.getenv('CHROMA_DB_DIR')
Executing the above Python code snippet generates no output.
To initialize an instance of Ollama running the our desired LLM model granite3.1-moe:1b, execute the following code snippet:
from langchain_ollama import OllamaLLM ollama_llm = OllamaLLM(base_url=ollama_base_url, model=ollama_model, temperature=llm_temperature)
Executing the above Python code snippet generates no output.
To initialize an instance of vector embedding class corresponding to the model running in Ollama, execute the following code snippet:
from langchain_ollama import OllamaEmbeddings ollama_embedding = OllamaEmbeddings(base_url=ollama_base_url, model=ollama_model)
Executing the above Python code snippet generates no output.
Recipe 1 : Execute a simple LLM prompt
The following code snippet creates a simple prompt template, an LLM chain, and executes the chain to get a response from the LLM model:
from langchain_core.prompts import PromptTemplate template = """ Question: {question} Answer: Summarize in less than {tokens} words. """ prompt = PromptTemplate.from_template(template=template) chain = prompt | ollama_llm result = chain.invoke({'question': 'describe langchain ai framework', 'tokens': 50}) print(result)
Executing the above Python code snippet generates the following typical output:
Langchain is an open-source AI framework that enables developers to build and deploy language models for various applications, including chatbots, content generation, and translation services. It supports multiple languages and offers a modular architecture for customization.
We have successfully demonstrated the Python code snippet for Recipe 1.
Recipe 2 : Generate Structured Output from LLM
The following code snippet creates an LLM chat instance, a data class that will conform to the structured output, a chain using the data class as the schema, and executes the chain to get a structuredresponse from the LLM model:
from pydantic import BaseModel from langchain_ollama import ChatOllama class GpuSpecs(BaseModel): name: str vram: int cuda_cores: int ollama_struct_llm = ChatOllama(base_url=ollama_base_url, model=ollama_model, format='json', temperature=llm_temperature) structured_llm = ollama_struct_llm.with_structured_output(GpuSpecs) result2 = structured_llm.invoke('Get the GPU specs for RTX 4070 Ti') print(result2)
Executing the above Python code snippet generates the following typical output:
name='RTX 4070 Ti' vram=128 cuda_cores=128
We have successfully demonstrated the Python code snippet for Recipe 2.
Recipe 3 : Chat with LLM preserving History
The following code snippet creates a chat session, an in-memory chat history store, an LLM chain using the history store, and executes the chain to get responses from the LLM model:
from langchain_ollama import ChatOllama from langchain_core.chat_history import BaseChatMessageHistory, InMemoryChatMessageHistory from langchain_core.runnables.history import RunnableWithMessageHistory from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder ollama_chat_llm = ChatOllama(base_url=ollama_base_url, model=ollama_model, temperature=llm_temperature) in_memory_store = {} def get_chat_history(session_id: str) -> BaseChatMessageHistory: if session_id not in in_memory_store: in_memory_store[session_id] = InMemoryChatMessageHistory() return in_memory_store[session_id] prompt2 = ChatPromptTemplate.from_messages([ ('system', 'Helpful AI assistant!'), MessagesPlaceholder(variable_name='chat_history'), ('human', '{input}') ]) config = {'configurable': {'session_id': 'recipes'}} chain2 = prompt2 | ollama_chat_llm chain2_with_history = RunnableWithMessageHistory(chain2, get_chat_history, input_messages_key='input', history_messages_key='chat_history') result3 = chain2_with_history.invoke({'input': 'Suggest top 3 budget GPUs in one line'}, config=config) print(result3.content) print('-------------------------') result4 = chain2_with_history.invoke({'input': 'Not impressed, try again'}, config=config) print(result4.content)
Executing the above Python code snippet generates the following typical output:
1. NVIDIA GeForce RTX 3060: High-performance GPU for gaming and professional use, with competitive price. 2. AMD Radeon RX 5600 XT: Affordable option offering excellent single-GPU performance. 3. NVIDIA GeForce GTX 1650 Super: Balanced budget choice providing good gaming performance. ------------------------- 1. NVIDIA GeForce RTX 3060 Ti: High-end GPU for demanding tasks and professional use, with excellent price-performance ratio. 2. AMD Radeon RX 5700 XT: Budget-friendly option offering competitive single-GPU performance in gaming and professional applications. 3. NVIDIA GeForce GTX 1660 Super: Balanced budget choice providing good gaming performance at a lower price point.
We have successfully demonstrated the Python code snippet for Recipe 3.
Recipe 4 : Q&A on a CSV file content using LLM
For this recipe, we will create a CSV file called gpu_specs.csv that will contain the GPU card specs for a handful of popular GPU cards. The following list the contents of the CSV file:
manufacturer,productName,memorySize,memoryClock,gpuChip NVIDIA,GeForce RTX 4060,12,5888,AD104 NVIDIA,GeForce RTX 4070,12,7680,AD104 NVIDIA,GeForce RTX 4080,16,9728,AD103 NVIDIA,GeForce RTX 4090,24,17408,AD102 AMD,Radeon RX 6950 XT,16,5120,Navi 21 AMD,Radeon RX 7700 XT,8, 4096,Navi 33 AMD,Radeon RX 7800 XT,12,8192,Navi 32 AMD,Radeon RX 7900 XT,16,12288,Navi 31 NVIDIA,GeForce RTX 3060 Ti,8,4864,GA103S NVIDIA,GeForce RTX 3070 Ti,8,5632,GA104 NVIDIA,GeForce RTX 3080,12,8960,GA102 NVIDIA,GeForce RTX 3090 Ti,24,10752,GA102 AMD,Radeon RX 6400,4,768,Navi 24 AMD,Radeon RX 6500 XT,4,1024,Navi 24 AMD,Radeon RX 6650 XT,8,2048,Navi 23 AMD,Radeon RX 6750 XT,12,2560,Navi 22 AMD,Radeon RX 6850M XT,12,2560,Navi 22 Intel,Arc A770,16,4096,DG2-512 Intel,Arc A780,16,4096,DG2-512
The following code snippet creates a CSV file loader, loads the rows from the file as documents into an in-memory vector store using the embedding class, creates a prompt template with the data from the CSV file as the context, creates a Q&A LLM chain, and executes the chain to get answers to questions from the LLM model:
from langchain_core.documents import Document from langchain.document_loaders import CSVLoader from langchain_core.vectorstores import InMemoryVectorStore from langchain.chains import RetrievalQA loader = CSVLoader(gpu_dataset, encoding='utf-8') encoding = '\ufeff' newline = '\n' comma_space = ', ' documents = [Document(metadata=doc.metadata, page_content=doc.page_content.lstrip(encoding) .replace(newline, comma_space)) for doc in loader.load()] vector_store = InMemoryVectorStore(ollama_embedding) vector_store.add_documents(documents) template2 = """ Given the following context, answer the question based only on the provided context. Context: {context} Question: {question} """ prompt2 = PromptTemplate.from_template(template2) retriever = vector_store.as_retriever() qa_chain = RetrievalQA.from_chain_type(llm=ollama_llm, retriever=retriever, chain_type_kwargs={'prompt': prompt}) result5 = qa_chain.invoke({'query': 'what is the memory size on GeForce RTX 3070 Ti'}) print(result5) print('-------------------------') result6 = qa_chain.invoke({'query': 'who is the manufacturer of rx 7800 xt'}) print(result6) print('-------------------------') result7 = qa_chain.invoke({'query': 'what is the GPU chip on RTX 3090 Ti'}) print(result7)
Executing the above Python code snippet generates the following typical output:
{'query': 'what is the memory size on GeForce RTX 3070 Ti', 'result': 'The memory size on GeForce RTX 3070 Ti is 8GB.'} ------------------------- {'query': 'who is the manufacturer of rx 7800 xt', 'result': 'The manufacturer of RX 7800 XT is AMD.'} ------------------------- {'query': 'what is the GPU chip on RTX 3090 Ti', 'result': "The GPU chip on NVIDIA's GeForce RTX 3090 Ti is GA102."}
We have successfully demonstrated the Python code snippet for Recipe 4.
Recipe 5 : Q&A on a PDF file content using LLM
For this recipe, we will use the Nvidia 3rd Quarter 2024 financial report to analyze it !!!
Also, ensure to install the additional Python module(s) by executing the following command:
$ pip install pypdf
The following code snippet creates a PDF file loader, loads the page chunks from the PDF file as embedded documents into the persistent vector store Chroma using the embedding class, creates a prompt template with the data from the PDF file as the context, creates a Q&A LLM chain, and executes the chain to get answers to questions from the LLM model:
from langchain.document_loaders import PyPDFLoader from langchain_chroma import Chroma from langchain.chains import RetrievalQA nvidia_q3_2024 = './data/NVIDIA-3rd-Qtr.pdf' pdf_loader = PyPDFLoader(nvidia_q3_2024) pdf_pages = pdf_loader.load_and_split() vector_store2 = Chroma(collection_name='pdf_docs', embedding_function=ollama_embedding, persist_directory=chroma_db_dir) vector_store2.add_documents(pdf_pages) template3 = """ Given the following context, answer the question based only on the provided context. Context: {context} Question: {question} """ prompt4 = PromptTemplate.from_template(template3) retriever2 = vector_store2.as_retriever() qa_chain2 = RetrievalQA.from_chain_type(llm=ollama_llm, retriever=retriever2, chain_type_kwargs={'prompt': prompt4}) result8 = qa_chain2.invoke({'query': 'what was the revenue in q3 2024'}) print(result8) print('-------------------------') result9 = qa_chain2.invoke({'query': 'what were the expenses in q3 2023'}) print(result9)
Executing the above Python code snippet generates the following typical output:
{'query': 'what was the revenue in q3 2024', 'result': 'The revenue for Q3 2024 is $17,475 million.'} ------------------------- {'query': 'what were the expenses in q3 2023', 'result': "In Q3 FY23, NVIDIA's operating expenses increased by 16% compared to Q2 FY23. The specific increase was $298 million from $2,576 million to $2,874 million. This growth in expenses is part of the company's strategy to support its growth engines such as GPUs, CPUs, networking, AI foundry services, and NVIDIA AI Enterprise software."}
We have successfully demonstrated the Python code snippet for Recipe 5.
References