LangChain 기초

Rawon

2025년 10월 21일

LangChain 기초

Rawon

2025년 10월 21일

🚀 LangChain 기초: LLM 호출부터 LCEL까지

💡 LLM 호출하기

LLM을 호출하는 방법에는 GPT, Claude와 같은 상용 서비스 외에도 Ollama라는 오픈소스 도구를 활용하여 로컬에서 모델을 운용하는 방법이 있습니다.

📦 Ollama 설치 및 모델 다운로드

https://ollama.com/download 에서 운영체제에 맞게 다운로드하여 실행할 수 있습니다. 사용할 모델은 git clone처럼 간단히 불러올 수 있으며, https://ollama.com/search 에서 원하는 모델을 선택할 수 있습니다.

모델 선택 시 고려사항은 다음과 같습니다:

tools: 도구 활용이 가능한 언어 모델입니다
1.5b, 7b: 파라미터 개수를 나타내며, 숫자가 작을수록 경량화된 모델입니다

모델 다운로드 예시:

bash

ollama pull llama3.2:1b

🐍 가상환경 설정

bash

python -m venv .venv
source .venv/bin/activate

⚠️ 주의: Cursor 등의 에디터에서 .ipynb 파일을 만들고 최초 한 번 Jupyter 커널을 선택해야 정상 실행됩니다.

🔧 LangChain으로 Ollama 사용하기

python

%pip install -q langchain-ollama

from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2:1b")
llm.invoke("What is the capital of the France?")

실행 결과는 다음과 같습니다:

python

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, response_metadata={'model': 'llama3.2:1b', 'created_at': '2025-10-21T01:49:25.346267Z', 'done': True, 'done_reason': 'stop'}, usage_metadata={'input_tokens': 11, 'output_tokens': 12, 'total_tokens': 23})

🤖 ChatGPT 사용하기

Ollama 대신 ChatGPT를 사용하는 것도 가능합니다.

python

%pip install -q langchain-openai

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
llm.invoke("What is the capital of the France?")

하지만 API KEY가 없으면 OpenAIError가 발생합니다. 이를 해결하는 방법은 두 가지입니다:

클라이언트(ChatOpenAI)에 API KEY 직접 입력하기
OPENAI_API_KEY 환경변수 설정하기 (권장) ✅

OpenAI 공식 사이트에서 API KEY를 발급받은 후 프로젝트 루트에 .env 파일을 생성하여 저장합니다:

plain

OPENAI_API_KEY=your-api-key-here

그 다음 코드 최상단에서 환경변수를 불러옵니다:

python

%pip install -q python-dotenv

from dotenv import load_dotenv
load_dotenv()

이제 다시 실행하면 정상적으로 답변이 출력됩니다! 🎉

📝 프롬프트 템플릿

python

from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2:1b")
llm.invoke("What is the capital of the France?")

위 코드에서 invoke 내부에 있는 "What is the capital of the France?"가 바로 프롬프트입니다. 프롬프트는 LLM을 호출하는 명령어 또는 질문입니다.

LangChain에서 프롬프트로 사용할 수 있는 타입은 다음과 같습니다:

PromptValue (PromptTemplate에서 생성)
str (문자열)
list of BaseMessage (메시지 리스트)

🎯 PromptTemplate 사용하기

PromptTemplate을 사용하면 재사용 가능한 프롬프트를 만들 수 있습니다.

python

from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate(
    template="What is the capital of {country}?",
    input_variables=["country"]
)

prompt = prompt_template.invoke({"country": "France"})
print(prompt)

llm.invoke(prompt_template.invoke({"country": "France"}))

프롬프트 템플릿의 구성 요소:

template: 활용할 프롬프트의 문자열 형태입니다
{}: placeholder로, 동적으로 값을 주입할 수 있습니다
input_variables: placeholder에 들어갈 변수명 리스트입니다
invoke: dictionary 형태로 값을 전달하여 완성된 프롬프트를 생성합니다

💬 BaseMessage 활용하기

프롬프트를 메시지 형태로 전달할 수도 있습니다.

python

from langchain_core.messages import HumanMessage

llm.invoke(HumanMessage(content="What is the capital of France?"))

HumanMessage는 BaseMessage를 상속받는 클래스입니다. BaseMessage를 상속받는 주요 클래스는 다음과 같습니다:

SystemMessage 🎭: LLM의 역할이나 페르소나를 정의합니다
HumanMessage 👤: 사용자의 메시지입니다
AIMessage 🤖: LLM의 응답 메시지입니다
ToolMessage 🛠️: Agent가 도구를 사용할 때 활용됩니다

대화 히스토리를 제공하여 컨텍스트를 주는 예시:

python

from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

llm.invoke([
    SystemMessage(content="You are a helpful assistant that can answer questions."),
    HumanMessage(content="What is the capital of France?"),
    AIMessage(content="The capital of France is Paris."),
    HumanMessage(content="What is the capital of Germany?"),
])

💡 팁: 이렇게 LLM에게 마치 이전에 대화한 것처럼 예시를 전달함으로써 더욱 정확하고 일관성 있는 답변을 받을 수 있습니다. 이를 Few-shot Prompting이라고 합니다.

🎨 ChatPromptTemplate

확장성과 유지보수성을 고려한다면 ChatPromptTemplate을 사용하는 것이 좋습니다.

python

from langchain_core.prompts import ChatPromptTemplate

chat_prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant!"),
    ("human", "What is the capital of {country}?")
])

chat_prompt = chat_prompt_template.invoke({"country": "France"})
print(chat_prompt)

실행 결과:

python

messages=[
    SystemMessage(content='You are a helpful assistant!', additional_kwargs={}, response_metadata={}),
    HumanMessage(content='What is the capital of France?', additional_kwargs={}, response_metadata={})
]

📌 중요: ChatPromptTemplate에서 placeholder를 사용하는 경우, 각 메시지는 튜플(Tuple) 형태로 작성해야 합니다: (역할, 내용)

🔍 Output 파싱

AIMessage 응답에는 content 외에도 additional_kwargs, response_metadata, usage_metadata 등 많은 정보가 포함되어 있습니다. 실제로 필요한 것은 대부분 content이므로, 이를 간단히 추출하기 위해 LangChain의 output_parser를 사용합니다.

✂️ StrOutputParser 사용하기

python

from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt_template = PromptTemplate(
    template="What is the capital of {country}?",
    input_variables=["country"]
)

prompt = prompt_template.invoke({"country": "France"})
print(prompt)

ai_message = llm.invoke(prompt_template.invoke({"country": "France"}))
print(ai_message)

output_parser = StrOutputParser()
answer = output_parser.invoke(ai_message)
print(answer)

실행 결과:

python

text='What is the capital of France?'

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, response_metadata={...})

The capital of France is Paris.

ai_message는 AIMessage 클래스 객체이며, answer는 순수한 문자열(str)입니다.

💻 개발 팁: 프론트엔드에서 사용하기 용이한 것은 문자열 형태입니다. JavaScript에는 BaseMessage 클래스가 없으므로, 문자열로 파싱하여 반환하는 것이 좋습니다.

더 간결한 답변이 필요한 경우 프롬프트를 수정할 수 있습니다:

python

prompt_template = PromptTemplate(
    template="What is the capital of {country}? Return the name of the city only.",
    input_variables=["country"]
)

실행 결과:

python

Paris

📊 Structured Output with Pydantic

JSON 형식의 구조화된 출력이 필요한 경우가 많습니다. JsonOutputParser도 있지만, JSON 형식이 아닌 값을 반환하거나 파싱에 실패하는 경우가 있습니다.

더 안정적으로 구조화된 데이터를 받으려면 Pydantic을 활용하는 것이 좋습니다.

python

from pydantic import BaseModel, Field

class CountryDetail(BaseModel):
    capital: str = Field(description="The capital of the country")
    population: int = Field(description="The population of the country")
    area: float = Field(description="The area of the country in square kilometers")
    language: str = Field(description="The language of the country")
    currency: str = Field(description="The currency of the country")

structured_llm = llm.with_structured_output(CountryDetail)

python

country_detail_prompt = PromptTemplate(
    template="""Give following information about {country}:
    - Capital
    - Population
    - Area
    - Language
    - Currency

    Return it in the structured format.
    """,
    input_variables=["country"]
)

json_ai_message = structured_llm.invoke(country_detail_prompt.invoke({"country": "France"}))

실행 결과:

python

json_ai_message
# CountryDetail(capital='Paris', population=67000000, area=643801.0, language='French', currency='Euro (EUR)')

json_ai_message.model_dump()
# {
#   'capital': 'Paris',
#   'population': 67000000,
#   'area': 643801.0,
#   'language': 'French',
#   'currency': 'Euro (EUR)'
# }

✅ 장점: Pydantic을 사용하면 타입 검증, 필드 설명, 기본값 설정 등이 가능하여 더욱 안정적이고 예측 가능한 출력을 얻을 수 있습니다.

⛓️ LCEL (LangChain Expression Language)

지금까지 작성한 코드를 다시 살펴보겠습니다:

python

answer = output_parser.invoke(llm.invoke(prompt_template.invoke({"country": "France"})))

이 코드의 실행 순서는 다음과 같습니다:

prompt_template.invoke: 프롬프트 생성
llm.invoke: LLM 답변 생성
output_parser.invoke: 결과 파싱

기능이 다른데도 모두 invoke 메서드를 사용하고 있습니다. 그 이유는 이들 모두 Runnable 클래스를 상속받기 때문입니다.

Runnable 클래스에는 invoke 메서드가 정의되어 있으며, 이 메서드는 해당 컴포넌트의 출력(output)을 반환합니다.

💡 핵심 개념: LangChain은 Runnable(실행 가능한 컴포넌트)들의 조합으로 이루어져 있습니다.

🔗 파이프 연산자로 체인 구성하기

중첩된 invoke 호출은 가독성이 떨어집니다. LCEL은 파이썬의 파이프(|) 연산자를 사용하여 컴포넌트들을 연결할 수 있습니다.

python

chain = prompt_template | llm | output_parser

# 이제 간단하게 호출할 수 있습니다
answer = chain.invoke({"country": "France"})

데이터 흐름이 왼쪽에서 오른쪽으로 진행되므로 가독성이 크게 향상됩니다:

prompt_template: 입력 받아 프롬프트 생성
llm: 프롬프트 받아 AI 응답 생성
output_parser: AI 응답 받아 문자열로 파싱

🎯 추가 장점: chain도 Runnable이므로 invoke가 가능하며, 다른 체인의 일부로도 사용할 수 있습니다. 이를 통해 복잡한 파이프라인을 모듈화하여 구성할 수 있습니다.

예시:

python

# 여러 체인을 조합할 수 있습니다
preprocessing_chain = input_parser | data_cleaner
main_chain = preprocessing_chain | prompt_template | llm | output_parser

result = main_chain.invoke(raw_input)

이 링크를 통해 구매하시면 제가 수익을 받을 수 있어요. 🤗

https://inf.run/iBxHp

🚀 LangChain 기초: LLM 호출부터 LCEL까지

💡 LLM 호출하기

LLM을 호출하는 방법에는 GPT, Claude와 같은 상용 서비스 외에도 Ollama라는 오픈소스 도구를 활용하여 로컬에서 모델을 운용하는 방법이 있습니다.

📦 Ollama 설치 및 모델 다운로드

모델 선택 시 고려사항은 다음과 같습니다:

tools: 도구 활용이 가능한 언어 모델입니다
1.5b, 7b: 파라미터 개수를 나타내며, 숫자가 작을수록 경량화된 모델입니다

모델 다운로드 예시:

bash

ollama pull llama3.2:1b

🐍 가상환경 설정

bash

python -m venv .venv
source .venv/bin/activate

⚠️ 주의: Cursor 등의 에디터에서 .ipynb 파일을 만들고 최초 한 번 Jupyter 커널을 선택해야 정상 실행됩니다.

🔧 LangChain으로 Ollama 사용하기

python

%pip install -q langchain-ollama

from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2:1b")
llm.invoke("What is the capital of the France?")

실행 결과는 다음과 같습니다:

python

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, response_metadata={'model': 'llama3.2:1b', 'created_at': '2025-10-21T01:49:25.346267Z', 'done': True, 'done_reason': 'stop'}, usage_metadata={'input_tokens': 11, 'output_tokens': 12, 'total_tokens': 23})

🤖 ChatGPT 사용하기

Ollama 대신 ChatGPT를 사용하는 것도 가능합니다.

python

%pip install -q langchain-openai

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
llm.invoke("What is the capital of the France?")

하지만 API KEY가 없으면 OpenAIError가 발생합니다. 이를 해결하는 방법은 두 가지입니다:

클라이언트(ChatOpenAI)에 API KEY 직접 입력하기
OPENAI_API_KEY 환경변수 설정하기 (권장) ✅

OpenAI 공식 사이트에서 API KEY를 발급받은 후 프로젝트 루트에 .env 파일을 생성하여 저장합니다:

plain

OPENAI_API_KEY=your-api-key-here

그 다음 코드 최상단에서 환경변수를 불러옵니다:

python

%pip install -q python-dotenv

from dotenv import load_dotenv
load_dotenv()

이제 다시 실행하면 정상적으로 답변이 출력됩니다! 🎉

📝 프롬프트 템플릿

python

from langchain_ollama import ChatOllama

llm = ChatOllama(model="llama3.2:1b")
llm.invoke("What is the capital of the France?")

위 코드에서 invoke 내부에 있는 "What is the capital of the France?"가 바로 프롬프트입니다. 프롬프트는 LLM을 호출하는 명령어 또는 질문입니다.

LangChain에서 프롬프트로 사용할 수 있는 타입은 다음과 같습니다:

PromptValue (PromptTemplate에서 생성)
str (문자열)
list of BaseMessage (메시지 리스트)

🎯 PromptTemplate 사용하기

PromptTemplate을 사용하면 재사용 가능한 프롬프트를 만들 수 있습니다.

python

from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate(
    template="What is the capital of {country}?",
    input_variables=["country"]
)

prompt = prompt_template.invoke({"country": "France"})
print(prompt)

llm.invoke(prompt_template.invoke({"country": "France"}))

프롬프트 템플릿의 구성 요소:

template: 활용할 프롬프트의 문자열 형태입니다
{}: placeholder로, 동적으로 값을 주입할 수 있습니다
input_variables: placeholder에 들어갈 변수명 리스트입니다
invoke: dictionary 형태로 값을 전달하여 완성된 프롬프트를 생성합니다

💬 BaseMessage 활용하기

프롬프트를 메시지 형태로 전달할 수도 있습니다.

python

from langchain_core.messages import HumanMessage

llm.invoke(HumanMessage(content="What is the capital of France?"))

HumanMessage는 BaseMessage를 상속받는 클래스입니다. BaseMessage를 상속받는 주요 클래스는 다음과 같습니다:

SystemMessage 🎭: LLM의 역할이나 페르소나를 정의합니다
HumanMessage 👤: 사용자의 메시지입니다
AIMessage 🤖: LLM의 응답 메시지입니다
ToolMessage 🛠️: Agent가 도구를 사용할 때 활용됩니다

대화 히스토리를 제공하여 컨텍스트를 주는 예시:

python

from langchain_core.messages import HumanMessage, SystemMessage, AIMessage

llm.invoke([
    SystemMessage(content="You are a helpful assistant that can answer questions."),
    HumanMessage(content="What is the capital of France?"),
    AIMessage(content="The capital of France is Paris."),
    HumanMessage(content="What is the capital of Germany?"),
])

💡 팁: 이렇게 LLM에게 마치 이전에 대화한 것처럼 예시를 전달함으로써 더욱 정확하고 일관성 있는 답변을 받을 수 있습니다. 이를 Few-shot Prompting이라고 합니다.

🎨 ChatPromptTemplate

확장성과 유지보수성을 고려한다면 ChatPromptTemplate을 사용하는 것이 좋습니다.

python

from langchain_core.prompts import ChatPromptTemplate

chat_prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant!"),
    ("human", "What is the capital of {country}?")
])

chat_prompt = chat_prompt_template.invoke({"country": "France"})
print(chat_prompt)

실행 결과:

python

messages=[
    SystemMessage(content='You are a helpful assistant!', additional_kwargs={}, response_metadata={}),
    HumanMessage(content='What is the capital of France?', additional_kwargs={}, response_metadata={})
]

📌 중요: ChatPromptTemplate에서 placeholder를 사용하는 경우, 각 메시지는 튜플(Tuple) 형태로 작성해야 합니다: (역할, 내용)

🔍 Output 파싱

✂️ StrOutputParser 사용하기

python

from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt_template = PromptTemplate(
    template="What is the capital of {country}?",
    input_variables=["country"]
)

prompt = prompt_template.invoke({"country": "France"})
print(prompt)

ai_message = llm.invoke(prompt_template.invoke({"country": "France"}))
print(ai_message)

output_parser = StrOutputParser()
answer = output_parser.invoke(ai_message)
print(answer)

실행 결과:

python

text='What is the capital of France?'

AIMessage(content='The capital of France is Paris.', additional_kwargs={}, response_metadata={...})

The capital of France is Paris.

ai_message는 AIMessage 클래스 객체이며, answer는 순수한 문자열(str)입니다.

💻 개발 팁: 프론트엔드에서 사용하기 용이한 것은 문자열 형태입니다. JavaScript에는 BaseMessage 클래스가 없으므로, 문자열로 파싱하여 반환하는 것이 좋습니다.

더 간결한 답변이 필요한 경우 프롬프트를 수정할 수 있습니다:

python

prompt_template = PromptTemplate(
    template="What is the capital of {country}? Return the name of the city only.",
    input_variables=["country"]
)

실행 결과:

python

Paris

📊 Structured Output with Pydantic

JSON 형식의 구조화된 출력이 필요한 경우가 많습니다. JsonOutputParser도 있지만, JSON 형식이 아닌 값을 반환하거나 파싱에 실패하는 경우가 있습니다.

더 안정적으로 구조화된 데이터를 받으려면 Pydantic을 활용하는 것이 좋습니다.

python

from pydantic import BaseModel, Field

class CountryDetail(BaseModel):
    capital: str = Field(description="The capital of the country")
    population: int = Field(description="The population of the country")
    area: float = Field(description="The area of the country in square kilometers")
    language: str = Field(description="The language of the country")
    currency: str = Field(description="The currency of the country")

structured_llm = llm.with_structured_output(CountryDetail)

python

country_detail_prompt = PromptTemplate(
    template="""Give following information about {country}:
    - Capital
    - Population
    - Area
    - Language
    - Currency

    Return it in the structured format.
    """,
    input_variables=["country"]
)

json_ai_message = structured_llm.invoke(country_detail_prompt.invoke({"country": "France"}))

실행 결과:

python

json_ai_message
# CountryDetail(capital='Paris', population=67000000, area=643801.0, language='French', currency='Euro (EUR)')

json_ai_message.model_dump()
# {
#   'capital': 'Paris',
#   'population': 67000000,
#   'area': 643801.0,
#   'language': 'French',
#   'currency': 'Euro (EUR)'
# }

✅ 장점: Pydantic을 사용하면 타입 검증, 필드 설명, 기본값 설정 등이 가능하여 더욱 안정적이고 예측 가능한 출력을 얻을 수 있습니다.

⛓️ LCEL (LangChain Expression Language)

지금까지 작성한 코드를 다시 살펴보겠습니다:

python

answer = output_parser.invoke(llm.invoke(prompt_template.invoke({"country": "France"})))

이 코드의 실행 순서는 다음과 같습니다:

prompt_template.invoke: 프롬프트 생성
llm.invoke: LLM 답변 생성
output_parser.invoke: 결과 파싱

기능이 다른데도 모두 invoke 메서드를 사용하고 있습니다. 그 이유는 이들 모두 Runnable 클래스를 상속받기 때문입니다.

Runnable 클래스에는 invoke 메서드가 정의되어 있으며, 이 메서드는 해당 컴포넌트의 출력(output)을 반환합니다.

💡 핵심 개념: LangChain은 Runnable(실행 가능한 컴포넌트)들의 조합으로 이루어져 있습니다.

🔗 파이프 연산자로 체인 구성하기

중첩된 invoke 호출은 가독성이 떨어집니다. LCEL은 파이썬의 파이프(|) 연산자를 사용하여 컴포넌트들을 연결할 수 있습니다.

python

chain = prompt_template | llm | output_parser

# 이제 간단하게 호출할 수 있습니다
answer = chain.invoke({"country": "France"})

데이터 흐름이 왼쪽에서 오른쪽으로 진행되므로 가독성이 크게 향상됩니다:

prompt_template: 입력 받아 프롬프트 생성
llm: 프롬프트 받아 AI 응답 생성
output_parser: AI 응답 받아 문자열로 파싱

🎯 추가 장점: chain도 Runnable이므로 invoke가 가능하며, 다른 체인의 일부로도 사용할 수 있습니다. 이를 통해 복잡한 파이프라인을 모듈화하여 구성할 수 있습니다.

예시:

python

# 여러 체인을 조합할 수 있습니다
preprocessing_chain = input_parser | data_cleaner
main_chain = preprocessing_chain | prompt_template | llm | output_parser

result = main_chain.invoke(raw_input)

이 링크를 통해 구매하시면 제가 수익을 받을 수 있어요. 🤗

https://inf.run/iBxHp