Introducing LangSmith: Streamline Your LLM Application Development

langsmith
langsmith

LangSmith is an all-in-one platform for building and deploying production-grade Large Language Model (LLM) applications. With LangSmith, you can monitor and evaluate your application closely, ensuring quick and confident deployment. You don’t need LangChain to use LangSmith, as it works independently.

Simplified Development and Deployment

LangSmith provides a seamless experience for every stage of the LLM application lifecycle. Whether you’re building with LangChain or not, LangSmith has got you covered.

Getting Started

  1. Install LangSmith using pip: pip install -U langsmith
  2. Create an API key in the setting page
LangSmith
LangSmith

3. Set up your environment with the API key and tracing

export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>

# The below examples use the OpenAI API, though it's not necessary in general
export OPENAI_API_KEY=<your-openai-api-key>

Logging Traces and Evaluations

LangSmith offers various ways to log traces, including the traceable decorator. You can also create evaluations using the evaluate function, which requires a system to test, data for test cases, and optional evaluators for grading results. See more on the Integrations page.

Integrations
import openai
from langsmith.wrappers import wrap_openai
from langsmith import traceable

# Auto-trace LLM calls in-context
client = wrap_openai(openai.Client())

@traceable # Auto-trace this function
def pipeline(user_input: str):
    result = client.chat.completions.create(
        messages=[{"role": "user", "content": user_input}],
        model="gpt-3.5-turbo"
    )
    return result.choices[0].message.content

pipeline("Hello, world!")
# Out:  Hello there! How can I assist you today?
Log Trace
Log Trace

Create your first evaluation

Evalution requires a system to test, data to serve as test cases, and optionally evaluators to grade the results. Here we use a built-in accuracy evaluator.

from langsmith import Client
from langsmith.evaluation import evaluate

client = Client()

# Define dataset: these are your test cases
dataset_name = "Sample Dataset"
dataset = client.create_dataset(dataset_name, description="A sample dataset in LangSmith.")
client.create_examples(
    inputs=[
        {"postfix": "to LangSmith"},
        {"postfix": "to Evaluations in LangSmith"},
    ],
    outputs=[
        {"output": "Welcome to LangSmith"},
        {"output": "Welcome to Evaluations in LangSmith"},
    ],
    dataset_id=dataset.id,
)

# Define your evaluator
def exact_match(run, example):
    return {"score": run.outputs["output"] == example.outputs["output"]}

experiment_results = evaluate(
    lambda input: "Welcome " + input['postfix'], # Your AI system goes here
    data=dataset_name, # The data to predict and grade over
    evaluators=[exact_match], # The evaluators to score the results
    experiment_prefix="sample-experiment", # The name of the experiment
    metadata={
      "version": "1.0.0",
      "revision_id": "beta"
    },
)

LLMOps

LangSmith simplifies the LLM application development process, allowing you to focus on building and deploying high-quality AI applications.

LLMOps
LLMOps

Important Links:

Valuable comments