Step-by-Step Guide: Building an AI Agent for Earnings Call Summaries

Overview and Goals

In this guide, we’ll create an AI agent that automatically summarizes stock earnings call transcripts and highlights key financial points (like stock price and P/E ratio), then provides a brief recommendation.

The agent will use LangChain (for chaining LLM calls and tools), OpenAI’s LLM (for summarization and analysis), and Streamlit (for a simple web interface).

By the end, the agent will:

Summarize key points from an earnings call transcript (e.g., revenue, guidance, management commentary).
Fetch real-time data (such as the company’s current stock price and P/E ratio) via API calls to enrich the summary.
Highlight important metrics in the summary (emphasize figures like stock price/P&E) and provide a recommendation or insight based on the earnings performance and valuation.
Display the results in a web interface for easy reading.

Why this is useful: Earnings call transcripts can be very long, making it hard to extract the key takeaways. An AI agent can condense these transcripts into a concise summary and augment it with up-to-date stock data for deeper insight, saving analysts hours of work.

1. Project Setup and Dependencies

First, ensure your environment is set up with the necessary tools: Python 3.7+, LangChain, OpenAI’s API, and Streamlit.

Install the required libraries:

pip install langchain openai streamlit tiktoken yfinance requests

This will install LangChain (which provides chaining and agent tools), OpenAI (for the LLM), Streamlit (for the UI), plus yfinance and requests for fetching stock data. (You might use yfinance to get stock price/P&E easily, or any financial API of your choice.)

Configuration: Obtain an OpenAI API key and set it in your environment or directly in code (you can input it in the Streamlit app for security). Also, if using a financial data API (like Alpha Vantage, Financial Modeling Prep, etc.), get the API key and note the endpoints.

In our case, the user has an “earnings call API” (referred to as kreate-earning-call-api) which provides new transcripts daily – we will use this as the data source.

2. Fetching the Earnings Call Transcript

Next, retrieve the earnings call transcript from your data source. Given you have an API for earnings calls, you can use Python’s requests to get the latest transcript.

For example, if the API provides a JSON with the transcript content:

import requests

API_URL = "https://your-api-endpoint.com/latest_earnings_call?symbol=XYZ"
response = requests.get(API_URL)
data = response.json()
transcript_text = data.get("transcript")  # assuming the JSON has a 'transcript' field

Replace API_URL with your actual endpoint (e.g., the user’s kreate-earning-call-api). Many earnings call APIs return a JSON where the transcript text is under a key like "content".

If your API requires specifying the company and quarter, provide those parameters. For instance, with FMP’s API you might call:

# Example for NVIDIA Q2 2024 using FMP API (if using that service)
url = f"https://financialmodelingprep.com/api/v3/earning_call_transcript/NVDA?quarter=2&year=2024&apikey=YOUR_API_KEY"
transcript_text = requests.get(url).json()[0]["content"]

Ensure you handle the API response format according to your provider. Once you have transcript_text (which could be a very long string, often thousands of words), you may want to do some light preprocessing – e.g., removing excess whitespace or separating sections.

Earnings call transcripts often have sections like Prepared Remarks and Q&A. You can split the text by a delimiter (like "Question-and-Answer") if needed to handle each part separately. However, for simplicity, we will feed the whole transcript (or chunks of it) into the summarizer.

3. Summarizing the Earnings Call with OpenAI (LangChain)

To condense the transcript, we leverage LangChain’s summarization capabilities with an OpenAI model (like GPT-4 or GPT-3.5).

Earnings call transcripts can be very long (several pages of text), so the key is to split the text into manageable chunks and summarize iteratively. LangChain provides utilities to do exactly this:

Split the text: Use a text splitter to break the transcript into chunks (e.g., 1,000-2,000 characters or tokens each). LangChain’s CharacterTextSplitter can do this for you. This prevents exceeding the model’s token limit.
Create documents: Wrap each chunk in a Document object (LangChain’s standard format for text data).
Run summarization chain: Use LangChain’s load_summarize_chain with a large language model to summarize the documents. A common choice is the "map_reduce" chain, which will summarize chunks and then summarize the summaries. This yields a concise summary of the entire transcript.

In code, this process might look like:

from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document
from langchain.chains.summarize import load_summarize_chain
from langchain.chat_models import ChatOpenAI

# Initialize the OpenAI LLM (ensure your API key is set)
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, openai_api_key=OPENAI_API_KEY)

# Split the transcript into chunks
text_splitter = CharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
texts = text_splitter.split_text(transcript_text)

# Create Document objects for each chunk
docs = [Document(page_content=t) for t in texts]

# Load a summarization chain (using map-reduce strategy)
summarize_chain = load_summarize_chain(llm, chain_type="map_reduce")

# Generate the summary
summary = summarize_chain.run(docs)

This chain will handle large texts by first summarizing each chunk and then combining those summaries into a final summary. Internally, the text is split with CharacterTextSplitter, wrapped in Document objects, and then fed to summarize_chain.run().

After this step, summary will contain a few paragraphs (or a set of bullet points) highlighting the key points from the call – for example, it might note “Revenue grew 10% year-over-year, beating estimates”, “Management raised full-year guidance”, “Analysts asked about supply chain issues in Q&A”, etc. This is already valuable on its own, condensing pages of transcript into the main takeaways.

Tip: To improve structure, you can prompt the LLM to organize the summary (e.g., as bullet points or sections like TL;DR, Outlook, Risks). This ensures important details aren’t missed. But even a straightforward summary will capture most key points.

4. Integrating Real-Time Stock Data (Price & P/E Ratio)

With the earnings call summarized, the next step is to highlight important financial metrics that put the earnings in context. Specifically, we want to fetch the company’s current stock price and Price-to-Earnings (P/E) ratio. These give immediate insight into how the market values the company post-earnings.

We’ll use an external API or library to get this data:

Stock Price: You can use Yahoo Finance data via the yfinance library. For example, yfinance.Ticker("XYZ").info["currentPrice"] gives the latest price for ticker XYZ.
P/E Ratio: Many APIs provide the trailing P/E ratio. With yfinance, you might get it from Ticker(...).info["trailingPE"].

Using LangChain Tools

We can wrap these API calls into LangChain tools so the LLM agent can invoke them when needed. We can implement a custom tool by subclassing BaseTool or using the @tool decorator.

Here’s a conceptual example using the @tool decorator for brevity:

from langchain.agents import tool
import yfinance as yf

@tool("GetStockPrice", return_direct=True)
def get_stock_price(ticker: str) -> str:
    """Returns the current stock price of the given company ticker."""
    try:
        price = yf.Ticker(ticker).info.get("currentPrice")
        return f"{ticker} current price is ${price:.2f}"
    except Exception as e:
        return "Error fetching price."

@tool("GetPERatio", return_direct=True)
def get_pe_ratio(ticker: str) -> str:
    """Returns the current P/E ratio of the given company ticker."""
    try:
        pe = yf.Ticker(ticker).info.get("trailingPE")
        return f"{ticker} P/E ratio is {pe:.2f}"
    except Exception as e:
        return "Error fetching P/E ratio."

Each tool has a name and a docstring that tells the agent what it does. The return_direct=True here means the tool’s output will be directly returned to the agent. We used yfinance for convenience; you could instead call your own stock data API via requests inside these functions.

Note: Ensure the ticker symbol (e.g., AAPL for Apple) is known or provided. We might extract the ticker from the transcript or have the user specify it (perhaps as part of the input in the Streamlit app). For example, if summarizing Apple’s earnings call, we know the ticker is "AAPL".

5. Building the LangChain Agent with Tools

Now we combine everything: the LLM, the summary, and the tools into a LangChain Agent. An agent allows the LLM to decide if and when to use the tools to get additional information.

We will use OpenAI’s function-calling agent type, which fits well with our custom tools. We pass our tools and LLM to initialize_agent:

from langchain.agents import initialize_agent, AgentType

# Assume llm is our ChatOpenAI model from before, and tools are [get_stock_price, get_pe_ratio]
tools = [get_stock_price, get_pe_ratio]
agent = initialize_agent(
    tools, 
    llm, 
    agent=AgentType.OPENAI_FUNCTIONS, 
    verbose=True
)

Here we choose AgentType.OPENAI_FUNCTIONS to leverage OpenAI’s function calling. We set verbose=True to see the agent’s reasoning steps in the console (useful for debugging).

Now the agent is ready to take a prompt and act on it. We want the agent to output a final report that includes the summary, the fetched data, and a recommendation. We can prompt it with something like:

“Here is the summary of {Company}’s earnings call:
{summary}

Using the summary, provide a concise analysis. Include the company’s current stock price and P/E ratio (using the available tools), and then give a brief recommendation on the stock’s outlook based on the earnings performance and these metrics.”

When we run the agent on such a query, the LLM will read the summary and the instruction. If it needs the stock price or P/E, it will invoke our tools. The agent then incorporates that into its answer. Because we designed the tools to return a formatted string, the LLM can directly place them in the final response.

The final answer might read like:

“Apple’s Q3 earnings call summary: The company reported strong results, with revenue growing 10% and beating expectations. Management raised future guidance, citing robust iPhone sales. AAPL’s current price is $171.21 and its P/E ratio is 29.5, indicating a fairly high valuation. Recommendation: Given the earnings beat and positive outlook, the stock appears poised for upside, but the elevated P/E suggests much optimism is priced in. Investors should remain optimistic but watch valuation.”*

* (these figures are examples)

Notice how the agent inserted the real-time price and P/E into the analysis. This approach enriches the summary with accurate market data. It then provides a recommendation taking into account both the earnings call content and the current market metrics.

6. Developing the Streamlit Web Interface

With the backend logic ready (summarization + agent), we build a simple Streamlit app to allow users to interact with the agent and display results. Streamlit makes it easy to create a form and display text or highlighted sections.

Key elements to implement in Streamlit:

Input Selection: Let the user choose or enter the company (ticker or name). Since you have an API providing daily transcripts, you might have a dropdown of available companies or simply always load the latest transcript for a given ticker. For example, st.text_input("Enter company ticker:", "AAPL").
Trigger Summarization: A button to run the summarization (and agent) once input is provided. You might wrap the logic in an st.button("Summarize").
Loading State: This may take a few seconds to run since it involves multiple LLM calls. Use st.spinner("Loading...") to indicate work.
Display Summary and Highlights: Once the agent returns the final output, display it. You can use st.write() or st.markdown() to show text. If the agent’s output is in Markdown (you can instruct it to use bold for important metrics), Streamlit will render that.

Here’s a pseudo-code layout of the Streamlit app:

import streamlit as st

st.title("📈 Earnings Call Summarizer AI")

ticker = st.text_input("Enter Stock Ticker:", value="AAPL")

if st.button("Summarize Latest Earnings Call"):
    with st.spinner("Summarizing and analyzing..."):
        # Fetch transcript via API
        transcript_text = fetch_transcript(ticker)  # your function using requests
        
        # Summarize the transcript
        summary = summarize_chain.run(split_transcript(transcript_text))
        
        # Run the agent to get enriched summary with recommendation
        prompt = (f"Here is the summary of {ticker}'s earnings call:\n{summary}\n\n"
                  f"Provide an analysis with the company's current stock price and P/E ratio, "
                  f"then give a recommendation based on the performance.")
        
        result = agent.run(prompt)
    
    # Display the result
    st.markdown(result)

In the interface, you would simply enter “AAPL” (for example) and click Summarize. The app will then display something like:

Summary: (e.g., “Apple had a strong quarter, beating revenue estimates by 5% and raising guidance….”)

Highlighted Metrics: (e.g., “AAPL current price is $171.21 and P/E is 29.5”)

Recommendation: (e.g., “Outlook is positive given strong performance, though valuation is high – a cautious Buy.”)

You can improve the presentation (e.g., splitting into multiple st.write calls or using Markdown headings for “Summary” vs “Recommendation”).

7. Testing the AI Agent

It’s important to test the agent to ensure it works as expected. Run the Streamlit app (streamlit run your_app.py) and try it on a known earnings call.

The log (if verbose is True) will show the agent’s thought process in the console. You should see it calling the tools for price and P/E. The final output should correctly include those values and a coherent summary.

Make sure:

The transcript is successfully fetched from the API.
The summary is concise and captures key points.
The agent identifies the ticker symbol correctly for the tools.
The price and P/E ratio make sense (verify with an external source).
The recommendation is reasonable given the summary.

If anything looks off, refine the prompts or chain parameters. For example, you can ask the LLM to output the summary in bullet points or to explicitly label sections. LangChain’s flexibility allows tweaking the chain or agent as needed.

8. Conclusion and Next Steps

You now have a working AI agent that combines LLM summarization with tool use for real-time data, presented through an easy UI.

This architecture can be extended in many ways: you could add more tools (e.g., a tool to fetch recent news or analyst ratings to provide even more context), incorporate memory to compare the latest call with previous quarters, or use a vector database to let the agent answer questions about the call in a chatbot style.

By following these steps, we demonstrated how to summarize an earnings call and enhance it with key financial metrics automatically. The agent effectively acts like a junior financial analyst: reading the transcript, pulling in the latest stock data, and giving an informed take on the results.

With Streamlit, this capability is just a click away for an end-user. Enjoy building and customizing your earnings-call AI agent!