Most AI sales agent tutorials stop at "make it respond to messages." That's the easy part. The hard part — and the part that determines whether your agent actually closes deals — is making it remember. Not just within a session, but across every conversation that customer has ever had with your product.

This guide walks through the complete architecture: what to store, how to extract it, and how to inject it so your agent walks into every conversation already knowing the customer.

Step 1: Design your customer profile schema

Before writing any code, you need to decide what your agent should know about a customer. The temptation is to store everything. Resist it. A bloated profile is as useless as no profile — your agent needs curated, actionable intelligence, not a transcript archive.

For a sales agent, the profile should capture:

CustomerProfile schema
customer_id Unique identifier — phone number, email, or CRM ID. Used to fetch the right profile before each session. string
buying_triggers What prompted them to reach out? What event, need, or deadline is driving the purchase? The most valuable field for personalization. string[]
budget Budget range or ceiling, and any constraints mentioned (e.g. "needs approval above $500"). string
objections Known objections and how they've been addressed. Critical for follow-up conversations — don't re-raise closed objections. string[]
preferences Product preferences, styles, brands, features they've shown interest in or explicitly rejected. string[]
personal_details Rapport-building facts: family members, important dates, hobbies. Remembered and used appropriately, these build trust fast. string[]
channel_preference How does this customer prefer to be reached? WhatsApp, email, phone? Some customers ghost on one channel and respond instantly on another. string
journey_stage Where are they in the buying process? First touch, warm lead, near decision, post-purchase follow-up. string
last_interaction Summary of the most recent conversation. Lets the agent pick up naturally without asking "what did we discuss last time?" string

Keep the schema tight. Every field should answer a question your agent would otherwise ask out loud.

Step 2: Extract profile updates after each conversation

After each conversation ends, you need to update the customer profile with new information. This is an LLM task — you pass the conversation transcript and the current profile to a model and ask it to produce an updated profile.

The extraction prompt is critical. It should instruct the model to:

# Pseudocode — extraction call

extract_profile_update(customer_id, conversation, current_profile):
  prompt = f"""
  You are a customer intelligence extractor.

  Current profile:
  {current_profile}

  New conversation:
  {conversation}

  Return an updated profile JSON that:
  - Merges new facts with existing ones
  - Updates fields where the customer has shared new information
  - Resolves contradictions in favor of the latest statement
  - Keeps the same schema structure
  - Omits nothing from the current profile unless explicitly contradicted
  """

  response = llm.call(prompt)
  updated_profile = parse_json(response)

  return store_profile(customer_id, updated_profile)
Run extraction async Don't run the extraction step during the conversation — it adds latency. Trigger it as a background job after the session ends. The next session will have the updated profile ready.

Step 3: Inject the profile at session start

Before your agent handles the customer's first message in a new session, fetch their profile and prepend it to the system prompt. This is where the magic happens — the agent walks into the conversation already knowing everything.

# Pseudocode — session initialization

start_session(customer_id, incoming_message):
  # Fetch the profile (~400 tokens)
  profile = fetch_profile(customer_id)

  system_prompt = f"""
  You are a helpful sales agent.

  === Customer Profile ===
  {format_profile(profile)}
  =======================

  Use this profile to personalize your responses.
  Reference relevant details naturally — don't recite the profile.
  Update your approach based on known objections and preferences.
  """

  return llm.chat(
    system=system_prompt,
    message=incoming_message
  )

That's it. The agent now has full context in ~400 tokens. It knows the customer's budget, their last objection, their daughter's upcoming birthday, and that they prefer WhatsApp over email — all without asking a single question.

Step 4: Handle new customers gracefully

For customers with no prior profile, the agent should simply not have a profile section in its system prompt. First interactions are treated as a fresh conversation, and the extraction step after the session will create a new profile from scratch.

# Pseudocode — handle missing profile

profile = fetch_profile(customer_id)

if profile is None:
  system_prompt = base_system_prompt  # No profile section
else:
  system_prompt = base_system_prompt + format_profile(profile)

Using DeepRaven instead of building it yourself

The architecture above works, but building it properly — handling concurrent updates, profile conflicts, storage at scale, and a robust extraction pipeline — takes significant engineering time. DeepRaven implements all of this as a two-endpoint API:

# Ingest a conversation after it ends
POST https://api.deepraven.ai/v1/ingest
{
  "customer_id": "customer_123",
  "messages": [
    { "role": "user", "content": "..." },
    { "role": "assistant", "content": "..." }
  ]
}

# Fetch the profile before a new session
GET https://api.deepraven.ai/v1/profile/{customer_id}

# Response: compact profile ready to inject
{
  "customer_id": "customer_123",
  "buying_triggers": [...],
  "budget": "~$200",
  "objections": [...],
  "preferences": [...],
  "personal_details": [...],
  "channel_preference": "WhatsApp",
  "journey_stage": "warm_lead",
  "last_interaction": "Discussed rose gold necklace options..."
}
What you skip by using DeepRaven Extraction prompt engineering, conflict resolution logic, profile storage and versioning, concurrent write handling, schema evolution, and the ongoing maintenance of all of the above. Most teams underestimate this by 3–4× when building it in-house.

The result

An agent built with this pattern — whether you build the memory layer yourself or use DeepRaven — behaves fundamentally differently from a stateless agent. Every session feels like a continuation of a relationship, not a cold restart. Customers don't repeat themselves. Agents don't ask questions they've already been answered. Conversion rates improve because the agent spends time closing, not catching up.

The technical investment is modest. The business impact is not.