Captivate Chat
  • 📚Overview
  • 💻Components
  • 📕Changelog
  • 🚀Get Started
  • DASHBOARD
    • Homepage
      • Setup
        • My AI Chatbots
        • Chat Flows
      • Transcripts
      • Tokens
      • Settings
        • Edit Profile
        • Manage Plan
        • Account Details
        • Referrals
  • START CREATING
    • Create an AI Chatbot
      • Setting Up
      • Import Your Own Information
        • URL Crawler
      • Select Type
      • Train your AI Chatbot
      • Select Integration
      • Finishing Touches
      • Modify your AI Chatbots
    • Create a Chat Flow
      • Setting Up
      • AI Chatbots
      • Select Channel
        • Channel: Web Chat
          • Preview your Web Widget
        • Channel: Facebook
        • Channel: Instagram
        • Channel: WhatsApp
          • Setting up WhatsApp
          • Submit your first WhatsApp Sender
          • Setting up more WhatsApp Senders
          • Configure the Endpoint for Integration
        • Channel: Full Chat
          • Preview Full Chat Channel
        • Channel: Custom Channel
          • JavaScript Socket API
          • Socket API
          • Socket Conversation Testing
      • Select Live Chat Vendor
      • Modify your Chat Flow
    • Connect a Live Chat
      • Microsoft Teams
        • Set up the Captivate Chat MS Teams App
          • MS Teams App Chat Prompts
        • Connect your Captivate Chat account
        • Connect your Chat Flow
        • MS Teams App Chats
        • Test your Microsoft Teams integration
    • Create a Custom Component
  • 💬SUPPORT
Powered by GitBook
On this page
  • Using the URL Crawler
  • Wildcard Character
  • 1 Wildcard Character: url.com/abc/*
  • 2 Wildcard Characters: url.com/abc/**

Was this helpful?

  1. START CREATING
  2. Create an AI Chatbot
  3. Import Your Own Information

URL Crawler

PreviousImport Your Own InformationNextSelect Type

Last updated 4 months ago

Was this helpful?

Thanks to the URL Crawler, you can now simply insert just one root URL (or main URL) and Captivate Chat will use a web crawler to gather all URLs located inside that website - no need to manually find them yourself.

Use Direct Input if you have a specific set of URLs to ingest

If you want your AI Chatbot to ingest and learn from all available URLs within a website, the URL Crawler is a convenient tool to use!

However, you should use Import > Direct Input if you only have a specific set of URLs you want your AI Chatbot to ingest.


Using the URL Crawler

Start deep crawl will appear after listing your root URL. It will list all the URLs available under that main link.

You can scroll down to choose which URLs you want your AI Chatbot to ingest, but you can also just click the checkbox beside the "URL" column name to select all the URLs listed in the crawl.


Wildcard Character

If you want to make more precise searches with our URL Crawler, you can use what's called a wildcard character or an asterisk (*) at the end of your root URL.

What this does is to command the Captivate Chat web crawler to only retrieve URLs of that specific website layer. The Captivate Chat URL Crawler can retrieve from up to two (2) layers within a website. If we're using the above example:

1 Wildcard Character: url.com/abc/*

If you only use 1 Wildcard Character or asterisk (*) in the format url.com/abc/* in the URL Crawler, then you should be able to retrieve all URLs one layer below your root URL.

As a more technical example, if your website has this kind of arrangement:

/abc/def/123

/abc/def/123/456

/abc/ghi/123/456

/abc/jkl/

Then, inputting /abc/* in the URL Crawler will only retrieve:

/abc/def/

/abc/ghi/

/abc/jkl/

As we only crawled one layer downward.

This is useful for retrieving all pages within a single website section, such as the specific subcategories within a category of products.

For instance, a URL Crawler can give you all brands (subcategories) of laptop (laptop) in an electronics store.


2 Wildcard Characters: url.com/abc/**

If you use 2 Wildcard Characters or asterisks (**) in the format url.com/abc/** in the URL Crawler, then you should be able to retrieve all the URLs under your root URL.

As a more technical example, if your website has this kind of arrangement:

/abc/def/123

/abc/def/123/456

/abc/ghi/123/456

/abc/jkl/

Then, inputting /abc/def/** in the URL Crawler will only retrieve:

/abc/def/123

/abc/def/123/456

As we allowed the URL Crawler to look for all pages under /abc/def/, including lower layers.

This is useful for retrieving specific sub-types of a subcategory within a category of products.

For instance, a URL Crawler will give you all models (sub-type) or even models of a specific year (deeper sub-type) of a brand (subcategory) of a laptop (category) in an electronics store.


To use the URL Crawler feature, simply place a root URL in the box provided and click the button.

After choosing the URLs you want to ingest, click to proceed.

If you click the "Import" button of the "Select Your Own Information" page in the AI Chatbot creation process, you can access the URL Crawler by clicking its tab. You can then insert a root URL for crawling.
To use the URL Crawler feature, insert the root URL or main URL you want to use as a source for the crawl. Captivate Chat will deploy a web crawler to look for all URLs nested under this source.
After clicking "Submit" under "Import Your Own Information" > Import > URL Crawler, Captivate Chat will list all URLs nested under your root URL. It will appear under the "Start deep crawl" pop-up window. You can then select the ones you want your AI Chatbot to ingest and click "Import Selected" to proceed.
Using a wildcard character or an asterisk (*) at the end of the last front slash of a root URL inside the URL Crawler will make the Captivate Chat web crawler look for all relevant URLs within that website layer.
Using only one wildcard character or asterisk (*) in the Captivate Chat URL Crawler will only retrieve URLs of one layer beneath the specified root URL.
Using only two wildcard characters or asterisks (**) in the Captivate Chat URL Crawler will only retrieve URLs of one layer beneath the specified root URL.
If you click the "Import" button of the "Select Your Own Information" page in the AI Chatbot creation process, you can access the URL Crawler by clicking its tab. You can then insert a root URL for crawling.
To use the URL Crawler feature, insert the root URL or main URL you want to use as a source for the crawl. Captivate Chat will deploy a web crawler to look for all URLs nested under this source.
After clicking "Submit" under "Import Your Own Information" > Import > URL Crawler, Captivate Chat will list all URLs nested under your root URL. It will appear under the "Start deep crawl" pop-up window. You can then select the ones you want your AI Chatbot to ingest and click "Import Selected" to proceed.
Using a wildcard character or an asterisk (*) at the end of the last front slash of a root URL inside the URL Crawler will make the Captivate Chat web crawler look for all relevant URLs within that website layer.
Using only one wildcard character or asterisk (*) in the Captivate Chat URL Crawler will only retrieve URLs of one layer beneath the  specified root URL.