Web Search
The Web Search module enables you to perform searches across the web, retrieve URLs, fetch page content, and extract structured insights such as snippets or keypoints. It provides a unified interface for integrating external search and content retrieval into your application.
Setup
Before using the Web Search SDK, ensure the following environment variables are configured:
export SMARTSEARCH_BASE_URL="https://your-smartsearch-endpoint"
export SMARTSEARCH_TOKEN="your-access-token"Install the SDK and dependencies:
pip install smart-search-sdkInitialize the client:
import asyncio
import json
import os
from dotenv import load_dotenv
from smart_search_sdk.web.client import WebSearchClient
from smart_search_sdk.web.models import (
GetWebSearchResultsRequest,
GetWebSearchUrlsRequest,
GetWebPageRequest,
GetWebPageSnippetsRequest,
GetWebPageKeypointsRequest,
GetWebSearchMapRequest,
)
load_dotenv()
client = WebSearchClient(base_url=os.getenv("SMARTSEARCH_BASE_URL"))
await client.authenticate(token=os.getenv("SMARTSEARCH_TOKEN"))1. Perform a Basic Web Search
Retrieve search results containing snippets, keypoints, or summaries from the web.
Parameters
query
str
The text query to search for.
result_type
str
Type of output format. Supported values: • "snippets" – short excerpts from web pages.• "keypoints" – summarized highlights from results.• "summary" – condensed summary across sources.• "description" – descriptions of results.
size
int
Number of results to return.
site
list[str] / str
(Optional) URL or list of URLs to limit search results to specific sites or domains.
engine
str
(Optional) Search engine to use: "auto", "firecrawl", or "perplexity". Defaults to "auto".
Search Results with Site Filter
To limit search results to a specific site or domain, use the site parameter in GetWebSearchResultsRequest:
Search with Multiple Sites
You can search across multiple sites simultaneously by providing a list of URLs:
Search with Specific Engine
Choose a specific search engine for your query:
2. Retrieve Web Search Keypoints
Extract summarized keypoints directly from top web search results.
Parameters
Same as above — result_type can be set to "keypoints" to focus on summarized insights instead of raw snippets.
3. Get Web Search URLs
Return a list of URLs that match the given query.
Parameters
query
str
The text query to search for.
size
int
Number of URLs to return.
site
list[str] / str
(Optional) URL or list of URLs to limit search results to specific sites or domains.
engine
str
(Optional) Search engine to use: "auto", "firecrawl", or "perplexity".
Search URLs with Site Filter
To limit URL search results to a specific site or domain, use the site parameter in GetWebSearchUrlsRequest:
Search URLs Across Multiple Sites
3.1 Map a Website
Discover and map the URL structure of a website.
Parameters
base_url
str
The base URL of the website to map.
size
int
Maximum number of URLs to return (1-1000).
include_subdomains
bool
Whether to include subdomains in the mapping (default: False).
Map a Website with Query Filter
To filter mapped URLs by a search query, use the query parameter in GetWebSearchMapRequest:
Parameters for Query Filter
base_url
str
The base URL of the website to map.
page
int
Page number to return (default: 1).
size
int
Maximum number of URLs to return (1-1000), default is 20.
return_all_map
bool
Whether to return all mapped links (default: False).
include_subdomains
bool
Whether to include subdomains in the mapping (default: False).
query
str
Search query to filter URLs by keywords.
4. Fetch a Web Page
Retrieve the content of a specific web page, either as raw HTML or structured text.
Parameters
source
str
The URL of the web page to fetch.
json_schema
dict
(Optional) JSON schema for custom structured data extraction.
return_html
bool
Whether to return the page as raw HTML (True) or cleaned text (False).
4.1 Fetch Page with JSON Schema Extraction
Extract structured data from a web page using a custom JSON schema:
Using Pydantic Models (Recommended):
Instead of manually writing JSON schemas, you can define Pydantic models and convert them automatically:
5. Extract Snippets from a Web Page
Extract relevant text snippets from a single web page based on a query.
Parameters
query
str
The text to match against the web page content.
source
str
The URL of the web page.
size
int
Number of snippets to extract.
json_schema
dict
(Optional) JSON schema for custom extraction. Uses Firecrawl extract API.
snippet_style
str
Style of snippet extraction. Supported values: • "paragraph" – extracts full paragraphs containing the match.• "sentence" – extracts individual sentences related to the query.
5.1 Extract Snippets with JSON Schema
Extract structured information from a web page:
Using Pydantic Models (Recommended):
6. Extract Keypoints from a Web Page
Generate key points summarizing the content of a given web page.
Parameters
query
str
The focus topic for extracting keypoints.
source
str
The web page URL to analyze.
size
int
Number of keypoints to return.
json_schema
dict
(Optional) JSON schema for custom extraction.
6.1 Extract Keypoints with JSON Schema
Extract structured keypoint information:
Using Pydantic Models (Recommended):
Summary
get_web_search_results
client.search_web()
Perform a web search and retrieve snippets, keypoints, or summaries.
get_web_search_urls
client.search_web_urls()
Retrieve a list of relevant web URLs.
get_web_search_map
client.search_web_map()
Map a website and discover its URL structure.
get_web_page
client.fetch_web_page()
Fetch the full content of a web page.
get_web_page_snippets
client.get_web_page_snippets()
Extract snippets from a specific web page.
get_web_page_keypoints
client.get_web_page_keypoints()
Generate keypoints summarizing a web page.
Full Code Snippet
Last updated
