Skip to content

f5xc-firecrawl

The f5xc-firecrawl plugin provides local self-hosted web scraping via the open-source firecrawl engine. No API keys, no subscriptions, no cloud dependency. All operations run against the local firecrawl instance on localhost:3002 inside the devcontainer.

v1.1.0 Productivity
/plugin install f5xc-firecrawl@f5xc-salesdemos-marketplace

Scrape a single URL and extract content as markdown.

/scrape https://docs.example.com/getting-started
/scrape https://example.com --format markdown,links --wait 2000

Scrape multiple URLs at once.

/batch-scrape https://example.com https://example.org https://example.net

Crawl multiple pages from a starting URL.

/crawl https://docs.example.com --limit 20 --depth 2
/crawl https://docs.example.com --include /api/* --exclude /blog/*

Discover all URLs on a website.

/map https://docs.example.com
/map https://docs.example.com --search api --subdomains

Search the web and optionally scrape results.

/search "firecrawl web scraping" --limit 10
/search "AI tools 2026" --scrape --time month

LLM-powered structured data extraction from web pages.

/extract https://example.com "Extract the main heading and any links"
/extract https://example.com/pricing --schema '{"plans": [{"name": "string", "price": "string"}]}'

Generate an llms.txt file for a site.

/llmstxt https://docs.example.com

Auto-activates when you ask to scrape a URL, crawl a website, map site URLs, search the web, extract structured data, generate llms.txt, batch scrape multiple URLs, or convert a web page to markdown. Delegates immediately to the firecrawl-operator agent.

Autonomous web scraping agent that executes curl + jq sequences against the local firecrawl API. Supports 11 protocols covering all v1 endpoints. Read-only agent (no Write, Edit, or Agent tools).

ProtocolEndpointType
HEALTHGET /Sync
SCRAPEPOST /v1/scrapeSync
BATCH_SCRAPEPOST /v1/batch/scrapeAsync
CRAWLPOST /v1/crawlAsync
CRAWL_CANCELDELETE /v1/crawl/:idSync
CRAWL_ACTIVEGET /v1/crawl/activeSync
CRAWL_ERRORSGET /v1/crawl/:id/errorsSync
MAPPOST /v1/mapSync
SEARCHPOST /v1/searchSync
EXTRACTPOST /v1/extractAsync
LLMSTXTPOST /v1/llmstxtAsync

The plugin requires the firecrawl stack running in the devcontainer:

ComponentPortPurpose
Firecrawl API3002All scrape/crawl/map/search/extract endpoints
Playwright3000JavaScript rendering engine
Redis6379Job queue backend
PostgreSQLsocketCrawl/batch job persistence
LiteLLM proxyOPENAI_BASE_URLLLM backend for extract (optional)

The stack starts automatically when ENABLE_FIRECRAWL=true (the default). A SessionStart hook checks that the API is reachable and warns if the service is down.

This plugin uses the self-hosted open-source version:

  • No authentication or API keys required for scraping
  • No credit limits or rate limiting
  • Uses v1 API endpoints (not v2)
  • Browser sessions and deep research not available
  • Extract uses your own LLM proxy instead of hosted models
  • Runs entirely within the local container network