← back to blog

Using Singapore mobile proxies with Firecrawl in 2026

firecrawl mobile proxies tutorials 2026

Using Singapore mobile proxies with Firecrawl in 2026

Firecrawl’s core promise is clean, structured output that goes straight into an LLM pipeline without you spending three hours writing CSS selectors. That promise holds well for public, lightly-guarded content. But the moment you point Firecrawl at anything that matters commercially, especially e-commerce platforms, price aggregators, app stores, or regional SaaS dashboards, you start seeing 403s, CAPTCHAs, and silent redirects to honeypot pages. The culprit is almost always the IP class. Firecrawl runs on cloud infrastructure, and in 2026, every major anti-bot vendor fingerprints ASN ranges at the connection layer before a single HTTP header is evaluated. You can rotate user agents all day. It does not matter if your egress IP resolves to an AWS or GCP prefix. Adding Singapore residential mobile IPs to the stack fixes this at the root.

why Firecrawl hits walls without residential mobile IPs

The most common Firecrawl workflow for data teams is automated crawling of product pages, search result sets, or review content on platforms like Shopee, Lazada, Carousell, or regional Google SERPs. These platforms have invested heavily in bot mitigation because the data on them is genuinely valuable. Cloudflare’s Bot Management, DataDome, and PerimeterX all score incoming requests on a combination of IP reputation, TLS fingerprint, HTTP/2 frame ordering, and behavioral signals. A datacenter IP coming from us-east-1, even one that has never been flagged before, carries a prior probability of being a bot that is orders of magnitude higher than a mobile carrier IP.

Firecrawl handles the browser automation layer competently: it manages headless Chromium, handles JavaScript rendering, and can deal with some bot challenges. It cannot change the ASN of its egress IP, though, and that ASN is the first thing evaluated. When you run Firecrawl against a well-protected site from a cloud IP, you often get a response that looks like a 200 but contains a challenge page or a stripped-down version of the site with no real data. This is harder to detect than a hard block because Firecrawl’s extraction pipeline will dutifully parse the challenge page and return empty or nonsensical fields.

The second problem is geofencing. A meaningful subset of Southeast Asian platforms serve different content, different prices, or different product catalogs depending on whether the visitor appears to be in Singapore or somewhere else. This is not hypothetical: Shopee SG and Shopee MY have separate inventory, promotions, and pricing. If your Firecrawl instance is egressing from a European datacenter, you may get redirected to a generic regional landing page or see prices in a different currency with no way to force the correct view through headers alone. A real SingTel or StarHub IP resolves this because the platform’s geolocation database classifies it as a Singapore mobile user, which is exactly the persona you need to see the data as a local would.

setting up SMP credentials in Firecrawl

Firecrawl exposes a proxy configuration option both in its SDK and in its self-hosted deployment config. The SMP credential format is ip:port:username:password, which maps directly to a standard HTTP proxy auth string. Here is how you pass it when using the Firecrawl Python SDK against a scrape endpoint:

import os
from firecrawl import FirecrawlApp

# SMP credentials: host, port, username, password
SMP_HOST = "proxy.singaporemobileproxy.com"
SMP_PORT = 8080
SMP_USER = "your_username"
SMP_PASS = "your_password"

proxy_url = f"http://{SMP_USER}:{SMP_PASS}@{SMP_HOST}:{SMP_PORT}"

app = FirecrawlApp(api_key=os.environ["FIRECRAWL_API_KEY"])

result = app.scrape_url(
    url="https://www.shopee.sg/search?keyword=wireless+earbuds",
    params={
        "formats": ["markdown", "html"],
        "proxy": proxy_url,
        "waitFor": 3000,          # give JS time to render
        "timeout": 30000,
    }
)

print(result["markdown"])

A few things to note about this setup. First, the proxy parameter in Firecrawl’s scrape params is passed through to the underlying Playwright browser context, which means it applies to all requests the browser makes during the crawl, including subrequests for API calls that hydrate the page with product data. That is the behavior you want. Second, if you are using SOCKS5 instead of HTTP (SOCKS5 tends to perform better for long-lived sessions and has slightly lower detection surface), swap the scheme: socks5://user:pass@host:port. For a comparison of when each protocol is preferable, see HTTP vs SOCKS5 mobile proxies.

If you are running the self-hosted version of Firecrawl rather than the managed API, the proxy configuration goes into your worker environment rather than the per-request params. You set PROXY_SERVER, PROXY_USERNAME, and PROXY_PASSWORD environment variables, which the Playwright worker picks up on startup. With SMP you will typically want per-request rotation rather than a single static proxy credential at the worker level, so the SDK approach above gives you more control in practice.

rotating IPs per request or per session

The choice between rotating and sticky sessions is not about preference. It is about what the target site’s session model requires.

Rotating mode assigns a new IP for every connection. This is optimal for stateless scraping: fetching individual product pages, scraping SERPs one query at a time, or pulling public API responses where there is no login or cart state. Firecrawl’s default crawl mode, where it discovers and follows links from a seed URL, tends to work well with rotation because each page fetch is logically independent.

Sticky sessions pin you to the same IP for the duration of a session window (typically 10 to 30 minutes depending on your SMP plan). You need this whenever the target site ties any meaningful state to the IP: login flows, multi-step checkout pages you are monitoring for price changes, or any site that uses IP consistency as part of its fraud detection. Rotate IPs mid-session on a site like that and you will get logged out or hit with a step-up challenge on the next request.

import os
import time
from firecrawl import FirecrawlApp

SMP_HOST = "proxy.singaporemobileproxy.com"
SMP_PORT_ROTATING = 8080   # rotating endpoint
SMP_PORT_STICKY   = 8081   # sticky session endpoint
SMP_USER = "your_username"
SMP_PASS = "your_password"

app = FirecrawlApp(api_key=os.environ["FIRECRAWL_API_KEY"])

def scrape_with_rotation(url: str) -> dict:
    """Use for stateless pages: SERPs, product listings, public content."""
    proxy = f"http://{SMP_USER}:{SMP_PASS}@{SMP_HOST}:{SMP_PORT_ROTATING}"
    return app.scrape_url(url, params={"formats": ["markdown"], "proxy": proxy})

def scrape_with_sticky(url: str) -> dict:
    """Use for pages that require consistent IP: logged-in dashboards, multi-step flows."""
    proxy = f"http://{SMP_USER}:{SMP_PASS}@{SMP_HOST}:{SMP_PORT_STICKY}"
    return app.scrape_url(url, params={"formats": ["markdown"], "proxy": proxy})

# rotating: scrape a list of product pages independently
product_urls = [
    "https://www.carousell.sg/p/some-item-123/",
    "https://www.carousell.sg/p/another-item-456/",
]
for url in product_urls:
    data = scrape_with_rotation(url)
    print(data["markdown"][:200])
    time.sleep(1.5)  # be a polite crawler

# sticky: scrape a price-check page that requires a logged-in context
dashboard_data = scrape_with_sticky("https://example-sg-platform.com/dashboard/pricing")

SMP surfaces rotating and sticky sessions through separate port assignments rather than query parameters, which keeps the proxy URL format clean and avoids any ambiguity about which mode is active on a given request.

three real workflows where this combo wins

monitoring shopee and lazada pricing for SEA market research

Price intelligence on Shopee SG or Lazada SG is a common use case for teams doing competitive analysis or building repricing tools for SEA e-commerce sellers. Both platforms are aggressive about bot detection and serve geo-specific prices. A Firecrawl crawl against these sites from a US or EU datacenter IP will either get blocked outright or receive prices localized to the wrong region, making the data useless for SG market decisions.

With SMP, your Firecrawl requests arrive from real SingTel or StarHub mobile IPs that these platforms see every day from legitimate users. The prices returned are in SGD, the promotions are SG-specific, and the block rate drops dramatically. For large-scale crawls across thousands of product SKUs, you use the rotating endpoint and spread requests across a few seconds of jitter to avoid rate limits. The output feeds directly into an LLM pipeline to extract structured price and availability data from Firecrawl’s markdown.

scraping Singapore SERP results for SEO tracking

Teams tracking keyword rankings in Google SG face a specific problem: Google personalizes and localizes SERP results by IP location, device type, and carrier. A desktop datacenter IP in Singapore gets different results than a mobile SingTel IP, and for any client whose customers are predominantly mobile, the mobile SERP is the ground truth. Firecrawl’s ability to render JavaScript is useful here because many SERP features (People Also Ask, shopping carousels, local pack results) are dynamically loaded.

By routing Firecrawl through SMP, the Google SERP you capture reflects what an actual SingTel mobile user in Singapore would see. This matters for mobile proxies for SEO research, where the gap between desktop datacenter rankings and real mobile carrier rankings can be significant enough to change strategic decisions. You get accurate rank positions, featured snippet content, and local pack data that reflects the actual search experience.

extracting LLM training data from Singapore-specific platforms

A growing use case for Firecrawl in 2026 is sourcing structured text data for fine-tuning or retrieval-augmented generation pipelines. For teams building LLM products targeted at Singapore users, forums like HardwareZone, local news sites, or community platforms contain valuable Singapore-specific language patterns, local terminology, and domain knowledge. Many of these sites have started restricting access from known scraper IP ranges after a rise in AI training crawls.

SMP’s real carrier IPs are not in any commercial scraper IP blocklist because they are the same IPs used by ordinary Singapore mobile users. Firecrawl can access this content reliably where datacenter-based approaches have been shut out. The ethical framing matters here too: responsible data collection that respects rate limits and robots.txt is the standard to maintain, and ethical mobile proxy use covers the principles worth keeping in mind when building pipelines like this at scale.

common pitfalls

  • user agent and IP class mismatch: if you send a desktop Chrome user agent through a mobile carrier IP, some fingerprinting systems flag the inconsistency. SMP IPs are classified as mobile, so configure Firecrawl to use a mobile Chrome user agent string (Mozilla/5.0 (Linux; Android 14; Pixel 8) AppleWebKit/537.36...) when targeting mobile-specific content.
  • rotating IPs mid-session on stateful sites: if your Firecrawl crawl follows links across a site that uses IP-bound sessions, rotating between pages will log you out or trigger a re-authentication challenge. map your workflows to either fully stateless (rotating) or session-bound (sticky) before you start.
  • not accounting for proxy latency in timeout settings: mobile proxies have slightly higher latency than datacenter IPs because the traffic routes through real carrier infrastructure. the default Firecrawl timeout may be too aggressive. add 3 to 5 seconds of headroom to your timeout param.
  • geolocation headers overriding the IP signal: some Firecrawl configurations or downstream middleware inject X-Forwarded-For or CF-IPCountry headers that can contradict the real IP’s geolocation. make sure these headers are not being set to non-SG values, or strip them entirely and let the carrier IP speak for itself.
  • treating all 200 responses as success: as mentioned above, some anti-bot systems return 200 with a challenge page. add a validation step in your Firecrawl pipeline that checks for expected content markers (a specific CSS selector, a known string) before accepting the response as valid data.
  • ignoring robots.txt and rate limits: mobile IPs give you access, not permission to hammer a site. respect crawl delays and robots.txt directives, both because it is the right approach and because aggressive crawling from carrier IPs can get those IPs flagged, which degrades the pool for everyone.

when Singapore IPs specifically matter

The value of a Singapore carrier IP is not just “residential.” It is specifically Singaporean, and that distinction matters for a meaningful subset of Firecrawl use cases. Platforms like Singpass-integrated services, SG government data portals, and local financial services sites perform strict geolocation checks that reject non-SG IPs regardless of whether they are residential or datacenter. A residential IP in Germany or a mobile IP in India will not pass these checks. An IP on SingTel’s mobile ASN will, because it is the same IP class that Singapore residents use every day to access these services.

Beyond hard geofencing, there is a softer but equally important dynamic: content localization. Google, Facebook, and most major platforms customize what they show based on inferred location, and mobile carrier IP is one of the strongest location signals available. For any Firecrawl workflow where you need to see Singapore as a Singapore user sees it, not as an approximation inferred from a VPN exit node or a datacenter in the right country, real SingTel, StarHub, and M1 IPs are the correct tool. If you want to understand more about how mobile proxies differ from other IP types at a technical level, what is a mobile proxy is a useful reference before setting up your first integration.

getting started

If you are ready to wire this up, the Singapore Mobile Proxy plans page has current pricing and endpoint details for both HTTP and SOCKS5, with options for rotating and sticky session pools. Start with the rotating endpoint for stateless scraping workflows and move to sticky sessions once you have a use case that requires session consistency. The credential format is straightforward and you can have Firecrawl running through SMP within a few minutes of provisioning. For teams already running Firecrawl in production, this is typically a one-line change to your scrape params, and the improvement in data quality on protected or geo-fenced sites is visible immediately.

ready to try Singapore mobile proxies?

2-hour free trial. no credit card required.

start free trial
message me on telegram