A Node.js/Express API that scrapes, deduplicates, and caches football news from BBC Sport, ESPN, Sky Sports, The Guardian, Goal, 90mins, OneFootball, and FourFourTwo — returning clean JSON on every call.
All sources scraped in real-time with Cheerio. Each item returns title, URL, image (where available), and source label.
Each news source has its own endpoint. Hit /news/bbc, /news/espn, or /news/skysports independently if you only need one outlet.
/news/all runs all scrapers in parallel with Promise.allSettled, merges the results, and deduplicates by URL — one call, everything, no duplicates.
/news/worldcup hits WC-dedicated pages on BBC, Sky, and The Guardian, then cross-filters general feeds using WC 2026 keywords — 3-minute TTL for faster refresh.
FourFourTwo endpoints broken out by competition — EPL, La Liga, Champions League, Bundesliga — so you can target one league without client-side filtering.
Vercel KV caching with 5-minute TTL on news feeds, 3-minute on the world cup and unified feed. Every response tells you whether it came from cache or a fresh scrape.
30 req/min per client enforced at router level. CORS enabled globally. All responses follow a consistent { success, data, cached } wrapper.
/health
API health — uptime, version, timestamp
/news
List of all active news sources
/news/all
Merged, deduped feed from all sources
Main
/news/worldcup
World Cup 2026 specific news
WC 2026
/news/bbc
BBC Sport football
/news/espn
ESPN football — includes images
/news/skysports
Sky Sports football
/news/guardian
The Guardian football section
/news/goal
Goal.com — includes images
/news/90mins
90mins (si.com/soccer) — includes images
/news/onefootball
OneFootball home feed
/news/fourfourtwo/epl
Premier League
/news/fourfourtwo/laliga
La Liga
/news/fourfourtwo/ucl
Champions League
/news/fourfourtwo/bundesliga
Bundesliga
Full live testing and response samples available on the RapidAPI listing.
Every endpoint returns the same wrapper. Each news item includes title, URL, image (where available), and the source it came from.
{
"success": true,
"cached": false,
"data": [
{
"title": "Arsenal confirm new signing ahead of UCL clash",
"url": "https://www.bbc.com/sport/football/articles/...",
"image": "https://ichef.bbci.co.uk/...",
"source": "BBC Sport"
}
]
}
In v1, the cache lookup happened after the scrape was already done — so every request was a cold scrape regardless. Fixed by moving cache-check first inside a unified scraperRoute wrapper that handles the entire flow.
Upstream sources occasionally hang. Added timeout handling in fetchHTML via Axios so slow or unresponsive sources fail fast, especially important in /news/all where everything runs in parallel.
BBC, Sky Sports, and The Guardian all return 403s or empty pages for headless requests. Adding a realistic browser User-Agent in every fetchHTML call fixed the block immediately.
In v1 the limiter was mounted too late in the middleware chain — requests went through before it could apply. Moved to before app.use("/api/v2", router) so it actually intercepts at the right point.
CORS was scoped to individual routes in v1, missing several endpoints. Fixed by applying it globally at the app level so every route is accessible from browser clients without exceptions.
Promise.all rejects entirely if any scraper throws. Switched to Promise.allSettled so the unified feed returns whatever succeeded — partial data beats a 500 error every time.
News dashboards that want a unified feed from multiple outlets without building and maintaining their own scrapers.
Prediction and tracking apps that need a dedicated WC 2026 news stream separate from general football noise.
Telegram and Discord bots pushing league-specific or competition-specific headlines to football communities.
Apps targeting EPL, La Liga, UCL, or Bundesliga in isolation using the FourFourTwo league-specific endpoints.
Subscribe on RapidAPI. Free tier available. Start hitting endpoints in minutes.
Get Access on RapidAPI ↗