From Script to Daemon: Architecting a Resilient AI News Radar on an 8GB Mac

Developer from India.
How I scaled a simple crawler into a "Staff-Level" automated research assistant by overcoming flaky inputs, hardware limits, and API rot.
As engineers, we are expected to stay on top of everything: Netflix’s latest architecture, AWS updates, Go 1.23 releases. The fear of missing out (FOMO) is real, but the time to read is nonexistent.
I wanted to solve this. My goal was simple: Build an agent that reads engineering blogs for me and sends summaries to my phone.
But building it on my daily driver (M1 MacBook Air, 8GB RAM) forced me to evolve the design from a
script" to a "daemon." Here is the story of that evolution through four major bottlenecks.
Bottleneck #1: The Ingestion Dilemma (HTML is a Trap)
The "Just Scrape It" Mistake
My first instinct was to build a standard web scraper using Colly or GoQuery. I thought, "I'll just fetch the HTML and find the article links."
I immediately hit three walls:
The DOM Stability Problem: Tech blogs (especially Medium-based ones like Netflix's) use dynamic class names like
<div class="x7-y8-z">. Every time they deployed a UI update, my crawler broke.The JavaScript Wall: Many modern blogs (Uber, DoorDash) render content via React/Hydration. A simple
http.Getreturned an empty skeleton, forcing me to consider heavy tools like Selenium or Playwright.The Resource Tax: Running a Headless Browser (like Chrome) to scrape 5 sites consumes ~1GB of RAM. On an 8GB machine, that’s 12% of my total memory just to find a URL.
The Alternatives Analysis
| Strategy | Pros | Cons | Verdict |
| HTML Scraping | Can get everything | Brittle; breaks on UI changes | ❌ Too High Maintenance |
| Headless Browser | Renders JS perfecty | Heavy CPU/RAM usage; slow | ❌ Too Heavy for M1 Air |
| RSS / Atom Feeds | Standardized XML | Limited to feed content | ✅ The Winner |
The Solution: Boring is Better (RSS)
I pivoted to RSS Feeds.
Why: It is a standardised XML contract. It doesn't care about CSS classes, React, or ads.
Efficiency: Parsing 10 XML feeds takes milliseconds and kilobytes of RAM, compared to seconds and gigabytes for headless browsing.
Code: I swapped 200 lines of fragile HTML parsing for the robust
gofeedlibrary.
Bottleneck #2: The Hardware Reality Check
The "Hello World" Mistake
With the links secured, I tried to summarise them locally using Ollama and Llama 3 (8B).
The Crash:
My 8GB M1 Air immediately choked. The OS takes ~3GB, VS Code takes ~1GB. Loading an 8B parameter model (which needs ~4GB+ VRAM) left zero room for the Go compiler. My laptop turned into a heater, and the "summarisation" took 45 seconds per article.
The Solution: Cloud Delegation
I realized that Hardware Constraints dictate Architecture. I refactored the system to use the Strategy Pattern, allowing me to swap the "Brain" of the agent.
I moved from local inference to Google Gemini (Flash model).
Cost: $0 (Free tier).
Latency: 2 seconds (vs 45s).
RAM Usage: <50MB.
Bottleneck #3: The Rate Limit Wall (429s)
The "Too Fast" Mistake
With RSS (fast) and Gemini (cloud), my agent became too efficient. It grabbed 15 URLs and fired 15 concurrent requests to Gemini.
The Crash:
429 Too Many Requests. The free tier limits you to ~15 Requests Per Minute (RPM), and sometimes 5 RPM for newer models. My agent crashed instantly.
The Solution: Intelligent Pacing
I couldn't just "try again." I needed to design for the constraint.
Exponential Backoff: If the API says "Stop", we wait 2s, then 4s, then 8s.
The Speed Bump: I added a calculated delay in the main loop to mathematically guarantee compliance.
Go
// Staff-Level Resilience: Don't just hammer the API
cfg.RequestsPerMinute = 4 // Ultra-safe limit
safeDelay := time.Minute / time.Duration(cfg.RequestsPerMinute)
for _, job := range jobs {
go worker(job)
time.Sleep(safeDelay) // The "Speed Bump"
}
Bottleneck #4: "Model Rot" & Hardcoding
The "It Worked Yesterday" Mistake
I hardcoded the model string "gemini-1.5-flash" into my source code. One morning, I woke up to 404 Model Not Found. Google had deprecated the alias, and my binary was useless until I recompiled it.
The Solution: Dynamic Configuration
I learned that Dependencies change faster than Code. I refactored the initialization logic to pull the model version from Environment Variables (GEMINI_MODEL). Now, when a model is deprecated, I just update my .env file—no recompile needed.
The Final Architecture
Today, the system is a robust background daemon that I trust.
Inputs: RSS Feeds (Polled every 6 hours).
State: A simple
history.jsonfile prevents re-reading old articles.Brain: Gemini Flash (Configurable).
Output: Telegram Notifications.
Key Takeaways
Inputs Matter: Don't scrape HTML if an XML feed exists. Reliability > "Getting everything."
Respect Constraints: If you have 8GB RAM, you can't run Llama 70B. Move the compute.
Resilience > Speed: A slow crawler that never crashes is infinitely better than a fast one that dies on the 10th request.
You can check out the open-source code here: https://github.com/AkshayContributes/crawler-agent
#Go #SystemDesign #WebScraping #AI #Engineering #SideProject #Gemini





