From Script to Daemon: Architecting a Resilient AI News Radar on an 8GB Mac

How I scaled a simple crawler into a "Staff-Level" automated research assistant by overcoming flaky inputs, hardware limits, and API rot.

As engineers, we are expected to stay on top of everything: Netflix’s latest architecture, AWS updates, Go 1.23 releases. The fear of missing out (FOMO) is real, but the time to read is nonexistent.

I wanted to solve this. My goal was simple: Build an agent that reads engineering blogs for me and sends summaries to my phone.

But building it on my daily driver (M1 MacBook Air, 8GB RAM) forced me to evolve the design from a
script" to a "daemon." Here is the story of that evolution through four major bottlenecks.

Bottleneck #1: The Ingestion Dilemma (HTML is a Trap)

The "Just Scrape It" Mistake

My first instinct was to build a standard web scraper using Colly or GoQuery. I thought, "I'll just fetch the HTML and find the article links."

I immediately hit three walls:

The DOM Stability Problem: Tech blogs (especially Medium-based ones like Netflix's) use dynamic class names like <div class="x7-y8-z">. Every time they deployed a UI update, my crawler broke.
The JavaScript Wall: Many modern blogs (Uber, DoorDash) render content via React/Hydration. A simple http.Get returned an empty skeleton, forcing me to consider heavy tools like Selenium or Playwright.
The Resource Tax: Running a Headless Browser (like Chrome) to scrape 5 sites consumes ~1GB of RAM. On an 8GB machine, that’s 12% of my total memory just to find a URL.

The Alternatives Analysis

Strategy	Pros	Cons	Verdict
HTML Scraping	Can get everything	Brittle; breaks on UI changes	❌ Too High Maintenance
Headless Browser	Renders JS perfecty	Heavy CPU/RAM usage; slow	❌ Too Heavy for M1 Air
RSS / Atom Feeds	Standardized XML	Limited to feed content	✅ The Winner

The Solution: Boring is Better (RSS)

I pivoted to RSS Feeds.

Why: It is a standardised XML contract. It doesn't care about CSS classes, React, or ads.
Efficiency: Parsing 10 XML feeds takes milliseconds and kilobytes of RAM, compared to seconds and gigabytes for headless browsing.
Code: I swapped 200 lines of fragile HTML parsing for the robust gofeed library.

Bottleneck #2: The Hardware Reality Check

The "Hello World" Mistake

With the links secured, I tried to summarise them locally using Ollama and Llama 3 (8B).

The Crash:

My 8GB M1 Air immediately choked. The OS takes ~3GB, VS Code takes ~1GB. Loading an 8B parameter model (which needs ~4GB+ VRAM) left zero room for the Go compiler. My laptop turned into a heater, and the "summarisation" took 45 seconds per article.

The Solution: Cloud Delegation

I realized that Hardware Constraints dictate Architecture. I refactored the system to use the Strategy Pattern, allowing me to swap the "Brain" of the agent.

I moved from local inference to Google Gemini (Flash model).

Cost: $0 (Free tier).
Latency: 2 seconds (vs 45s).
RAM Usage: <50MB.

Bottleneck #3: The Rate Limit Wall (429s)

The "Too Fast" Mistake

With RSS (fast) and Gemini (cloud), my agent became too efficient. It grabbed 15 URLs and fired 15 concurrent requests to Gemini.

The Crash:

429 Too Many Requests. The free tier limits you to ~15 Requests Per Minute (RPM), and sometimes 5 RPM for newer models. My agent crashed instantly.

The Solution: Intelligent Pacing

I couldn't just "try again." I needed to design for the constraint.

Exponential Backoff: If the API says "Stop", we wait 2s, then 4s, then 8s.
The Speed Bump: I added a calculated delay in the main loop to mathematically guarantee compliance.

// Staff-Level Resilience: Don't just hammer the API
cfg.RequestsPerMinute = 4 // Ultra-safe limit
safeDelay := time.Minute / time.Duration(cfg.RequestsPerMinute)

for _, job := range jobs {
    go worker(job)
    time.Sleep(safeDelay) // The "Speed Bump"
}

Bottleneck #4: "Model Rot" & Hardcoding

The "It Worked Yesterday" Mistake

I hardcoded the model string "gemini-1.5-flash" into my source code. One morning, I woke up to 404 Model Not Found. Google had deprecated the alias, and my binary was useless until I recompiled it.

The Solution: Dynamic Configuration

I learned that Dependencies change faster than Code. I refactored the initialization logic to pull the model version from Environment Variables (GEMINI_MODEL). Now, when a model is deprecated, I just update my .env file—no recompile needed.

The Final Architecture

Today, the system is a robust background daemon that I trust.

Inputs: RSS Feeds (Polled every 6 hours).
State: A simple history.json file prevents re-reading old articles.
Brain: Gemini Flash (Configurable).
Output: Telegram Notifications.

Key Takeaways

Inputs Matter: Don't scrape HTML if an XML feed exists. Reliability > "Getting everything."
Respect Constraints: If you have 8GB RAM, you can't run Llama 70B. Move the compute.
Resilience > Speed: A slow crawler that never crashes is infinitely better than a fast one that dies on the 10th request.

You can check out the open-source code here: https://github.com/AkshayContributes/crawler-agent

#Go #SystemDesign #WebScraping #AI #Engineering #SideProject #Gemini

From Script to Daemon: Architecting a Resilient AI News Radar on an 8GB Mac

Bottleneck #1: The Ingestion Dilemma (HTML is a Trap)

The "Just Scrape It" Mistake

The Alternatives Analysis

The Solution: Boring is Better (RSS)

Bottleneck #2: The Hardware Reality Check

The "Hello World" Mistake

The Solution: Cloud Delegation

Bottleneck #3: The Rate Limit Wall (429s)

The "Too Fast" Mistake

The Solution: Intelligent Pacing

Bottleneck #4: "Model Rot" & Hardcoding

The "It Worked Yesterday" Mistake

The Solution: Dynamic Configuration

The Final Architecture

Key Takeaways

Comments

More from this blog

Building an Enterprise Vendor Integration Platform: Processing 1M+ Daily Requests

Building a High-Performance Load Balancer in Go: Architecture, Design Decisions & Bottleneck Analysis

System Design Evolution: Building a URL Shortener from MVP to Planet-Scale

The Payroll Bug That Lived in a Timezone: How a Silent Failure Changed Our Sync Architecture

Command Palette

Bottleneck #1: The Ingestion Dilemma (HTML is a Trap)

The "Just Scrape It" Mistake

The Alternatives Analysis

The Solution: Boring is Better (RSS)

Bottleneck #2: The Hardware Reality Check

The "Hello World" Mistake

The Solution: Cloud Delegation

Bottleneck #3: The Rate Limit Wall (429s)

The "Too Fast" Mistake

The Solution: Intelligent Pacing

Bottleneck #4: "Model Rot" & Hardcoding

The "It Worked Yesterday" Mistake

The Solution: Dynamic Configuration

The Final Architecture

Key Takeaways

Comments

More from this blog