ContentPosts from @faun..
Link
@faun shared a link, 1 week, 1 day ago

The Big LLM Architecture Comparison

Architectures since GPT-2 still ride transformers. They crank memory and performance withRoPE, swapGQAforMLA, sprinkle in sparseMoE, and roll sliding-window attention. Teams shiftRMSNorm. They tweak layer norms withQK-Norm, locking in training stability across modern models. Trend to watch:In 2025,..

The Big LLM Architecture Comparison
Link
@faun shared a link, 1 week, 1 day ago

GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search

GPT-5's“thinking” modeljust leveled up. It's not just answering queries—it’s doing full-on research. Picture deep, multi-step Bing searches mixed with tool use and reasoning chains. It reads PDFs. Analyzes them. Suggests what to do next. Then actually does it. All from your phone. What’s changing:L..

GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search
Link
@faun shared a link, 1 week, 1 day ago

Best Practices for High Availability of LLM Based on AI Gateway

Alibaba Cloud’s AI Gateway just got sharper. It now handlesreal-time overload protectionandLLM fallback routingusing passive health checks, first packet timeouts, and traffic shaping. It proxies both BYO and cloud LLMs—think PAI-EAS, Tongyi Qianwen—and redirects load spikes or failures on the fly. F..

Best Practices for High Availability of LLM Based on AI Gateway
Link
@faun shared a link, 1 week, 1 day ago

Simplifying Large-Scale LLM Processing across Instacart with Maple

Instacart builtMaple, a backend brain for handling millions of LLM prompts—fast, cheap, and shared across teams. It’s not just another service. Maple runs onTemporal,PyArrow, andS3, strip-mines away provider-specific boilerplate, auto-batches prompts, retries failures, and slashes LLM costs by up t..

Simplifying Large-Scale LLM Processing across Instacart with Maple
Link
@faun shared a link, 1 week, 1 day ago

Hermes V3: Building Swiggy’s Conversational AI Analyst

Swiggy just gave its GenAI tool, Hermes, a serious glow-up. What started as a simple text-to-SQL bot is now acontext-aware AI analystthat lives inside Slack. The upgrade? Not just tweaks—an overhaul. Think: vector-based prompt retrieval, session-level memory, an Agent orchestration layer, and a SQL..

Hermes V3: Building Swiggy’s Conversational AI Analyst
Link
@faun shared a link, 1 week, 1 day ago

Why language models hallucinate

OpenAI sheds light on the persistence ofhallucinationsin language models due to evaluation methods favoring guessing over honesty, requiring a shift towards rewarding uncertainty acknowledgment. High model accuracy does not equate to the eradication of hallucinations, as some questions are inherentl..

Why language models hallucinate
Link
@faun shared a link, 1 week, 1 day ago

Sandboxed to Compromised: New Research Exposes Credential Exfiltration Paths in AWS Code Interpreters

Researchers poked holes insandboxed Bedrock AgentCore code interpreters—and found a way to leak execution role credentials through theMicroVM Metadata Service (MMDS). No outside network? Doesn’t matter. The exploit dodges basic string filters in requests and lets non-agentic code swipe AWS creds to ..

Link
@faun shared a link, 1 week, 1 day ago

Deploy a containerized application with Kamal and Terraform

A Docker-first workflow combinesTerraformandKamalinto a lean, Elastic Beanstalk-ish alternative—without the bloat. Terraform spins up a three-tier VPC and wires it toECR. Kamal takes it from there, booting containers on a raw EC2 box: app, proxy, monitor. One script. Done...

Deploy a containerized application with Kamal and Terraform
Link
@faun shared a link, 1 week, 1 day ago

Measuring Developer Productivity with Amazon Q Developer and Jellyfish

Amazon Q Developer now plugs into Jellyfish. Teams get a clearer view of how AI fits into the real flow of work—prompt usage, code adoption, PR throughput. Not just surface stats. The setup pipes data from AWS S3 straight into Jellyfish’s analytics engine. It tags AI users, tracks velocity gains, an..

Measuring Developer Productivity with Amazon Q Developer and Jellyfish
Link
@faun shared a link, 1 week, 1 day ago

AWS, Microsoft and Google unite behind Linux Foundation DocumentDB database to cut enterprise costs and limit vendor lock-in

Document databases are crucial for AI apps in the gen AI era. Microsoft's open-source DocumentDB project, based on PostgreSQL, is moving to the Linux Foundation, offering a vendor-neutral, open-source alternative to MongoDB. DocumentDB's compatibility with MongoDB drivers and open source governance ..