ContentPosts from @kala..
Link
@kala shared a link, 2 months, 3 weeks ago
FAUN.dev()

FinePDFs: Liberating 3T of the finest tokens from PDFs - a Hugging Face Space by HuggingFaceFW

Hugging Face introduces FinePDFs, a large open dataset built by extracting and cleaning text from millions of PDF documents, reaching trillions of tokens across many languages. The post explains how the pipeline handles messy PDF structure, layout noise, duplication, and low-quality content to produ.. read more  

Link
@kala shared a link, 2 months, 3 weeks ago
FAUN.dev()

Recursive Language Models: the paradigm of 2026

Prime Intellect dropped a fresh take on long-range LLM workflows with itsRecursive Language Model (RLM)scaffold. It pulls off two smart moves: folds context to free up tokens and spins off sub-LLMs to handle chunkier tasks. Think persistent Python REPL meets lightweight agent swarm... read more  

Recursive Language Models: the paradigm of 2026
Link
@kala shared a link, 2 months, 3 weeks ago
FAUN.dev()

Reading across books with Claude Code

A custom LLM agent, built withClaude Codeand some hard-working CLI tools, chewed through 100+ nonfiction books by slicing them into 500-word semantic chunks - and then threading excerpt trails by topic. Under the hood: Chunk-topic indexes lived inSQLite. Topic embeddings flowed throughUMAPfor clust.. read more  

Reading across books with Claude Code
News FAUN.dev() Team
@kala shared an update, 2 months, 3 weeks ago
FAUN.dev()

Anthropic’s New "Economic Primitives" Reveal Who Uses Claude, for What, and How Well It Works

Anthropic's new Economic Index report introduces five "economic primitives" to measure *how* Claude is used: task complexity, user and AI skill level, use case (work, coursework, personal), autonomy, and task success - built from privacy-preserving classification of anonymized Claude.ai and first-party API transcripts from **November 2025**.

Anthropic’s New "Economic Primitives" Reveal Who Uses Claude, for What, and How Well It Works
Link
@kala shared a link, 2 months, 4 weeks ago
FAUN.dev()

8 plots that explain the state of open models

Starting 2026, Chinese companies are dominating the open AI model scene, with Qwen leading in adoption metrics. Despite the rise of new entrants like Z.ai, MiniMax, Kimi Moonshot, and others, Qwen's position seems secure. DeepSeek's large models are showing potential to compete with Qwen, but the Ch.. read more  

Link
@kala shared a link, 2 months, 4 weeks ago
FAUN.dev()

Build an AI-powered website assistant with Amazon Bedrock

AWS spun up a serverless RAG-based support assistant usingAmazon BedrockandBedrock Knowledge Bases. It pulls in docs via a web crawler and S3, then stuffs embeddings intoAmazon OpenSearch Serverless. Access is role-aware, locked down withCognito. Everything spins up clean withAWS CDK... read more  

Build an AI-powered website assistant with Amazon Bedrock
Link
@kala shared a link, 2 months, 4 weeks ago
FAUN.dev()

Where good ideas come from (for coding agents)

A new way to build agents treats prompting ascontext navigation, steering the LLM through ideas like a pilot, not tossing it prompts and hoping for magic. It maps neatly onto Steven Johnson’s seven patterns of innovation. For coding agents to actually pull their weight, users need to bring more than.. read more  

Link
@kala shared a link, 2 months, 4 weeks ago
FAUN.dev()

Agentic AI, MCP, and spec-driven development: Top blog posts of 2025

AI speeds up dev - but it’s a double-edged keyboard. It sneaks in subtle bugs and brittle logic that break under pressure. To keep things sane, teams are fighting back withguardrail patterns,AI-aware linters, andtest suites hardened for hallucinated code... read more  

Link
@kala shared a link, 2 months, 4 weeks ago
FAUN.dev()

Towards Generalizable and Efficient Large-Scale Generative Recommenders

Authors discuss their approach to scaling generative recommendation models from O(1M) to O(1B) parameters for Netflix tasks, improving training stability, computational efficiency, and evaluation methodology. They address challenges in alignment, cold-start adaptation, and deployment, proposing syst.. read more  

News FAUN.dev() Team
@kala shared an update, 2 months, 4 weeks ago
FAUN.dev()

OpenAI Goes All-In on Healthcare: ChatGPT Health for Consumers, and a Suite for Hospitals

#ChatGPT  #HIPAA  #Healthc...  #AI  #OpenAI 
ChatGPT GPT-5.2

OpenAI introduces ChatGPT for Healthcare, offering HIPAA-compliant AI tools to enhance healthcare delivery. The suite includes ChatGPT Health, designed to integrate health information with AI for improved user navigation.

OpenAI Goes All-In on Healthcare: ChatGPT Health for Consumers, and a Suite for Hospitals