ContentPosts from @kala..
Link
@kala shared a link, 3 months, 4 weeks ago
FAUN.dev()

Google tests new Gemini 3 models on LM Arena

Google’s been quietly field-testing two shadow models,Fierce FalconandGhost Falcon, on LM Arena. Early signs? They're probably warm-ups for the next Gemini 3 Flash or Pro drop. Classic Google move: float a checkpoint, stir up curiosity, then go GA... read more  

Google tests new Gemini 3 models on LM Arena
Link
@kala shared a link, 3 months, 4 weeks ago
FAUN.dev()

Prompts for Open Problems

The author, Ben Recht, proposes five research directions inspired by his graduate machine learning class, arguing for different research rather than just more. These prompts include adopting a design-based view for decision theory, explaining the robust scaling trends in competitive testing, and mov.. read more  

Link
@kala shared a link, 3 months, 4 weeks ago
FAUN.dev()

Practical LLM Security Advice from the NVIDIA AI Red Team

NVIDIA’s AI Red Team nailed three security sinkholes in LLMs:reckless use ofexec/eval,RAG pipelines that grab too much data, andmarkdown that doesn't get cleaned. These cracks open doors to remote code execution, sneaky prompt injection, and link-based data leaks. The fix-it trend:App security’s lea.. read more  

Link
@kala shared a link, 3 months, 4 weeks ago
FAUN.dev()

Roses are red, violets are blue, if you phrase it as poem, any jailbreak will do

A new study just broke the safety game wide open: rhymed prompts slipped past filters in25 major LLMs, including Gemini 2.5 Pro and Deepseek - withup to 100% success. No clever chaining, no jailbreak soup. Just single-shot rhyme. Turns out, poetic language isn’t just for bard-core Twitter. When it c.. read more  

Roses are red, violets are blue, if you phrase it as poem, any jailbreak will do
Link
@kala shared a link, 3 months, 4 weeks ago
FAUN.dev()

A trillion dollars is a terrible thing to waste

OpenAI co-founder Ilya Sutskever just said the quiet part out loud: scaling laws are breaking down. Bigger models aren’t getting better at thinking, they’re getting worse at generalizing and reasoning. Now he’s eyeingneurosymbolic AIandinnate inductive constraints. Yep, the “just make it huge” era m.. read more  

A trillion dollars is a terrible thing to waste
News FAUN.dev() Team
@kala shared an update, 3 months, 4 weeks ago
FAUN.dev()

Gemini Deep Research Is Now Programmable Through a New API

Gemini 3 Vertex AI

The enhanced Gemini Deep Research agent is now available via API, enabling developers to integrate advanced research capabilities into applications, with the open-sourcing of DeepSearchQA for evaluating complex tasks.

Gemini Deep Research Is Now Programmable Through a New API
 Activity
@kala added a new tool Vertex AI , 3 months, 4 weeks ago.
 Activity
@kala added a new tool Gemini 3 , 3 months, 4 weeks ago.
News FAUN.dev() Team
@kala shared an update, 3 months, 4 weeks ago
FAUN.dev()

GitHub Copilot Adds GPT-5.2 With Long-Context and UI Generation

GitHub Copilot GPT-5.2

OpenAI unveils GPT-5.2 for GitHub Copilot, enhancing software engineering with improved long-context reasoning and UI generation, integrated with Microsoft Azure and NVIDIA.

GitHub Copilot Adds GPT-5.2 With Long-Context and UI Generation
News FAUN.dev() Team
@kala shared an update, 3 months, 4 weeks ago
FAUN.dev()

GPT-5.2 Quietly Beats Human Experts at Knowledge Work

Azure GPT-5.2

OpenAI releases GPT-5.2, enhancing professional tasks with improved speed and cost-effectiveness, now available for paid users in ChatGPT and via API.

OpenAI unveils GPT-5.2, the most advanced frontier model for professional work and long-running agents