Thanks to visit codestin.com
Credit goes to ddkang.substack.com
Daniel’s Substack
Subscribe
Sign in
Home
Archive
About
Accelerating Analytical Joins on Unstructured Data
Semantic joins over unstructured data have become essential to modern data analytics.
May 27
•
Daniel Kang
Codestin Search App
Codestin Search App
Codestin Search App
Latest
Top
Discussions
SODIUM: From Open Web Data to Queryable Databases
In research workflows using public data, answering a single analytical question requires collecting and organizing data from many different web sources.
May 19
•
Daniel Kang
Codestin Search App
1
Codestin Search App
Codestin Search App
Launching the CVE-Bench Leaderboard: A Public Arena of AI for Cybersecurity
Last year, we introduced CVE-Bench, a rigorous benchmark with real-world web vulnerabilities to evaluate the cyberoffensive capabilities of AI agents.
Feb 24
•
Daniel Kang
Codestin Search App
1
Codestin Search App
Codestin Search App
Claude 4.5 Opus Solves CORE-Bench — But Not REPRO-Bench
In our ACL 2025 paper, we introduced REPRO-Bench (GitHub), a benchmark designed to evaluate whether AI agents can accurately assess the reproducibility…
Dec 16, 2025
•
Daniel Kang
Codestin Search App
3
Codestin Search App
Codestin Search App
SafeSearch: Teaching LLM Search Agents to Be Both Smart and Safe
LLMs are rapidly expanding their built-in knowledge from training.
Nov 10, 2025
•
Daniel Kang
Codestin Search App
2
Codestin Search App
Codestin Search App
When Your Home Robot Turns Against You: BEATing Vision-Language Agents with Visual Backdoors
Household humanoid robots promise to assist everyone in daily life, with several exciting demos released recently (NEO, Figure 03, Tesla Optimus).
Nov 5, 2025
•
Daniel Kang
Codestin Search App
2
Codestin Search App
Codestin Search App
DRAMA: Enabling AI Agents to Collect Data to Support Data Science Workflows
Data science workflows generally include two major phases: data retrieval and data analysis.
Nov 3, 2025
•
Daniel Kang
Codestin Search App
1
Codestin Search App
Codestin Search App
CVE-Bench v2.0: Making Evaluation More Rigorous with ABC
This is the third post in the Agentic Benchmark Checklist (ABC) blog series. Written by Yuxuan Zhu, Antony Kellermann, and Daniel Kang.
Oct 30, 2025
•
Daniel Kang
Codestin Search App
3
Codestin Search App
1
Codestin Search App
See all
Daniel’s Substack
My personal Substack
Subscribe
Daniel’s Substack
Subscribe
About
Archive
Sitemap
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts