I ran a logging layer on my agent for 72 hours. 37% of tool calls had parameter mismatches — and none raised an error.

r/artificial
Generative AI

I've been running an AI agent that makes tool calls to various APIs, and I added a logging layer to capture exactly what was being sent vs. what the tools expected. Over 84 tool calls in 72 hours, 31 of them (37%) had parameter mismatches - and not a single one raised an error. The tools accepted the wrong parameters and returned plausible-looking but incorrect output. Here are the 4 failure categories I found: 1. Timestamp vs Duration - The agent passed a Unix timestamp where the API expected a duration string like "24h.