Best Practices for AI Agent in WMS: An Engineer's Field Notes

Last summer, on the hottest day in the warehouse, I crouched in the server room staring at the monitoring screen, sweat dripping onto the keyboard. On the screen, the AI Agent's decision log was scrolling wildly—it had just tripled the replenishment quantity for a batch of room-temperature goods, claiming "a sudden temperature drop in three days would cause a demand surge." But the weather forecast showed high temperatures for the next week. I quickly switched to emergency mode, but the inventory was already messed up. That night, I traced through the logs line by line, discovering that the Agent had set the correlation weight between "temperature drop" and "demand surge" too high, ignoring seasonal fluctuations.

TL;DR Don't expect an AI Agent to manage your warehouse perfectly out of the box. My experience shows: data cleaning takes 70% of the effort, choosing the right scenario matters more than flashy tech, and you must install a "brake"—human review fallback. Here are my engineering best practices.

Data Cleaning: The Foundation of AI Agent

In the first week after deploying the Agent, it frequently suggested replenishing slow-moving items. I was puzzled until I discovered that some SKUs had empty "last outbound time" fields, and the Agent auto-filled them with the current time, mistakenly treating these items as fast-moving.

Data quality determines the ceiling of an AI Agent. I later built a data validation pipeline that automatically scans for anomalies every night. According to Fortune Business Insights^[1], data quality issues cause up to 60% of AI project failures.

Three Major Sources of Dirty Data

Manual entry errors: e.g., confusing "box" and "piece", making the Agent think inventory is 12x higher.
System sync delays: ERP and WMS data differ by 30 minutes, causing the Agent to make decisions on stale data.
Legacy issues: Old systems with changed field meanings but no updated documentation.

Comparison: Before and After Cleaning

Metric	Before Cleaning	After Cleaning
Replenishment accuracy	62%	91%
Weekly false alarms	8.3	1.2
Inventory turnover days	45	32

Data from Flash WMS internal test environment, July to September 2025.

Scenario Selection: Don't Let AI Do Everything

At first, I was greedy: the Agent handled inventory prediction, replenishment, slotting optimization, and order assignment simultaneously. It juggled them poorly—predicting sales while forgetting to adjust slotting, leading to longer picking routes.

AI Agents excel at single, high-frequency decisions with clear feedback. I later split its responsibilities into three independent modules, each focusing on one task.

Three Deployment Scenarios

1. Dynamic Safety Stock Calculation

We used a fixed formula: safety stock = average daily sales × 3 days. But during promotions, 3 days wasn't enough. After integrating real-time sales data and promo calendars, the Agent can now warn of stockouts 48 hours in advance.

2. Intelligent Slotting Recommendation

When new goods arrive, the Agent recommends the best shelf based on historical outbound frequency and cross-selling patterns. For example, chips and beer are often bought together, so they are placed on adjacent aisles.

3. Exception Order Handling

For orders with incomplete addresses or out-of-stock items, the Agent generates three options for human review: partial shipment, substitute similar items, or cancel and refund. Efficiency tripled.

Scenario Selection Comparison

Scenario	AI Agent Recommended?	Reason
Dynamic safety stock	Strongly recommended	Sufficient data, fast feedback, clear ROI
Slotting	Recommended	Requires historical data, high upfront cost
Order assignment	Cautious	Involves human scheduling, Agent may ignore human factors
Supplier negotiation	Not recommended	Requires relationships and game theory, AI not ready

Human-in-the-Loop: Install a "Brake" for the Agent

After the temperature prediction incident, I learned my lesson. Agent decisions cannot be executed directly without human review. But not all decisions need human eyes—we designed a three-tier approval mechanism.

Automation does not mean unmanned. A good AI Agent knows when to ask for help. According to McKinsey's operations insights^[2], 78% of AI projects fail due to over-automation and ignoring human-AI collaboration.

Three-Tier Approval Mechanism

Green channel: Low-risk decisions (e.g., routine replenishment), Agent executes directly, logs afterward.
Yellow alert: Medium risk (e.g., increasing safety stock by 20%), pushed to supervisor's phone for one-click confirmation.
Red alert: High risk (e.g., batch price changes, large returns), requires manual confirmation from manager.

Implementation Results

After deploying this mechanism, the Agent's decision adoption rate increased from 40% to 85%, with zero major incidents.

Continuous Iteration: AI Agent Is Not a One-Time Deal

Many bosses think deployment is the end. In reality, the Agent needs continuous feeding of new data. Every month, I pull last month's decision logs, compare them with actual results, and identify deviation causes.

Maintenance costs for AI Agents are no less than development costs. According to Gartner's supply chain research^[3], companies should allocate 20% of their total AI budget annually for maintenance and optimization.

Iteration Checklist

Weekly: Check decision logs, flag anomalies
Monthly: Retrain models with new data
Quarterly: Evaluate scenario applicability, retire ineffective modules

Summary

Now my Agent has been stable for six months, reducing error rates by 40% and improving inventory turnover by 25%. But every time I upgrade the system, I still personally monitor the logs for a while. No matter how strong the technology, it can't replace respect for the business.

Key takeaways:

Data cleaning is the foundation; don't cut corners.

Choosing the right scenario matters more than flashy tech.

Always keep a human review "brake."

AI Agents need continuous feeding and iteration.

References: Fortune Business Insights^[1], McKinsey^[2], Gartner^[3]

References

Fortune Business Insights Warehouse Management System Market Report — Referenced data on AI project failure rates due to data quality
McKinsey Operations Insights — Referenced failure rates of AI projects due to over-automation
Gartner Supply Chain Research — Referenced recommendation on AI system maintenance budget