Question

A robotic agent operates in a warehouse modeled as an N×M grid environment. The agent starts at a predefined loading dock and must deliver packages to multiple destinations marked on the grid while avoiding dynamically placed obstacles.
Your Tasks:
- Q1. Represent the warehouse as an N×M matrix. Place packages, drop-off points, and obstacles randomly.
- Q2. Implement a goal-based agent using a search algorithm (UCS, BFS, or DFS) to deliver all packages.
- Q3. Choose a random seed to reproduce results and output the final path, costs, rewards, penalties, and success stats.

2. What is a Goal-Based Agent?
A Goal-Based Agent selects actions by considering possible future outcomes and choosing the one that leads to its goal. It relies on:
- Perception of the current state
- Defined goals (e.g., deliver packages)
- Search and planning (e.g., Uniform Cost Search)
These agents differ from reactive agents as they plan ahead using heuristics or search algorithms.
3. What Are Warehouse Problems in AI?

Warehouse problems simulate real-world logistics, focusing on:
- Efficient pathfinding through a constrained grid.
- Handling obstacles and variable movement costs.
- Delivering items in predefined sequences.
- Optimizing for minimum total cost and maximum delivery reward.
Applications include robotics, inventory systems, and smart logistics.
4. Uniform Cost Search (UCS) – Path Planning Strategy
Uniform Cost Search (UCS) is a classic search algorithm used when the goal is to find the least-cost path in a weighted graph. Unlike greedy or depth-first search, UCS always expands the least-cost node first, making it optimal and complete as long as step costs are non-negative.
In this warehouse problem, UCS is ideal because:
- Movement has a cost of 1 unit.
- Obstacles add an extra cost of 5 units.
- The agent must find cost-efficient paths while minimizing penalty.
Key UCS Concepts Applied:
- State Space: Each cell in the grid is a node.
- Cost Function: Total cost = movement cost + obstacle penalty.
- Priority Queue (Min-Heap): Used to always expand the path with the lowest cumulative cost.
5. Thought Process and How to Approach the Problem
To design the solution:
- Randomly generate a warehouse grid (N×M) using a seed.
- Place packages, drop-off points, and obstacles with no overlaps.
- Use a Uniform Cost Search (UCS) algorithm to compute shortest paths:
- Obstacles incur an additional penalty of 5
- Normal movement costs 1 unit
- Create a Goal-Based Agent that:
- Moves to the next pickup point
- Picks up the package
- Plans to the respective drop location
- Delivers it, and repeats until all packages are delivered
6. Assumptions
- Grid size randomly chosen: 6 × 8
- Number of packages: 6
- Number of obstacles: 2
- Agent starts at: (0, 0)
- Penalty for obstacle: -5 units
- Reward for successful delivery: +10 units
- Agent uses Uniform Cost Search (UCS)
7. Code Snippet
import random
import heapq
random.seed(54)
def generate_random_warehouse():
"""
Generates a random warehouse grid and associated parameters,
labeling:
- Empty cells as " "
- Obstacles as "X"
- Packages as "P1", "P2", ...
- Drop-off locations as "D1", "D2", ...
Robot start is always (0,0), which remains " ".
Returns a dictionary containing:
- rows, cols
- grid (2D list of strings)
- num_packages, num_obstacles
- package_locations, drop_off_locations
- robot_start (row, col)
Conditions:
1. 5 <= N, M <= 10
2. 1 <= O <= 10 (number of obstacles)
3. 2 <= P <= 6 (number of packages)
4. Packages and drop-offs do not overlap each other or obstacles
5. Robot start is always (0,0) and remains free
"""
# 1) Random warehouse dimensions: N x M
N = random.randint(5, 10)
M = random.randint(5, 10)
# Initialize the grid to " " (empty space)
grid = [[" " for _ in range(M)] for _ in range(N)]
# Keep track of used positions to avoid overlap
# Mark (0,0) as used for the robot start so we don't place anything else there
used_positions = {(0, 0)}
# 3) Random number of packages
P = random.randint(2, 6)
package_locations = []
drop_off_locations = []
# Place packages and drop-offs without overlap
for i in range(P):
# Find a free cell for the package
while True:
r = random.randint(0, N - 1)
c = random.randint(0, M - 1)
if (r, c) not in used_positions:
label = f"P{i+1}"
grid[r][c] = label
used_positions.add((r, c))
package_locations.append((r, c))
break
# Find a free cell for the drop-off
while True:
r = random.randint(0, N - 1)
c = random.randint(0, M - 1)
if (r, c) not in used_positions:
label = f"D{i+1}"
grid[r][c] = label
used_positions.add((r, c))
drop_off_locations.append((r, c))
break
# 2) Random number of obstacles
O = random.randint(1, 10)
# Place static obstacles (marked as "X")
count_obstacles = 0
obstacle_locations = []
while count_obstacles < O:
r = random.randint(0, N - 1)
c = random.randint(0, M - 1)
if (r, c) not in used_positions:
grid[r][c] = "X"
used_positions.add((r, c))
obstacle_locations.append((r,c))
count_obstacles += 1
# 4) Robot start is always (0,0) => keep as " "
robot_start = (0, 0)
return {
"rows": N,
"cols": M,
"grid": grid,
"num_packages": P,
"num_obstacles": O,
"package_locations": package_locations,
"drop_off_locations": drop_off_locations,
"obstacle_locations": obstacle_locations,
"robot_start": robot_start
}
def show_grid(grid):
for idx, row in enumerate(grid):
print(f"{row}")
class GoalBasedAgent:
def __init__(self, environment):
self.environment = environment
self.state = "FIND_PICKUP_POINT"
self.location = environment["robot_start"]
self.current_goal =(0,0)
self.path = []
self.cost = 0
self.reward = 0
def perceive(self):
if self.location in self.environment["obstacle_locations"]:
return "OBSTACLE"
elif self.location in self.environment["drop_off_locations"]:
return "DROP POINT"
elif self.location in self.environment["package_locations"]:
return "PICKUP POINT"
else:
return "EMPTY"
def move(self, new_location):
self.location = new_location
self.cost += 1
def act(self):
percept = self.perceive()
if percept == "PICKUP POINT" and self.current_goal == self.location:
print(f"Package picked from {self.location}.")
self.environment["package_locations"].remove(self.location)
self.state = "FIND_DROP_POINT"
elif percept == "DROP POINT" and self.current_goal == self.location:
print(f"Package dropped to {self.location}.")
self.environment["drop_off_locations"].remove(self.location)
self.reward = self.reward + 10
# If we still have packages left, go find another pickup; otherwise, we are done
if len(self.environment["package_locations"]) > 0:
self.state = "FIND_PICKUP_POINT"
else:
self.state = "COMPLETED"
elif percept == "OBSTACLE":
print(f"Hit a obstacle at {self.location}. Penalty incurred -5")
self.cost = self.cost + 5
self.move(self.path[0])
self.path = self.path[1:]
else:
# Skip any leading positions in the path that match the current location
while len(self.path) > 0 and self.path[0] == self.location:
self.path.pop(0) # Remove the first element since it's the same as the current location
# Now, if there's still something in the path, move there
if len(self.path) > 0:
print(f"Moving from {self.location} to {self.path[0]}")
self.move(self.path[0])
self.path.pop(0)
self.compute_goal()
def find_shortest_path(self,start, goal):
start = tuple(start)
goal = tuple(goal)
directions = [(-1, 0), (1, 0), (0, -1), (0, 1)]
heap = []
heapq.heappush(heap, (0, [start])) # Heap element: (cumulative_cost, path)
visited = {start: 0}
if start == goal:
return [start] # or an empty list, depending on your design
while heap:
cost, path = heapq.heappop(heap)
current = path[-1]
if current == goal:
return path, cost
for dx, dy in directions:
nx, ny = current[0] + dx, current[1] + dy
if 0 <= nx < self.environment["rows"] and 0 <= ny < self.environment["cols"]:
step_cost = 1 # Base movement cost
penalty = 0
if (nx, ny) in self.environment["obstacle_locations"]:
penalty = 5 # Additional penalty for stepping into an obstacle
new_cost = cost + step_cost + penalty
neighbor = (nx, ny)
if neighbor not in visited or new_cost < visited[neighbor]:
visited[neighbor] = new_cost
heapq.heappush(heap, (new_cost, path + [neighbor]))
return None, float('inf')
def compute_goal(self):
if self.state == "FIND_PICKUP_POINT":
start = self.location
goal = self.environment["package_locations"][0]
self.path, cost = self.find_shortest_path(start, goal)
print(f" FIND_PICKUP_POINT: Current Location={start}, Package pickup location={goal}, Path={self.path}, Cost={cost}")
self.current_goal= goal
self.state = "MOVE_TO_PICKUP"
elif self.state == "FIND_DROP_POINT":
start = self.location
goal = self.environment["drop_off_locations"][0]
self.path, cost = self.find_shortest_path(start, goal)
print(f"FIND_DROP_POINT: Package Location={start}, Package drop location={goal}, Path={self.path}, Cost={cost}")
self.current_goal= goal
self.state = "MOVE_TO_DROP"
# Example usage:
if __name__ == "__main__":
#Q1 : Initial Warehouse Configuration:
warehouse_data = generate_random_warehouse()
print("Initial Warehouse Configuration\n")
show_grid(warehouse_data["grid"])
print("Grid Size:", warehouse_data["rows"], "x", warehouse_data["cols"])
print("Number of Obstacles:", warehouse_data["num_obstacles"])
print("Number of Packages:", warehouse_data["num_packages"])
print("Package Locations:", warehouse_data["package_locations"])
print("Drop-Off Locations:", warehouse_data["drop_off_locations"])
print("Obstacle Locations:", warehouse_data["obstacle_locations"])
print("Robot Start:", warehouse_data["robot_start"])
agent =GoalBasedAgent(warehouse_data)
while(agent.state != "COMPLETED"):
agent.act()
print(f"All deliveries are completed with total cost {agent.cost} and reward {agent.reward}")
print(f"Final score is {agent.reward - agent.cost}")
Key functions:
generate_random_warehouse(seed)
GoalBasedAgent
class- UCS implemented in
find_shortest_path
🧠 Pathfinding uses a min-heap and accounts for extra obstacle cost.
8. Output
Initial Configuration:
Initial Warehouse Configuration
[' ', ' ', ' ', ' ', ' ', ' ', ' ', 'D3']
[' ', ' ', ' ', 'P3', ' ', ' ', ' ', 'P2']
[' ', ' ', ' ', ' ', ' ', ' ', 'D2', 'P1']
[' ', ' ', ' ', ' ', 'P6', ' ', 'D1', ' ']
['X', 'D4', 'D6', ' ', 'D5', 'P5', ' ', 'P4']
[' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ']
Grid Size: 6 x 8
Number of Obstacles: 2
Number of Packages: 6
Package Locations: [(2, 7), (1, 7), (1, 3), (4, 7), (4, 5), (3, 4)]
Drop-Off Locations: [(3, 6), (2, 6), (0, 7), (4, 1), (4, 4), (4, 2)]
Obstacle Locations: [(5, 4), (4, 0)]
Robot Start: (0, 0)
Execution Steps:
FIND_PICKUP_POINT: Current Location=(0, 0), Package pickup location=(2, 7), Path=[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (1, 7), (2, 7)], Cost=9
Moving from (0, 0) to (0, 1)
Moving from (0, 1) to (0, 2)
Moving from (0, 2) to (0, 3)
Moving from (0, 3) to (0, 4)
Moving from (0, 4) to (0, 5)
Moving from (0, 5) to (0, 6)
Moving from (0, 6) to (0, 7)
Moving from (0, 7) to (1, 7)
Moving from (1, 7) to (2, 7)
Package picked from (2, 7).
FIND_DROP_POINT: Package Location=(2, 7), Package drop location=(3, 6), Path=[(2, 7), (2, 6), (3, 6)], Cost=2
Moving from (2, 7) to (2, 6)
Moving from (2, 6) to (3, 6)
Package dropped to (3, 6).
FIND_PICKUP_POINT: Current Location=(3, 6), Package pickup location=(1, 7), Path=[(3, 6), (2, 6), (1, 6), (1, 7)], Cost=3
Moving from (3, 6) to (2, 6)
Moving from (2, 6) to (1, 6)
Moving from (1, 6) to (1, 7)
Package picked from (1, 7).
FIND_DROP_POINT: Package Location=(1, 7), Package drop location=(2, 6), Path=[(1, 7), (1, 6), (2, 6)], Cost=2
Moving from (1, 7) to (1, 6)
Moving from (1, 6) to (2, 6)
Package dropped to (2, 6).
FIND_PICKUP_POINT: Current Location=(2, 6), Package pickup location=(1, 3), Path=[(2, 6), (1, 6), (1, 5), (1, 4), (1, 3)], Cost=4
Moving from (2, 6) to (1, 6)
Moving from (1, 6) to (1, 5)
Moving from (1, 5) to (1, 4)
Moving from (1, 4) to (1, 3)
Package picked from (1, 3).
FIND_DROP_POINT: Package Location=(1, 3), Package drop location=(0, 7), Path=[(1, 3), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7)], Cost=5
Moving from (1, 3) to (0, 3)
Moving from (0, 3) to (0, 4)
Moving from (0, 4) to (0, 5)
Moving from (0, 5) to (0, 6)
Moving from (0, 6) to (0, 7)
Package dropped to (0, 7).
FIND_PICKUP_POINT: Current Location=(0, 7), Package pickup location=(4, 7), Path=[(0, 7), (1, 7), (2, 7), (3, 7), (4, 7)], Cost=4
Moving from (0, 7) to (1, 7)
Moving from (1, 7) to (2, 7)
Moving from (2, 7) to (3, 7)
Moving from (3, 7) to (4, 7)
Package picked from (4, 7).
FIND_DROP_POINT: Package Location=(4, 7), Package drop location=(4, 1), Path=[(4, 7), (4, 6), (4, 5), (4, 4), (4, 3), (4, 2), (4, 1)], Cost=6
Moving from (4, 7) to (4, 6)
Moving from (4, 6) to (4, 5)
Moving from (4, 5) to (4, 4)
Moving from (4, 4) to (4, 3)
Moving from (4, 3) to (4, 2)
Moving from (4, 2) to (4, 1)
Package dropped to (4, 1).
FIND_PICKUP_POINT: Current Location=(4, 1), Package pickup location=(4, 5), Path=[(4, 1), (4, 2), (4, 3), (4, 4), (4, 5)], Cost=4
Moving from (4, 1) to (4, 2)
Moving from (4, 2) to (4, 3)
Moving from (4, 3) to (4, 4)
Moving from (4, 4) to (4, 5)
Package picked from (4, 5).
FIND_DROP_POINT: Package Location=(4, 5), Package drop location=(4, 4), Path=[(4, 5), (4, 4)], Cost=1
Moving from (4, 5) to (4, 4)
Package dropped to (4, 4).
FIND_PICKUP_POINT: Current Location=(4, 4), Package pickup location=(3, 4), Path=[(4, 4), (3, 4)], Cost=1
Moving from (4, 4) to (3, 4)
Package picked from (3, 4).
FIND_DROP_POINT: Package Location=(3, 4), Package drop location=(4, 2), Path=[(3, 4), (3, 3), (3, 2), (4, 2)], Cost=3
Moving from (3, 4) to (3, 3)
Moving from (3, 3) to (3, 2)
Moving from (3, 2) to (4, 2)
Package dropped to (4, 2).
Final Metrics:
All deliveries are completed with total cost 44 and reward 60
Final score is 16
✅ All deliveries completed
🟢 Total Cost: 44
💰 Total Reward: 60
🏁 Final Score: 16
9. Future Work
- Introduce dynamic obstacles that move over time.
- Implement A or Bidirectional Search* for better efficiency.
- Optimize for multiple agents working simultaneously.
- Add priority levels to packages.
- Use reinforcement learning to learn obstacle avoidance and path planning.
GitHub link: Dynamic Goal-Based Agent for Warehouse Logistics Optimization.py
Comment and let me know what can be improved in this approach!