1. (25 pts.) Consider state-space search using A and an !-admissible heuristic function h(n). That is, n, h(n) ≤ h (n) + ! where h is the perfect heuristic function and ! is a fixed positive constant.

(a) (20pts.) Prove that in such a case, A is !-admissible. That is, A is guaranteed to return a path to a goal node m such that g(m) ≤ h (s) + !.

(b) (5 pts.) Comment on the potential advantages of an !-admissible search algorithm in problem domains in which near-optimal solutions are abundant.

2. (25 points) Consider a swarm of autonomous robotic vehicles which have to find their way to a goal location in an environment for which no roadmap is available. Each robot is capable of reading road signs etc. and learn the distance to closest reachable cities from the city where it is located. The robots, in general, do not have a way of obtaining the distance to the goal location from their current locations although in some instances they may be able to do so (e.g., through intelligence information provided by local informants). Each robot, if it is the first one to visit a particular city, leaves behind a sign which can be read by other robots that arrive there. Suppose this sign contains the h value (the estimated distance to the goal city). Each robot is able to update the sign based on information that it has gathered. Formulate a strategy that each robot can use independently (without any direct communication) to maximize the chances of the swarm of robots in achieving their objective. Some desirable characteristics of the solution include the following:

0 In a finite graph with positive arc costs, in which there exists a path from every node to a goal node, if the h values are initialized with nonnegative admissible estimates, the strategy is complete: i.e., eventually, some member(s) of the swarm will reach a goal.

0 Over repeated trials mounted by swarm(s) of robots, the heuristic estimates con-verge to their true values (unless no new arcs are introduced into the graph).

3. (25 points) Consider a version of the perceptron learning algorithm in which the learning rate η(t) can vary with each weight change step t (e.g., it might be diﬀerent at diﬀerent times of the day or it may be a function of the weather, or the learner’s mood, etc.). Prove the perceptron convergence theorem or provide a counter-example to show that the convergence theorem does not hold in the following scenario : 0 < A ≤ η(t) ≤ B where A and B are fixed lower and upper bounds respectively.

4. Consider a student’s dilemma of whether to stay up late to study for an exam. To help with this decision, the student (who shall remain nameless), constructed a Bayesian network with 4 boolean random variables: S (study hard); R (get enough sleep), K (know material), and G (graduate). The network consists of directed arrows from S to R and K and from both R and K to G. The relevant probabilities are: P (S) = 0.7 (and hence P (¬S) = 0.3). P (R|S) = 0.3; P (R|¬S) = 0.8; P (K|S) = 0.8; P (K|¬S) = 0.4. and P (G|R, K) = 0.9; P (G|R, ¬K) = 0.6; P (G|¬R, K) = 0.5; and P (¬R, ¬K) = 0.1 (where X stands for random variable X being True and ¬X stands for the random variable X being False).

2

(a) Draw the Bayesian network and specify all the relevant conditional probabilities associated with each node.

(b) Compute P (S, R, K, ¬G). Compute P (K|S, R).

5. (25 points.) Consider a binary classification task where the random variables rep-resenting the inputs the classifier include both discrete valued (nominal) variables as well as real-valued variables. Assume that the input variables are independent given the class. Further assume that each real-valued variable has a class conditional density function that is gaussian with a class-dependent mean. Derive from first principles an algorithm for determining the parameters of a discriminative Bayesian classifier.

6. (25 points.) Suppose you are an employee of a major AI consulting and software development firm. The list that follows represents a sample of specific projects that may be assigned to you by your employer. In each case, briefly outline your approach to the project. More specifically, identify in each case, the approach (e.g., perceptron algorithm, decision tree, Q learning, Bayesian networks, or their variants, or a combi-nation of approaches) that you would use, explain how you would use it to solve the problem of interest, and the reasons for your choice.

(a) Your client, NASA, has sent a manned mission to Mars. The mission requires the crew deploy a large army of robots to explore the surface of Mars. After landing, the crew realize that due to possible sabotage, all but a couple of the five hundred robots are missing a critical piece of hardware – a control unit that is essential for proper functioning of the robots. The crew is on a tight schedule and is not equipped to assemble additional control units even if had the design specifications. To complicate matters, the contractor that developed the control units for NASA is based in France and since the right wing fascist government came to power in France, the contractor has been prohibited from working with NASA. In any event, NASA is not in position to transport additional control units to Mars even if it had a way of getting its hands on the specifications. You get a call from NASA. The caller explains the situation to you and wonders if the function of the missing hardware control units might possibly be mimicked by a piece of software – if such a piece of software could be developed and installed on each of the robots on Mars in a matter of days. You ask a few questions and learn that each of the robots has a computer on board. Luckily, the inputs and the output of the control unit can be observed and recorded by the crew on Mars when one of the functioning robots is exploring Mars. Your task is to reverse engineer the functionality of the control unit for NASA.

(b) Your client, a major financial institution has accumulated a large collection of records that contain information about loan applicant’s employment history, in-come, credit history, loan approval, and repayment history. They are interested in automating the loan approval process in a large majority of cases. Your client had previously hired your competitor for this task and but was not entirely satisfied with the neural network solution that was provided to them because their em-ployees could not figure out what criteria were actually being used by the neural

network in its decision process. They are hoping that you will provide a system whose decision process is easy to comprehend for their employees.

(c) Your client, an automobile manufacturer is interested in optimizing their assembly process. The operation is such that partially assembled automobiles (supplied by subcontractors) may enter their plant at diﬀerent entry points. At any stage in the assembly process, they need to decide where to route the unit for the next step in the assembly process (given a finite set of possible destinations). Optimal choice would depend on the state of the plant which is periodically captured by measuring relevant variables (e.g., whether a station in the assembly plant is idle, whether a station has broken down, etc.) The outcome of a particular sequence of decisions is also captured (for instance, it is possible to know when the assembly of a particular unit is completed). You are asked to design a system that can adapt its routing policy over time so as to optimize the operation of the plant (e.g., in terms of its throughput).

(d) Your client, a processed food manufacturer has compiled a database of test results on food additives characterized in terms of a number of chemical properties that have been tested for their propensity for causing cancer. You have been asked to design a system that can estimate the probability that a new (hithertofore untested) food additive with known chemical properties is likely to cause cancer.

(a) (10 points)

For each of the following pairs of atomic sentences, state whether or not the pair can be unified. If so, give the most general unifier. If not, explain why unification fails. You may assume that uppercase letters stand for object, function or predicate constants and that lowercase letters stand for variables.

i. P (A, B, B), P (x, y, z)

ii. Q(y, G(A, B)), Q(G(x, x), y)

iii. O(F (y), y), O(F (x), J)

iv. K(F (y), y), K(x, x)

(b) (5 points)

Eliminate the existential quantifier in the FOPL sentence that follows using Skolemization:

x{ z¬B(x, z) { y z[O(x, y) ¬P (y, z)] y[¬O(x, y) ¬O(y, x)]}}

(c) (10 points)

Convert the following set of FOPL sentences into Conjunctive Normal Form (CNF):

0 x y(O(x, y) → A(x, y))

0 x y z(A(x, y) A(y, z) → A(x, z))

8. (25 points)

(a) (10 points)

Prove Above(B, T able) using the resolution by refutation guided by the set of support strategy on the basis of the following axioms (which are already in conjunctive normal form).

0 ¬On(u, v) Above(u, v)

0 ¬Above(x, y) ¬Above(y, z) Above(x, z)

0 On(B, A)

0 On(A, T able)

(a) (10 points)

Suppose Mr. Leibnitz, having uncovered the secret of time travel, decides to visit State College, Pennsylvania in the year 2017. As fate would have it, he runs into you at the Starbucks coﬀee shop on College Ave. He appears quite depressed. Being the nice person that you are, you try to find out why he is depressed so that you can cheer him up. Mr. Leibnitz tells you that despite years of hard work, he has not realized his dream of mechanizing reasoning. In particular, he says he is depressed because he has not been able to answer the following questions:

0 If a theorem is entailed by a given set of facts, is there an algorithm that is guaranteed to derive the theorem from the facts?

0 If a theorem is NOT entailed by a given set of facts, is there an algorithm that is guaranteed to terminate with the answer that the theorem does not follow from the axioms?

Remember that although Leibnitz might be a little behind the times in 2017, he is a pretty smart guy and he wants honest answers from you. How would you answer his questions? What are your chances of cheering him up?

(c) (5 points)

Indicate which of the following clauses are subsumed by P (F (x), y). Justify your answer.

i. P (F (A), F (X)) P (z, F (y))

ii. P (z, A) ¬P (A, z)

iii. P (F (F (x)), z)

iv. P (F (z), z) Q(x)

v. P (A, A) P (F (x), y)

9. (25 pts.) Let C be a set of logical assertions that capture general knowledge of the domain. You may assume that C is consistent. Let A be a set of ground sentences which represent possible assumptions. Unlike in the case of C, some of the assumptions may not be consistent with others. For instance, A may contain Cloudy and ¬Cloudy only one of which can be true in any given model. Let q be some logical assertion. We define an explanation of q to be a a minimal subset of A which is consistent with