0
Review Article

Tips on Stochastic Optimal Feedback Control and Bayesian Spatiotemporal Models: Applications to Robotics

[+] Author and Article Information
Jongeun Choi

Associate Professor
Mem. ASME
Department of Mechanical Engineering,
Department of Electrical and
Computer Engineering,
Michigan State University,
East Lansing, MI 48864
e-mail: jchoi@egr.msu.edu

Dejan Milutinović

Associate Professor
Mem. ASME
Computer Engineering Department,
University of California,
Santa Cruz,
Santa Cruz, CA 95064
e-mail: dejan@soe.ucsc.edu

1Corresponding author.

Contributed by the Dynamic Systems Division of ASME for publication in the JOURNAL OF DYNAMIC SYSTEMS, MEASUREMENT, AND CONTROL. Manuscript received February 14, 2014; final manuscript received September 22, 2014; published online October 21, 2014. Editor: J. Karl Hedrick.

J. Dyn. Sys., Meas., Control 137(3), 030801 (Oct 21, 2014) (10 pages) Paper No: DS-14-1068; doi: 10.1115/1.4028642 History: Received February 14, 2014; Revised September 22, 2014

This tutorial paper presents the expositions of stochastic optimal feedback control theory and Bayesian spatiotemporal models in the context of robotics applications. The presented material is self-contained so that readers can grasp the most important concepts and acquire knowledge needed to jump-start their research. To facilitate this, we provide a series of educational examples from robotics and mobile sensor networks.

FIGURES IN THIS ARTICLE
<>
Copyright © 2015 by ASME
Your Session has timed out. Please sign back in to continue.

References

Bertsekas, D. P., 1995, Dynamic Programming and Optimal Control, Vol. 1, Athena Scientific, Belmont, MA.
Bryson, A. E., and Ho, Y.-C., 1975, Applied Optimal Control: Optimization, Estimation, and Control, Taylor & Francis, New York.
Pontryagin, L. S., 1962, The Mathematical Theory of Optimal Processes, Vol. 4, Interscience Publishers, New York.
Young, J., and Zhou, X. Y., 1999, Stochastic Controls: Hamiltonian Systems and HJB Equations, Vol. 43, Springer, New York.
Kushner, H. J., and Dupuis, P., 2001, Numerical Methods for Stochastic Control Problems in Continuous Time, Vol. 24, Springer, New York. [CrossRef]
Stengel, R. F., 1986, Optimal Control and Estimation, Dover, New York.
Oksendal, B., 2003, Stochastic Differential Equations: An Introduction With Applications, Springer Verlag, New York.
Davis, M. H., 1993, Markov Models and Optimization, Vol. 49, Chapman & Hall/CRC, Boca Raton, FL.
Koutsoukos, X. D., 2004, “Optimal Control of Stochastic Hybrid Systems Based on Locally Consistent Markov Decision Processes,” Proceedings of the 2005 IEEE International Symposium on Intelligent Control, Limassol, Cyprus, pp. 435–440.
Fleming, W. H., and Soner, H. M., 2006, Controlled Markov Processes and Viscosity Solutions, Vol. 25, Springer, New York.
Engel, Y., Mannor, S., and Meir, R., 2005, “Reinforcement Learning With Gaussian Processes,” Proceedings of the 22nd International Conference on Machine Learning, University of Bonn, Germany, ACM, pp. 201–208. [CrossRef]
Engel, Y., Mannor, S., and Meir, R., 2003, “Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning,” Proceedings of the 20th International Conference on Machine Learning (ICML-2003), Washington, DC, pp. 154–161.
Deisenroth, M. P., Rasmussen, C. E., and Peters, J., 2009, “Gaussian Process Dynamic Programming,” Neurocomputing, 72(7), pp. 1508–1524. [CrossRef]
Leonard, N. E., Paley, D. A., Lekien, F., Sepulchre, R., Fratantoni, D. M., and Davis, R., 2007, “Collective Motion, Sensor Networks, and Ocean Sampling,” Proc. IEEE, 95(1), pp. 48–74. [CrossRef]
Lynch, K. M., Schwartz, I. B., Yang, P., and Freeman, R. A., 2008, “Decentralized Environmental Modeling by Mobile Sensor Networks,” IEEE Trans. Rob., 24(3), pp. 710–724. [CrossRef]
Choi, J., Oh, S., and Horowitz, R., 2009, “Distributed Learning and Cooperative Control for Multi-Agent Systems,” Automatica, 45(12), pp. 2802–2814. [CrossRef]
Xu, Y., Choi, J., and Oh, S., 2011, “Mobile Sensor Network Navigation Using Gaussian Processes With Truncated Observations,” IEEE Trans. Rob., 27(6), pp. 1118–1131. [CrossRef]
Xu, Y., Choi, J., Dass, S., and Maiti, T., 2013, “Efficient Bayesian Spatial Prediction With Mobile Sensor Networks Using Gaussian Markov Random Fields,” Automatica, 49(12), pp. 3520–3530. [CrossRef]
Le Ny, J., and Pappas, G., 2013, “Adaptive Deployment of Mobile Robotic Networks,” IEEE Trans. Autom. Control, 58(3), pp. 654–666. [CrossRef]
Jadaliha, M., and Choi, J., 2013, “Environmental Monitoring Using Autonomous Aquatic Robots: Sampling Algorithms and Experiments,” IEEE Trans. Control Syst. Technol., 21(3), pp. 899–905. [CrossRef]
Cao, Y., Yu, W., Ren, W., and Chen, G., 2013, “An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination,” IEEE Trans. Ind. Inf., 9(1), pp. 427–438. [CrossRef]
Kalnay, E., 2003, Atmospheric Modeling, Data Assimilation, and Predictability, Cambridge University, New York. [CrossRef]
Cortés, J., 2009, “Distributed Kriged Kalman Filter for Spatial Estimation,” IEEE Trans. Autom. Control, 54(12), pp. 2816–2827. [CrossRef]
Graham, R., and Cortés, J., 2009, “Cooperative Adaptive Sampling of Random Fields With Partially Known Co-Variance,” Int. J. Rob. Nonlinear Control, 22(5), pp. 504–534. [CrossRef]
Graham, R., and Cortés, J., 2012, “Adaptive Information Collection by Robotic Sensor Networks for Spatial Estimation,” IEEE Trans. Autom. Control, 57(6), pp. 1404–1419. [CrossRef]
Xu, Y., Choi, J., Dass, S., and Maiti, T., 2012, “Sequential Bayesian Prediction and Adaptive Sampling Algorithms for Mobile Sensor Networks,” IEEE Trans. Autom. Control, 57(8), pp. 2078–2084. [CrossRef]
Xu, Y., and Choi, J., 2012, “Spatial Prediction With Mobile Sensor Networks Using Gaussian Processes With Built-In Gaussian Markov Random Fields,” Automatica, 48(8), pp. 1735–1740. [CrossRef]
Xu, Y., and Choi, J., 2012, “Stochastic Adaptive Sampling for Mobile Sensor Networks Using Kernel Regression,” Int. J. Control Autom. Syst., 10(4), pp. 778–786. [CrossRef]
Varagnolo, D., Pillonetto, G., and Schenato, L., 2012, “Distributed Parametric and Nonparametric Regression With On-Line Performance Bounds Computation,” Automatica, 48(10), pp. 2468–2481. [CrossRef]
Samson, C., and Ait-Abderrahim, K., 1990, “Mobile Robot Control, Part 1: Feedback Control of a Nonholo-Nomic Wheeled Cart in Cartesian Space,” Institut National de Recherche en Informatique et en Automatique, Report 1288, Le Chesnay, France.
Aicardi, M., Casalino, G., Bicchi, A., and Balestrino, A., 1995, “Closed Loop Steering of Unicycle Like Vehicles via Lyapunov Techniques,” IEEE Rob. Autom. Mag., 2(1), pp. 27–35. [CrossRef]
Ren, W., and Beard, R., 2004, “Trajectory Tracking for Unmanned Air Vehicles With Velocity and Heading Rate Constraints,” IEEE Trans. Control Syst. Technol., 12(5), pp. 706–716. [CrossRef]
Anderson, R., Bakolas, E., Milutinović, D., and Tsio-tras, P., 2013, “Optimal Feedback Guidance of a Small Aerial Vehicle in a Stochastic Wind,” J. Guidance Control Dyn., 36(4), pp. 975–985. [CrossRef]
Sutton, R. S., and Barto, A. G., 1998, Reinforcement Learning: An Introduction, Vol. 1, The MIT Press, Cambridge, MA.
Powell, W. B., 2007, Approximate Dynamic Programming: Solving the Curses of Dimensionality, Vol. 703, John Wiley & Sons, Hoboken, NJ. [CrossRef]
Long, A. W., Wolfe, K. C., Mashner, M. J., and Chirikjian, G. S., 2012, “The Banana Distribution is Gaussian: A Localization Study With Exponential Coordinates,” Proceedings of Robotics: Science and Systems, Syndey, Australia, pp. 265–272.
Thrun, S., Burgard, W., and Fox, D., 2005, Probabilistic Robotics, The MIT Press, Cambridge, MA.
Anderson, R. P., and Milutinović, D., 2011, “A Stochastic Approach to Dubins Feedback Control for Target Tracking,” Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), San Francisco, CA, pp. 3917–3922.
Jadaliha, M., Lee, J., and Choi, J., 2012, “Adaptive Control of Multiagent Systems for Finding Peaks of Uncertain Static Fields,” J. Dyn. Syst. Meas. Contr., 134(5), p. 051007. [CrossRef]
Cressie, N., 1986, “Kriging Nonstationary Data,” J. Am. Stat. Assoc., 81(395), pp. 625–634. [CrossRef]
Rasmussen, C. E., and Williams, C. K. I., 2006, Gaussian Processes for Machine Learning, The MIT Press, Cambridge, MA.
Choi, J., Lee, J., and Oh, S., 2008, “Biologically-Inspired Navigation Strategies for Swarm Intelligence Using Spatial Gaussian Processes,” Proceedings of the 17th International Federation of Automatic Control (IFAC) World Congress, Seoul, Korea.
Choi, J., Lee, J., and Oh, S., 2008, “Swarm Intelligence for Achieving the Global Maximum Using Spatio-Temporal Gaussian Processes,” Proceedings of the 27th American Control Conference (ACC), Seattle, WA, pp. 135–140.
Shi, J. Q., and Choi, T., 2011, Gaussian Process Regression Analysis for Functional Data, CRC, Boca Raton, FL.
Abrahamsen, P., 1997, A Review of Gaussian Random Fields and Correlation Functions, Norsk Regnesentral/Norwegian Computing Center, Oslo, Norway.
Snelson, E. L., 2007, “Flexible and Efficient Gaussian Process Models for Machine Learning,” Ph.D. thesis, University College, London, UK.
Zhang, B., and Sukhatme, G., 2007, “Adaptive Sampling for Estimating a Scalar Field Using a Robotic Boat and a Sensor Network,” 2007 IEEE International Conference on Robotics and Automation, Rome, Italy, Apr. 10–14, pp. 3673–3680. [CrossRef]
Laut, J., Henry, E., Nov, O., and Porfiri, M., 2014, “Development of a Mechatronics-Based Citizen Science Platform for Aquatic Environmental Monitoring,” IEEE/ASME Trans. Mechatron., 19(5), pp. 1541–1551. [CrossRef]
Xu, Y., and Choi, J., 2011, “Adaptive Sampling for Learning Gaussian Processes Using Mobile Sensor Networks,” Sensors, 11(3), pp. 3051–3066. [CrossRef] [PubMed]
Bishop, C. M., 2006, Pattern Recognition and Machine Learning, Springer, New York.
Xu, Y., Choi, J., Dass, S., and Maiti, T., 2011, “Bayesian Prediction and Adaptive Sampling Algorithms for Mobile Sensor Networks,” Proceedings of the 2011 American Control Conference (ACC), San Francisco, CA, pp. 4095–4200.
Xu, Y., 2011, “Environmental Adaptive Sampling for Mobile Sensor networks using Gaussian processes,” Ph.D. thesis, Michigan State University, East Lansing, MI.
Gaudard, M., Karson, M., Linder, E., and Sinha, D., 1999, “Bayesian Spatial Prediction,” Environ. Ecol. Stat., 6(2), pp. 147–171. [CrossRef]
Rue, H., and Tjelmeland, H., 2002, “Fitting Gaussian Markov Random Fields to Gaussian Fields,” Scand. J. Stat., 29(1), pp. 31–49. [CrossRef]
Cressie, N., and Verzelen, N., 2008, “Conditional-Mean Least-Squares Fitting of Gaussian Markov Random Fields to Gaussian Fields,” Comput. Stat. Data Anal., 52(5), pp. 2794–2807. [CrossRef]
Hartman, L., and Hössjer, O., 2008, “Fast Kriging of Large Data Sets With Gaussian Markov Random Fields,” Comput. Stat. Data Anal., 52(5), pp. 2331–2349. [CrossRef]
Le Ny, J., and Pappas, G. J., 2009, “On Trajectory Optimization for Active Sensing in Gaussian Process Models,” Proceedings of the 48th IEEE Conference on Decision and Control, Shanghai, China, pp. 6286–6292.
Rasmussen, C. E., and Nickisch, H., 2010, “Gaussian Processes for Machine Learning (GPML) Toolbox,” J. Mach. Learn. Res., 11, pp. 3011–3015.
Minka, T. P., 2001, “Expectation Propagation for Approximate Bayesian Inference,”. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, pp. 362–369.
Williams, C. K., and Barber, D., 1998, “Bayesian Classification With Gaussian Processes,” IEEE Trans. Pattern Anal. Mach. Intell., 20(12), pp. 1342–1351. [CrossRef]

Figures

Grahic Jump Location
Fig. 1

Fixed velocity two-wheel robot control problems: (a) Minimum expected time control (P1); (b) Distance keeping control (P2); (xx,xy)—the robot coordinates relative to the target which is at the origin, r—distance between the robot and the target, φ—robot heading angle, α—bearing angle and ν—velocity

Grahic Jump Location
Fig. 2

Solution of the minimum expected time problem (P1): (left panel) gray colored map of the value function V(r, α); black color at the absorbing boundary (A) indicates V(r, α) = 0 and the lighter shades depict longer expected times. The type of the boundary conditions is labeled by P—periodic, R—reflective, A—absorbing; (right panel) optimal feedback control; white u = 1 and gray u = –1.

Grahic Jump Location
Fig. 3

Solution of the distance keeping control problem (P2): (left panel) gray colored map of the value function V(r, α); darker shades correspond to smaller values of V(r, α) and lighter shades depict its larger values. The type of the boundary conditions are labeled by P—periodic, R—reflective; (right panel) optimal feedback control, white u = 1 and gray u = − 1.

Grahic Jump Location
Fig. 4

Each agent is driven by swarming and gradient ascent efforts based on its own recursively estimated field (with a model in Eq. (21)) via locally collected measurements by itself and its neighboring agents. The multiagent system locates peaks of an uncertain static field in a distributed and scalable manner [39].

Grahic Jump Location
Fig. 5

(a) Remotely controlled boat equipped with depth sensor and GPS and (b) experiment site with the sampling locations (shown as colored dots)

Grahic Jump Location
Fig. 6

Prediction with hyperparameters σf = 0.2, σ1 = 0.07, σ2 = 0. 07, and σw = 0. 03. (a) Estimated depth and (b) prediction error variance, with sampling positions shown as white crosses.

Grahic Jump Location
Fig. 7

Prediction with hyperparameters σf = 0.2, σ1 = 0.2, σ2 = 0. 2, and σw = 0. 03. (a) Estimated depth and (b) prediction error variance, with sampling positions shown as white crosses.

Grahic Jump Location
Fig. 8

The posterior distribution of β, σf2, σs, and σt at (a) t1 and (b) t20

Grahic Jump Location
Fig. 9

The prediction at (a) t1 and (b) t20 using the MCMC-based approach. The true fields are plotted in blue solid lines. The predicted fields are plotted in red dashed lines. The area between red dotted lines indicates the 95% confidence interval.

Tables

Errata

Discussions

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In