Lupine Publishers: Robotics Engineering Journal

Showing posts with label Robotics Engineering Journal. Show all posts

Monday, 14 March 2022

Lupine Publishers| Maze Search Using Reinforcement Learning by a Mobile Robot

Lupine Publishers| Journal of Robotics & Mechanical Engineering

Abstract

This review presents on research of application of reinforcement learning and new approaches on a course search in mazes with some kinds of multi-point passing as machines. It is based on a selective learning from multi-directive behavior patterns using PS (Profit Sharing) by an agent. The behavior is selected stochastically from 4 kinds of ones using PS with Boltzmann Distribution with a plan to inhibit invalid rules by a reinforcement function of a geometric sequence. Moreover, a variable temperature scheme is adopted in this distribution, where the environmental identification is valued in the first stage of the search and the convergence of learning is shifted to be valuing as time passing. A SUB learning system and a multistage layer system were proposed in this review, and these functions were inspected by some simulations and experiments using a mobile robot.

Keywords: Autonomous; Mobile Robot; Learning; Agent; System Simulation

Introduction

In robots which has begun to spread to not only industrial world but also general home, e.g. cleaning robots etc., recently achievement of complex tasks and adaptation of complex environment has been required and can be done by agents which were concept of distributed artificial intelligent and caught abstractly various robots. Conventionally, as behavior of agents has been controlled by rules designed as if then rules, a lot of rules were required for adaptation to complex environment and achievement of complex tasks. Then, in fact, it is impossible that human designers design an individual rule of each environment.

Then, a lot of reinforcement learning researches, e.g., Q-Learning (QL), Profit Sharing (PS), Instance-Based (IB), which is an unsupervised learning to attain optimal task by learning the environment based on the agent behavior without foresight knowledge on the objects and environments, are paid to attention. The various application areas such as maze search [1], optimal route search [2], a design of dynamic route navigation system using electrical maps [3] have been considered. Especially, a new method of integration with reinforcement learning and A* algorithm which is one of the shortest route search algorithms which do not use learning etc. is groped for in the application to the route search. The advantage of integrating reinforcement learning to such algorithm without learning is that trial and errors of the agent achieves the target even if only the target point is given, and the environment is unknown (Even if the unknown dynamic changes exist).

The reinforcement learning is more effective than the shortest route search algorithms in the case of unknown route as a maze or unknown dynamic change by the way. Then, it is necessary to choose suitable field for them when the field of application of reinforced learning is set. Basic Profit Sharing (PS) has been theoretically considered by Muraoka and Miyazaki [4]. Recently, Kawada proposed the efficient maze search method which improved the action selection machine and the study machine of Profit Sharing (PS). It is an action selection switch type with a premeditated action selection machine, and the method of not strengthening the rule again more than the necessity at learning.

Moreover, it is pointed out that PS is more advantageous than QL in the maze search because the number of steps in PS is convergence which was known from the results of the comparison of numerical value experiment of PS and Q-Learning (QL). Besides, there is a research which is not batch payment but makes the reward installments of two stages in the goal, too.

The purpose of the agent of this research is to learn the action or rule for obtaining the pass towards the goal point after acquiring the key at k point from the start point. Though many studies have linked autonomous agents’ action decisions with maze learning [5], in the maze learning problem by agents, fixed point passing problems which set sub-goals in the middle of a maze are interesting because it can apply the laboratory research to industry.

This review is on the literature [6] of a Japanese conference, which is on the premise that intelligent agents autonomously move mazes, based on selective learning of multidirectional behavior patterns by agents using PS, the problem of searching for a route which the mobile robot moves to the goal points via passing two fixed points by the way was treated as an example of reinforcement learning. Therefore, adopting the time-variant Boltzmann distribution adopted in QL for newly PS, this research emphasis on environmental identification at the initial stage of the search and made a search strategy that focuses on convergence at the latter stage. Also, this review proposes a multistage hierarchical learning system that realizes learning in a complex maze and SUB learning system which realizes learning in a vast maze, so that they aim to speed up learning, instead of paying lump sum payment by goal, focusing on research to be made in two steps of installment payment, characterized by updating the value between sub-goals.

First, this review proposes a SUB learning system to cope with the problem of two-point passing problems in a relatively large maze. The SUB learning system means that the basic algorithm inherits the conventional learning algorithm and learns the course to the fixed point by SUB learning and helps to reach the goal early so that the learning efficiency is raised, and the learning time is shortened. Next, this review proposes a multistage hierarchical system to deal with cases such as when there are duplicate passages in the maze. The multistage hierarchical system ultimately achieves a major goal by dividing measures to achieve small goals into each SUB learning system. This research verifies these functions by simulation and experiment using a mobile robot.

Condition Selection in Course Search

Fixed Point Passing Problem

The 2 fixed point passing problem can be categorized into several types on fixed point passing order, step number, multiple passing method, profit dividing method, searching course dividing method. For example, classification of the fixed-point passing order is as follows.

a) Fixed Order: A → B, B → A, etc., the designer preliminarily determines the order.

b) Any Order: A → B, or B → A, and it is not necessary for the designer to determine the order.

In this research, focusing only on the comparison between the conventional method and the proposed method (SUB learning system and multistage hierarchical learning system) for the convenience of time and space of review, comparative verification shall not be carried out for the effect on the learning performance by the fixed point passing order, the number of steps at the fixed point passing time, the fixed point multiple passing times, the profit distribution method and the search course division method.

Agent Type and Movement Form

There are various kinds of agents, such as wheel type and walking type, 4-way type and 8-direction type, left and right turning type, right / left backward inclined swivel type, etc. can be assumed for the type of agent and movement form, but this research reports only the result of development as an agent of wheel type and 4–way of advance, backward, left-right swing type movement, which is the easiest to handle.

Experimental Devices

Environment for Simulations and Experiments

The main components of the general maze are the passage for the agent to pass, walls, people and other agents. Here, we call the component to heading to a place is an agent, such as a wall or a passage, which is fixed and does not need to heading to a place, is a static omnidirectional object, a person or another agent, etc. moving on its own judgment, which do not need to heading to a place is a dynamic omnidirectional object. On the other hand, there is a need of work for the agent, and an object to be directed toward the direction by the agent is called a directional object. In some cases, it may be static like a fixed point or a goal, or it may be dynamic, such as giving things to people or other agents.

Figure 1: Map A.

Lupinepublishers-openaccess-robotics-mechanical-engineering-journal

Therefore, in this study, the aisles and walls are static omnidirectional objects, and the fixed points and goals are static oriented objects. As a simulation environment, the maze is shown in Figures 1 & 2. Also, in the example of maze, black squares are walls of static omnidirectional objects. Map A and Map B are both 5×7 squares maze. Here, A and B are the fixed points of the directional object, and in the case where it is not mentioned specifically, the fixing order of A → B, S/G is the start and goal, starting from S/G, passed through each fixed point, reaching S/G is the goal. That is, it is a circulation type maze. Other types of maze include a type that enters from the outside of the maze and goes out of maze, and a type that reaches another inside goal from the internal start.

Figure 2: Map B.

Khepera

The compact mobile robot Khepera (Figure 3) used for experiments is a wheel-type compact mobile robot system for research and development which had been developed at the Institute of Microcomputer and Interface of the Federal Institute of Technology Lausanne, Switzerland (Table 1).

Table 1: The hardware main profile of khapera.

Results and Discussion

Figure 3 shows the trace display of the simulation results on the Map C differed from Figure 1 for reducing the number of steps, and when the number of trials increases from 300s to 500s, the number of steps required to the goal is almost halved. In the latter half of convergence emphasis, it is found that the temperature constant is reduced from 3 to 2. This approach is expected on the contribution to facilitate course searches of two fixed points passing problems by new combination of SUB learning and multistage hierarchical learning. Moreover, this approach is expected to reduce the number of steps required to the goal as learning progresses and to be useful to search courses in unknown plants or factories by mobile robots (Figure 4) (Appendix 1).

Figure 3: A mobile robot Khepera.

Figure 4: Simulation results for reduction of step number in Map C.

Conclusion

This research pointed out that from the findings of applied research on reinforcement learning, it is effective not only for the unknown environment for agents but also for unknown dynamic changes in the middle of known environments at the beginning as advantages of general shortest path search algorithm. When there is an unknown dynamic change in the middle, it is difficult to match the timing of the change between the simulation environment and the real environment, and a method to effectively set the real environment must wait for future research. However, even when it is unknown in a static environment, if only the result of the route search in the simulation environment is set to the real environment, the influence of the length of the search time on the work in the real environment can be minimized.

Therefore, this review sets up simple problems of cyclic maze with only static obstacles and presented two examples of maze search problem by reinforcement learning using a small mobile robot Kepera which can turn as 90 degree in the same position and find the neighbor walls by the 8 sensors. Finally, the results of simulation are shown by two maps (first half and second half) traced by the mobile robot, that is, the number of steps required to the goal is almost 1/2. In the latter half of convergence emphasis, the temperature constant is reduced to 2/3.

Tuesday, 30 November 2021

Lupine Publishers| Modeling the Temperature of the Evacuation Chamber with Artificial Neural Networks

Lupine Publishers| Journal of Robotics & Mechanical Engineering

Abstract

This investigation approaches the artificial neural networks applied to the ore drying process in carbonate-ammonia leaching. To carry out this research, the main variables that characterize the process were identified. Besides, it was collected the data that comprise a whole month of facility´s operation. Furthermore, it was developed a regression analysis backwards, step by step, which allowed to determine that the linear correlation coefficient did not reach values higher than 0,62. In addition, it was pinpointed a two layered feed - forward back propagation neural network to model the temperature. Thins one reached the correlation coefficient values of 0,97 during its training and 0,95 in validation, as well as 0,87 in its generalization.

Keywords: Artificial Neuronal Network; Regression; Feed-Forward Backpropagation; Mineral Drying

Introduction

In a global context, nowadays, modern control systems play a fundamental role when developing solutions to issues or problems presented in domestic and industrial applications. The main contributions of modern control systems at industrial level contribute to technological innovation, profitability and maintainability of the controlled processes. Within the advanced control strategies under investigation to automate complex processes are: adaptive control, predictive control based on models, robust control, and intelligent control, among others. Intelligent control relies on several techniques such as: fuzzy logic, evolutionary algorithms, and artificial neural networks. Artificial neural networks can be used effectively and accurately for modeling systems with complex dynamics, especially for nonlinear processes that vary over time. The growing interest in neural networks is due to its great versatility and the continuous advance in network training algorithms and hardware [1-4]. The nickel producing companies have continuous processes of great complexity that require automation to achieve a greater efficiency in their productions. In the process of ore preparation, it is important to maintain a temperature control at the outlet of the dryer evacuation chamber, in order to obtain the mineral drying with an established humidity level of 4 to 5,5 %. It must also be ensured that the temperature at the outlet of the electrofilters is above the dew point temperature; to prevent the deterioration of electrofilters, which leads to high economic losses, from accelerating considerably. The inefficiencies in the control of the outlet temperature of the dryer evacuation chamber in the ore preparation process are taken as a research problem and as an objective to obtain an artificial neural model for the outlet temperature on the basis of the main input variables, using Matlab as a calculation tool.

Materials and Methods

Description of the Mineral Drying Process

The drying of the ore is carried out in elongated cylinders formed by a combustion chamber where the hot gases that dry the ore are produced, and by the cylinder where the ore will receive the drying process. These drums (Figure 1) have in their interior lifting elements that are responsible for allowing the transfer of heat between the hot gas and the mineral, in addition the dryer drum has a motor system coupled to the body of this which allows it to rotate on its axis. The dryer drum externally rests on two wheels that has two pairs of roller. Internally the dryer is formed near the combustion chamber by guides or baffles welded to the body of the drum that are the ones that direct the mineral towards the outside of the cylindrical part of the drum [5]. The mineral dryer is a complex physical-mathematical modeling object with a large number of input and output parameters which are in a complex interdependence (Figure 2).

Figure 1: Schematic diagram of the dryer.

Figure 2: Structural diagram of the mineral drying process.

The Input Parameters in the Process are:

a) rpmAl - Feed motor speed [rpm].

b) rpmMp - Speed of the main motor [rpm].

c) corrAl - Feed motor power [A].

d) corrMp - Power of the main motor [A].

e) temGaEn -Temperature in incoming gases [ºC] (coming from the Reduction Furnaces Plant).

f) fluPe - Oil flow at the burner inlet [kg/h].

The Output Parameter is:

a) temGaSa - Oulet gas temperature [ºC].

In addition to the input and output parameters, it is important to highlight a specific disturbance of this process that influences it, which is: minAl - Mineral fed to the dryer. It is known that there are other parameters that are involved in the drying process of the ore and that in turn influence the temperature of the exhaust gases in the evacuation chamber (granulometry in the entrance mineral, humidity of the entrance mineral, exact amount of mineral fed to the dryer), but due to the process itself, they are not registered. Due to the automation existing in the process, the values of the process parameters are sensed by the instrument corresponding to each of them and the signal is sent to the computer located in the process control office. The data obtained along 1 month of operation, were recorded every 240s and processed with the Stat graphics Plus V 5.1 software.

Artificial Neural Networks

The determination of the type of artificial neural network, the number of layers and the number of neurons in each layer that best characterize the process of ore drying process was carried out through a trial and error process that plays with the number of neurons and the maximum permissible error. Through Matlab’s Toolbox (nnstart), the performance of artificial neural models was evaluated by using the mean square error and the correlation coefficient between the real values and those obtained by the network [6]. The objective was to provide the network with an adequate number of neurons in the hidden layer to learn about the characteristics of the possible relationships between the sample data. Through the trial and error process, it was identified the feedforward back propagation structure that provided better results. The proposed network consists of two layers: a hidden layer and an output layer. The output layer will only have one unit, which will indicate the value of the oulet gas temperature associated with each input vector presented to the network. The hidden layer will have a variable number of neurons.

Results and Discussion

Figure 3 shows the behavior of the exhaust gas temperature in the evacuation chamber, between its minimum and maximum values of 79,59 and 130,51°C, respectively, for the month of work. Once the database was analyzed, the sample functions that evaluate the measures of central tendency and dispersion of the sample were determined through a descriptive statistical analysis (Table 1). The mathematical model that best represents the relationship between the variables analyzed. Table 2 shows the regression analysis for the output pulp density, where a 0,7correlation coefficient is observed.

Figure 3: Control chart for the dependent variable.

Table 1: Summary of the sample´s descriptive statistical analysis for one month.

Table 2: Regression analysis summary.

Figure 4 shows the training behavior of the network for the learning process, observing the training, validation and test curves, which converge to the iterations for an error of 0,00026. Figure 5 shows the behavior of the correlation coefficients for the training, validation, testing and adjustment of the artificial neuron network (it is assumed as an artificial neuronal model for the oulet gas temperature in the ore drying process “nntemGaSa” and the real temperature “temGaSa”). Figure 6 shows the generalization of the network with 1767 data not presented during training, where a 0,87correlation coefficient is observed.

Figure 4: Behavior of training and validation of the neural network.

Figure 5: Correlation coefficients of the neural network.

Figure 6: Network Generalization.

Conclusion

The capacity of the feed-forward back propagation network for the simulation of pulp sedimentation processes in the industry was demonstrated. The structure that best characterizes the behavior of the temperature in the exhaust gases of the evacuation chamber is characterized by two layers with 50 neurons in the hidden layer and one in the output layer, with the Levenberg Marquart learning method (trainlm), and the log-sigmoidal (logsig) and sigmoidal hyperbolic tangent (tansig).

Monday, 11 October 2021

Lupine Publishers| Insight Looks to Soft (Continuum) Robotics

Lupine Publishers| Journal of Robotics and Mechanical Engineering

Opinion

Soft Robotics is emerging fresh sub field in Robotics which is very useful in medical, industry, space exploration, deep sea exploration, Nano-robotics and many more likewise applications. The major benefit of Soft Robots as compare to Rigid Robots their excellent flexibility and adaptability to accomplish task. Before to move further I would like to state Soft or Continuum Robots first “Soft Robots are small, medium and big shapes various biological or non-biological body forms robots which are made up using ultra soft and flexible materials, where materials are engineered using Continuum Mechanics and Kinematics”. The big difference between conventional rigid robots and soft robots, in rigid robotics intelligence engineered using AI only to control robotics body, but in soft robotics the materials using which robots has made themselves smart and has intelligence, sensations and actuations. Therefore, Soft Robots can also learn from surrounding environment in self mode as well as has greater flexibility in clutching, climbing, moving, defending etc. why this happen? This would be question in your mind let me answer it why this happen. Because Soft Robotics constructed with highly compliant materials similar those originate and found in living organisms and creatures on planet earth. Hence Soft Robotics build up using material morphology and Continuum Mechanics, it’s a mechanics that deal with the analysis of kinematics and mechanical behaviour of materials modeled as continuous mass rather than discrete particles, therefore Soft Robotics also called as “Continuum Robotics”. These robots constructed using Biological materials, Biophotonics materials, Conductive polymers, Biochemical materials, Nanomaterials, Nanocomposites, Synthetic Biology, Shape Memory Alloy (SMA) and Smart Materials, DLC, Carbon having high young modulus and so on. In conclusion better to say Smart Materials are the building blocks of Soft/Continuum Robots, where smart materials can be defined as “Materials which has ability to sense some environmental stimuli, process and actuate (Response) according to sensation”. Hence Soft Robots need less Electronic AI as compare to Rigid Robots and less harmful for human and environment as well as mimic and learn move and adapt quickly from surrounding (Figure 1). In above figure I have depicted some succeeded Soft Robots like Octobot world first ultrasoft and flexible Soft Robots, Soft Robot Fish etc.

Figure 1:

Conclusion

Soft or Continuum Robotic is fresh subfield in Robotics technology where lot of research need to do bring it on next level. This branch of robotics has its own different important and utility along with conventional one and very useful in Deep Space, Medical, Industry and Deep Sea research.

Monday, 30 August 2021

Lupine Publishers| Insight Virtual Humanoid Robotics Modeling: the Ultimate Level 0f Artificial Intelligence

Lupine Publishers| Advances in Robotics & Mechanical Engineering (ARME)

Abstract

I have no doubt to state Virtual Humanoid Robots (VHRs) are the ultimate level of Artificial Intelligence which change the scenario of world and human technologies, it would be applicable in all domains of technology with common factor Ultra Artificial Intelligence (UAI) with disappear and appear ability by any means which boost to our civilization from Type-0 to type-1 civilization at least and would be first step to compete with Aliens technology, if exist (hypothesis only). I would like to define term Virtual Humanoid Robotics (VHR) as “it’s Humanoid Robotics with UAI and has ability to transform from Physical to Virtual by any Internal (Humanoid Self-Control) or External (Human-Control) mode activation mechanism”. VHR is future technology which will use energy from Sun (or Space), Internet of Things (IoT) with RFID USN, Bigdata and Self-learning and healing mechanism. Now I would like to generate future utopia front of your eyes with initial modeling to coined term VHR in this short communication.

Keywords: Humanoid Robotics; Bionic Brain; UAI; Virtual Humanoid Robotics; Robotics Teleportation

Modeling to VHR

a) VHR-Basic Engineering Model

Figure 1:

I depicted in my first model “VHR-Basic Engineering” that there we need to extent our fundamental engineering aspect of Humanoid to Virtual Humanoid robotics domain. Hence model divided into two broad chambers as Humanoid Chamber and to give virtual ability Virtualization chamber. As we can analyze from model for successful humanoid building we need advanced Humanoid robotics hardware’s which link to Bionic Brain as similar to human brain mimic in the form of UAI which further cascade to Advanced Humanoid Operating System and Communication Interfaces. After successful engineering of first segment successful physical humanoid can build but to next level i.e. to convert physical humanoid into virtual and back from virtual to physical we doesn’t need to modify hardware but to strongly need to give extension to existed. Hence Virtualization chamber exhibits this regard in model. The virtualization chamber has two functional blocks to engineer Advanced Physical to Virtual Mode transfer Units and Light/Projection/Optical/Teleport interfaces Engineering (Figure 1).

b) Physical-to-Virtual Modes Switch Model

Figure 2:

My second purposed model “Physical-to-Virtual Mode Switch” model one of the essential VHR engineering models, in another word can say expansion and detail discussion on Virtualization Chamber second part of my first model. Its lucid and clear representation of concept in model diagram I considered three different possible modes viz. M1, M2 and M3 which may be increase in future with technological advancement and new methods of virtualization. The mode M1 has highest priority to implement VHR where Humanoid hardware itself has ability to appear and disappear itself with selfcontrol (Internal Control) which is only hypothesis right now. The second mode M2 has possible and second priority Teleportation and lot research going on this Mode M2 by several premium university and institutions scholars. The last mode M3 is easiest one but not satisfactory where virtualization engineer using virtual and augmented reality [1-10] (Figure 2).

Conclusion

I have discussed two models and with the help of them try to learn one of the promising and world change future technology “Virtual Humanoid Robotics” where Humanoid not only seems to be like human in near future but also will have ability to Avatar itself. This would be very helpful to send humanoid virtually in deepspace, on stars and planets to understand universe closely with teleportation or internal humanoid mechanism. VHR also ultimate level of AI hence might be shift mankind race on planet earth from Type-0 civilization to Type-1 civilization as shown in sci-fi movies.