Modeling the Behavior of Virtual Systems with Endogenously Shaping Purposes

The problem of constructing a choice model of an agent endogenously shaping purposes of his evolution is under debate. It is demonstrated that its solution requires the development of well-known methods of decision-making while taking into account the relation of action mode motivation to an agent's ambition to implement subjectively understood interests and the environment state. The latter is submitted for consideration as a purposeful state situation model that exists only in the mind of an agent. It is the situation that is a basis for getting an insight into the agent’s ideas on the possible selected action mode results. The agent’s ambition to build his confidence in the feasibility of the action mode and the possibility of achieving the desired state requires him to use the procedures of forming a model-representation based on the measured values of the environment state. This leads to the gaming approach for the choice problem and its solution can be obtained on a set of trade-off alternatives.


Introduction
The development of the theory of multi-agent systems is currently aimed at solving the complex of problems centred on the phenomenon of subjective selection [13]. The formal theory of choice [7] developed by abstracting from the subjective factors that led to the creation of a normative theory of decision making "perfect" subject. The logic of the problem of selection has led to the need to study how and why in real conditions there is "waste" subject to regulatory rationality [7,8]. The solution to this problem currently associated with the results obtained in the theory of reflexive games and the theory of information systems management with a will and intelligence [1,[9][10][11]. However, despite the abundance of works in this direction [12] is still a problem.
Activity motivation of agents is associated with their interests and the striving to implement them. In the paper [1] it is demonstrated that the agent's interests can be formally represented by two parameters: the specific value of the purposeful state situation based on the results and the specific value of the purposeful state situation at the expense of performance. Motivation of selection is inherently associated with the striving to implement goals with some "best result". The concept can be formally represented, for example, as the ratio of the specific value of the purposeful state situation motivated by the result to the specific value of the purposeful state situation at the price of performance [1]. The parameter values are the values attributed to some scale that characterizes the subjective attitude of an agent to evolution of his state depending on the selected action mode as well as to the system evolution as a whole, given the fact that implementation of the agent's interests is only possible within the system. The on-scale values depend on the agent's value system and standards reflecting the backbone of his interests, emotional experiences and the extent of commitment to their implementation As stated in the paper [3] the system of internal values can be considered a priori given and invariant only until arises the possibility of an agent's death or receiving such property evaluations of the purposeful state situation that he would not consider satisfactory. The system of internal values should be considered while being affected by the agent's accepted ethical system. Therefore the term "with the best results" depends on the ethical system through the system of internal values and norms, which in its turn determines the structure of preferences from an agent's activities. Note that in the normative theory of decision-making the structure of the agent's preferences is considered priori given [2].
The interest formalization scheme of an agent permits to ascertain his subjective estimates of desirability and possible achieving different specific values of the purposeful state situation and through them getting specific values of quality of life indicators which the agent aspires to. Since their values are formed on the basis of subjective assessments of the value and feasibility of interests, the interests of an agent and the relevant purposes are endogenous, that is generated within the system.

Assumptions
For a constructive formalization of the individual choice problem, we are to introduce the following set of assumptions.
1. The basic motivations of an agent's (subject's) activity are hunger, comfort, self-preservation. They define the structure of needs, are a basis of self-cognition and environment transformation aimed at building up the subjectively understood ideal.
2. Needs, strives for cognition and abilities determine the subject area the interests of a subject are developed in, i.e. his mission.
3. Implementation of interests depends on a system of values and norms the agent adheres to, and on restrictions imposed by an ethical system. They determine the structure of agent's preferences when choosing a mode of action.
4. The structure of values and norms, as well as the structure of preferences are not fixed, a subject can choose them. We shall name such choice options structural alternatives.
5. The purpose of activity of a subject is not specified but is formed by him.
6. Motivation of choice is determined by the subject's interest in his evolution aiming at subjectively understand ideal.
The agent's interest in evolution means his relation to the state of environment. It is expressed in his ideas about the correspondence of the state observed to the success of his interests, inclinations and intentions by means of assessments. Consequently, there is a set of variables being observed and perceived by a subject with the help of which an agent characterizes the environment state. The agent's awareness of these variables is necessary external condition prompting him to choose the modes of action aimed at implementation of his interests in the choice situation. Awareness is qualitatively expressed by verbal assessments and covers all state components including the selection of a choice subject. The idea of the environment state causes both the attitude towards the state and the attitude to the possibility and effectiveness of implementing interests with "best results" through the modes of action.
Definition. We will define the purposeful state situation as qualitative characteristics relating to the attitude of an agent to the environment state, the assessment of direction and capabilities of interests implementation, values of results, effectiveness of the required efforts in terms of his ideas about the state of environment and specified by his internal system of interests, values and norms as well as by his ethic.
The purposeful state situation exists in the consciousness of a subject, reflects his individual characteristics in modeling environment state. Therefore, we are to speak about the relation of a subject to his ideas-models. This is discussed in [4], where the authors propose to estimate the ideas by measure of their value in the course of implementation of interests. Therefore, a person select of a variety of options, depending on his conviction in their usefulness for the transition to the desired state.
As the idea of purposeful state situation is based on the subjective estimation of environment state and of a subject himself, we can assume that the choice of action modes aimed at the implementing interests with "best results" is a subjective rational choice, with the state of environment being considered as an exogenous factor. The purposeful state situation, as a model, is an endogenous factor in the subject choice model and determines his attitude to the observed state by means of a set of qualitative characteristics. These considerations allow us to offer the following scheme of the agent's action mode choice: 1. For the environment state S, an agent assigns the purposeful state situation X x ∈ where X is a set of possible descriptions of environment state. Its specific value is determined as estimation of the degree of implementing interests. Then the evaluation of the degree of satisfaction with a situation, as well as the degree of conformity of ideas about the situation and their usefulness with state S is made. If conviction in adequate reflection of state S is below a preset threshold, the situation diagnostics procedure is applied.
2. If the satisfaction with a purposeful state situation is below a preset threshold, a mode of action is selected from the set of alternatives C allowing achieving the desired purposeful state situation, the value of which either exceeds a threshold value or is an optimal value under the given opportunity. The value magnitude of a desired purposeful state situation is determined, the goal and action mode tree instantiating subject's interests is constructed to achieve the desired situation.
3. If execution of paragraph 2 does not lead to the desired state, we are to determine the ability to achieve stated goals by means of some structural alternatives . If it is not possible, a problem is stated, the project to overcome a problem is made by means of relevant research and developments, with the aim of (a) extension of a set of structural alternatives ↑ G and a set of action modes ↑ C ; (b) debottlenecking of restrictions, rate cutting, etc. To do this, the tendency of evolution in space of parameters and the respective step size are determined.
4. The chosen structural alternatives implementation plan ↑ ↑⊆ ∈ G G g ' and the action mode set extension plan ↑ ≡ C C ' are hatched. 5. The stimulating magnitude is determined to create a 28 Modeling the Behavior of Virtual Systems with Endogenously Shaping Purposes certain level of agent's motivation. 6. Control actions in the forms specified in paragraphs 4-5 are carried out for the purposeful state situation.
We consider such a scheme resulting from the assumptions postulating the purposeful behavior motivation as a control scheme of purposeful evolution, which is determined by the subject's drive to survive, or to keep the level attained, or to dominate.
According to this scheme, a subject makes control decisions depending on two types of conditions: 1) exogenous (objective) -generated by the environment dynamics and the object of interests; 2) endogenous (subjective) -arising from the subject's interests.
It is evident that a set of alternatives C should be given for the possibility of purposeful action mode choice as well as a set of situations X on which the choice of alternatives depends, and preferences per element of the set C which would allow an agent to compare alternative elements and to choose, in a sense, the "best" of them. As shown in [1], for an agent the utility function means evaluating a specific value of the purposeful state situation. It is found that preferences in case of subjective rational choice may be represented by the only way of evaluating a specific value of the purposeful state situation based on the result of Eϕ(•); they should be read as values of a utility function. This function represents a priori internal preferences of an agent based on control alternatives depending on the state and idea, as well as on his values and norms set by the ethical system. It is defined by the virtue of action mode alternatives due to situation х∈X and state s∈S. In addition, an agent is interested in the selection of preference structure of a given set of structural alternatives G. In this case, it goes without saying that the utility function will also depend on structural alternative g∈G, but being parameter-dependent.
From these considerations it follows, that the utility is a priori preferences based on control alternatives c∈ C in accordance with the . Conceptual meaning and representation of utility function for agents with subjective rational behavior are considered in detail in [1]. If the choice is made under conditions of uncertainty, dynamics, and weak environment structuring, we can suggest that the choice model involves assumptions and disambiguation rules which are typically referred to as hypotheses of determinism. Different versions of uncertainty elimination are considered in works of D.A. Novikov [3].
Since the model of purposeful state situation is a basis for the agent's choice, its optional versions should be considered as the environment state diagnostics alternatives. The model choice is made by the criteria of utility and quality. In addition, the utility function Eϕ(•) can be a basis for determining meaning and structure of the required criterion. Indeed, suppose that there is a pair of variables C X s c x × ∈ ) * , * ( for each state s∈S, with the utility function having a maximum. Now let us suppose that for some reasons, a purposeful state situation model х∈Х is chosen as a diagnostic result in state s∈S and accordingly a controlling action c∈C is selected. The diagnostic quality will depend on the state identification procedures being used; they can be considered as a means of ideas determination. For example, in case where the uncertainty is stochastic, it is natural to use the mathematical expectation of a function of the utility losses as a diagnostics criterion. It is usually called "risk" [4]. Thus to describe the quality of control, in the purposeful choice scheme one is to use a criterion with the meaning of expected specific value of the purposeful state situation based on the results, and to describe the diagnostics quality -the criterion with risk meaning. In this case the rules being chosen will be in a certain way interdependent. In these circumstances, the problem of choice has a game content, and its "best" solution is to construct a sustainable compromise ("equilibrium" [5]) between achieving the maximum expected utility and the minimum risk. The search and use of such equilibrium are sure to be regarded as an internal aim in selecting the "best mode of action." Taking into account the considerations made above and in a context of assumptions induced, we can extend the concept of choice with the following key provisions.
1. When making assumptions about the agent's behavior, one can consider observation and perception of the state to be necessary but insufficient condition for making a choice.
2. The sufficient condition for making a choice is determined by specifying the relation of a subject to the state through the purposeful state situation.
3. Due to the fact that direct observation has state parameters, it is necessary to perform diagnostic procedures, with the purpose of their completion being to make a situation model choice, depending on the observation results.4. The action mode choice is performed by the criterion of the expected specific value of the purposeful state situation.
5. The selection of diagnostics rules is carried out by the risk criterion.
6. The problem of control and diagnostics rule choice is of game content; its "best" solution consists in constructing a sustainable compromise known as "equilibrium". 7. When one is choosing the "best mode of action", the construction and use of the balanced rules of control and diagnostics is an inner aim.
The provisions formulated define the concept of choice made by agents with the endogenously motivated behavior.

Decision-making with Regard to Structural Alternative
Let us determine the conditions under which decisions are to be made. We introduce additional assumptions, specifying the list of suggestions in this direction.
According to the assumptions introduced, the object of our interest may be active, dynamic. If its evolution is described by the rule and f is a fuzzy function, then the corresponding system is a fuzzy system the state of which at the moment t+1 is a conditional for x t , and x t is a fuzzy set characterized by a membership function of the ) , Then we can assume that the object evolution is described by a fuzzy Markovian process. Without similarity limitation it can be assumed that the set of its states S is put in order in some way. For example, a priori distribution of possibility β(S) of a particular s∈S condition occurrence is assigned.
According to the conception of purposeful control, the choice of control alternatives is carried out in the purposeful state situation, which defines the relationship of a subject to a state. Let us denote the set of such situations X, and it is finite. The X set power cannot exceed the power of the state set, i.e. the condition S X ≤ should be satisfied.
According to the assumptions introduced, the choice motivation is determined by a subject's interest in their progressive object evolution by selecting the modes of action and the structure of preferences; that determines two aspects of their interests. The conception of a purposeful choice introduces the third aspect of a subject's being interested, and it is connected with the necessity to diagnose a situation, depending on a condition observed.
It is natural to assume that for each aspect of interests there are alternative implementation options, with appropriate sets of feasible alternatives being given. Then, following the conception of a purposeful choice, we assume that for an aspect of interests related to state evolution control there is a set of alternative control actions C, or simply control alternatives. As control alternative are selected depending on х∈X situations, it is natural to assume that there are limits to control alternatives feasibility depending on situation х∈X. Such limits are natural to be given by inclusions of type.
As shown above, the situation of a purposeful state exists in an agent's consciousness in the form of the environment model, depending on state s∈S. Let us assume that an agent has a varying degree of awareness of the environment state by means of some set s X It is natural to assume that the Finally, according to the assumptions introduced, a subject is interested in choosing a structure of preferences from a given G set of alternative options named structural alternatives. It is clear that an agent's state evolution is determined by a chosen preference structure. At the same time an agent seeks to implement progressive dynamic state evolution, depending on their own ideas. In this case, structural alternatives should be chosen as a general for states and situations parameter that determines the states dynamics pattern in two ways: systematically, at each step, or a chosen structural alternative is to remain constant at a given interest existence horizon.
A subject can monitor a state and make control decisions only at discrete points in time. Taking this into account, we will regard the choice process as a systematic decision-making process with discrete time.
Applying control actions at discrete instants of time generates a controlled Markovian process with discrete time and a set of states S. The dynamics of this process is determined by a transfer function defining the probability of one-step transition at a set of states, depending on the choice of control alternatives c∈C. At the same time it will also depend on a structural alternative g∈G but as a parameter. Let us denote the transition function of the controlled process with symbol ) According to the conception of choice there must be specified the utility function 1 ) ( : representing a priori preference with control alternatives c∈C.
The arguments introduced clarify choice assumptions by adopting the conditions of control decision making.
1. The evolution of an object of interest is described by Markovian process in state space S. For set S the priori probability distribution β(S) is given.
2. There is given a set of situations X being quality characteristics, representing a priori relation of a subject to a state. They require diagnostics which consists in choosing a situation from set X depending on the state. Idea models are to satisfy condition S s X s X ∈ ∅ ≠ ∩ , .
3. There is given a set of control alternatives C. Control alternatives are chosen depending on situation х∈X; С x С ⊂ limits for control alternatives feasibility depending on х∈X situations are given. 4. A preferences structure is not fixed and can be chosen from a given set of structural alternatives G. They are chosen according to the system of values and norms and serve to ensure the agent's progressive evolution, subjectively understood by him. 5. A structural alternative can be chosen either systematically at each n = 1, 2, ...moment, then it is a tactical one; or a structural alternative chosen may remain constant throughout a given time horizon, then it is strategic.
6. The implementation of an action mode chosen generates a controlled Markovian with discrete time and transfer function C S S g q × | ( ) of C S × to S, which depends on structural alternative g∈G as a parameter.
7. There is given utility function 1 ) ( : representing a priori preference for set of control alternatives С. The utility function depends on structural alternative g∈G as a parameter.

Information Structures in Decision-making
The formal descriptions introduced determine not only the conditions of decision-making, but also proper a priori information carriers. Together, they form a set of the following formal properties: S is an environment state set; β(S) is a priori possibility distribution for a state set; X is a situation set; ∅ ≠ ∩ X s X stands for the limitations determining the presence of the "right" ideas as diagnostics alternatives depending on s∈S states; C is a control alternative set; С x С ⊆ means control alternative feasibility limitation depending on х∈X situations; G is a structural alternative stands for the utility function representing a priori preferences for с∈С alternatives depending on s∈S states, х∈X situations and g∈G structural alternatives.
This set defines an a priori information structure which is to be set according to decision-making rules.
The peculiarity of conditions of an information structure is that it is supposed to include a task of both states and situations, the choice of control actions depending on situations which, being qualitative characteristics, representing relationship to a state, are inaccessible for direct observation and need preventive maintenance.
Under these conditions, the regularity of situation dynamics cannot be set a priori. Therefore, decision-making rules suggest posing the laws of dynamics states only, defined by the ) | ( С S S g q × transition function from С S × to S. In this sense, information given a priori is minimal. In the conditions of a priori information deficiency, the minimum structure can be incomplete. Then it is necessary to introduce plausible assumptions (in the form of hypothesis setГ) which would allow formulating a problem definition as some approach of the initial task [6]. Let us assume, for example, that in the basic information structure transfer function is not given, but there is set of hypotheses Г of it. Then technically it can be assumed depends on some γ parameter getting values from given setГ, but the true value of the parameter is unknown.
It is obvious that it will also demand the choice of the, in a sense, "best" hypothesis of a transitional function. At the same time expanded information structure completeness can be observed only according to final results of the problem research.

Game Approach to Formalization of the Choice Problem
Assumptions of the choice specify the existence of two aspects of an agent's interests, one of which is determined by the bias in the management of a desirable object evolution, and the other -by the choice of a preference structure. The purposeful control concept defines the third aspect of interests associated with the need for the diagnostics of the situation depending on the observed condition. In compliance with these three aspects three sets of alternatives are to be assigned: set C-action modes, set G -structural alternatives and set X -diagnostic alternatives. It is also assumed to assign the utility function Assigning these objects suggests the possibility of forming the qualitative mode choice criterion as postulated by the situational control concept that makes sense of expected utility and the purposeful state situation model choice quality criterion that concerns the risk. These criteria are obviously different and in a way interdependent. The natural presumption is that in order to choose structural alternatives the corresponding quality criterion may be introduced. It differs from the rest of the criteria and in some way is dependent on the choice of other alternatives. It is commonly known that in similar conditions the problem of mode choice has a gaming intension [6]. Then each set of alternatives can be formally linked with a party concerned (a player), whose interests are related to the choice of alternatives from the corresponding set of alternatives according to their individual quality criterion. Within a set of alternatives every party has the freedom of choice. Since interests of each party represent a certain component of agent's interests, parties are to comply with the common to them agent's interests, when selecting alternatives. Therefore the problem of mode choice acquires a gaming content in relation to corporate interests [6], and the subject of interest plays the role of a center. He can accept the proposed trade-off alternative if it is hardly possible to improve it without infringing at least one component of interests. The compromise that meets this requirement will be called a "corporate stable equilibrium."

Conclusions
The problem of building a choice model of the agent endogenously forming aims of his evolution has been examined. It is shown that its solution requires development of methods to take into account the dependence of decision-making motivation on the agent's commitment to implement subjectively understood interests and on his understanding of the environment state. The latter is proposed for consideration in the form of a purposeful state situation model that exists only in the agent's mind. This is a basis of receiving ideas of possible outcomes resulting from the chosen action mode. The agent's striving to build confidence in the reliability of his action mode and the possibility of achieving the desired state requires him to use procedures of forming ideas based on the environment state measurement results. It has been demonstrated that an agent has to choose the action mode by the criterion of the expected specific value of the purposeful state situation motivated by the result, and diagnostic procedures -by a risk criterion, with the choice acquiring a gaming content and being exercised with the help of trade-off alternatives.