IMATT: An Integrated Multi-Agent Testing Tool for the Security of Agent-Based Web Applications

In this paper, an integrated multiagent testing tool, is presented. Such tool comprises static analyzer, dynamic tester and an integrator of the two components for detecting security vulnerabilities and errors in agent based web applications written in Java. The static analysis component analyzes the source code of the web application to identify the locations of security vulnerabilities and displays them to the programmer. Consequently, dynamic testing of the web application is carried out. Here, a temporal-based assertion language is introduced to help in detecting security violations (errors) in the underlying application. The proposed language has operators for detecting SQL injection and cross-site scripting, XSS, security errors. The dynamic tester consists of two components: instrumentor (preprocessor) and run-time-agent. The instrumentor has many modules that have been implemented as software agents using Java language under the control of a multi agent framework. The agents of the instrumentor are: static analyzer agent, parser agent, and code converter agent. Moreover, an integrator for integrating both static and dynamic analyses is employed. Eventually the implementation details of IMATT are reported.


Introduction
In fact web applications represent a considerable share of software products. Such applications are continuously promoted using various software technologies. The promotion, as such, has led to web applications that are based on multiagent systems to provide: 1) user friendliness, 2) intelligent search and 3) better communications. Unfortunately, those web applications are subject to different attacks. This paper presents an integrated MultiAgent Testing Tool, IMATT, to facilitate static and dynamic testing procedures for finding out the security flaws, if any. In fact, the majority of the software testing tools are generic [2,23,25] in the sense that they are working independent of the style of the program under test. However, recently Centonze et al [2] have presented a tool named AEC for testing component based programs where the peculiarities of the program components are considered.
Here we went a step further in this direction, where IMATT extends AEC and introduces, an agent based tool for testing large agent based Web applications (which are beyond component based programs) against security flaws. IMATT could be used with the following pragmatic advantages: 1. IMATT is homogeneous in the sense that both static and dynamic components are model based where the static analysis model is based on a set of grammar rules while the dynamic analysis model is based on temporal logic assertions in addition to a set of behavioral dynamic responses.
2. Integration of static and dynamic analysis, via path concatenation, enables the discovery of both intra and inter vulnerabilities.
3. Web applications allow intervention; consequently different scenarios for the same application can be generated. It is essential to check out the liveness of each scenario in order to guarantee the application ability to reach its goal. This is carried out by making use of temporal logic formalism.
Agent based web applications, (Figure 1), can be attacked (consequently protected at various levels). To be specific and to clarify the scope of IMATT, the MultiAgent, MAS, web application levels are pointed out as follows.
Web Application GADE-Based Agent(s) Network Node Figure 1. Agent-based web application 1. Network node (site) level: where both attacks and protection mechanisms are network oriented and they are out of scope of this paper.
2. MAS environment (GADE) level: where malicious agent(s) could be introduced to attack the web application. At that level, the agent security is the responsibility of GADE-S that can allow authentication, authorization and integrity. Accordingly, IMATT is not involved.
3. Web application level: where IMATT is utilized to check out the underlying application.
Thus IMATT is a special purpose security testing tool that satisfies: • The close fitting testing approach [1].
• Soundness (from static analysis), precision (from dynamic analysis) and flexibility by making use of a group of GADE agents for building up the instrumentor.
• IMATT can be easily involved in a continuous testing integration process [2,3,4,5,6], where iterating first one analysis, then the other is more powerful than performing either one in isolation [7].
There is a common agreement that attacks aimed at web applications represent most of today attacks [8], therefore the major types of such attacks are considered here. Namely, SQL injection [9,10,11] and cross site scripting, XSS [12,13,14], are adopted for their popularity, however, many other attacks could be illustrated in the same manner. The rest of the paper is organized as follows. Section two is concerned with the related work while section three is concerned with the proposed architecture of IMATT. The implementation and testing of the tool are discussed in section four. Section five is concerned with the conclusion.

Related Work
Currently, there are several generic tools such as NuSVM, FDR2, ITS4, CHESS and NESSUS that could be exploited for program (code) analysis. Although they are widely used, such tools will not be considered here because they lack integration and their application domain is different. To be specific, IMATT will be only related to the class of tools that: • Combines both static and dynamic code analyzes.
• Can be applied for Web application written in Java or an equivalent language.
• Can be devoted basically for detecting security vulnerabilities.
• Performs either model checking or any other sound approach to get decision.
The work of Centonze et al [2] has presented a proposal for combing static and dynamic analysis for automatic determination of database access control polices. Their tool could be applied on programs that are executed on stake-based access control systems such as Java. In their proposal the static analysis models the execution of the program taken into account native methods, reflection and multi-threading. In addition, the dynamic analysis can refine the potentially conservative results of the static analysis. The authors have implemented their analysis framework in a tool called Access Content Explorer, ACE. Such tool allows for automatic and precise identification of access-right requirements and library code location that should be made privilege-asserting to prevent any client code from requiring extra-access-rights.
An extension to the well-known tainted-mode model has been presented to afford inter-module vulnerabilities detection by Petukhov et al [8]. The authors have applied their proposal on web applications using dynamic analysis with penetration testing. Their automatic analyzer avoids the drawbacks of the manual-based code review recommended by OWASP (Open Web Application Security Project).The main contributions of that analyzer are: • Improvement of classical tainted mode model so that inter-module data flows could be checked.
• Automatic penetration testing by leveraging it with information from dynamic testing output.
Livshits et al [15] have exploited a Program Query Language to build up a static analyzer for finding out security flaws in Java application. Moreover the authors have extended their work to include both static and dynamic techniques to check out the underlying queries. The static analyzer, given by livshits et al [15] finds the potential matches conservatively using a context-sensitive, flow-insensitive, inclusion-based pointer alias analysis. In addition their dynamic analyzer instruments the sources program to catch the security violations when the program runs to perform user specified actions. By making use of these techniques, an analyzer has been designed and implemented to detect security flaws, resource leaks and violations of the predefined rules.
In their recent work Keromytis et al [6] have presented MINESTRONE as an architecture that integrates static analysis, dynamic confinement and code diversification techniques to enable the identification of vulnerabilities in a third party software. In its present from MINESTRONE in written in C/C++ and it seeks to: • Enable the immediate deployment of new software, and, • Enable the protection of legacy software. The authors approach is to insert extensive security instrumentation, while leverage program analysis that is aided by runtime data. Diversification techniques are used as confinement mechanisms that may achieve software fault isolation.
The fundamental problem being addressed by MINESTRONE is finding vulnerabilities in the underlying software. Its key idea to realize this goal is to make use of the static analysis to allow reliable instrumentation, while runtime data provides a focus on portions of the code that are heavily exercised or otherwise considered security critical.
The tool Apollo has been discussed by Artzi et al in [16]. It aims at finding bugs in Web applications using dynamic testing and explicit state model checking. The proposed technique generates tests automatically, runs the tests capturing logical constraints on input and reduces the condition on the inputs to failing tests [16]. Thus Apollo provides test inputs for underlying application and validates that the output conforms to the predefined specification.
In all of the above mentioned tools no agents are considered or involved in either the Web application or the error-checker. In addition the integration process is always implicit.

Proposed Architecture of IMATT
This tool aims at finding both static and dynamic vulnerabilities in Web applications. Static vulnerabilities [9,12] include SQL injection, cross-site scripting, XSS, while dynamic vulnerabilities are checked via the code coverage analysis using various metrics. The two approaches are similar in that they are model-based i.e. in both of them, vulnerability conditions are formally specified by the static tool. The dynamic tool takes the locations of the vulnerabilities and monitors if there are security violation during the web execution, (Figure 2).

Static Vulnerabilities
Once malicious data has entered a Web application an attacker can use one of the following techniques (among others) to accomplish the expected breach.

SQL Injection
It is one of the well-known security Vulnerabilities found in Web application. It is caused by unchecked user input being passed to a back-end database. The hacker may embed SQL commands into his data sent to the application.
Many SQL injections can be practically avoided with the use of better API's. Also, J2EE provides the prepare statement class, that allows specifying an SQL statements template capable for indicating statement parameters.

Cross-Site Scripting, XSS
It occurs when dynamically generated Web pages display input that has not been properly validated [12]. An attacker may hide a malicious JavaScript code into such pages. When executed on the user machine, these scripts can breach the user account credentials. At the application level, echoing the application input back to the browser enables cross-site scripting.

Static Analysis
In its general form the static analysis problem should include object propagation problem [18,19,20,21] with three types of description source descriptors, destination descriptors and derivation descriptors.
Source descriptors of the form <m,n,p> to specify ways in which user data can enter the program, where , m is a source method , n is parameter number and p is an access path to be applied to argument n to obtain the user-provided input. Destination descriptors have the same from with, m is a destination method, n is argument number and p is an access path to be applied to that argument.
Derivation descriptors have the form <m,ns,ps,nd,pd> to specify how data probates between the program objects. In this case, m represents a derivation method; a source object is given by argument number ns and access path ps. A destination object is given by argument number nd and access path pd. Such descriptor specifies that at a call to method m, the object obtained by applying ps to argument nd is derived from the object obtained by applying ps to argument ns. Actually, in the absence of derived objects, to detect potential vulnerabilities, it is needed only to known if a source object is used at the destination.
In fact, derivation descriptors are used to handle the semantics of Java strings. Because Strings are immutable Java objects, string manipulation routines (concatenation in the underlying case) create new string objects, where contents are based on the original string objects. Actually, most Java programs use built-in string libraries and consequently share the same set of derivation descriptors [18].
The needed generalization may be achieved by making use of a simple syntax analyzer (parser) for data log queries to allow users to express vulnerability patterns in a friendly manner. Therefore, that approach will be relied upon in IMATT as it is explained in the following.
It should be noticed that the proposed approach does not replace the possibility of using the available Java security, API's and J2EE, instead it provides an affective extension for them to handle uncovered cases.

Dynamic Analysis
In order to detect the security violations during Web 22 IMATT: An Integrated Multi-Agent Testing Tool for the Security of Agent-Based Web Applications applications execution, an assertion language has been proposed. It is based on temporal logic to help in detecting security errors in a scope of the Web application. In addition, we have built a dynamic testing tool to instrument assert statements and detect security violations. In what follows the temporal assertion language is discussed.

Temporal Assertion Language
In order to detect the run time security vulnerabilities and error that occurs in Web applications, we introduce special language based on the temporal logic. We describe this language using Backus Naur Form (BNF).
In this language we use the temporal logic operators (Always , Next , Eventually , Until). Also, the language has another two operators for detecting the security vulnerabilities (SQL , XSS) .
As shown in the following (Figure 3), our assertion language has six temporal assert statements [Always, Eventually, Next, Until, XSS, SQL]. All of these assert statements (except next) are coupled with end-assert statements, thus enabling the tester to control the scope of the assert statement. Fig.3 shows the Java-based temporal assert statements. The semantic of the temporal assertion language is determined according to choosing one of the temporal operators (Always, Next, Eventually, Until, SQL or XSS). Choosing those operators depends on the type of error that we want to detect. Suppose it is required to ensure that some variables never equal zero along the scope of certain code, then we use always operator, but if we want to check whether the input field contains SQL injection or not so we will use SQL operator. Such operators semantics are pointed out in the following.
1) Always (safety) properties: A temporal expression of this form // 1.1.A Assert [ ] (W) , specifies that W is always true, during the scope of the always assert statement. Note that the assert statement starts with double slash followed by label followed by Assert keyword and finally the condition (W).
2) Eventually (liveness) properties: The eventually operator (~) of this form // 1.1.A Assert ~ (W) is used to test that a specific condition (W) is satisfied at least once during the scope of the eventually assert statement.
3) Precedence properties: The until (U) temporal operators of this form // 1.1.A Assert T1 U T2. Can be used to assert that Task T1 will start when Task (T2) finishes. We can use this property to check race condition . 4) SQL properties: The SQL temporal operator of this form // 1.2.A Assert SQL (variables). We use this property to insure that the variables in the form are not injected with SQL attack . 5) XSS properties: The XSS temporal operator of this form // 1.2.A Assert XSS (variables). We use this property to insure that the variables in the form are not injected with XSS attack.

The Architecture of the dynamic testing tool
This section introduces the architecture of the dynamic tool. The programmer adds temporal assert statements to the source code of the agent-based web application in the position that he expects errors. The agent based instrumntor consists of set of agents. Agents detect the assert statements in the web application under testing and convert each one to the corresponding Java statements. The basic components in our dynamic testing tool are presented in (Figure 4).

Agent Based Lexical Analyzer
The agent-based lexical analyzer reads the (java source file which has the temporal assert statements within the source code). Then this agent tokenizes the file to set of World Journal of Computer Application and Technology 1(2): 19-28, 2013 23 tokens which will be sent to the agent-based parser. The pseudo code of the lexical analyzer agent is shown in ( Figure  5).

Agent Based Parser
The parser reads the tokens and then decides whether the tokens are Java statements or assert statements. If they are Java statements, it will write it to the destination file which contains only the Java source code without the temporal assertion, otherwise if the statements start with double slash followed by the assert keywords and one of the temporal logic operators, then source code will be generated based on the kind of the temporal operators. The pseudo code of the parser agent is shown in (Figure 6).

Agent Based Code Generator
Depending on the temporal logic operators, this agent will generate the code for each temporal assert statement. The pseudo code of the code generation agent is shown in (Figure  7).

Integration of Static and Dynamic Analyzers
Given a large program, it may be impractical to identify, manually, security failures. However, by integrating static and dynamic analyses [25], IMATT can soundly model the program behavior to identify the security vulnerabilities. Consequently, using the dynamic analysis would handle second order (indirect) run-time attacks.
While theoretically sound, in practice the static analysis may be unsound for the following reasons: 1) Multi-language code: A Java program may trigger the execution of methods written in C and executed directly on the operating system. A static analyzer for Java will not be able to model C functions. As a result the analysis will fail.
2) Reflection: which is a mechanism that enables code to dynamically manipulates fields and methods of loaded classes. Modeling reflection through static analysis is unsound since the type of object obtained through reflection is only available at runtime.
In fact neither static nor dynamic analysis can independently guarantee the identification of all security vulnerabilities. Actually, dynamic analysis suffers from the fact that: • It needs a set of functional or security rules that may be practically unavailable [22].
• It needs a set of attacks like those used in the real world. In addition it needs a collection of temporal information.
• It is destructive since it may perform attack execution IMATT integration, Fig.8, consists of two analyzing modules: static and dynamic, where each analyzer is designed as a multi-agent subsystem. The static analyzer agents read the Java-based web application, and analyze it to identify a list of security vulnerabilities. Based on the list of identified vulnerabilities, the user (programmer) inserts some assert statements in the web application and creates new web application file that contains java statements and assert statements. The dynamic testing agent reads the new file and instrument it, so that it can cover all security violation at various levels. Eventually it displays the violations,if any of them is reveald during Web application execution.

24
IMATT: An Integrated Multi-Agent Testing Tool for the Security of Agent-Based Web Applications In IMATT, the need to integrating static and dynamic analyses is a must. This is because the fact that agents, specially mobile ones use extensively 'reflection' in their programing pardigm. Actually, modelling reflection by making use of static analysis is unsound since the type of underlying objects that are obtained through the reflection is identified only at run time.However, the dynamic analyzer uses reflection to load classes , create objects and invoke the required methods. Accordingly, the process of creating a testcase is automated ( but not eliminated).
On the other hand , relying on pure dynamic analysis is not sufficient because of its dependency on the test cases. In practice it is usual that some execution paths, along with the previledged rights to execute those paths may remain undiscovered until the code deployment phase. This yields an incomplete cover for the program under test, consequently unsoundness is arised due to the absence of a formal cover that should be generated by the selected test cases. IMATT integrator, Fig.8, has several essential features that can be pointed out in the following: • It tackles the refelction problem(s) by conservatively locating the suspected agent using the static analyzer, then the dynamic analyzer is employed to refine the obtained conservative results, i.e. to extract the runtime rule(s) violation.
• A Java temporal assertion language is implemented with well defined semantics. Such language combines on a formal basis, temporal logic and application oriented operators.
• One of the roles of the proposed integrator is to eliminate false alarams, i.e when the static analyzer might report a false alarm(due to security senstive action) the dynamic analyzer that utilizes the coverage of the underlying program methods can eliminate the statically detected false alarms.
For IMATT each solution is executed in three steps. 1. The static analyzer discovers the call that may cause security vulnerability and determines its location (agent) 2. At run-time the dynamic analyzer checks out the vulnerability locations of the underlying agent to discover the method that can yield a breach. In addition it logs the underlying operation in a special file that might be parsed for security holes.
3. From steps 1 and 2 the integrator, Fig.8, exploitsa continuous integration agent which is coupled with both static and dynamic analyzers in order to find out the corrupted class which is responsible for the security violation problem.
Also, the security side effects can be discovered and detected. For convenience such details are moved to Sec.4 , where illustration of IMATT implementation, using several experimenal examples, is given.

Tool Implementation and Testing
All agents of the testing tools are written in Java programming language. In addition JADE [24] as a middleware that facilitates the development of multi-agent systems is used to manage and run the agents of IMATT.

Code Generation for SQL Injection and XSS
SQL Code Generation Agent: When the agent receives the source file , destination file , and the pointer to both files with the condition and label , it starts to extract the variables from the conditions and then starts reading the source file from the pointer until it finds the label followed by word "END". When the agent reads the source file each line has any one of those variables, the agent will insert run time method called hasSQL() in the destination file after the java statement which has one of those variables the method which will take variables as the arguments analyzes the variables to ensure no SQL injections , otherwise the agent will write the java statement in the destination file .After reaching the end of the assert statement, the control flow will return back to the lexical analyzer which will continue reading the source file from where the code generation ended reading and the procedure will be repeated again when the lexical analyzer agent catches any temporal assert statements . We use the SQL temporal operator when we want to detect SQL attack.
A similar XSS code generation agent can be obtained by replacing SQL by XSS.

Testing of Web Applications
For testing Web applications, the Web application under testing is inserted by temporal assert statements. After that the instrumentor part of IMATT instruments the Web application, where translates each temporal assert statement based on the semantic of the temporal operator to Java statments. The instrumented Web application is compiled and executed for detecting any security attack. To clarify the nature of IMATT more examples that are concerened with the implementation details are given in what follows. Example 1: Detection of SQL Injection using the SQL Operator •The problem Suppose we have Web application of a company, where there is a service that allows us to retrieve information of an employee from the database by giving his first name. Suppose "John" is entered and "submit" button is pressed, information of the employee "John" is retrieved and displayed as shown in (Figure 9).
Assume an attacker would like to get information of all employees in the company, he will insert John ' OR '1'='1 in the field of employee's name, so the query will be select * from employees where firstname='" + John ' OR '1'='1 + "'"; due to this SQL injection and because the 'OR' expression is always true, information of all employees are retrieved and displayed as shown in (Figure 10). This allows an attacker to take information of all employees. Using the same technique attackers can inject other SQL commands which could extract, modify or delete data within the database.

•
Solution of the problem In order to detect the SQL injection, a temporal assert statement is inserted in the agent-based Web application to check the fields of the form. In the code of ( Figure 11), the inserted temporal assert statement is // 1.2.A Assert SQL ( user), where the (user) in this statement will be the data entered by the client or attacker.
The code of ( Figure 11) is instrumented by agents of the dynamic analyzer to generate a pure Java code as shown in (Figure 12). The generated Java code contains a method called hasSQL() that takes the fields of the form as an argument and checks if the field has SQL attack characters or not.

26
IMATT: An Integrated Multi-Agent Testing Tool for the Security of Agent-Based Web Applications

• Executing the Web Application after instrumentation
After executing the program in (Figure 12), and entering (John ' OR '1'='1) in the field of employee, we see in (Figure 13), the assertion exception arises after the detection of SQL injections. The problem Suppose Myspace Web site of a Web application has been signed up by a malicious user and in his profile page the following script has been added. So, every time a visitor visits the profile the script is gotten and annoyed. Now suppose that the problem get bigger where a code has been added in the comments of the site as shown in the following statement. So, every time the users click on this link they will visit web site about cats, but they will be logged out of the web site and that's so annoying.
The problem will be worst if the attacker has injected script which steals user cookies. So, every one visit the guess book, he will be redirected to a page at attacker's site. The cookies from MySpace's browser session have been transmitted to attacker's web server as part of the URL. This will allow the attacker to steal the pass word and the username of the administrator of the web site, and the attacker gives himself administrator access, or start deleting content.
And now come to the most dangerous problem if the attacker could have used a JavaScript link to trick users into sending sensitive information to his server If users clicked that link, as they probably do often, their session ID would be transmitted to attacker's server. (Figure 14) and explains the problem. Figure 14. The script to steal user session has been added • Solution of the problem: In order to detect the XSS attack, a temporal assert statement // 1.2.R Assert SQL ( name, email, comm) has been inserted to check the fields of that form as shown in ( Figure 15); the name , email and comm are the form fields. The above code is instrumented by the agents of the dynamic analyzer to generate a pure Java code that contains a method called has XSS() as shown in (Figure 16). The data of the fields of the form are received and checked by the has XSS() method during the Web application execution.

•
Testing the Web Application after instrumentation The code in (Figure 16) has been compiled and executed. The input that contains XSS attack has been entered. The XSS attack has been detected by the tool (Figure 17). In order to emphasize the relative merits of IMATT, its performance upon compacting versus should be compared practically with similar analyzers. However, such task could not be accomplished due to Lack of published quantitative information of the performance of such similar products.

Conclusion
This paper presents IMATT as a special purpose integrated multiagent tester that integrates both static and dynamic testing components to check out the security of agent based Web applications. IMATT has been built up using software agents.
The static component consists of a rule-base and a code checker while the dynamic component consists of instrumentor and a run-time analyzer. In order that such analyzer can handle different scenarios of the Web application it makes use of temporal logic to examine the application under test. The integrator integrates the results of both components to get a decision for either intra or inter attacks. In the present state, the temporal assert statements are inserted manually in the Web application, however, in future, it is planned to assign an intelligent agent that can be able to insert such statements automatically.