Keyword Based Searching According to the Movie Names

Keyword based queries are inherently ambiguous such that given a set of keywords the database search engine has only an uncertain guess about the user’s informational need represented by the query. Possibly high complexity of the data makes providing intelligent search results effectively extremely challenging. Databases enable users to precisely express their informational needs using structured queries. However, database query construction is a laborious and error-prone process, which cannot be performed well by most end users. Keyword search alleviates the usability problem at the price of query expressiveness. As keyword search algorithms do not differentiate between the possible informational needs represented by a keyword query, users may not receive adequate results. This paper presents Extended Incremental Query Processing - a novel approach to bridge the gap between usability of keyword search and expressiveness of database queries. Extended Incremental Query Processing enables a user to start with an arbitrary keyword query and incrementally refine it into a structured query through an interactive interface. The enabling techniques of Extended Incremental Query Processing include: 1) A probabilistic framework for incremental query construction; 2) A probabilistic model to assess the possible informational needs represented by a keyword query; 3) An algorithm to obtain the optimal query construction process. This paper presents the detailed design of Extended Incremental Query Processing, and demonstrates its effectiveness and scalability through experiments over real-world data and a user study. Extracting information from semi structured documents is a very hard task. Documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this, we describe an approach based on Tree-Based Association Rules (TARs): mined rules, which provide approximate, intentional information on both the structure and contents of XML documents. This mined knowledge is later used to provide: a concise idea—the gist—of both the structure and the content of the XML document .quick, approximate answers to queries.


Introduction
With the growth of structured information available on the Web and in online databases, it becomes increasingly difficult for users to find the exact data they seek for. Structured queries are a powerful tool to exactly describe a user's informational need and retrieve the intended information from a database. However, formulation of a structured query is a challenging task for a naive user as it requires precise knowledge of the database schema and the query language. On the other hand, a keyword query is easy to put and it is more customary for the users.
Keyword search allows ordinary users to search for information without any expert knowledge of the database schema. However, keyword search lacks expressiveness to precisely describe a user's informational need, and may return irrelevant or incomplete results. To take advantage of both, i.e., expressiveness of structured queries and usability of keyword search, a query ranking approach has been introduced which translate a keyword query into a ranked list of structured queries, such that the user can select the query that best represents her informational need.
Such a query ranking approach has two limitations. First, each keyword can occur in nearly any textual attribute of a database, the number of possible query interpretations grows sharply with number of textual attributes and the size of the schema. Second, even a theoretically optimal ranking algorithm can, at best, rank the most common query interpretations highest, so the users with less frequent informational needs may not receive adequate results. To overcome these limitations we introduced a new system EIQP designed to fill the gap between usability of keyword search and expressiveness of database queries. EIQP provides a query construction interface that enables users to create their own structured queries in an interactive way based on the query construction options automatically suggested by the system. If the user accepts or rejects an option, the system can use this information to automatically reduce the interpretation space of the user's keyword query and suggest new options. The interaction process between the user and EIQP continues until the user finds the desired interpretation of the keyword query and retrieves the corresponding search results. Initially, in order to generate query construction options, EIQP made use of the database schema and keyword occurrence statistics.
The enabling techniques of EIQP include: 1) A probabilistic framework for incremental query construction. 2) A probabilistic model to assess the possible informational needs represented by a keyword query. 3) An algorithm to obtain the optimal query construction process. (MQCP). Using EIQP, a user can benefit from both, a conventional ranking interface and a more controllable query construction interface. The former allows the user to immediately identify the most common interpretation of her query. The latter enables the user to clarify her search intent step by step, which is especially helpful when the intended query interpretation does not receive a good rank. We use XAMPP tool in our paper, for performing various data processing techniques, there by getting the results most efficiently and accurately.
The goal of data mining is to extract or mine knowledge from large amounts of data. Consequently, data mining consists of more than collecting and managing data, it also includes analysis and prediction. The extensible Markup Language (XML) has become a standard language for data representation and exchange XML is a Standard, flexible syntax for data exchanging Regular, structured data. Mining of XML documents significantly differs from structured data mining and text mining. XML allows the representation of semi-structured and hierarchal data containing not only the values of individual items but also the relationships between data items. Due to the inherent flexibility of XML, in both structure and semantics, discovering knowledge from XML data is faced with new challenges as well as benefits. Mining of structure along with content provides new insights and means into the process of knowledge discovery. As for query-answering, since query languages for semi structured data rely the on document structure to convey its semantics, in order for query formulation to be effective users need to know this structure in advance, which is often not the case. This limitation is a crucial problem which did not emerge in the context of relational database management systems. As a consequence, when accessing for the first time a large dataset, gaining some general information about its main structural and semantic characteristics helps investigation on more specific details.

Keyword Search in Databases
The continually growing amount of structured information has little use without the availability of an efficient search function. The most relational databases provide users with a full-text search capability restricted to one table attribute and enabling the execution of a simple query that affects only one data entry of table column. But due to the complexity of the database and its schema, each attempt to put an intricate query becomes a challenging task. One should still have knowledge of the database schema and use a structured query language (e.g. SQL) in order to satisfy his information need. In the age of internet and search engines ordinary users are more acquainted with a keyword search. This kind of search is very intuitive and convenient, as they only need to type some keywords in the search window and become a list with ranked suggestions for their request. But what happens behind the keyword search process? What is the magic behind the transformation of the simple keyword query into the relevant individual results?
The following aspects are important during this process: 1. Data modeling 2. Structural ambiguity of the results

Ranking strategies
In order to enable the database search, first the properties and peculiarities of database data and schema are reflected in a data model. In the second step, on the basis of a selected model the number of candidates as possible results is generated. Thereby there exists a vast variety on their structural presentation. As users are interested only in a limited number of relevant results, the disambiguation and ranking of the candidates is done.

Data Modeling
A lot of attempts have been made in order to support keyword search on relational databases. The first of them to mention are the Discover, DBXplorer. As the keywords can be present in different attributes and tables, these systems were the pioneers generating the matching rows obtained by joining several table. For this purpose a database is modeled as a graph with tuples as nodes and edges between the nodes are primary to foreign key relationships. But with the growing amount of information the size of the databases increases and the relationships between the tables become more complex

Structural Ambiguity of the Results
During the search step the suggestions that meet the users' information need are created. As modeling of the dataset and extraction algorithms used by the research community is very diverse, the structure of results also vary. EIQP query construction system uses query templates and during a search phase fills them with keywords. First the smallest templates are used to generate the partial interpretations. Next during the expansion step the more complex queries up to complete ones are created. With the help of such query hierarchy there is no need to generate the whole interpretation space. The appropriate options help to construct the query incrementally and ensure the scalability of the system.

128
Keyword Based Searching According to the Movie Names

Ranking Strategies
As to the analysis of the results, there exist two approaches. The traditional methods generate and explore all possibilities, whereas the newer strategy is to rank structures. Let us first have a look how the traditional methods proceed. At the beginning tuple trees are extracted as the possible answers for a keyword query. In the next step a ranking strategy is used in order to decide which results are the most relevant ones. The user can then explore the ranked list to identify the desired interpretations.

Faceted search in Databases
User-driven query disambiguation has been successfully applied in Information Retrieval in the context of faceted search. Faceted search engines, such as the product search engine of Google and the Clusty search engine, organize search results into meaningful groups, called facets, by applying some clustering or categorization algorithms. Users can easily shrink the scope of the search by focusing on a small number of facets. Several navigational techniques were proposed to support users in finding information in a hierarchy of faceted categories. The interface of EIQP is similar to a faceted interface, whereas each facet corresponds to a query construction option. SUITS, a faceted interface that enables users to interactively disambiguate keyword queries. However, SUITS lacks a theoretical foundation for verifying its effectiveness.

Incremental Query Construction
Nowadays in the age of web search machines ordinary users let himself be spoilt with the ease of use of a keyword query. Unfortunately, this simplicity doesn't mean high expressiveness and quality of the retrieved results. The less knowledge about the intent behind a query is provided, the more effort is needed to extract the satisfying information from database. On the contrary, databases are equipped with a powerful query language that allows asking even trickiest and unusual questions but is too complex and hard to understand for an ordinary user. Many effort have been made in order to combine the high expressiveness of a database query language and the simplicity of a keyword query. The recent approaches make use of the database structure and provide users with a possibility to refine their information need and actively participate in the information retrieval process.

Techniques
 In the existing system, they propose a set of functions written in Xquery; perform well on simple XML documents.
 XQuery is the language for querying XML data. It is like SQL for databases. it is built on XPath expressions. It is supported by all major databases.  After that there was a proposal based on XMINE RULE, an operator for mining association rules for native XML documents.  XMINE is based on the MINE RULE operator, which works on relational data only. This means that, after a step of pruning of unnecessary information, the XML document is translated into the relational format.  In another system, the designer must specify the structure of the rule to be extracted and then to mine it.  XQuery expressions that can be used to retrieve useful information from the extracted sets of association rules. Such useful information can provide intentional answers to queries formulated as XQuery expressions.  The overall querying process appears completely transparent to the user: if the actual dataset is not available, the intentional (approximate) answer will be automatically provided by querying the rule set.

Disadvantages of Existing System
 The limitation in XQuery is that it is difficult to apply to complex XML documents with an irregular structure.  The designer has to know the structure of the XML document in advance, and this is an unreasonable requirement when the document does not have a DTD.  In Xquery we cannot get the exact answers to queries containing aggregators.

Implementation Results
In our paper work, we have taken the real time example i.e., very useful for investigations. This includes storing the incidents in XML format that happen in day to day life. These incidents can be updated every day. If the investigator wants to get the required information from that, he can get the information in three formats using PHP.

Get the gist
Get the gist is nothing but after updating all the incidents if we want to get required information i.e., getting about particular incident. This is that how many incidents happen nothing but finding the support. And at the same time how much percentage of the incident that is happened nothing but finding confidence.
Get the gist means retrieving information from XML document using PHP (Hypertext Preprocessor). It can provide intentional information about XML document. Get the gist allows intentional information extraction from an XML document, given the support, confidence.
Here we can give input as XML document and processing it using PHP according to user requirements. First we have to load the XML document into web server after connecting to local host. Here we are using WAMP server in PHP environment to execute the programs. And used HTML for user interfaces.

Get the idea
Get the idea includes get the gist as well as getting the original document to compare the results. Get the gist means retrieving information from XML document using PHP(Hypertext Preprocessor). It can provide intentional information about XML document. Get the gist allows intentional information extraction from an XML document, given the support, confidence.
Here we can give input as XML document and processing it using PHP according to user requirements. First we have to load the XML document into web server after connecting to local host. Here we are using WAMP server in PHP environment to execute the programs. And used HTML for user interfaces. Get the idea allows to show the intentional information as well as the original document, to give users the possibility to compare the two kinds of information. From that original document the user can verify the results with original documents.

Get the answers
Get the answers is nothing but we have the date tag in our XML document i.e., in XML we can specify the user defined tag. From that date tag we can the list of incidents that are happened with their details. This includes some date comparison functions to display the accurate results.
Get the answers allows to query the intentional knowledge and the original XML document. Users have to write an extensional query; when the query belongs to the classes we have analyzed it is translated and applied to the intentional knowledge.
The incidents document is having the elements like incident type, when it is reported, bullet case type and etc., here we can give XML document as input and the two dates from which the user wants the incidents that are happened. Here we can query the XML document by giving the two dates and getting the incidents that are happened between two dates. We can not only retrieve incident type and also when it is reported, where it is happened etc., all the data can be retrieved.
Here we used three classes of queries are to query the XML document. We used various functions to retrieve data from XML documents. Some of them are $_POST method to get the data directly that is entered by user from textbox. This can be stored in one variable and processed by taking values from XML documents. To get values from XML documents there is a function get Elements By Tag Name() and passing tag name as parameter. This gives the values of that tag as output.
These tag values can be stored in array format. We used forereach loop to loop through arrays that is to compare with entered values. We used node values for getting tag values from the XML document. And used strtotime() function to convert given entered date in textbox in to given date format.     This will show the results after testing is completed. This consists of various screenshots of the paper which will briefly explain the execution of the paper.

Conclusion
The presented Extended Incremental Query Processing (EEIQP) -a novel system, which enables construction of structured queries from keywords. We presented a conceptual framework for the incremental query construction as well as a probabilistic model, which enables consistent assessment of the probability of a query interpretation. We presented an algorithm for generating optimal query construction plans, which enables the user to obtain the intended structured query with a minimal number of interactions. Our experimental results and user study show that EEIQP is highly helpful when user intended structured queries cannot be found within the top-ranked results. This can be used in place of any keyword search. Here the user is given with certain options to find the desired structured queries by which the user can get the exact results from the data base. In our system user can get the exact result instead of searching the exact result in the bunch of the result which is happened in the normal keyword searches. In the ranking search the most common query interpretations highest, whereas the users with less frequent informational needs may not receive adequate results. Using EEIQP, a user can benefit from both, a conventional ranking interface and a more controllable query construction interface. The former allows the user to immediately identify the most common interpretation of her query. The latter enables the user to clarify her search intent step by step, which is especially helpful when the intended query interpretation does not receive a good rank. By, using this we can get the exact results which the user is expected to get. In this paper implements for getting information from XML documents using query answering. Basically retrieving information from XML documents is a hard task because XML store large amount of data. Querying such is not easy to implement. But this can be done using PHP. Required data can be extracted from XML documents using that scripting language. We collect the incidents that are happened in our day to day life. These data can be stored in XML format. As there is a large data set, it is difficult to investigate or finding information about the particular incident. We proposed the system that it can give the results about that event based on the queries given by user. Finally, we showed that there is easy way to retrieve information from XML documents. In this paper we had done the examples for one dynamic XML document. A possible extension of this work concerns for doing the results for any type of XML document.