Natural Language mySQL Interface (NLSQL) Jamie Olson Amrit Tuladhar Final Project CS 357 Spring 2005 This file contains the following sections: 1. OVERVIEW 2. FILES 3. REQUIREMENTS 4. RUNNING NLSQL, FEATURES 5. DEVELOPMENT NOTES 6. LIMITATIONS AND KNOWN ISSUES 1. OVERVIEW: Natural Language mySQL Interface (NLSQL) allows users to interact with mySQL using natural language commands. Currently, it includes support for most variations of commands for displaying and deleting records, and limited support for sorting them. NLSQL was written in Perl and uses an Earley parser written in Perl for parsing sentences, and the Perl DBI and DBD::mySQL modules to interact with mySQL. 2. FILES: NLSQL consists of the following files: * nlsql.pl : Perl script that uses an Earley parser module to parse a sentence and then translates it into SQL commands. It also contains Perl DBI components to execute the SQL commands and display the results * simple-grammar : Grammar used to parse sentences * simple-lexicon : Lexicon used to parse sentences * Parser.pm : Earley parser used to parse sentences. It uses the following Perl modules: * Node.pm : Represents one node in the parse tree * PhraseStruct.pm : Represents one phrase structure (consisting of a left hand side phrase and an array of right hand side children * Production.pm : Represents one production from the production table produced by the Earley parser * SentenceParse.pm : Uses the above modules to produce parses * README.txt : This README file 3. REQUIREMENTS: * Perl interpreter * Perl DBI module * Perl DBD::mySQL module * A mySQL account and database is also needed. The user has to log into his / her mySQL account at the beginning of each session. 4. USING NLSQL, FEATURES: NLSQL can be started by typing 'perl nlsql.pl' or just 'nlsql.pl' (with the right permissions set). First, log into your mySQL account with your username and password. NLSQL will display a list of available databases. Choose the database you want to use. NLSQL will display a list of available tables from the database and then the NLSQL prompt: NLSQL> At this prompt, you can type a request in English to interact with tables in the database. In the current version, the request has to fulfill the following criteria: a. It cannot contain any punctuation except the final period or question mark. b. It should be in lowercase, except the fieldnames and values and the first word. c. Except fieldnames and values, which have to nearly match the values in the database, other words should be correctly spelled. Supported SQL commands: The following functions are supported: (In the examples below, we assume that the database contains a table called people with the fields FirstName, LastName, Age and Sex.) a. Displaying data: Example: Show me everyone in people SQL command: SELECT * FROM people Example: Display the lastname of everyone in people where age is greater than 5 SQL command: SELECT LastName FROM people WHERE Age > '5' Example: what are the firstname and lastname from people such that firstname is Amrit and age is less than 10? SQL command: SELECT FirstName, LastName FROM people WHERE FirstName = 'Amrit' AND Age < '10' Example: show me everything from people where firstname starts with s SQL command: SELECT * FROM people WHERE FirstName LIKE '%s' Simple transformations are also allowed: Example: Show me everything that has the firstname Jamie in people SQL command: SELECT * FROM people WHERE FirstName = 'Jamie' b. Deleting data: Example: Erase everyone in people SQL command: DELETE FROM people Example: Delete things from people where firstname is Bob SQL command: DELETE FROM people WHERE FirstName = 'Bob' Example: delete everything such that firstname contains a from people SQL command: DELETE FROM people WHERE firstname LIKE '%a%' c. Sorting data: Example: Sort everything in people by Age SQL command: SELECT * FROM people ORDER BY Age ASC Example: Arrange everything in people by FirstName in alphabetical order SQL command: SELECT * FROM people ORDER BY FirstName ASC Example: Sort everything in ascending order by LastName in people SQL command: SELECT * FROM People ORDER BY LastName ASC Running SQL commands directly: SQL commands can be run directly by preceding the command with a colon (:). Example: :LOAD DATA INFILE '/tmp/people.txt' INTO TABLE people Getting help: Type 'help' or '?' to display short help information. Quitting: Type 'exit' or 'bye' or 'quit' to quit NLSQL. 5. DEVELOPMENT NOTES: Initially, we had thought of using Phoenix to do the parsing, but the main sticking point with Phoenix was that we could not have a lexical category that could be any random word (which is needed to represent a table or field name or a value in this case). So, we decided to use an Earley parser, written by Jamie for an earlier assignment. 6. LIMITATIONS AND KNOWN ISSUES: i. The lexicon is fairly limited in this version and contains only the most common words that might be used for these tasks. ii. The parser is case sensitive, which means that all the words (except table and field name, values and first word in the sentence) have to be lower case. The lexicon includes both lowercase and uppercase forms for probable first words. iii. The parser does not handle punctuation other than a final period or question mark. iv. Only displaying, deleting and sorting data are currently supported. v. Multiple sorting orders (such as order by FirstName and then LastName) are not supported.