Tuesday, January 09, 2007

An SQL Introduction and Tutorial

If you need to freshen up your basics on SQL or if you need to train a new employee on a gradient approach to SQL, the following is a suggested training line-up:

First, clear up each of the following terms:

SQL
SELECT
FROM
AS
WHERE
LIKE
IN
BETWEEN
AND
OR
INTO
GROUP BY
ORDER BY
HAVING
COUNT

Next, go through this online interactive tutorial. This has the student walk through real examples using an online SQL tool so you can practice what you learn online without having to go to another computer where your database is.

An Interactive SQL Tutorial

In learning any new subject it is extremely important to start with the key words and to have these cleared up first, before carrying on with the material. This is covered in Hubbard's, Study Technology.

Here are good basic definitions for each of these SQL terms:

SQL
SQL stands for Structured Query Language and is a computer language that allows you to ask questions (query) or interact with your database in a structured manner.

SELECT
The SELECT statement indicates that you wish to query and retrieve information from a database. The select_list specifies the type of information (or column names) to retrieve. The keyword ALL or the wildcard character asterisk (*) could be used to signify all columns in the table. Also, the keyword DISTINCT could be used to discard duplicate records and retrieve only the unique records for the specified columns.

INTO
The INTO keyword indicates that you are inserting records into another table. For example in Microsoft Access you can create a new table and insert all records from another table by using the following: SELECT * INTO newtablename FROM tablename

FROM
The FROM clause is required in any SELECT statement. The FROM clause specifies the specific tables to retrieve data from.

AS
The AS keyword is used in the SELECT or the FROM clause when you want to refer to a table or column as something else. This is very handy if you have long table or column names and you want to use a shorter name to refer to them. This is commonly used when you are joining multiple tables together and do not want to rewrite the full table name each time you are referring to that table. For example:

SELECT a.column, b.column
FROM table_name_a AS a, table_name_b AS b
WHERE a.ID = b.ID

This is also commonly used to have a nicer display of your results. For example if you have a column called "field17" and you want to display it as "COUNTRY" you can use the following:

SELECT field17 as COUNTRY
FROM table_name

WHERE
The WHERE clause limits the results to those records (or rows) that meet some particular conditions (optional).

LIKE
LIKE is another SQL keyword that is used in the WHERE clause. LIKE allows you to search based on a pattern rather than specifying exactly what is desired The syntax for it is as follows: SELECT * FROM table WHERE column LIKE 'abc%'
The wildcard '%' (in Microsoft SQL and Oracle SQL) or '*' (in Access SQL) indicates any character and can be used in any location within the quotes. This will return any records that match 'abc' as the first three letters and then anything else after that such as 'abc123', 'abcdef', etc.

IN
The IN keyword is used in the WHERE clause when you want to select all records that have specific values such as if you wanted all customers that have either blue, green or hazel eyes you would use the following:

SELECT *
FROM customers
WHERE eye_color IN ('blue', 'green', 'hazel')

You can not use wildcards with the IN keyword.

BETWEEN
If you want to specify a range of values you can use the BETWEEN keyword. This will give you anything between value a and b values. For example, if you wanted any zip codes that were between '90210' and '91420' you would use the following:

SELECT *
FROM address_list
WHERE zip_code BETWEEN '90210' AND '91420'

AND
The AND keyword is used when you want to add additional criteria to your WHERE clause or as part of the BETWEEN clause.

OR
The OR keyword is used when you want either one criteria or another, but not both. Be very careful when using the OR keyword as if you have AND and OR in the same WHERE clause your result will probably not be what you expect, unless you use parentheses (). For example, if you want all people who with blue or hazel eyes that live in California, you would use the following:

SELECT *
FROM address_list
WHERE (eye_color = 'blue' OR eye_color = 'hazel') AND state = 'CA'

If you don't use the parentheses your query will return people in California, people with blue eyes or people with hazel eyes. So remember your math class where you learned to group parts of the equation to avoid confusion (a * b) + y ...

GROUP BY
The GROUP BY clause specifies if you are grouping (or aggregating) any of the columns in your SELECT statement. For example if you want to display a count of all addresses by city you would use the GROUP BY clause to group the results by City. This following example will give you a breakdown by city and sort it by most to least:

SELECT city, COUNT(*)
FROM table
GROUP BY city
ORDER BY COUNT(*) DESC

HAVING
The HAVING clause specifies the specific conditions to group by (optional). For example you may want to group your results by City but only display those Cities that have more than 10 matches. You would do this by saying "HAVING COUNT(*) > 10". HAVING always comes after the GROUP BY clause.

ORDER BY
The ORDER BY clause specifies whether to output the query result in ascending or descending order. This is often used to sort your results in alphabetical or numerical sequence.

COUNT
The COUNT keyword is used in the SELECT or HAVING clauses. This allows you to give a total count of something. This is also referred to as an "aggregate". This means you are grouping things together and displaying aggregated data on the group instead of listing each individual record. Other aggregate keywords are SUM, MIN, MAX, etc. COUNT gives you the count of records, SUM gives you the SUM of the values in a column, MIN gives you the minimum value, MAX gives you the maximum value. Here is an example of the COUNT usage:

SELECT COUNT(*)
FROM table_name
WHERE last_name = 'SMITH'

This will give you a count of all records with the last name of 'SMITH'. Notice the (*) after the keyword. This means you want to give a count of all records meeting the criteria in your WHERE clause. But you may also see COUNT(1) which also means give a total number of the records. The COUNT keyword is very useful in giving reports or breakdowns of various columns. For example if you want a breakdown of all records you have for each eye_color (if you have this column in your table) you could use the following:

SELECT eye_color, COUNT(1)
FROM table_name
GROUP BY eye_color
ORDER BY COUNT(1) desc

Notice the GROUP BY and ORDER BY keywords used as well. This will show you the count by eye_color and will sort it by the the highest number to the lowest.
Here is another simple introduction to SQL.