SELECT column_name FROM table_name
o retrieve information from a table
o each table is like a spreadsheet
o can select multiple columns like SELECT c1, c2 FROM table_1
o can select all the columns from a table like SELECT * FROM table_1
o However, this will query everything which will increase traffic between database
server and the application which can slow down the retrieval of results.
o semicolon denote the end of a query
SELECT DISTINCT column FROM table
o To specify which column DISTINCT is being applied to, you can use () for clarity
like SELECT DISTINCT (column) FROM table
o Return unique values from that column
COUNT(column) FROM table
o return the number of input rows that match a specific condition of a query
o can apply COUNT on a specific column or just pass COUNT(*) which are the same
o COUNT needs () as it is a function that needs a 'y'
o COUNT will count the number of rows in that column, it should be the same
regardless of the column as each column has the same number of rows as they
are in the same table.
o Thus, COUNT by itself simply returns back a count of the number of rows in a
table
o COUNT(DISTINCT name) FROM table will return the distinct counts from that
column. This will just return a number, not the unique values.
WHERE
o Specify conditions on columns for the rows to be returned
o SELECT column1, column2 FROM table WHERE conditions;
o The WHERE clause appears immediately after the FORM clause of the SELECT
statement.
o The conditions are used to filter the rows returned from the SELECT statement.
o Comparison Operators
Compare a column value to something.
o Logical operators
Allow us to combine multiple comparison operators
AND, OR, NOT
o E.g.
SELECT name,choice FROM table WHERE name = ‘David’
The software uses ‘’ to denote the string. Capitalisation matters.
SELECT name,choice FROM table WHERE name = ‘David’ AND choice =
‘Red’
ORDER BY
o Can be used to sort rows based on a column value, in either ascending or
descending order. Alphabetical for string-based columns or numerical order for
numeric columns.
o SELECT column_1,column_2 FROM table ORDER BY column_1 ASC/DESC
o ORDER BY is at the end of a query since we want to do selection and filtering first
before sorting.
o If it is blank after ORDER BY, it uses ASC by default.
o You can also ORDER BY multiple columns, makes sense when one column has
duplicate entries. E.g. SELECT company,name,sales FROM table ORDER BY
company,sales (Sort by company first then sales in ascending order).
o Can also be SELECT store_id,first_name,last_name FROM customer ORDER BY
store_id DESC, first_name ASC
LIMIT
o Allows us to limit the number of rows returned for a query.
o It goes at the very end of a query request and is the last command to be
executed.
o SELECT * FROM payment ORDER BY payment_date DESC LIMIT 5;
BETWEEN operator
o Can be used to match a value against a range of values, value BETWEEN low
AND high.
o It is the same as
value >= low AND value <= high
Value BETWEEN low AND high
o Can also be value NOT BETWEEN low AND high, which is the same as
Value < low OR value > high
Value NOT BETWEEN low AND high
o It can also be used with dates. Note that you need to format dates in the ISO
8601 standard format, which is YYYY-MM-DD. E.g. date BETWEEN ‘2007-01-01’
AND ‘2007-02-01’.
When using BETWEEN operator with dates that also include timestamp
information, be careful when using BETWEEN vs <=,>= comparison
operators due to the fact that a datetime starts at 0:00.
IN operator
o When you want to check for multiple possible value options, e.g. if a username
shows up IN a list of known names. We can use the IN operator to create a
condition that checks to see if a value is included in a list of multiple options.
o Value IN (option1,option2,…,option_n)
o E.g. SELECT color FROM table WHERE color IN (‘red’,’blue’)
o Can also be SELECT color FROM table WHERE color NOT IN (‘red’,’blue’)
o SELECT * FROM payment WHERE amount IN (0.00,1.98,1.99);
LIKE operator
o Allows us to perform pattern matching against string data with the use of
wildcard characters:
o Percent %
Matches any sequence of characters
o Underscore _
Matches any single character
o E.g.
All names that begin with an ‘A’ -> WHERE name LIKE ‘A%’
All names that end with an ‘a’ -> WHERE name LIKE ‘%a’
LIKE is case-sensitive whereas ILIKE is case-insensitive.
o Underscore allows us to replace just a single character
E.g. title LIKE ‘Misson Impossible _’
You can use multiple underscores, e.g. ‘Version#A4’ then WHERE value
LIKE ‘Version#__’
o Can combine pattern matching operators
E.g. WHERE name LIKE ‘_her%’, e.g. Cheryl, Theresa, Sherri
Aggregate functions
o Take multiple inputs and return a single output.
o Common aggregate functions:
AVG() – returns average value, returns a floating point value (many
decimal places 0.2345), you can use ROUND(value you want to round,# of
decimals) to specify precision after the decimal
COUNT() – returns number of values, meaning we can just use COUNT(*)
MAX() – returns maximum value
MIN() – returns minimum value
SUM() – returns the sum of all values
o Aggregate function calls happen only in the SELECT clause or the HAVING clause.
GROUP BY
o Allow us to aggregate data and apply functions to better understand how data is
distributed per category.
o We need to choose a categorical column to GROUP BY. Categorical columns are
non-continuous. They can still be numerical like Class 1, Class 2 etc.
o SELECT category_col, AGG(data_col) FROM table GROUP BY category_col
o The GROUP BY clause must appear right after a FROM or WHERE statement.
o SELECT category_col, AGG (data_col) FROM table WHERE category_col != ‘A’
GROUP BY category_col.
o In the SELECT statement, columns must either have an aggregate function or be
in the GROUP BY call.
o WHERE should not refer to the aggregation result.
o SELECT company, SUM(sales) FROM finance_table GROUP BY company ORDER BY
SUM(sales). If you want to sort result based on the aggregate, make sure to
reference the entire function.