Sample Video Frame
Exercise 15: GROUP BY And Counts
For this exercise we'll need a large dataset to demonstrate how to group rows by data in their columns. Right now you only have a few pets and an owner in your database. What we need is a huge number of pets and owners. What I've done is created a dump of a fake dataset that you can load.
Once you have that file, you simply load it into your database using -init
and you can start playing with the GROUP BY
style of query.
What Is GROUP BY
The best way to describe GROUP BY
is by comparing it to what a sociologist would need to analyze a dataset. A sociologist isn't necessarily interested in what any single row contains. A sociologist doesn't care about each person's eye color, height, or survey answers. What matters most is the summary of those variables (columns) into numeric groups. These groups might be:
Counts of how many people answered "yes" vs. "no" to a question.
Average height of people by geographic region.
Counts of each gender and who they voted for in an election.
A sociologist wants the database to go through all the rows, and calculate a summary number based on separate groups of rows. Another way to put this is the sociologist wants to have the rows separated out into buckets by one column, and then the contents of those buckets counted or averaged. This is what GROUP BY
does for the sociologist.
In your database of pets we want to calculate some counts of these groups to look for patterns just like a sociologist would. We might want to know the following:
How many pets are dead vs. alive?
At what ages do pets die?
How many people own that are alive vs. dead?
We can also start calculating averages, but in this exercise we'll focus on basic counts and groups to understand this concept.
Register for Learn SQL the Hard Way
Register today for the course and get the all currently available videos and lessons, plus all future modules for no extra charge.