SQL for Data AnalysisAdetola AdeyaBlockedUnblockFollowFollowingMay 27SQL (Structured Query Language) pronounced ‘sequel’, is used to modify and access data or information from a storage area called database.
Here are a few things the SQL can do;SQL can execute queries against a databaseSQL can retrieve data from a databaseSQL can insert records in a databaseSQL can update records in a databaseSQL can delete records from a database.
SQL is a database computer language designed for the retrieval and management of data in a relational database.
It is the standard language for relational database management systems.
A relational database management system (RDBMS) is the basis for SQL, and for all modern database systems such as MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.
The relational database system contains one or more objects called tables and data is stored in these tables.
Tables are comprised of rows and columns.
SQL is effective for performing the types of aggregations that are commonly executed on Excel, but over much larger datasets and on multiple tables at the same time.
I am going to be explaining how I solved some challenges in the SQL Curriculum on the Gitgirl platform.
We were asked to complete 54 challenges on SQL on the Hackerrank platform.
Here is a description of an exercise that I found quite interesting.
The PADSThe challenge was for us to generate the following two result sets:Query an alphabetically ordered list of all names in OCCUPATIONS, immediately followed by the first letter of each profession as a parenthetical (i.
: enclosed in parentheses).
For example: AnActorName(A), ADoctorName(D), AProfessorName(P), and ASingerName(S).
Query the number of occurrences of each occupation in OCCUPATIONS.
Sort the occurrences in ascending order, and output them in the following format: There are a total of [occupation_count] [occupation]s.
where [occupation_count] is the number of occurrences of occupation in OCCUPATIONS and [occupation] is the lowercase occupation name.
If more than one Occupation has the same [occupation_count], they should be ordered alphabetically.
SOLUTIONFirst, we put in the SELECT statement is used to select data from a database and the data extracted here is the concatenation of the name column and a substring of the occupation column.
The SUBSTR function is for extracting a substring from a string.
An alias ‘Names’ is then given to the whole statement:SELECT CONCAT(name,’(‘,SUBSTR(occupation,1,1),’)’) as NamesNext, we use the FROM clause that produces the table needed, which in this case is the OCCUPATIONS table:FROM OCCUPATIONSLastly, we will sort the result-set in ascending order using the ORDER BY clause and we order by Names and close the statement with a semicolon showing the end of the query:ORDER BY Names;note: The ORDER BY clause sorts the records in ascending order by default.
For the second query, the same CONCAT function is used to add a line of string and the Count of records in the population column.
SELECT CONCAT(‘There are a total of ‘, COUNT(occupation),’ ‘, LOWER(occupation),’s.
’)Next, we use the FROM clause that produces the table needed, which in this case is the OCCUPATIONS table:FROM OCCUPATIONSAfter that, the GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data: occupation, into groups.
GROUP BY occupationWe then sort the result-set in ascending order using the ORDER BY clause and we order by COUNT of population and the population column and close the statement with a semicolon indicating the end of the query:ORDER BY COUNT(occupation), occupation;A picture showing Queries 1 and 2Lastly, we run both queries together.
The output of both queriesIn this short tutorial, we were able to identify a few clauses and utilize them to retrieve, modify, sort and update data.
I assumed the understanding of the basic knowledge of SQL in this tutorial but if you want to learn more or refresh your memory check this out.
PS: I hope this tutorial helps you to get started with basic data analysis using any dataset you come across and practice reading, exploring, analysing, using SQL.
My Hackerrank profile can be found here and my Gitgirl projects can be found on my Github.
Please comment below if you have thoughts or questions concerning anything I have explained.
Till next time!.