Maybe this was the only way you knew how to solve this problem.
Take a moment and think about how you could solve this problem with the basic SQL tools you have.
Think about what the SQL statement is actually askingSELECT day ,SUM(Profit)FROM( Select t1.
Profit FROM f_daily_sales t1 JOIN f_daily_sales t2On t1.
day) t3Group by dayIn this case you can use the self join with a greater than or equal to statement creates a table like the one belowAs you can see you will be able to group on the t1.
day field and get a rolling sum effect.
One of the major downsides depends on the amount of data.
Because depending on how the data is structured and how far back you would like to go, this method could cause a system to run out of memory.
If by any chance this is a problem.
Here is another thought.
You could create a summary table that basically aggregates the total per day.
This depends on you are partitioning the rolling sum.
For instance, this will work well as long as you know you will constantly be reporting the rolling sum on a monthly and yearly basis.
This would lock your design in, but add both speed and a reduction of the computer when you actually query.
Again this is more of a design question.
What is the right answer depending on your overall goals?Joining consecutive eventsAnother question you might be asked is to provide insight into consecutive events for patients, customers, or other entities.
For instance, you might want to calculate the number of days between the time a customer visits a store, or your website.
Maybe you want a simple flag that dictates whether or not a patient is readmitted in x days after each their initial visit.
Well at a basic level, these have a similar goal.
You need to know how many days are between two different transactions.
Most transactions are lined up row by row, so you can’t simply go to the next row to figure out this information.
This is where an analytic function can work again.
Using the row number function you are able to partition the data by each customer and patient and add a row number to each event.
What would like below?Using this as your base data you can self join this table to the next row using a +1 on the row number row to join on the following row.
Select Customer_ID ,row_number() over (partition by Customer_ID order by date) r_num ,date ,total_paidInto #Customer_TransactionsFrom Customer_TransactionsSelect t1.
date) –this will calculate the difference in daysFrom #Customer_Transactions t1Join #Customer_Transactions t2Join t1.
Customer_ID = t2.
r_num = t2.
r_numWith this self-joined the table and each row can be looked at side by side you can tell more information.
You can answer questions like, how many days were in between each event on average, or how many times did customers return in less than 15 days.
This can be a great starting point into some analysis focused on retention or policy improvement.
You can start to see where your problem areas are, and possibly notice trends between stores or doctors based on they perform.
Another approach that some people like to take is looping through the data.
This can get a little more complex and is usually slower, but it is another solution we won’t go over in this post.
These were two examples of solving SQL problems.
We will continue to post various problems and different ways you can solve them.
We might even re-approach the same problems above.
That is what is great about SQL, depending on how your data is laid out and your limitations force you to use different solutions.
You could have worked somewhere with SQL Server and get accustomed to temporary tables that allow you to put indexes on them.
Then you go to the database that no longer allows temp tables and your performance goes way down.
We would love to come and help increase your teams analytical and SQL abilities.
Please feel free to reach out!.Also, please let us know if you have any questions or problems you are working through.
We would love to poke around on new problems!.