(relational) database performance for a date/time point/interval

Tags:

So I am doing a project in Access SQL and it has come along nicely. I have learned a lot about Access and VBA and this site has been helpful in the process.

Now I am facing a problem which is performance and since I have little experience in this kind of SQL work I come here for some thoughts.

I have a ~20 table relational database for around 100 sections which represent parts of a route. The Access database is essentially a map on which I drew several routes (via lines) that can be coloured dynamically - the color is determined by the specific question and calculated out of the database.

Here is a picture which explains it better. You can not click on lines in access so the buttons are set to be identical in colour and width to the lines and are clickable for more information. a thing

The user can chose a date and it will display the progress of the route according to the question asked. Up to now, these questions were always binary "yes, or no" (green or red).

I have found that because of the complexity of the queries I have to pretty much prepare a temporary database for each query at startup, otherwise it is not possible to scroll through dates smoothly.

So anyway here is my specific problem:

Each section of the route can be in different phases (think construction) at a certain date. From "phase 0" to "done"

A new line is to be implemented which represents phases of a project. There are around 8 possible phases for all sections, which can happen at different times and - here is the thing - in a different order for each section AND not all phases happen on all sections.

What I have in the database are only starting dates - not ending dates - for each phase. The order of the phases has pretty much be determined by the order of the starting date. At least each phase can only happen once for each section, so there is that. As you can see this is a shitty thing for this kind of performance centric program.

I am certain it will involve one or several temporary databases. My ideas:

Aggregate all dates into one row of a new table. Since the number of phases is set, there are columns for each phase - if it is needed, when it starts and when it ends. A loop now needs to go through each and check if the user-date falls into which phase. So: "SectionID - phase1needed phase1start phase1end ....."
Advantage:
- One can confirm the data manually and display it in secondary forms well
- It keeps the database small
  Disadvantage:
- The actual loop needs to go through (At worst) all phases to find the correct one.
Calculate a new database which is just "IdSection - Date - Phase" and calculate a phase for each Section and EVERY Day in an interval.
Advantage:
- This keeps the runtime calculations to one query per section
- Access should work with large amounts of data
  Disadvantage:
- I can not manually check if what I did was correct for all sections
- Will take long at startup, like really long
- It will take a lot of entries in that db

Now I ask which you would prefer, or even if there is a different method? I can not really change much about the points of data I have.

In short I have to display intervals of time of different phases and in the database I only have starting points of time, no complete order of the phases.

Thank you for your thoughts, any experiences in these sort of things will help

693

asked Dec 21 '12 07:12

IMA

1 Answers

If I understand you properly, you have a series of data similar to the form:

Section 1, Phase 7, Start Date = 11/07/2012
Section 1, Phase 2, Start Date = 12/14/2012
Section 1, Phase 3, Start Date = 12/28/2012
Section 2, Phase 1, Start Date = 11/04/2012
Section 2, Phase 9, Start Date = 12/30/2012
Section 3, Phase 4, Start Date = 11/19/2012
Section 3, Phase 5, Start Date = 12/06/2012
Section 3, Phase 3, Start Date = 12/11/2012

and you want to answer a question like "What phase is each section in on 12/15/2012?", is that correct?

The answer in this case should look something like the form:

Section 1, Phase 2
Section 2, Phase 1
Section 3, Phase 3

In order to do this, I'll assume you have a table called SECTION_PHASES with the following fields:

SECTION    Number
PHASE      Number
START_DATE Date/Time

What you need to do is figure out the maximum start date for each section that happened before your current input date, because that is the most recently active phase before the next phase change. Once you do that, you can join that information back into your main table to determine what the phase was after that date.

You need to make one query SECTION_MAX_DATES that then has the following code in its SQL View:

SELECT [SECTION_PHASES].SECTION, Max([SECTION_PHASES].START_DATE) AS target_date
FROM SECTION_PHASES
WHERE [SECTION_PHASES].START_DATE<#12/15/2012#
GROUP BY [SECTION_PHASES].SECTION
ORDER BY [SECTION_PHASES].SECTION;

Once you have that query saved, you can join it as a subquery back to your original table. Now, make another query SECTION_PHASE_AT_DATE which includes your original table and the previous query, then enter the following code in its SQL View:

SELECT SECTION_PHASES.SECTION, SECTION_PHASES.PHASE, SECTION_PHASES.START_DATE
FROM SECTION_MAX_DATES INNER JOIN SECTION_PHASES ON (SECTION_MAX_DATES.target_date=SECTION_PHASES.START_DATE) AND (SECTION_MAX_DATES.SECTION=SECTION_PHASES.SECTION)
ORDER BY SECTION_PHASES.SECTION;

That query will give you the result you are after, if I understand your question correctly. There is no need to calculate the end dates if I understand you properly that a new start date for a given phase indicates the end of whatever phase was previously-current prior to the new date.

You'll still have a few edge cases to work out, like what happens if a section doesn't have a phase registered yet prior to the given date. I'll also leave it to you to figure out how to parameterize the date in the WHERE clause of the 1st of the two queries, which is probably trivial for you given the progress you made already! However, I think this is the SQL structure you were looking for to solve the data/calculation part of your problem.

answered Nov 15 '22 03:11

Lluluien

Related questions
                            
                                Generate MySQL data dump in SQL from PHP
                            
                                In SQL Server, how do I create a reference variable to a table?
                            
                                Cannot use the ROLLBACK statement within an INSERT-EXEC statement [duplicate]
                            
                                How to convert hex 'YMD' date to a readable date?
                            
                                How do I map true/false/unknown to -1/0/null without repetition?
                            
                                Why does EF generate this sql when querying reference property
                            
                                Statistical query in SQL - is this possible with NHibernate LINQ?
                            
                                Connect by prior tree must be symmetrical
                            
                                How can I query between two columns while still taking advantage of indexes?
                            
                                INSERT ALL INTO and Sequence.nextval for a Surrogate Key
                            
                                Contradictory Oracle query result
                            
                                How to ensure a SQL like NOLOCK when using Django Model Queries
                            
                                Way to debug stored procedures in SQL Azure?
                            
                                Best practices for synchronizing a SQL database with a REST remote server on Android
                            
                                Best (NoSQL?) DB for small docs/records, unchanging data, lots of writes, quick reads?
                            
                                PostgreSQL: The lower the LIMIT, the slower the query
                            
                                What is enough to store dates/times in the DB from multiple time zones for accurate calculations?
                            
                                Is bind peeking disabled on distributed queries?
                            
                                Why are individual SELECT queries running when an all-encompassing SELECT already ran? (Rails/ActiveRecord)
                            
                                MySQL index optimization with Subquery vs Left Joins

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

(relational) database performance for a date/time point/interval

Tags:

performance

sql

ms-access

IMA

People also ask

1 Answers

Lluluien

Recent Activity

Donate For Us