1

I am running a database for log analysis. At the moment I use a MySQL database and the table for my analysis looks like this:

  • UUID
  • REQUEST_ID
  • REQUEST_TIMESTAMP
  • RESPONSE_TIMESTAMP
  • RUNTIME
  • SERVER_NAME
  • EXSIGHT_NAME

I use this table to create views for each entry, for 5 minute aggregation and for a daily aggregation. I am inserting about 400.000 entries a day. At the moment there are about 70 Million rows in this table.

My actual problem is, that my queries are getting slow, my insert/update queries as well as my aggregation queries.

So I created a second table for my daily aggregation. Once a day a job will run, to make an aggregation for the last day. A second job will delete all entries which are older than 30 days from the original table.

My question: Is this the right approach or would be a different table structure or even a another database (e.g. NoSQL, Graphdatabase, etc.) better?

[EDIT]

Select for daily aggregation:

select 
        date_format(REQUEST_TIMESTAMP,'%Y-%m-%d 00:00:00') as INTERVAL_START,
        null as INTERVAL_END,
        count(REQUEST_ID) as Anzahl,
        FORMAT((sum(RUNTIME)/count(REQUEST_ID))/1000,0) as dStime 
    from REQUEST_LOGS 
        where EXSIGHT_NAME like (case when '<EXSIGHT>' = 'alle' then '%' else '<EXSIGHT>' end) 
        and SERVER_NAME like '<SERVER>' 
        and (REQUEST_TIMESTAMP between '<FILTERFROMDATE>' and '<FILTERTODATE>')
    group by INTERVAL_START
    order by INTERVAL_START desc

Indexes:

enter image description here

4
  • My first guess is that a few properly placed indexes will help you out better than changing engines. Could you post any existing indexes as well as what your aggregate query looks like and what it groups on? Commented Feb 14, 2017 at 14:06
  • added information about indexes and an example query
    – LStrike
    Commented Feb 15, 2017 at 15:01
  • Please provide SHOW CREATE TABLE; something is confusing in the Pos column of the index list.
    – Rick James
    Commented Feb 15, 2017 at 17:21
  • I'm confused. You say "daily aggregation" as if you do the same query each day. Yet there seem to be parameters passed to the query. I want move you toward a constant set of aggregation queries each day; put the results in Summary table(s); then get the real aggregation from the summary. It will be a little more work, but so fast that you can build a web page to let users get whatever they want (within limits).
    – Rick James
    Commented Feb 15, 2017 at 17:29

1 Answer 1

1

You need several of my blogs:

How big is the table? Which Engine are you using? What is the value of innodb_buffer_pool_size? It should be about 70% of RAM.

2
  • At the moment I don't need the UUID. It is created in my Java Class an not in the database. Additionally it is not included in the an index. I thought that my UUID will be processed like a common varchar or am I wrong?
    – LStrike
    Commented Feb 15, 2017 at 15:03
  • 1
    When a table gets do big that many actions are I/O-bound, performance suffers. This applies regardless of the "database type" in use. Getting rid of UUID would be a small step in that direction.
    – Rick James
    Commented Feb 15, 2017 at 17:39

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.