Uploaded image for project: 'Instrument control development'
  1. Instrument control development
  2. INSTRM-2696

Identify source of updateTelStatus timeouts

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Open (View Workflow)
    • Priority: Normal
    • Resolution: Unresolved
    • Component/s: None
    • Labels:
      None

      Description

      As slightly alleviated by INSTRM-2686 and INSTRM-2695, the updateTelStatus command is sometimes slow. The problem is in the opdb INSERT itself, but we haven't pinned it down beyond that. Turning on PostgreSQL logging and/or psycopg2/sqlalchemy logging might help; both of those might cause trouble.

      We did turn on log_min_duration_statement but got no hits. I'm not convinced that actually tells us what we are looking for: the end-to-end wall time for the statement hitting/leaving the server process. Still, I would try set log_statement='mod' and lowering the associated duration in any case. Kiyoto Yabe?

      WAL checkpointing (on a decently loaded server with basically one logical spindle) is a concern, per logic and the logs. I just don't know exactly how that IO/buffering affects statement processing. You can certainly see the effect of observing activity, but there are also much longer delays than at times where we see failures in gen2. Not sure.

      It would be nice to be able to turn psycopg/sqlalchemy logs on/off at runtime: correlating those times with the server times could clarify things. Wilfred Gee ?

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                cloomis cloomis
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: