Megatest

History Of Ticket 827909e4da935424995f21a683873114abb4bbcf
Login

Artifacts Associated With Ticket 827909e4da935424995f21a683873114abb4bbcf

  1. Ticket change [9f47737748] (rid 15300) by bjbarcla on 2017-09-05 17:17:35:

    1. comment initialized to:
      I have observed that if you are running dashboard on a non-homehost it will work until the database size reaches some threshold, at which point it will freeze.  When restarting, it immediately freezes.  I suspect the database size is the critical dimension -- syncing likely takes longer than the action loop in dashboard.
      
      In the instance I see it, the megatest.db size is 5.4M with 4k tests and 11 runs.
      
      Running dashboard on homehost, there is no issue at all and dashboard responds very quickly.
      
      Here is the console log from dashboard when we see the freezing issue:
      
      
      Checking for user runconfigs config ...
      WARNING  lefkowit.runconfigs.config file ...doesn't exist
      do this if you care ### Add custom targets/overrides to global targets here to  ./configs/lefkowit.runconfigs.config
      
      Checking for required/common linktree path: /p/fdk/gwa/lefkowit/qa/libanaacatqa/links ...
      exists   /p/fdk/gwa/lefkowit/qa/libanaacatqa/links
      Defined disks:
      ------------------------------------
      exists   /p/fdk/gwa/lefkowit/qa/libanaacatqa/runs
      ------------------------------------
      Checking if megatest variables are already defined ...
      STARTING DASHBOARD ...
      WARNING: Current policy requires running dashboard on homehost: (10.38.45.46 . #f)
      INFO: Trying to start server (megatest -server 10.38.45.46 -m testsuite:libanaacatqa) ...
      INFO: (0) Starting server on 10.38.45.46, logfile is /nfs/pdx/disks/ch_icf_megatest_001/lefkowit/mtTesting/megatestqa/libanaacatqa/logs/server.log
      #======================================================================
      # NBFAKE logging command to: /nfs/pdx/disks/ch_icf_megatest_001/lefkowit/mtTesting/megatestqa/libanaacatqa/logs/server.log
      #      megatest -server 10.38.45.46 -m testsuite:libanaacatqa
      #======================================================================
      WARNING: problem with directory /p/fdk/gwa/lefkowit/fossil/ext/acatqa_ext/trunk/tests, dropping it from tests path
      WARNING: problem with directory /nfs/pdx/disks/ch_icf_megatest_001/lefkowit/mtTesting/megatestqa/relqa/tests, dropping it from tests path
      WARNING: problem with directory /nfs/pdx/disks/ch_icf_megatest_001/lefkowit/mtTesting/megatestqa/libanaacatqa/tests, dropping it from tests path
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 1000 to 2000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 2000 to 4000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 4000 to 8000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 8000 to 16000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 16000 to 32000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 32000 to 64000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 64000 to 128000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 128000 to 256000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 256000 to 512000
      ...
      NOTE: increasing poll interval from 1125899906842624000 to 2251799813685248000
      NOTE: updates are taking a long time, 3.0s elapsed.
      NOTE: increasing poll interval from 2251799813685248000 to 4503599627370496000
      NOTE: updates are taking a long time, 3.0s elapsed.
      
              Call history:
      
              db.scm:1972: call-with-current-continuation
              db.scm:1972: with-exception-handler
              db.scm:1972: ##sys#call-with-values
              db.scm:1972: k3436
              db.scm:1972: g3440
              dashboard.scm:795: current-seconds
              dashboard.scm:798: dboard:tabdat-allruns-by-id
              dashboard.scm:798: hash-table-set!
              dashboard.scm:803: debug:print
              common_records.scm:131: debug:debug-mode
              common_records.scm:132: with-output-to-port
              dashboard.scm:804: iup-base#attribute
              dashboard.scm:805: floor
              dashboard.scm:3551: k7615
              dashboard.scm:3551: g7619
              dashboard.scm:3551: print-call-chain            <--
      inexact number cannot be represented as an exact number
      Callback error in dashboard:runs-tab-updater
      Full condition info:
      ((exn (location inexact->exact) (call-chain (#(db.scm:1978: loop #f #f) #(db.scm:1978: loop #f #f) #(db.scm:1978: loop #f #f) #(db.scm:1972: call-with-current-continuation #f #f) #(db.scm:1972: with-exception-handler #f #f) #(db.scm:
      1972: ##sys#call-with-values #f #f) #(db.scm:1972: k3436 #f #f) #(db.scm:1972: g3440 #f #f) #(dashboard.scm:795: current-seconds #f #f) #(dashboard.scm:798: dboard:tabdat-allruns-by-id #f #f) #(dashboard.scm:798: hash-table-set! #f #
      f) #(dashboard.scm:803: debug:print #f #f) #(common_records.scm:131: debug:debug-mode #f #f) #(common_records.scm:132: with-output-to-port #f #f) #(dashboard.scm:804: iup-base#attribute #f #f) #(dashboard.scm:805: floor #f #f))) (arg
      uments (9.00719925474099e+18)) (message inexact number cannot be represented as an exact number)) (type))
      NOTE: updates are taking a long time, 3.0s elapsed.
      <stack dump repeats regularly, probably in the dashboard event loop>
      
    2. foundin initialized to: "1.6429"
    3. login: "bjbarcla"
    4. private_contact initialized to: "9a900f538965a426994e1e90600920aff0b4e8d2"
    5. severity initialized to: "Important"
    6. status initialized to: "Open"
    7. title initialized to:
      dashboard running on non-homehost with large db can hang
      
    8. type initialized to: "Code_Defect"
  2. Ticket change [d8597228f5] (rid 15298) by bjbarcla on 2017-09-05 17:21:49:

    1. icomment:
      One interesting side-note: running dashboard on non-homehost as a different user triggers read-only mode. Dashboard works great in this mode.  Read-only mode starts an in-line "server" which talks to megatest.db directly, bypassing http-transport.  A potential approach would be to force "read-only mode" always and intercept rmt: calls which affect the database and only in this case use http-transport.
      
    2. login: "bjbarcla"
    3. mimetype: "text/x-fossil-plain"
    4. priority changed to: "Immediate"
    5. resolution changed to: "Open"