Changes In Branch v1.65-adjutant Through [241991700a] Excluding Merge-Ins
This is equivalent to a diff from e373dd2861 to 241991700a
2020-09-22
| ||
23:21 | Added call-with-environment-variables check-in: ae141f651e user: matt tags: v1.65-adjutant | |
23:09 | Added env and mttop stuff check-in: 241991700a user: matt tags: v1.65-adjutant | |
21:17 | Pulled in v1.65 check-in: dc83b9e0d8 user: matt tags: v1.65-adjutant | |
2020-09-15
| ||
22:44 | Job queue for adjutant check-in: 93bcd23d16 user: matt tags: v1.65-adjutant | |
2020-09-10
| ||
18:15 | change message from error to info check-in: 523dbf02c4 user: mmgraham tags: v1.65-experiment | |
2020-09-07
| ||
21:02 | merged with v1.65-cleanup check-in: e373dd2861 user: mmgraham tags: v1.6568, v1.65-experiment | |
19:24 | update version to v1.6568 ==11.35/1.8/WARN/1204/orion== ==27.1/2.5/WARN/1201/mars== Leaf check-in: 1251244af0 user: mmgraham tags: v1.65-cleanup, v1.6568 | |
11:39 | Better flagging with LAUNCHING state. NOTE: itemwait subrun items are re-running when they perhaps should not. check-in: 5d2d0fddc3 user: matt tags: v1.65-experiment | |
Modified Makefile from [0dc94ad098] to [9cff203c98].
︙ | ︙ | |||
26 27 28 29 30 31 32 | process.scm runs.scm tasks.scm tests.scm genexample.scm \ http-transport.scm filedb.scm tdb.scm client.scm mt.scm \ ezsteps.scm lock-queue.scm sdb.scm rmt.scm api.scm \ subrun.scm portlogger.scm archive.scm env.scm \ diff-report.scm cgisetup/models/pgdb.scm # module source files | | | 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | process.scm runs.scm tasks.scm tests.scm genexample.scm \ http-transport.scm filedb.scm tdb.scm client.scm mt.scm \ ezsteps.scm lock-queue.scm sdb.scm rmt.scm api.scm \ subrun.scm portlogger.scm archive.scm env.scm \ diff-report.scm cgisetup/models/pgdb.scm # module source files MSRCFILES = adjutant.scm mutils.scm mttop.scm # ftail.scm rmtmod.scm commonmod.scm removed # MSRCFILES = ducttape-lib.scm pkts.scm stml2.scm cookie.scm mutils.scm \ # mtargs.scm commonmod.scm dbmod.scm adjutant.scm ulex.scm \ # rmtmod.scm apimod.scm GUISRCF = dashboard-context-menu.scm dashboard-tests.scm \ dashboard-guimonitor.scm gutils.scm dcommon.scm tree.scm \ |
︙ | ︙ |
Modified TODO from [da5eae4898] to [0885dee1e5].
︙ | ︙ | |||
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | # # You should have received a copy of the GNU General Public License # along with Megatest. If not, see <http://www.gnu.org/licenses/>. TODO ==== WW15 . fill newview matrix with data, filter pipeline gui elements . improve [script], especially indent handling WW16 . split db into megatest.db (runs etc.) db/<something>.db . release basic newview implementation WW18 . release split db implementation . mtutil calls from dashboard (for remote control) . logs browser (esp. for surfacing mtutil related activities) WW19 . break command line into sections; all, run control, queries, utilities etc. . pull in ftfplan (not integrated, just code pulled in) WW20 . ./configure => ubuntu, sles11, sles12, rh7 | > > > > > > > < | 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | # # You should have received a copy of the GNU General Public License # along with Megatest. If not, see <http://www.gnu.org/licenses/>. TODO ==== WW38 . Add test_rundat to no-sync ==> correction, put in <testdir>/.meta/test-run.dat . Add STATE/STATUS transitions to .meta/test-run.dat or similar . Swizzle update-test-rundat to operate on no-sync . Swizzle update-run-duration, -uname-host and cpuload-diskfree to no-sync . On state/status change update tests table with duration WW15 . fill newview matrix with data, filter pipeline gui elements . improve [script], especially indent handling WW16 . split db into megatest.db (runs etc.) db/<something>.db . release basic newview implementation WW18 . release split db implementation . mtutil calls from dashboard (for remote control) . logs browser (esp. for surfacing mtutil related activities) WW19 . break command line into sections; all, run control, queries, utilities etc. . pull in ftfplan (not integrated, just code pulled in) WW20 . ./configure => ubuntu, sles11, sles12, rh7 . Add output flushing in teamcity support . Switch to using simple runs query everywhere . Add end_time to runs and add a rollup call that sets state, status and end_time Future . Switch to scsh-process pipeline management for job execution/control . Use call-with-environment-variables more. |
︙ | ︙ |
Modified adjutant.scm from [7560fecb1c] to [d6c67b1549].
︙ | ︙ | |||
20 21 22 23 24 25 26 | (declare (unit adjutant)) (module adjutant * (import scheme chicken data-structures extras files) (import (prefix sqlite3 sqlite3:) posix typed-records srfi-18 srfi-69 | | | | > > > > > > > > > > > | 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | (declare (unit adjutant)) (module adjutant * (import scheme chicken data-structures extras files) (import (prefix sqlite3 sqlite3:) posix typed-records srfi-18 srfi-69 md5 message-digest matchable regex srfi-1) (define (adjutant-run host-type rmt:no-sync-take-job) (print "Running the adjutant!") (let loop ((wait-count 0)) (if (< wait-count 10) ;; 6 x 10 seconds = one minute (let* ((dat (rmt:no-sync-take-job host-type))) (match dat ((id ht vars exekey cmdline state event-time last-update) (system cmdline) (loop 0)) (else (thread-sleep! 10) (loop (+ wait-count 1))))) (print "I'm bored. Exiting.")))) ) |
Modified api.scm from [4fa67bb6bd] to [0134572f5d].
︙ | ︙ | |||
249 250 251 252 253 254 255 256 257 258 259 260 261 | ((tasks-get-last) (apply tasks:get-last dbstruct params)) ;; NO SYNC DB ((no-sync-set) (apply db:no-sync-set *no-sync-db* params)) ((no-sync-get/default) (apply db:no-sync-get/default *no-sync-db* params)) ((no-sync-del!) (apply db:no-sync-del! *no-sync-db* params)) ((no-sync-get-lock) (apply db:no-sync-get-lock *no-sync-db* params)) ;; ARCHIVES ;; ((archive-get-allocations) ((archive-register-disk) (apply db:archive-register-disk dbstruct params)) ((archive-register-block-name)(apply db:archive-register-block-name dbstruct params)) ;; ((archive-allocate-testsuite/area-to-block)(apply db:archive-allocate-testsuite/area-to-block dbstruct block-id testsuite-name areakey)) | > > > | | 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 | ((tasks-get-last) (apply tasks:get-last dbstruct params)) ;; NO SYNC DB ((no-sync-set) (apply db:no-sync-set *no-sync-db* params)) ((no-sync-get/default) (apply db:no-sync-get/default *no-sync-db* params)) ((no-sync-del!) (apply db:no-sync-del! *no-sync-db* params)) ((no-sync-get-lock) (apply db:no-sync-get-lock *no-sync-db* params)) ((no-sync-add-job) (apply db:no-sync-add-job *no-sync-db* params)) ((no-sync-take-job) (apply db:no-sync-take-job *no-sync-db* params)) ((no-sync-job-records-clean) (apply db:no-sync-job-records-clean *no-sync-db* params)) ;; ARCHIVES ;; ((archive-get-allocations) ((archive-register-disk) (apply db:archive-register-disk dbstruct params)) ((archive-register-block-name)(apply db:archive-register-block-name dbstruct params)) ;; ((archive-allocate-testsuite/area-to-block)(apply db:archive-allocate-testsuite/area-to-block dbstruct block-id testsuite-name areakey)) ;;====================================================================== ;; READ ONLY QUERIES ;;====================================================================== ;; KEYS ((get-key-val-pairs) (apply db:get-key-val-pairs dbstruct params)) ((get-keys) (db:get-keys dbstruct)) |
︙ | ︙ |
Modified common.scm from [e271de7a8e] to [7e9a56be1f].
︙ | ︙ | |||
565 566 567 568 569 570 571 | (lambda (a b) (< (hash-table-ref all-files a)(hash-table-ref all-files b)))) (- num-logs max-allowed)))) (for-each (lambda (file) (let* ((fullname (conc "logs/" file))) (if (directory? fullname) | | | 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 | (lambda (a b) (< (hash-table-ref all-files a)(hash-table-ref all-files b)))) (- num-logs max-allowed)))) (for-each (lambda (file) (let* ((fullname (conc "logs/" file))) (if (directory? fullname) (debug:print-info 0 *default-log-port* fullname " in logs directory is a directory! Cannot rotate it, it is best to not put subdirectories in the logs dir.") (handle-exceptions exn (debug:print-error 0 *default-log-port* "failed to remove " fullname ", exn=" exn) (delete-file* fullname))))) files) (debug:print-info 0 *default-log-port* "Deleted " (length files) " files from logs, keeping " max-allowed " files.")))))) |
︙ | ︙ | |||
3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 | ;; ;; [hosts] ;; arm cubie01 cubie02 ;; x86_64 zeus xena myth01 ;; allhosts #{g hosts arm} #{g hosts x86_64} ;; ;; [host-types] ;; general #MTLOWESTLOAD #{g hosts allhosts} ;; arm #MTLOWESTLOAD #{g hosts arm} ;; nbgeneral nbjob run JOBCOMMAND -log $MT_LINKTREE/$MT_TARGET/$MT_RUNNAME.$MT_TESTNAME-$MT_ITEM_PATH.lgo ;; ;; [host-rules] ;; # maxnload => max normalized load ;; # maxnjobs => max jobs per cpu ;; # maxjobrate => max jobs per second ;; general maxnload=1.1; maxnjobs=1.2; maxjobrate=0.1 ;; ;; [launchers] ;; envsetup general | > > > > | > | > > > | | | > > > > | | 3169 3170 3171 3172 3173 3174 3175 3176 3177 3178 3179 3180 3181 3182 3183 3184 3185 3186 3187 3188 3189 3190 3191 3192 3193 3194 3195 3196 3197 3198 3199 3200 3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236 3237 3238 3239 3240 3241 3242 3243 3244 3245 3246 3247 | ;; ;; [hosts] ;; arm cubie01 cubie02 ;; x86_64 zeus xena myth01 ;; allhosts #{g hosts arm} #{g hosts x86_64} ;; ;; [host-types] ;; C/M/A lets megatest know this launcher provides C cores, M bytes memory for architecture A ;; 2/2G/arm smart -cores 2 -memory 2G -arch arm ;; general #MTLOWESTLOAD #{g hosts allhosts} ;; arm #MTLOWESTLOAD #{g hosts arm} ;; nbgeneral nbjob run JOBCOMMAND -log $MT_LINKTREE/$MT_TARGET/$MT_RUNNAME.$MT_TESTNAME-$MT_ITEM_PATH.lgo ;; ;; NOTE: host-rules is ONLY used for MTLOWESTLOAD ;; ;; [host-rules] ;; # maxnload => max normalized load ;; # maxnjobs => max jobs per cpu ;; # maxjobrate => max jobs per second ;; general maxnload=1.1; maxnjobs=1.2; maxjobrate=0.1 ;; ;; [launchers] ;; envsetup general ;; xor/%/n 2/2G/arm ;; % nbgeneral ;; ;; [jobtools] ;; # if defined and not "no" flexi-launcher will bypass "launcher" unless no match. ;; flexi-launcher yes ;; launcher nbfake ;; mode adjutant|normal (default is normal) ;; ;; ;; mode is 'normal (i.e. directly use launcher) or 'adjutant (i.e. use adjutant) ;; (define (common:get-launcher configdat testname itempath mode) (let ((fallback-launcher (configf:lookup configdat "jobtools" "launcher"))) (if (and (configf:lookup configdat "jobtools" "flexi-launcher") ;; overrides launcher (not (equal? (configf:lookup configdat "jobtools" "flexi-launcher") "no"))) (let* ((launchers (hash-table-ref/default configdat "launchers" '()))) (if (null? launchers) fallback-launcher (let loop ((hed (car launchers)) (tal (cdr launchers))) (let ((patt (car hed)) (host-type (cadr hed))) (if (tests:match patt testname itempath) ;; have a launcher match for this test (begin (debug:print-info 2 *default-log-port* "Have flexi-launcher match for " testname "/" itempath " = " host-type) (let ((launcher (configf:lookup configdat "host-types" host-type))) ;; find the actual launcher from the host-types table ;; if we are in adjutant mode then we want to return both host-type and launcher (if launcher (let* ((launcher-parts (string-split launcher)) (launcher-exe (car launcher-parts))) (if (equal? launcher-exe "#MTLOWESTLOAD") ;; this is our special case, we will find the lowest load and craft a nbfake commandline (let host-loop ((targ-host (common:get-least-loaded-host (cdr launcher-parts) host-type configdat)) (count 100)) (if targ-host (conc "remrun " targ-host) (if (> count 0) (begin (debug:print 0 *default-log-port* "INFO: Waiting for a host for host-type " host-type) (thread-sleep! (- 101 count)) (host-loop (common:get-least-loaded-host (cdr launcher-parts) host-type configdat) (- count 1))) (begin (debug:print 0 *default-log-port* "FATAL: Failed to find a host from #MTLOWESTLOAD for host-type " host-type) (exit))))) (case mode ((adjutant) (list host-type launcher)) (else launcher)))) (begin (debug:print-info 0 *default-log-port* "WARNING: no launcher found for host-type " host-type) (if (null? tal) fallback-launcher (loop (car tal)(cdr tal))))))) ;; no match, try again (if (null? tal) |
︙ | ︙ |
Modified dashboard.scm from [935bf4d2df] to [af7f57fa9a].
︙ | ︙ | |||
210 211 212 213 214 215 216 217 218 219 220 221 222 223 | ;; (define (dboard:common-set-tabdat! commondat tabnum tabdat) (hash-table-set! (dboard:commondat-tabdats commondat) tabnum tabdat)) ;; gets and calls updater list based on curr-tab-num ;; (define (dboard:common-run-curr-updaters commondat #!key (tab-num #f)) (if (dboard:common-get-tabdat commondat tab-num: tab-num) ;; only update if there is a tabdat (let* ((tnum (or tab-num (dboard:commondat-curr-tab-num commondat))) (updaters (hash-table-ref/default (dboard:commondat-updaters commondat) tnum | > > | 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 | ;; (define (dboard:common-set-tabdat! commondat tabnum tabdat) (hash-table-set! (dboard:commondat-tabdats commondat) tabnum tabdat)) (define *updater-running* #f) ;; move this into one of the stucts ;; gets and calls updater list based on curr-tab-num ;; (define (dboard:common-run-curr-updaters commondat #!key (tab-num #f)) (if (dboard:common-get-tabdat commondat tab-num: tab-num) ;; only update if there is a tabdat (let* ((tnum (or tab-num (dboard:commondat-curr-tab-num commondat))) (updaters (hash-table-ref/default (dboard:commondat-updaters commondat) tnum |
︙ | ︙ | |||
238 239 240 241 242 243 244 | (curr-updaters (hash-table-ref/default (dboard:commondat-updaters commondat) tnum '()))) (hash-table-set! (dboard:commondat-updaters commondat) tnum (cons updater curr-updaters)))) ;; data for each specific tab goes here ;; | | > > | 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 | (curr-updaters (hash-table-ref/default (dboard:commondat-updaters commondat) tnum '()))) (hash-table-set! (dboard:commondat-updaters commondat) tnum (cons updater curr-updaters)))) ;; data for each specific tab goes here ;; (defstruct dboard:tabdat ;; flags ((already-running #f) : boolean) ;; the updater is already running. skip ;; runs ((allruns '()) : list) ;; list of dboard:rundat records ((allruns-by-id (make-hash-table)) : hash-table) ;; hash of run-id -> dboard:rundat records ((done-runs '()) : list) ;; list of runs already drawn ((not-done-runs '()) : list) ;; list of runs not yet drawn (header #f) ;; header for decoding the run records (keys #f) ;; keys for this run (i.e. target components) |
︙ | ︙ | |||
643 644 645 646 647 648 649 | ;; ;; NOTE: Yes, this is used ;; (define (dboard:get-tests-for-run-duplicate tabdat run-id run testnamepatt key-vals) (let* ((start-time (current-seconds)) (access-mode (dboard:tabdat-access-mode tabdat)) (num-to-get (string->number (or (configf:lookup *configdat* "setup" "num-tests-to-get") | | | 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 | ;; ;; NOTE: Yes, this is used ;; (define (dboard:get-tests-for-run-duplicate tabdat run-id run testnamepatt key-vals) (let* ((start-time (current-seconds)) (access-mode (dboard:tabdat-access-mode tabdat)) (num-to-get (string->number (or (configf:lookup *configdat* "setup" "num-tests-to-get") "60"))) ;; was 200, which is fine in a normal run area. (states (hash-table-keys (dboard:tabdat-state-ignore-hash tabdat))) (statuses (hash-table-keys (dboard:tabdat-status-ignore-hash tabdat))) (do-not-use-db-file-timestamps #t) ;; (configf:lookup *configdat* "setup" "do-not-use-db-file-timestamps")) ;; this still hosts runs-summary-tab (do-not-use-query-timestamps #t) ;; (configf:lookup *configdat* "setup" "do-not-use-query-timestamps")) ;; this no longer troubles runs-summary-tab (sort-info (get-curr-sort)) (sort-by (vector-ref sort-info 1)) (sort-order (vector-ref sort-info 2)) |
︙ | ︙ | |||
714 715 716 717 718 719 720 721 722 723 724 725 726 727 | ;; ;; (debug:print 0 *default-log-port* "got-all: " got-all " multi-get: " multi-get " num-to-get: " num-to-get " (length tmptests): " (length tmptests) " db-modified: " db-modified " db-mod-time: " db-mod-time " db-path: " db-path) (if got-all (begin (dboard:rundat-last-update-set! run-dat (- start-time 2)) (dboard:rundat-run-data-offset-set! run-dat 0)) (begin (dboard:rundat-run-data-offset-set! run-dat (+ num-to-get (dboard:rundat-run-data-offset run-dat))))) (for-each (lambda (tdat) (let ((test-id (db:test-get-id tdat)) (state (db:test-get-state tdat))) | > | 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 | ;; ;; (debug:print 0 *default-log-port* "got-all: " got-all " multi-get: " multi-get " num-to-get: " num-to-get " (length tmptests): " (length tmptests) " db-modified: " db-modified " db-mod-time: " db-mod-time " db-path: " db-path) (if got-all (begin (dboard:rundat-last-update-set! run-dat (- start-time 2)) (dboard:rundat-run-data-offset-set! run-dat 0)) (begin ;;; (thread-sleep! 0.25) ;; give the rest of the gui some time to update. <-- this did NOT help (dboard:rundat-run-data-offset-set! run-dat (+ num-to-get (dboard:rundat-run-data-offset run-dat))))) (for-each (lambda (tdat) (let ((test-id (db:test-get-id tdat)) (state (db:test-get-state tdat))) |
︙ | ︙ | |||
831 832 833 834 835 836 837 | ;; this calls dboard:get-tests-for-run-duplicate for each run ;; ;; create a virtual table of all the tests ;; keypatts: ( (KEY1 "abc%def")(KEY2 "%") ) ;; (define (dboard:update-rundat tabdat runnamepatt numruns testnamepatt keypatts) | > > | | 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 | ;; this calls dboard:get-tests-for-run-duplicate for each run ;; ;; create a virtual table of all the tests ;; keypatts: ( (KEY1 "abc%def")(KEY2 "%") ) ;; (define (dboard:update-rundat tabdat runnamepatt numruns testnamepatt keypatts) (dboard:tabdat-already-running-set! tabdat #t) (let* (;; (already-running (dboard:tabdat-already-running tabdat)) (access-mode (dboard:tabdat-access-mode tabdat)) (keys (dboard:tabdat-keys tabdat)) ;; (db:dispatch-query access-mode rmt:get-keys db:get-keys))) (last-runs-update (- (dboard:tabdat-last-runs-update tabdat) 2)) (allruns (rmt:get-runs runnamepatt numruns (dboard:tabdat-start-run-offset tabdat) keypatts)) ;;(allruns-tree (rmt:get-runs-by-patt (dboard:tabdat-keys tabdat) "%" #f #f #f #f)) (allruns-tree (rmt:get-runs-by-patt keys "%" #f #f #f #f 0)) ;; last-runs-update));;'("id" "runname") (header (db:get-header allruns)) (runs (db:get-rows allruns)) ;; RA => Filtered as per runpatt selected |
︙ | ︙ | |||
899 900 901 902 903 904 905 | (elapsed-time (- (current-seconds) start-time))) (if (null? all-test-ids) (hash-table-delete! (dboard:tabdat-allruns-by-id tabdat) run-id) (hash-table-set! (dboard:tabdat-allruns-by-id tabdat) run-id run-struct)) (if (or (null? tal) (> elapsed-time 2)) ;; stop loading data after 5 seconds, on the next call more data *should* be loaded since get-tests-for-run uses last update (begin | | > | | < < < | > | 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 | (elapsed-time (- (current-seconds) start-time))) (if (null? all-test-ids) (hash-table-delete! (dboard:tabdat-allruns-by-id tabdat) run-id) (hash-table-set! (dboard:tabdat-allruns-by-id tabdat) run-id run-struct)) (if (or (null? tal) (> elapsed-time 2)) ;; stop loading data after 5 seconds, on the next call more data *should* be loaded since get-tests-for-run uses last update (begin #;(when (> elapsed-time 2) (debug:print 0 *default-log-port* "NOTE: updates are taking a long time, " elapsed-time "s elapsed.") (let* ((old-val (iup:attribute *tim* "TIME")) (new-val (number->string (inexact->exact (floor (* 2 (string->number old-val))))))) (if (< (string->number new-val) 5000) (begin (debug:print 0 *default-log-port* "NOTE: increasing poll interval from "old-val" to "new-val) (iup:attribute-set! *tim* "TIME" new-val))))) (dboard:tabdat-allruns-set! tabdat new-res) maxtests) (if (> (dboard:rundat-run-data-offset run-struct) 0) (loop run tal new-res newmaxtests) ;; not done getting data for this run (loop (car tal)(cdr tal) new-res newmaxtests))))))) (dboard:tabdat-filters-changed-set! tabdat #f) (dboard:update-tree tabdat runs-hash header tb) (dboard:tabdat-already-running-set! tabdat #f))) (define *collapsed* (make-hash-table)) (define (toggle-hide lnum uidat) ; fulltestname) (let* ((btn (vector-ref (dboard:uidat-get-lftcol uidat) lnum)) (fulltestname (iup:attribute btn "TITLE")) (parts (string-split fulltestname "(")) |
︙ | ︙ | |||
2503 2504 2505 2506 2507 2508 2509 | sort-lb))) ) ;; insert extra widget here (if extra-widget extra-widget (iup:hbox)) ;; empty widget | < < < < | 2509 2510 2511 2512 2513 2514 2515 2516 2517 2518 2519 2520 2521 2522 | sort-lb))) ) ;; insert extra widget here (if extra-widget extra-widget (iup:hbox)) ;; empty widget ))) (let* ((status-toggles (map (lambda (status) (iup:toggle (conc status) #:fontsize 8 ;; btn-fontsz ;; "10" ;; #:expand "HORIZONTAL" #:action (lambda (obj val) |
︙ | ︙ | |||
3723 3724 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 | ;; removing the tabdat-values proc ;; ;; (define (tabdat-values tabdat) ;; runs update-rundat using the various filters from the gui ;; (define (dashboard:do-update-rundat tabdat) (dboard:update-rundat tabdat (hash-table-ref/default (dboard:tabdat-searchpatts tabdat) "runname" "%") (dboard:tabdat-numruns tabdat) (hash-table-ref/default (dboard:tabdat-searchpatts tabdat) "test-name" "%/%") ;; generate key patterns from the target stored in tabdat (let* ((dbkeys (dboard:tabdat-dbkeys tabdat))) (let ((fres (if (dboard:tabdat-target tabdat) | > > > > > > > | | | | | | | | | | > | 3725 3726 3727 3728 3729 3730 3731 3732 3733 3734 3735 3736 3737 3738 3739 3740 3741 3742 3743 3744 3745 3746 3747 3748 3749 3750 3751 3752 3753 3754 3755 3756 3757 3758 3759 3760 3761 3762 3763 3764 | ;; removing the tabdat-values proc ;; ;; (define (tabdat-values tabdat) ;; runs update-rundat using the various filters from the gui ;; (define (dashboard:do-update-rundat tabdat) ;; this seems like a good place to check for already running and skip if so ;; ;; (set! *updater-running* #t) ;;(if (dboard:tabdat-already-running tabdat) ;; (begin ;; (debug:print-info 0 *default-log-port* "Dashboard overloaded - updates will be slow, skipping update.") ;; (dboard:tabdat-target tabdat)) (dboard:update-rundat tabdat (hash-table-ref/default (dboard:tabdat-searchpatts tabdat) "runname" "%") (dboard:tabdat-numruns tabdat) (hash-table-ref/default (dboard:tabdat-searchpatts tabdat) "test-name" "%/%") ;; generate key patterns from the target stored in tabdat (let* ((dbkeys (dboard:tabdat-dbkeys tabdat))) (let ((fres (if (dboard:tabdat-target tabdat) (let ((ptparts (append (dboard:tabdat-target tabdat)(make-list (length dbkeys) "%")))) (map (lambda (k v)(list k v)) dbkeys ptparts)) (let ((res '())) (for-each (lambda (key) (if (not (equal? key "runname")) (let ((val (hash-table-ref/default (dboard:tabdat-searchpatts tabdat) key #f))) (if val (set! res (cons (list key val) res)))))) dbkeys) res)))) fres))) #;(set! *updater-running* #f)) (define (dashboard:runs-tab-updater commondat tab-num) (debug:catch-and-dump (lambda () (let* ((tabdat (dboard:common-get-tabdat commondat tab-num: tab-num)) (dbkeys (dboard:tabdat-dbkeys tabdat))) (dashboard:do-update-rundat tabdat) |
︙ | ︙ | |||
3799 3800 3801 3802 3803 3804 3805 | commondat (lambda () (dashboard:runs-tab-updater commondat 1)) tab-num: 2) (iup:callback-set! *tim* "ACTION_CB" (lambda (time-obj) | | > | | | | | | | > | > | | | | 3809 3810 3811 3812 3813 3814 3815 3816 3817 3818 3819 3820 3821 3822 3823 3824 3825 3826 3827 3828 3829 3830 3831 3832 3833 3834 3835 3836 3837 | commondat (lambda () (dashboard:runs-tab-updater commondat 1)) tab-num: 2) (iup:callback-set! *tim* "ACTION_CB" (lambda (time-obj) (if (not *updater-running*) (begin ;; (mutex-lock! (dboard:commondat-update-mutex commondat)) ;; (set! update-is-running (dboard:commondat-updating commondat)) ;;(if (not update-is-running) ;; (dboard:commondat-updating-set! commondat #t)) ;;(mutex-unlock! (dboard:commondat-update-mutex commondat)) ;;(if (not update-is-running) ;; we know that the update was not running and we now have a lock on doing an update ;; (begin (set! *updater-running* #t) (dboard:common-run-curr-updaters commondat) ;; (dashboard:run-update commondat) (set! *updater-running* #f) ;; (mutex-lock! (dboard:commondat-update-mutex commondat)) ;; (dboard:commondat-updating-set! commondat #f) ;; (mutex-unlock! (dboard:commondat-update-mutex commondat))) )) 1)))) (let ((th1 (make-thread (lambda () (thread-sleep! 1) (dboard:common-run-curr-updaters commondat 0) ;; force update of summary tab ) "update buttons once")) |
︙ | ︙ |
Modified db.scm from [bbc6b8f83a] to [2551c101d6].
︙ | ︙ | |||
1532 1533 1534 1535 1536 1537 1538 | id INTEGER PRIMARY KEY, test_id INTEGER, state TEXT DEFAULT 'new', status TEXT DEFAULT 'n/a', archive_type TEXT DEFAULT 'bup', du INTEGER, archive_path TEXT);"))) | | | 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 | id INTEGER PRIMARY KEY, test_id INTEGER, state TEXT DEFAULT 'new', status TEXT DEFAULT 'n/a', archive_type TEXT DEFAULT 'bup', du INTEGER, archive_path TEXT);"))) ;; (print "creating trigges from init") (db:create-triggers db) db)) ;; ) ;;====================================================================== ;; A R C H I V E S ;;====================================================================== |
︙ | ︙ | |||
2140 2141 2142 2143 2144 2145 2146 | (let* ((dbpath (db:dbfile-path)) (dbname (conc dbpath "/no-sync.db")) (db-exists (common:file-exists? dbname)) (db (sqlite3:open-database dbname))) (sqlite3:set-busy-handler! db (sqlite3:make-busy-timeout 136000)) (if (not db-exists) (begin | > | > > > > | | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 2140 2141 2142 2143 2144 2145 2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183 2184 2185 2186 2187 2188 2189 2190 2191 2192 2193 2194 2195 2196 2197 2198 2199 2200 2201 2202 2203 2204 2205 | (let* ((dbpath (db:dbfile-path)) (dbname (conc dbpath "/no-sync.db")) (db-exists (common:file-exists? dbname)) (db (sqlite3:open-database dbname))) (sqlite3:set-busy-handler! db (sqlite3:make-busy-timeout 136000)) (if (not db-exists) (begin (sqlite3:execute db "CREATE TABLE IF NOT EXISTS no_sync_metadat (var TEXT,val TEXT, CONSTRAINT no_sync_metadat_constraint UNIQUE (var));") (sqlite3:execute db "PRAGMA journal_mode=WAL;"))) ;; MOVE THIS TABLE CREATION TO THE (begin above in about six months (it is Sep 2020 right now). (sqlite3:execute db "CREATE TABLE IF NOT EXISTS jobs_queue (id INTEGER PRIMARY KEY, host_type TEXT, cores INTEGER, memory TEXT, vars TEXT, exekey TEXT, cmdline TEXT, state TEXT, event_time INTEGER, last_update INTEGER);") ;; not sure I'll use this next one. I prefer if tests simply append to a file: ;; last-update-seconds cpuload tmpspace rundirspace (sqlite3:execute db "CREATE TABLE IF NOT EXISTS test_extra_data (id INTEGER PRIMARY KEY, run_id INTEGER, test_id INTEGER, last_seen_running INTEGER);") (sqlite3:execute db "PRAGMA synchronous = 0;") db)) (define (db:no-sync-add-job db-in host-type vars-list exekey cmdline) (sqlite3:execute (db:no-sync-db db-in) "INSERT INTO jobs_queue (host_type,vars,exekey,cmdline,state,event_time,last_update) VALUES (?,?,?,?,?,?,?);" host-type vars-list exekey cmdline "waiting" (current-seconds)(current-seconds))) ;; find next job (waiting longest) that matches host-type - future, we'll find jobs that fit if no exact match (define (db:no-sync-take-job db-in host-type) (let* ((db (db:no-sync-db db-in)) (stmt1 "SELECT id,host_type,vars,exekey,cmdline,state,event_time,last_update FROM jobs_queue WHERE host_type=? AND state != 'taken' ORDER BY event_time ASC;") (stmt1h (sqlite3:prepare db stmt1)) (stmt2 "UPDATE jobs_queue SET state='taken',last_update=? WHERE id=?;") (stmt2h (sqlite3:prepare db stmt2)) (res (sqlite3:with-transaction db (lambda () (let* ((matching-jobs (sqlite3:fold-row (lambda (res . row) ;; id host-type vars exekey state event-time last-update) (cons row res)) '() stmt1h host-type))) (if (null? matching-jobs) #f (let ((choosen-one (let loop ((tal matching-jobs) (res #f)) ;; put bestest one in here (if (null? tal) res (let ((curr (car tal)) (rem (cdr tal))) curr) ;; here we will compare with res, if better candidate the loop with curr else loop with res )))) (if choosen-one ;; we need to mark it as taken (sqlite3:execute stmt2h (current-seconds) (car choosen-one))) choosen-one))))))) (sqlite3:finalize! stmt1h) ;; it'd be nice to cache these and finalize on exit. (sqlite3:finalize! stmt2h) res)) ;; clean out old jobs in queue, i.e. taken and event_time > 24 hrs ago ;; (define (db:no-sync-job-records-clean db) (sqlite3:execute (db:no-sync-db db) "DELETE FROM jobs_queue WHERE state='taken' AND event_time < ?;" (- (current-seconds)(* 24 3600)))) ;; if we are not a server create a db handle. this is not finalized ;; so watch for problems. I'm still not clear if it is needed to manually ;; finalize sqlite3 dbs with the sqlite3 egg. ;; (define (db:no-sync-db db-in) (mutex-lock! *db-access-mutex*) |
︙ | ︙ | |||
2165 2166 2167 2168 2169 2170 2171 | (define (db:no-sync-set db var val) (sqlite3:execute (db:no-sync-db db) "INSERT OR REPLACE INTO no_sync_metadat (var,val) VALUES (?,?);" var val)) (define (db:no-sync-del! db var) (sqlite3:execute (db:no-sync-db db) "DELETE FROM no_sync_metadat WHERE var=?;" var)) | | > | | 2213 2214 2215 2216 2217 2218 2219 2220 2221 2222 2223 2224 2225 2226 2227 2228 2229 | (define (db:no-sync-set db var val) (sqlite3:execute (db:no-sync-db db) "INSERT OR REPLACE INTO no_sync_metadat (var,val) VALUES (?,?);" var val)) (define (db:no-sync-del! db var) (sqlite3:execute (db:no-sync-db db) "DELETE FROM no_sync_metadat WHERE var=?;" var)) (define (db:no-sync-get/default db-in var default) (let ((db (db:no-sync-db db-in)) (res default)) (sqlite3:for-each-row (lambda (val) (set! res val)) (db:no-sync-db db) "SELECT val FROM no_sync_metadat WHERE var=?;" var) (if res |
︙ | ︙ | |||
3465 3466 3467 3468 3469 3470 3471 | (let* ((run-ids (db:get-all-run-ids mtdb))) (for-each (lambda (run-id) (let ((testrecs (db:get-all-tests-info-by-run-id mtdb run-id))) (db:prep-megatest.db-adj-test-ids (db:dbdat-get-db mtdb) run-id testrecs))) run-ids))) | | | 3514 3515 3516 3517 3518 3519 3520 3521 3522 3523 3524 3525 3526 3527 3528 | (let* ((run-ids (db:get-all-run-ids mtdb))) (for-each (lambda (run-id) (let ((testrecs (db:get-all-tests-info-by-run-id mtdb run-id))) (db:prep-megatest.db-adj-test-ids (db:dbdat-get-db mtdb) run-id testrecs))) run-ids))) ;; Get test data using test_id, run-id is not used - but it will be! ;; (define (db:get-test-info-by-id dbstruct run-id test-id) (db:with-db dbstruct #f ;; run-id #f (lambda (db) |
︙ | ︙ |
Modified docs/manual/megatest_manual.html from [dd2f6b61ae] to [18b25df6be].
︙ | ︙ | |||
3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 | <td class="tableblock halign-left valign-top" ><p class="tableblock monospaced"></p></td> <td class="tableblock halign-left valign-top" ><p class="tableblock monospaced">( key1 key2 … )</p></td> <td class="tableblock halign-left valign-top" ><p class="tableblock monospaced"></p></td> </tr> </tbody> </table> </div> </div> <div class="sect1"> <h2 id="_megatest_internals">Megatest Internals</h2> <div class="sectionbody"> <div class="imageblock graphviz"> <div class="content"> <img src="server.png" alt="server.png"> | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 3438 3439 3440 3441 3442 3443 3444 3445 3446 3447 3448 3449 3450 3451 3452 3453 3454 3455 3456 3457 3458 3459 3460 3461 3462 3463 3464 3465 3466 3467 3468 3469 3470 3471 3472 3473 3474 3475 3476 3477 3478 3479 3480 3481 3482 3483 3484 3485 3486 3487 3488 3489 | <td class="tableblock halign-left valign-top" ><p class="tableblock monospaced"></p></td> <td class="tableblock halign-left valign-top" ><p class="tableblock monospaced">( key1 key2 … )</p></td> <td class="tableblock halign-left valign-top" ><p class="tableblock monospaced"></p></td> </tr> </tbody> </table> </div> </div> <div class="sect1"> <h2 id="_test_plan">Test Plan</h2> <div class="sectionbody"> <div class="sect2"> <h3 id="_tests">Tests</h3> <div class="paragraph"><p>itemwait|33</p></div> <div class="paragraph"><p>rerun-downstream-item|20</p></div> <div class="paragraph"><p>rerunclean|20</p></div> <div class="paragraph"><p>fullrun|18</p></div> <div class="paragraph"><p>goodtests|18</p></div> <div class="paragraph"><p>kill-rerun|17</p></div> <div class="paragraph"><p>items-runconfigvars|16</p></div> <div class="paragraph"><p>ro_test|16</p></div> <div class="paragraph"><p>runconfig-tests|16</p></div> <div class="paragraph"><p>env-pollution|13</p></div> <div class="paragraph"><p>itemmap|11</p></div> <div class="paragraph"><p>testpatt_envvar|10</p></div> <div class="paragraph"><p>toprun|10</p></div> <div class="paragraph"><p>chained-waiton|8</p></div> <div class="paragraph"><p>skip-on-fileexists|8</p></div> <div class="paragraph"><p>killrun_preqfail|7</p></div> <div class="paragraph"><p>subrun|6</p></div> <div class="paragraph"><p>dependencies|5</p></div> <div class="paragraph"><p>itemwait-simple|4</p></div> <div class="paragraph"><p>rollup|4</p></div> <div class="paragraph"><p>end-of-run|3</p></div> <div class="paragraph"><p>killrun|3</p></div> <div class="paragraph"><p>listener|3</p></div> <div class="paragraph"><p>test2|3</p></div> <div class="paragraph"><p>testpatt|3</p></div> <div class="paragraph"><p>env-pollution-usecacheno|2</p></div> <div class="paragraph"><p>set-values|2 envvars|1 listruns-tests|1 subrun-usecases|1</p></div> </div> </div> </div> <div class="sect1"> <h2 id="_megatest_internals">Megatest Internals</h2> <div class="sectionbody"> <div class="imageblock graphviz"> <div class="content"> <img src="server.png" alt="server.png"> |
︙ | ︙ | |||
3459 3460 3461 3462 3463 3464 3465 | </div> </div> </div> <div id="footnotes"><hr></div> <div id="footer"> <div id="footer-text"> Version 1.5<br> | | | 3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508 3509 | </div> </div> </div> <div id="footnotes"><hr></div> <div id="footer"> <div id="footer-text"> Version 1.5<br> Last updated 2020-09-08 08:39:29 PDT 2020-08-22 12:47:36 MST </div> </div> </body> </html> |
Modified docs/manual/megatest_manual.txt from [69ab724537] to [cb5cc67576].
︙ | ︙ | |||
114 115 116 117 118 119 120 121 122 123 124 125 126 127 | // :leveloffset: 0 include::writing_tests.txt[] include::howto.txt[] include::reference.txt[] Megatest Internals ------------------ ["graphviz", "server.png"] ---------------------------------------------------------------------- include::server.dot[] | > > | 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | // :leveloffset: 0 include::writing_tests.txt[] include::howto.txt[] include::reference.txt[] include::testplan.txt[] Megatest Internals ------------------ ["graphviz", "server.png"] ---------------------------------------------------------------------- include::server.dot[] |
︙ | ︙ |
Added docs/manual/testplan.txt version [2f7346adda].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | // This file is part of Megatest. // // Megatest is free software: you can redistribute it and/or modify // it under the terms of the GNU General Public License as published by // the Free Software Foundation, either version 3 of the License, or // (at your option) any later version. // // Megatest is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // You should have received a copy of the GNU General Public License // along with Megatest. If not, see <http://www.gnu.org/licenses/>. // Copyright 2006-2020, Matthew Welland. Test Plan --------- Tests ~~~~~ itemwait|33 rerun-downstream-item|20 rerunclean|20 fullrun|18 goodtests|18 kill-rerun|17 items-runconfigvars|16 ro_test|16 runconfig-tests|16 env-pollution|13 itemmap|11 testpatt_envvar|10 toprun|10 chained-waiton|8 skip-on-fileexists|8 killrun_preqfail|7 subrun|6 dependencies|5 itemwait-simple|4 rollup|4 end-of-run|3 killrun|3 listener|3 test2|3 testpatt|3 env-pollution-usecacheno|2 set-values|2 envvars|1 listruns-tests|1 subrun-usecases|1 |
Modified launch.scm from [9ff8002b65] to [74747a8909].
︙ | ︙ | |||
203 204 205 206 207 208 209 | (round (- (current-seconds) start-seconds))))) (kill-tries 0)) ;; (tests:set-full-meta-info #f test-id run-id (calc-minutes) work-area) ;; (tests:set-full-meta-info test-id run-id (calc-minutes) work-area) | | | > > | 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 | (round (- (current-seconds) start-seconds))))) (kill-tries 0)) ;; (tests:set-full-meta-info #f test-id run-id (calc-minutes) work-area) ;; (tests:set-full-meta-info test-id run-id (calc-minutes) work-area) (tests:set-full-meta-info #f test-id run-id (calc-minutes) work-area 10 update-db: #t) (let loop ((minutes (calc-minutes)) (cpu-load (alist-ref 'adj-core-load (common:get-normalized-cpu-load #f))) (disk-free (get-df (current-directory))) (last-sync (current-seconds))) ;; (common:telemetry-log "zombie" (conc "launch:monitor-job - ;; top of loop encountered at "(current-seconds)" with ;; last-sync="last-sync)) (let* ((over-time (> (current-seconds) (+ last-sync update-period))) (new-cpu-load (let* ((load (alist-ref 'adj-core-load (common:get-normalized-cpu-load #f))) (delta (abs (- load cpu-load)))) (if (> delta 0.1) ;; don't bother updating with small changes load #f))) (new-disk-free (let* ((df (if over-time ;; only get df every 30 seconds |
︙ | ︙ | |||
231 232 233 234 235 236 237 | (do-sync (or new-cpu-load new-disk-free over-time)) (test-info (rmt:get-test-info-by-id run-id test-id)) (state (db:test-get-state test-info)) (status (db:test-get-status test-info)) (kill-reason "no kill reason specified") (kill-job? #f)) | | | | < < < | < < | 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 | (do-sync (or new-cpu-load new-disk-free over-time)) (test-info (rmt:get-test-info-by-id run-id test-id)) (state (db:test-get-state test-info)) (status (db:test-get-status test-info)) (kill-reason "no kill reason specified") (kill-job? #f)) #;(common:telemetry-log "zombie" (conc "launch:monitor-job - decision time encountered at "(current-seconds)" with last-sync="last-sync" do-sync="do-sync" over-time="over-time" update-period="update-period)) (cond ((test-get-kill-request run-id test-id) (set! kill-reason "KILLING TEST since received kill request (KILLREQ)") (set! kill-job? #t)) ((and runtlim (> (- (current-seconds) start-seconds) runtlim)) (set! kill-reason (conc "KILLING TEST DUE TO TIME LIMIT EXCEEDED! Runtime=" (- (current-seconds) start-seconds) " seconds, limit=" runtlim)) (set! kill-job? #t)) ((equal? status "DEAD") (tests:update-central-meta-info run-id test-id new-cpu-load new-disk-free (calc-minutes) #f #f update-db: #t) (rmt:set-state-status-and-roll-up-items run-id test-id 'foo "RUNNING" "n/a" "was marked dead; really still running.") ;;(set! kill-reason "KILLING TEST because it was marked as DEAD by launch:handle-zombie-tests (might indicate really overloaded server or else overzealous setup.deadtime)") ;; MARK RUNNING (set! kill-job? #f))) (debug:print 4 *default-log-port* "cpu: " new-cpu-load " disk: " new-disk-free " last-sync: " last-sync " do-sync: " do-sync) (launch:handle-zombie-tests run-id) (if do-sync ;; save meta data about the running of this test (tests:update-central-meta-info run-id test-id new-cpu-load new-disk-free (calc-minutes) #f #f)) (if kill-job? (begin (debug:print-info 0 *default-log-port* "proceeding to kill test: "kill-reason) (mutex-lock! m) ;; NOTE: The pid can change as different steps are run. Do we need handshaking between this ;; section and the runit section? Or add a loop that tries three times with a 1/4 second ;; between tries? |
︙ | ︙ | |||
310 311 312 313 314 315 316 | (begin (thread-sleep! 3) ;; (+ 3 (random 6))) ;; add some jitter to the call home time to spread out the db accesses (if (hash-table-ref/default misc-flags 'keep-going #f) ;; keep originals for cpu-load and disk-free unless they change more than the allowed delta (loop (calc-minutes) (or new-cpu-load cpu-load) (or new-disk-free disk-free) (if do-sync (current-seconds) last-sync))))))) | | | 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 | (begin (thread-sleep! 3) ;; (+ 3 (random 6))) ;; add some jitter to the call home time to spread out the db accesses (if (hash-table-ref/default misc-flags 'keep-going #f) ;; keep originals for cpu-load and disk-free unless they change more than the allowed delta (loop (calc-minutes) (or new-cpu-load cpu-load) (or new-disk-free disk-free) (if do-sync (current-seconds) last-sync))))))) (tests:update-central-meta-info run-id test-id (get-cpu-load) (get-df (current-directory))(calc-minutes) #f #f update-db: #t))) ;; NOTE: Checking twice for keep-going is intentional (define (launch:execute encoded-cmd) (let* ((cmdinfo (common:read-encoded-string encoded-cmd)) (tconfigreg #f)) (setenv "MT_CMDINFO" encoded-cmd) ;;(bb-check-path msg: "launch:execute incoming") |
︙ | ︙ | |||
463 464 465 466 467 468 469 | (db:test-get-host test-info) (begin (debug:print 0 *default-log-port* "ERROR: failed to find a record for test-id " test-id ", exiting.") (exit)))) (test-pid (db:test-get-process_id test-info))) (cond ;; -mrw- I'm removing KILLREQ from this list so that a test in KILLREQ state is treated as a "do not run" flag. | | > > | 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 | (db:test-get-host test-info) (begin (debug:print 0 *default-log-port* "ERROR: failed to find a record for test-id " test-id ", exiting.") (exit)))) (test-pid (db:test-get-process_id test-info))) (cond ;; -mrw- I'm removing KILLREQ from this list so that a test in KILLREQ state is treated as a "do not run" flag. ((or (member (db:test-get-state test-info) '("INCOMPLETE" "KILLED" "UNKNOWN" "STUCK")) ;; prior run of this test didn't complete, go ahead and try to rerun (and (equal? (db:test-get-state test-info) "COMPLETED") ;; completed/abort => rerun if asked (member (db:test-get-status test-info) '("ABORT")))) (debug:print 0 *default-log-port* "INFO: test is INCOMPLETE or KILLED, treat this execute call as a rerun request") ;; (tests:test-force-state-status! run-id test-id "REMOTEHOSTSTART" "n/a") (rmt:general-call 'set-test-start-time #f test-id) (rmt:test-set-state-status run-id test-id "REMOTEHOSTSTART" "n/a" #f) ) ;; prime it for running ((member (db:test-get-state test-info) '("RUNNING" "REMOTEHOSTSTART")) |
︙ | ︙ | |||
1353 1354 1355 1356 1357 1358 1359 | (else #f)))) (when do-scan? (debug:print 1 *default-log-port* "INFO: search and mark zombie tests") (rmt:set-var key (current-seconds)) (rmt:find-and-mark-incomplete run-id #f)))) | | > > > > > > > > | < < < < < < < < < < < < | > > > < | | | | | | | | | > > > > > > > | 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 | (else #f)))) (when do-scan? (debug:print 1 *default-log-port* "INFO: search and mark zombie tests") (rmt:set-var key (current-seconds)) (rmt:find-and-mark-incomplete run-id #f)))) (defstruct launch:ajt (vars '()) (exekey #f) (host-type #f) (test-sig #f) (cmdline #f)) ;; append vars (define (launch:ajt-add-vars dat vars) (launch:ajt-vars-set! dat (append (launch:ajt-vars dat) vars))) ;; 1. look though disks list for disk with most space ;; 2. create run dir on disk, path name is meaningful ;; 3. create link from run dir to megatest runs area ;; 4. remotely run the test on allocated host ;; - could be ssh to host from hosts table (update regularly with load) ;; - could be netbatch ;; (launch-test db (cadr status) test-conf)) (define (launch-test test-id run-id run-info keyvals runname test-conf test-name test-path itemdat params) (mutex-lock! *launch-setup-mutex*) ;; setting variables and processing the testconfig is NOT thread-safe, reuse the launch-setup mutex (let* (;; locking code removed from here commented out and pasted at end of file (item-path (item-list->path itemdat)) (contour #f) ;; NOT READY FOR THIS (args:get-arg "-contour"))) ;; launcher-mode will be 'adjutant or 'normal (launcher-mode (string->symbol (or (configf:lookup *configdat* "jobtools" "mode") "normal"))) (ajtdat (make-launch:ajt))) (let loop ((delta (- (current-seconds) *last-launch*)) (launch-delay (configf:lookup-number *configdat* "setup" "launch-delay" default: 0))) (if (> launch-delay delta) (begin (if (common:low-noise-print 1200 "test launch delay") ;; every two hours or so remind the user about launch delay. (debug:print-info 0 *default-log-port* "NOTE: test launches are delayed by " launch-delay " seconds. See megatest.config launch-delay setting to adjust.")) ;; launch of " test-name " for " (- launch-delay delta) " seconds")) (thread-sleep! (- launch-delay delta)) (loop (- (current-seconds) *last-launch*) launch-delay)))) (change-directory *toppath*) (let ((var-list (append (list (list "MT_RUN_AREA_HOME" *toppath*) (list "MT_TEST_NAME" test-name) (list "MT_RUNNAME" runname) (list "MT_ITEMPATH" item-path) (list "MT_CONTOUR" contour) ) itemdat))) ;; consolidate this code with the code in megatest.scm for ;; "-execute", *maybe* - the longer they are set the longer ;; each launch takes (must be non-overlapping with the vars) (alist->env-vars var-list) ;; the var-list into the ajtdat adjutant record whether it is needed or not. (launch:ajt-add-vars ajtdat var-list)) (let* ((tregistry (tests:get-all)) ;; third param (below) is system-allowed ;; for tconfig, why do we allow fallback to test-conf? (tconfig (or (tests:get-testconfig test-name item-path tregistry #t force-create: #t) (begin (debug:print 0 *default-log-port* "WARNING: falling back to pre-calculated testconfig. This is likely not desired.") test-conf))) ;; force re-read now that all vars are set (useshell (let ((ush (configf:lookup *configdat* "jobtools" "useshell"))) |
︙ | ︙ | |||
1420 1421 1422 1423 1424 1425 1426 | (subrun (> (length (hash-table-ref/default tconfig "subrun" '())) 0)) ;; send a flag to process a subrun ;; (diskspace (configf:lookup tconfig "requirements" "diskspace")) ;; (memory (configf:lookup tconfig "requirements" "memory")) ;; (hosts (configf:lookup *configdat* "jobtools" "workhosts")) ;; I'm pretty sure this was never completed (remote-megatest (configf:lookup *configdat* "setup" "executable")) (run-time-limit (or (configf:lookup tconfig "requirements" "runtimelim") (configf:lookup *configdat* "setup" "runtimelim"))) | < < < < < < < > | | < < < < < > > | > > > > > > > > > | | 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 | (subrun (> (length (hash-table-ref/default tconfig "subrun" '())) 0)) ;; send a flag to process a subrun ;; (diskspace (configf:lookup tconfig "requirements" "diskspace")) ;; (memory (configf:lookup tconfig "requirements" "memory")) ;; (hosts (configf:lookup *configdat* "jobtools" "workhosts")) ;; I'm pretty sure this was never completed (remote-megatest (configf:lookup *configdat* "setup" "executable")) (run-time-limit (or (configf:lookup tconfig "requirements" "runtimelim") (configf:lookup *configdat* "setup" "runtimelim"))) (local-megatest (common:find-local-megatest)) (launcher (let ((l (common:get-launcher *configdat* test-name item-path launcher-mode))) (if (string? l) (string-split l) l))) ;; some nonhomogenuity here. '(cmd param1 param2 ...) OR '(host-type launcher) ;; (item-list->path itemdat))) ;; test-path is the full path including the item-path (test-sig (conc (common:get-testsuite-name) ":" test-name ":" item-path)) (work-area #f) (toptest-work-area #f) ;; for iterated tests the top test contains data relevant for all (diskpath #f) (cmdparms #f) (fullcmd #f) ;; (define a (with-output-to-string (lambda ()(write x)))) (mt-bindir-path #f) (testinfo (rmt:get-test-info-by-id run-id test-id)) (mt_target (string-intersperse (map cadr keyvals) "/")) (debug-param (append (if (args:get-arg "-debug") (list "-debug" (args:get-arg "-debug")) '()) (if (args:get-arg "-logging")(list "-logging") '()) (if (configf:lookup *configdat* "misc" "profilesw") (list (configf:lookup *configdat* "misc" "profilesw")) '())))) ;; save the test-sig in the ajtdat record (launch:ajt-test-sig-set! ajtdat test-sig) ;; go ahead and figure out if we have a host-type from the ;; launcher call above and save it in the ajtdat record (if (and (eq? launcher-mode 'adjutant) (list? launcher) (> (length launcher) 1)) (launch:ajt-host-type-set! ajtdat (car launcher))) ;; (if hosts (set! hosts (string-split hosts))) ;; set the megatest to be called on the remote host (if (not remote-megatest)(set! remote-megatest local-megatest)) ;; "megatest")) (set! mt-bindir-path (pathname-directory remote-megatest)) ;; (if launcher (set! launcher (string-split launcher))) ;; yuk! ;; set up the run work area for this test (if (and (args:get-arg "-preclean") ;; user has requested to preclean for this run (not (member (db:test-get-rundir testinfo)(list "n/a" "/tmp/badname")))) ;; n/a is a placeholder and thus not a read dir (begin (debug:print-info 0 *default-log-port* "attempting to preclean directory " (db:test-get-rundir testinfo) " for test " test-name "/" item-path) (runs:remove-test-directory testinfo 'remove-data-only))) ;; remove data only, do not perturb the record |
︙ | ︙ | |||
1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 | (list 'target mt_target) (list 'contour contour) (list 'runtlim (if run-time-limit (common:hms-string->seconds run-time-limit) #f)) (list 'env-ovrd (hash-table-ref/default *configdat* "env-override" '())) (list 'set-vars (if params (hash-table-ref/default params "-setvars" #f))) (list 'runname runname) (list 'mt-bindir-path mt-bindir-path)))))))) ;; clean out step records from previous run if they exist ;; (rmt:delete-test-step-records run-id test-id) ;; if the dir does not exist we may have a itempath where individual variables are a path, launch anyway (if (common:file-exists? work-area) (change-directory work-area)) ;; so that log files from the launch process don't clutter the test dir | > > | > | | < > < | < | | | | | | | | < | > > | | | > > > | | | | | | | | | | | > > > > > > > > > > > > > > > > > > > > > > > > | < | 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 | (list 'target mt_target) (list 'contour contour) (list 'runtlim (if run-time-limit (common:hms-string->seconds run-time-limit) #f)) (list 'env-ovrd (hash-table-ref/default *configdat* "env-override" '())) (list 'set-vars (if params (hash-table-ref/default params "-setvars" #f))) (list 'runname runname) (list 'mt-bindir-path mt-bindir-path)))))))) ;; save the cmdparms in the ajtdat (launch:ajt-exekey-set! ajtdat cmdparms) ;; clean out step records from previous run if they exist ;; (rmt:delete-test-step-records run-id test-id) ;; if the dir does not exist we may have a itempath where individual variables are a path, launch anyway (if (common:file-exists? work-area) (change-directory work-area)) ;; so that log files from the launch process don't clutter the test dir ;; save the command line for adjutant mode (might never be needed but best to assemble it here) (launch:ajt-cmdline-set! ajtdat (string-intersperse (append (list remote-megatest "-m" test-sig "-execute" cmdparms) debug-param))) (cond (launcher (set! fullcmd (append launcher (list remote-megatest "-m" test-sig "-execute" cmdparms) debug-param))) (else (if (not useshell)(debug:print 0 *default-log-port* "WARNING: internal launching will not work well without \"useshell yes\" in your [jobtools] section")) (set! fullcmd (append (list remote-megatest "-m" test-sig "-execute" cmdparms) debug-param (list (if useshell "&" "")))))) (if (args:get-arg "-xterm")(set! fullcmd (append fullcmd (list "-xterm")))) (debug:print 1 *default-log-port* "Launching " work-area) ;; set pre-launch-env-vars before launching, keep the vars in prevvals and put the envionment back when done (debug:print 4 *default-log-port* "fullcmd: " fullcmd) (set! *last-launch* (current-seconds)) ;; all that junk above takes time, set this as late as possible. (let* ((env-override-vars (hash-table-ref/default *configdat* "env-override" '())) (commonprevvals (alist->env-vars env-override-vars)) (misc-vars (append (list (list "MT_TEST_RUN_DIR" work-area) (list "MT_TEST_NAME" test-name) (list "MT_ITEM_INFO" (conc itemdat)) (list "MT_RUNNAME" runname) (list "MT_TARGET" mt_target) (list "MT_ITEMPATH" item-path)) itemdat)) (miscprevvals (alist->env-vars misc-vars));; consolidate this code with the code in megatest.scm for "-execute" (test-vars (hash-table-ref/default tconfig "pre-launch-env-overrides" '())) (testprevvals (alist->env-vars test-vars)) ;; Launchwait defaults to true, must override it to turn off wait (launchwait (if (equal? (configf:lookup *configdat* "setup" "launchwait") "no") #f #t)) ;; BB: TODO: refactor this to examine return code of launcher, if nonzero, set state to launch failed. (launch-results-prev (if (eq? launcher-mode 'adjutant) '(#t 0) ;; just some fake data to fool downstream but non-applicable code (apply (if launchwait process:cmd-run-with-stderr-and-exitcode->list process-run) (if useshell (let ((cmdstr (string-intersperse fullcmd " "))) (if launchwait cmdstr (conc cmdstr " >> mt_launch.log 2>&1 &"))) (car fullcmd)) (if useshell '() (cdr fullcmd))))) (success (if launchwait (equal? 0 (cadr launch-results-prev)) #t)) (launch-results (if launchwait (car launch-results-prev) launch-results-prev))) (launch:ajt-add-vars ajtdat env-override-vars) (launch:ajt-add-vars ajtdat misc-vars) (launch:ajt-add-vars ajtdat test-vars) ;; if in adjutant mode we register the job in the jobs_queue ;; then fire off an adjutant runner ;; (if (eq? launcher-mode 'adjutant) (let* ((adjutant-runner-cmd (append (cdr launcher) (list remote-megatest "-adjutant" (launch:ajt-host-type ajtdat) "-start-dir" *toppath*))) (adj-cmd (conc (string-intersperse (map conc adjutant-runner-cmd) " ") "&"))) (rmt:no-sync-add-job (launch:ajt-host-type ajtdat) (conc (launch:ajt-vars ajtdat)) (launch:ajt-exekey ajtdat) (launch:ajt-cmdline ajtdat)) (print "adj-cmd: " adj-cmd) (system adj-cmd) )) (if (not success) (tests:test-set-status! run-id test-id "COMPLETED" "DEAD" "launcher failed; exited non-zero; check mt_launch.log" #f)) ;; (if launch-results launch-results "FAILED")) ;; (rmt:no-sync-del! lock-key) ;; release the lock for starting this test (if (not launchwait) ;; give the OS a little time to allow the process to start (thread-sleep! 0.01)) (with-output-to-file "mt_launch.log" (lambda () (print "LAUNCHCMD: " (string-intersperse fullcmd " ")) (if (list? launch-results) |
︙ | ︙ | |||
1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 | ;; but this hack will work! Thanks go to Alan Post of the Chicken email list ;; NB// Is this still needed? Should be safe to go back to "exit" now? (process-signal (current-process-id) signal/kill) )) (alist->env-vars miscprevvals) (alist->env-vars testprevvals) (alist->env-vars commonprevvals) launch-results)) (change-directory *toppath*) (thread-sleep! (configf:lookup-number *configdat* "setup" "inter-test-delay" default: 0.0)))) ;; recover a test where the top controlling mtest may have died ;; (define (launch:recover-test run-id test-id) | > > > > | 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 | ;; but this hack will work! Thanks go to Alan Post of the Chicken email list ;; NB// Is this still needed? Should be safe to go back to "exit" now? (process-signal (current-process-id) signal/kill) )) (alist->env-vars miscprevvals) (alist->env-vars testprevvals) (alist->env-vars commonprevvals) ;; yes, really should mutex all the way to here. Need to put this entire process into a fork. ;; the unlock previously was further up. This seemed wrong as we should not proceed until the ;; vars have been reset. (mutex-unlock! *launch-setup-mutex*) launch-results)) (change-directory *toppath*) (thread-sleep! (configf:lookup-number *configdat* "setup" "inter-test-delay" default: 0.0)))) ;; recover a test where the top controlling mtest may have died ;; (define (launch:recover-test run-id test-id) |
︙ | ︙ | |||
1618 1619 1620 1621 1622 1623 1624 | (read-symbolic-link (conc "/proc/" pid "/cwd")) #f))) ;; now wait on that process if all is correct ;; periodically update the db with runtime ;; when the process exits look at the db, if still RUNNING after 10 seconds set ;; state/status appropriately (process-wait pid))) | > > > > > > > > > > > > > > > > | 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 | (read-symbolic-link (conc "/proc/" pid "/cwd")) #f))) ;; now wait on that process if all is correct ;; periodically update the db with runtime ;; when the process exits look at the db, if still RUNNING after 10 seconds set ;; state/status appropriately (process-wait pid))) ;; (lock-key (conc "test-" test-id)) ;; (got-lock (let loop ((lock (rmt:no-sync-get-lock lock-key)) ;; (expire-time (+ (current-seconds) 15))) ;; give up on getting the lock and steal it after 15 seconds ;; (if (car lock) ;; #t ;; (if (> (current-seconds) expire-time) ;; (begin ;; (debug:print-info 0 *default-log-port* "Timed out waiting for a lock to launch test " keyvals " " runname " " test-name " " test-path) ;; (rmt:no-sync-del! lock-key) ;; destroy the lock ;; (loop (rmt:no-sync-get-lock lock-key) expire-time)) ;; ;; (begin ;; (thread-sleep! 1) ;; (loop (rmt:no-sync-get-lock lock-key) expire-time)))))) |
Modified megatest-version.scm from [3028461078] to [f253e5978b].
︙ | ︙ | |||
16 17 18 19 20 21 22 | ;; along with Megatest. If not, see <http://www.gnu.org/licenses/>. ;; Always use two or four digit decimal ;; 1.01, 1.02...1.10,1.11,1.1101 ... 1.99,2.00.. ;; (declare (unit megatest-version)) | | | 16 17 18 19 20 21 22 23 | ;; along with Megatest. If not, see <http://www.gnu.org/licenses/>. ;; Always use two or four digit decimal ;; 1.01, 1.02...1.10,1.11,1.1101 ... 1.99,2.00.. ;; (declare (unit megatest-version)) (define megatest-version 1.6569) |
Modified megatest.scm from [0e58f17e0f] to [c47e39d399].
︙ | ︙ | |||
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | (declare (uses tdb)) (declare (uses mt)) (declare (uses api)) (declare (uses tasks)) ;; only used for debugging. (declare (uses env)) (declare (uses diff-report)) ;; (declare (uses ftail)) ;; (import ftail) (define *db* #f) ;; this is only for the repl, do not use in general!!!! (include "common_records.scm") (include "key_records.scm") (include "db_records.scm") (include "run_records.scm") (include "megatest-fossil-hash.scm") (use (prefix sqlite3 sqlite3:) srfi-1 posix regex regex-case srfi-69 (prefix base64 base64:) | > > > > > > > > > > | | | 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | (declare (uses tdb)) (declare (uses mt)) (declare (uses api)) (declare (uses tasks)) ;; only used for debugging. (declare (uses env)) (declare (uses diff-report)) (declare (uses mutils)) (import mutils) (declare (uses adjutant)) (import adjutant) (declare (uses mttop)) (import mttop) ;; (declare (uses ftail)) ;; (import ftail) (define *db* #f) ;; this is only for the repl, do not use in general!!!! (include "common_records.scm") (include "key_records.scm") (include "db_records.scm") (include "run_records.scm") (include "megatest-fossil-hash.scm") (use (prefix sqlite3 sqlite3:) srfi-1 posix regex regex-case srfi-69 (prefix base64 base64:) readline apropos json http-client directory-utils typed-records matchable http-client srfi-18 extras format) ;; Added for csv stuff - will be removed ;; (use sparse-vectors) ;; (require-library mutils) (define *usage-log-file* #f) ;; put path to file for logging usage in this var in the ~/.megatestrc file (define *usage-use-seconds* #t) ;; for Epoc seconds in usage logging change this to #t in ~/.megatestrc file ;; load the ~/.megatestrc file, put (use trace)(trace-call-sites #t)(trace function-you-want-to-trace) in this file ;; (let ((debugcontrolf (conc (get-environment-variable "HOME") "/.megatestrc"))) |
︙ | ︙ | |||
100 101 102 103 104 105 106 107 108 109 110 111 112 113 | version " megatest-version " license GPL, Copyright Matt Welland 2006-2017 Usage: megatest [options] -h : this help -manual : show the Megatest user manual -version : print megatest version (currently " megatest-version ") Launching and managing runs -run : run all tests or as specified by -testpatt -remove-runs : remove the data for a run, requires -runname and -testpatt Optionally use :state and :status, use -keep-records to remove only the run data. Use -kill-wait to override the 10 second per test wait after kill delay (e.g. -kill-wait 0). | > | 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 | version " megatest-version " license GPL, Copyright Matt Welland 2006-2017 Usage: megatest [options] -h : this help -manual : show the Megatest user manual -version : print megatest version (currently " megatest-version ") help : help for the new Megatest interface Launching and managing runs -run : run all tests or as specified by -testpatt -remove-runs : remove the data for a run, requires -runname and -testpatt Optionally use :state and :status, use -keep-records to remove only the run data. Use -kill-wait to override the 10 second per test wait after kill delay (e.g. -kill-wait 0). |
︙ | ︙ | |||
197 198 199 200 201 202 203 | -sync-to-megatest.db : pull data from cache files in /tmp/$USER to megatest.db -sync-to dest : sync to new postgresql central style database -update-meta : update the tests metadata for all tests -setvars VAR1=val1,VAR2=val2 : Add environment variables to a run NB// these are overwritten by values set in config files. -server -|hostname : start the server (reduces contention on megatest.db), use - to automatically figure out hostname | | | 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 | -sync-to-megatest.db : pull data from cache files in /tmp/$USER to megatest.db -sync-to dest : sync to new postgresql central style database -update-meta : update the tests metadata for all tests -setvars VAR1=val1,VAR2=val2 : Add environment variables to a run NB// these are overwritten by values set in config files. -server -|hostname : start the server (reduces contention on megatest.db), use - to automatically figure out hostname -adjutant host-type : start the server/adjutant with given host-type use 0,0 to auto use full machine -transport http|rpc : use http or rpc for transport (default is http) -log logfile : send stdout and stderr to logfile -list-servers : list the servers -kill-servers : kill all servers -repl : start a repl (useful for extending megatest) -load file.scm : load and run file.scm |
︙ | ︙ | |||
266 267 268 269 270 271 272 273 274 275 276 277 278 279 | Called as " (string-intersperse (argv) " ") " Version " megatest-version ", built from " megatest-fossil-hash )) ;; -gui : start a gui interface ;; -config fname : override the runconfigs file with fname ;; process args (define remargs (args:get-args (argv) (list "-runtests" ;; run a specific test "-config" ;; override the config file name "-append-config" "-execute" ;; run the command encoded in the base64 parameter | > > > > | 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 | Called as " (string-intersperse (argv) " ") " Version " megatest-version ", built from " megatest-fossil-hash )) ;; -gui : start a gui interface ;; -config fname : override the runconfigs file with fname (mttop-run (command-line-arguments) '("help")) ;; process args (define remargs (args:get-args (argv) (list "-runtests" ;; run a specific test "-config" ;; override the config file name "-append-config" "-execute" ;; run the command encoded in the base64 parameter |
︙ | ︙ | |||
910 911 912 913 914 915 916 | (server:launch 0 transport-type) (set! *didsomething* #t))) ;; The adjutant is a bit different, it does NOT run (launch:setup) as it is not necessarily tied to ;; a specific Megatest area. Detail are being hashed out and this may change. ;; (if (args:get-arg "-adjutant") | > > > > > > > > > > > > > | > > > | | 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 | (server:launch 0 transport-type) (set! *didsomething* #t))) ;; The adjutant is a bit different, it does NOT run (launch:setup) as it is not necessarily tied to ;; a specific Megatest area. Detail are being hashed out and this may change. ;; (if (args:get-arg "-adjutant") (let* ((host-type (args:get-arg "-adjutant"))) (launch:setup) ;; dang it, wish this wasn't needed (print "Running the adjutant!") (let loop ((wait-count 0)) (if (< wait-count 10) ;; 6 x 10 seconds = one minute (let* ((dat (rmt:no-sync-take-job host-type))) (match dat ((id ht vars exekey cmdline state event-time last-update) (call-with-environment-variables vars (lambda () (system cmdline))) (loop 0)) (else (thread-sleep! 10) (loop (+ wait-count 1))))) (print "I'm bored. Exiting."))) ;; (adjutant-run (args:get-arg "-ajutant") rmt:no-sync-take-job) (set! *didsomething* #t))) (if (or (args:get-arg "-list-servers") (args:get-arg "-kill-servers")) (let ((tl (launch:setup))) (if tl ;; all roads from here exit (let* ((servers (server:get-list *toppath*)) |
︙ | ︙ |
Added mttop.scm version [0ba1c89f48].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | ;; Copyright 2006-2011, Matthew Welland. ;; ;; This program is made available under the GNU GPL version 2.0 or ;; greater. See the accompanying file COPYING for details. ;; ;; This program is distributed WITHOUT ANY WARRANTY; without even the ;; implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR ;; PURPOSE. ;; This is from the perl world, a hash of hashes is a super easy way to keep a handle on ;; lots of disparate data ;; (declare (unit mttop)) (module mttop * (import chicken scheme ;; data-structures posix srfi-1 ;; srfi-13 srfi-69 ports extras regex posix data-structures matchable ) (define (str-is-cmd cmd all-cmds) (let* ((rx (regexp (conc "^" cmd ".*"))) (mx (filter string? (map (lambda (x) (let ((res (string-match rx x))) (if res (car res) #f))) all-cmds)))) (if (eq? (length mx) 1) ;; have a command (car mx) #f))) (define (mttop-run args all-cmds) ;; any path through this call must end in exit if it is NOT an old Megatest call (if (null? args) #f ;; continue on and do the old Megatest stuff (let ((cmd (str-is-cmd (car args) all-cmds))) (if cmd (begin (case (string->symbol cmd) ((help)(print "New help")) (else (print "Command " cmd " is not implemented yet."))) (exit)) ;; always exit here #f)))) ;; or continue on to Megatest old stuff here ) |
Modified rmt.scm from [39d97c528a] to [cbca9c856c].
︙ | ︙ | |||
20 21 22 23 24 25 26 27 28 29 30 31 32 33 | (use format typed-records) ;; RADT => purpose of json format?? (declare (unit rmt)) (declare (uses api)) (declare (uses http-transport)) (include "common_records.scm") ;; (declare (uses rmtmod)) ;; (import rmtmod) ;; ;; THESE ARE ALL CALLED ON THE CLIENT SIDE!!! ;; | > > | 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | (use format typed-records) ;; RADT => purpose of json format?? (declare (unit rmt)) (declare (uses api)) (declare (uses http-transport)) (include "common_records.scm") (include "db_records.scm") ;; (declare (uses rmtmod)) ;; (import rmtmod) ;; ;; THESE ARE ALL CALLED ON THE CLIENT SIDE!!! ;; |
︙ | ︙ | |||
52 53 54 55 56 57 58 59 60 61 62 63 64 65 | cinfo (if (server:check-if-running areapath) (client:setup areapath) #f)))) (define *send-receive-mutex* (make-mutex)) ;; should have separate mutex per run-id ;; RA => e.g. usage (rmt:send-receive 'get-var #f (list varname)) ;; (define (rmt:send-receive cmd rid params #!key (attemptnum 1)(area-dat #f)) ;; start attemptnum at 1 so the modulo below works as expected #;(common:telemetry-log (conc "rmt:"(->string cmd)) payload: `((rid . ,rid) (params . ,params))) | > > > > > > > > > > > > > > > > > > > > > | | 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | cinfo (if (server:check-if-running areapath) (client:setup areapath) #f)))) (define *send-receive-mutex* (make-mutex)) ;; should have separate mutex per run-id (define *rmt-query-last-call-time* 0) (define *rmt-query-last-rest-time* 0) ;; last time there was at least a 1/2 second rest - giving other processes access to the db ;; NOTE: This query rest algorythm will not adapt to long query times. REDESIGN NEEDED. TODO. FIXME. ;; (define (rmt:query-rest) (let* ((now (current-milliseconds))) (cond ((> (- now *rmt-query-last-call-time*) 500) ;; it's been a while since last query - no need to rest (set! *rmt-query-last-rest-time* now) (set! *rmt-query-last-call-time* now)) ((> (- now *rmt-query-last-rest-time*) 5000) ;; no natural rests have happened (debug:print 0 *default-log-port* "query rest needed. blocking for 1/2 second.") (thread-sleep! 0.5) ;; force a rest of a half second (set! *rmt-query-last-rest-time* now) (set! *rmt-query-last-call-time* now)) (else ;; sufficient rests have occurred, just record the last query time (set! *rmt-query-last-call-time* now))))) ;; RA => e.g. usage (rmt:send-receive 'get-var #f (list varname)) ;; (define (rmt:send-receive cmd rid params #!key (attemptnum 1)(area-dat #f)) ;; start attemptnum at 1 so the modulo below works as expected #;(common:telemetry-log (conc "rmt:"(->string cmd)) payload: `((rid . ,rid) (params . ,params))) (if (not (equal? (configf:lookup *configdat* "setup" "query-rest") "no")) (rmt:query-rest)) (if (> attemptnum 2) (debug:print 0 *default-log-port* "INFO: attemptnum in rmt:send-receive is " attemptnum)) (cond ((> attemptnum 2) (thread-sleep! 0.05)) ((> attemptnum 10) (thread-sleep! 0.5)) ((> attemptnum 20) (thread-sleep! 1))) |
︙ | ︙ | |||
523 524 525 526 527 528 529 | ;; Just some syntatic sugar (define (rmt:register-test run-id test-name item-path) (rmt:general-call 'register-test run-id run-id test-name item-path)) (define (rmt:get-test-id run-id testname item-path) (rmt:send-receive 'get-test-id run-id (list run-id testname item-path))) | | | > > > > > > > > > | 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 | ;; Just some syntatic sugar (define (rmt:register-test run-id test-name item-path) (rmt:general-call 'register-test run-id run-id test-name item-path)) (define (rmt:get-test-id run-id testname item-path) (rmt:send-receive 'get-test-id run-id (list run-id testname item-path))) ;; run-id is NOT used - but it will be! ;; (define (rmt:get-test-info-by-id run-id test-id) (if (number? test-id) (let* ((testdat (rmt:send-receive 'get-test-info-by-id run-id (list run-id test-id))) (trundatf (conc (db:test-get-rundir testdat) "/.mt_data/test-run.dat"))) ;; now we can update a couple fields from the filesystem (if (and (db:test-get-rundir testdat) (file-exists? trundatf)) (let* ((duration (db:test-get-run_duration testdat)) (event-time (db:test-get-event_time testdat)) (last-touch (file-modification-time trundatf))) (db:test-set-run_duration! testdat (max duration (- last-touch event-time))))) testdat) (begin (debug:print 0 *default-log-port* "WARNING: Bad data handed to rmt:get-test-info-by-id run-id=" run-id ", test-id=" test-id) (print-call-chain (current-error-port)) #f))) (define (rmt:test-get-rundir-from-test-id run-id test-id) (rmt:send-receive 'test-get-rundir-from-test-id run-id (list run-id test-id))) |
︙ | ︙ | |||
925 926 927 928 929 930 931 932 933 934 935 936 937 938 | (rmt:send-receive 'no-sync-get/default #f `(,var ,default))) (define (rmt:no-sync-del! var) (rmt:send-receive 'no-sync-del! #f `(,var))) (define (rmt:no-sync-get-lock keyname) (rmt:send-receive 'no-sync-get-lock #f `(,keyname))) ;;====================================================================== ;; A R C H I V E S ;;====================================================================== (define (rmt:archive-get-allocations testname itempath dneeded) (rmt:send-receive 'archive-get-allocations #f (list testname itempath dneeded))) | > > > > > > > > > | 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 | (rmt:send-receive 'no-sync-get/default #f `(,var ,default))) (define (rmt:no-sync-del! var) (rmt:send-receive 'no-sync-del! #f `(,var))) (define (rmt:no-sync-get-lock keyname) (rmt:send-receive 'no-sync-get-lock #f `(,keyname))) (define (rmt:no-sync-add-job host-type vars-list exekey cmdline) (rmt:send-receive 'no-sync-add-job #f `(,host-type ,vars-list ,exekey ,cmdline))) (define (rmt:no-sync-take-job host-type) (rmt:send-receive 'no-sync-take-job #f `(,host-type))) (define (rmt:no-sync-job-records-clean) (rmt:set-receive 'no-sync-job-records-clean #f '())) ;;====================================================================== ;; A R C H I V E S ;;====================================================================== (define (rmt:archive-get-allocations testname itempath dneeded) (rmt:send-receive 'archive-get-allocations #f (list testname itempath dneeded))) |
︙ | ︙ |
Modified server.scm from [39f1590bf7] to [7b2af2dc7e].
︙ | ︙ | |||
583 584 585 586 587 588 589 590 591 | (debug:print 0 *default-log-port* "INFO: syncer thread NOT sleeping ; maybe time-to-exit..."))) (if (not (configf:lookup *configdat* "server" "disable-db-snapshot")) (common:snapshot-file mtdbfile subdir: ".db-snapshot")) (delete-file* staging-file) (let* ((start-time (current-milliseconds)) (res (system sync-cmd)) (res2 (cond | > | > | > | 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 | (debug:print 0 *default-log-port* "INFO: syncer thread NOT sleeping ; maybe time-to-exit..."))) (if (not (configf:lookup *configdat* "server" "disable-db-snapshot")) (common:snapshot-file mtdbfile subdir: ".db-snapshot")) (delete-file* staging-file) (let* ((start-time (current-milliseconds)) (res (system sync-cmd)) (dbbackupfile (conc mtdbfile ".backup")) (res2 (cond ((eq? 0 res ) (if (file-exists? dbbackupfile) (delete-file* dbbackupfile) ) (if (eq? 0 (file-size sync-log)) (delete-file sync-log)) (system (conc "/bin/mv " staging-file " " mtdbfile)) (set! last-sync-seconds (/ (- (current-milliseconds) start-time) 1000)) (set! off-time (calculate-off-time last-sync-seconds |
︙ | ︙ |
Modified tests.scm from [0094b671e6] to [af455125f4].
︙ | ︙ | |||
1942 1943 1944 1945 1946 1947 1948 | (lambda (count) (set! res count)) tdb "SELECT count(id) FROM test_rundat;") res)) 0) | > | > > > > > > > > > > > > > > > > > > > > | > > | | | | | | | > | < < < < < < < < < < < < < < < < < < < < < < < | 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 | (lambda (count) (set! res count)) tdb "SELECT count(id) FROM test_rundat;") res)) 0) ;; (define (tests:update-central-meta-info run-id test-id cpuload diskfree minutes uname hostname #!key (update-db #f)(tmpfree #f)) (if (get-environment-variable "MT_TEST_RUN_DIR") (let* ((dest-dir (conc (get-environment-variable "MT_TEST_RUN_DIR") "/.mt_data")) (or-dash (lambda (instr) (cond ((not instr) "") ;; #f -> blank, indicates value unchanged since last measurement taken ((string? instr)(if (string-search " " instr) (conc "\"" instr "\"") instr)) (else instr)))) (file-new (not (directory-exists? dest-dir)))) (if file-new (create-directory dest-dir #t)) (let* ((outp (open-output-file (conc dest-dir "/test-run.dat") #:append))) (with-output-to-port outp (lambda () (if file-new (print "epoch_time,run_id,test_id,cpuload,diskfree,tmpfree,run_minutes,hostname,uname")) (print (current-seconds) "," (or-dash run-id) "," (or-dash test-id) "," (or-dash cpuload) "," (or-dash diskfree) "," (or-dash tmpfree) "," (or-dash minutes) "," (or-dash hostname) "," (or-dash uname)))) ;; put uname last as it has spaces in it (close-output-port outp))) (begin (rmt:general-call 'update-test-rundat run-id test-id (current-seconds) (or cpuload -1)(or diskfree -1) -1 (or minutes -1)))) (if update-db (begin (if (and cpuload diskfree) (rmt:general-call 'update-cpuload-diskfree run-id cpuload diskfree test-id)) (if minutes (rmt:general-call 'update-run-duration run-id minutes test-id)) (if (and uname hostname) (rmt:general-call 'update-uname-host run-id uname hostname test-id))))) ;; This one is for running with no db access (i.e. via rmt: internally) (define (tests:set-full-meta-info db test-id run-id minutes work-area remtries #!key (update-db #f)) ;; (define (tests:set-full-meta-info test-id run-id minutes work-area) ;; (let ((remtries 10)) (let* ((cpuload (get-cpu-load)) (diskfree (get-df (current-directory))) (tmpfree (get-df "/tmp")) (uname (get-uname "-srvpio")) (hostname (get-host-name))) (tests:update-central-meta-info run-id test-id cpuload diskfree minutes uname hostname update-db: update-db tmpfree: tmpfree))) ;;====================================================================== ;; A R C H I V I N G ;;====================================================================== (define (test:archive db test-id) #f) |
︙ | ︙ |