Megatest: Changes On Branch d616d48e39958584

Changes In Branch v1.65-lazyqueue-items-rollup Through [d616d48e39] Excluding Merge-Ins

This is equivalent to a diff from 2769e4b7c9 to d616d48e39

2021-03-01
17:42		Manually patched in the new view check-in: f5206150ee user: mrwellan tags: v1.6569-new-view
2021-02-25
22:05		remove-mutex From: 022801133154585551c60feee4baf6758b91f993 User: matt Closed-Leaf check-in: d5bb9fca6e user: matt tags: v1.65-lazyqueue-items-rollup
22:05		add finalize of no-sync and re-enable the mutex From: 1775254e3f45db753fe942e0f142c3ec732dbcb6 User: matt check-in: d616d48e39 user: matt tags: v1.65-lazyqueue-items-rollup
2021-01-26
21:58		Added sleep to prevent high speed query stream from causing a locked db check-in: 82a0bf90a6 user: matt tags: v1.65-lazyqueue-items-rollup
14:00		Fix for the > crash. Maybe... Leaf check-in: 5a05fc04ff user: matt tags: v1.6569-gt-crash-fix
2021-01-25
12:03		rebased lazy-queue rollup check-in: 07ab120544 user: matt tags: v1.65-lazyqueue-items-rollup
2021-01-15
22:46		begin diet check-in: badd71f3b3 user: matt tags: v1.6569-diet
21:34		eval-string-in-environment if was disabled, re-enabled check-in: 9564772564 user: matt tags: v1.6569-reenable-eval-if
2021-01-08
11:42		enable custom value for max delay between archive time and test last update time Leaf check-in: 86a3d1148e user: pjhatwal tags: v1.6569-refactor
2020-11-25
12:00		Fixed issues in server gating code Leaf check-in: 063273e8cb user: mrwellan tags: v1.6569-server-gate-fix
2020-11-24
22:27		Added support for resetting run - allows to reload tests-paths to add tests to a run part way though. Just run megatest -clean-cache -runname $MT_RUNNAME Leaf check-in: 213021e02d user: mrwellan tags: v1.6596-reload-tests-paths
2020-10-13
16:46		Changed version from 69 to 76. No other changes. Will compile with chicken 13 check-in: 87ca35010f user: mmgraham tags: v1.65, v1.6576
2020-10-12
16:49		Reduced message from failed to info. Reverted a delay which seems to help pass full stack ext-tests. Leaf check-in: 9e35b1252c user: mrwellan tags: v1.65-minor-patch
10:18		Safe vector access in rmt. check-in: 58bb6d997a user: mrwellan tags: v1.65-side2
2020-10-11
22:46		Patched forward adjutant code. check-in: f936717bfa user: matt tags: v1.65-adjutant-again
2020-10-05
22:49		Do not exit on failure to create directory - race conditons on NFS cause false fail scenarios - just keep going and cross your fingers... (cherrypicked from v1.6572) check-in: 05b253a452 user: matt tags: v1.65-sidework
22:46		run duration testdat check-in: 4a0b43f3c6 user: matt tags: v1.65-test-rundat2
2020-09-21
15:36		merged in 1.65-test-rundat branch ==/FAIL/orion,mars/== check-in: cfd25d66e9 user: mmgraham tags: v1.6571, v1.65-failed-testdat
07:00		Added get-testsuite-name all over launch:setup and still not set when needed! This did NOT work. Closed-Leaf check-in: 2efe8ad422 user: mrwellan tags: v1.65-get-testsuitename
2020-09-19
04:21		Start moving test_rundat to no-sync db. ==/20/2/WARN/1203/mars/== check-in: abfabdb839 user: matt tags: v1.65-test-rundat
2020-09-18
17:30		added check for file existence before file delete ==/14/1.9/WARN/orion,mars/== NOTE: This is the last v1.65 before the split off. I.e code from before this point IS in the far future v1.65 branch. Code from this point to that branch might NOT be in the branch. check-in: 2769e4b7c9 user: mmgraham tags: v1.65, v1.6569
12:27		cherry picked 2 fixes, changed version to 1.6569 ==/7.2/2.0/PASS/1201/mars/== check-in: d145d0eb02 user: mmgraham tags: v1.65

Modified common.scm from [33c7316880] to [775d33a7fc].

Modified db.scm from [fb3a18f52f] to [d64b3f8e76].

Modified launch.scm from [d0067277fa] to [4f3065814f].

Modified rmt.scm from [39d97c528a] to [169f275811].

Modified runs.scm from [030b929939] to [01904d0a79].

︙			︙
2160 2161 2162 2163 2164 2165 2166 ~~2167~~ 2168 2169 ~~2170~~ 2171 2172 2173 2174 2175 2176 2177	(let ((db (db:open-no-sync-db))) (set! no-sync-db db) db)))) (mutex-unlock! db-access-mutex) res)) (define (db:no-sync-set db var val) ~~(sqlite3:execute (db:no-sync-db db) "INSERT OR REPLACE INTO no_sync_metadat (var,val) VALUES (?,?);" var val))~~ (define (db:no-sync-del! db var) ~~(sqlite3:execute (db:no-sync-db db) "DELETE FROM no_sync_metadat WHERE var=?;" var))~~ (define (db:no-sync-get/default db var default) (let ((res default)) (sqlite3:for-each-row (lambda (val) (set! res val)) (db:no-sync-db db)	> \| > > > \| > >	2160 2161 2162 2163 2164 2165 2166 2167 2168 2169 2170 2171 2172 2173 2174 2175 2176 2177 2178 2179 2180 2181 2182 2183	(let ((db (db:open-no-sync-db))) (set! no-sync-db db) db)))) (mutex-unlock! db-access-mutex) res)) (define (db:no-sync-set db var val) ;; (mutex-lock! db-access-mutex) (let ((res (sqlite3:execute (db:no-sync-db db) "INSERT OR REPLACE INTO no_sync_metadat (var,val) VALUES (?,?);" var val))) ;; (mutex-unlock! db-access-mutex) res)) (define (db:no-sync-del! db var) ;; (mutex-lock! db-access-mutex) (sqlite3:execute (db:no-sync-db db) "DELETE FROM no_sync_metadat WHERE var=?;" var) ;; (mutex-unlock! db-access-mutex) ) (define (db:no-sync-get/default db var default) (let ((res default)) (sqlite3:for-each-row (lambda (val) (set! res val)) (db:no-sync-db db)
︙			︙
3213 3214 3215 3216 3217 3218 3219 ~~3220 3221 3222 3223~~ 3224 3225 3226 3227 3228 3229 3230	(if newstatus (sqlite3:execute db "UPDATE tests SET status=? WHERE id=?;" newstatus test-id)) (if newcomment (sqlite3:execute db "UPDATE tests SET comment=? WHERE id=?;" newcomment ;; (sdb:qry 'getid newcomment) test-id)))))) (mt:process-triggers dbstruct run-id test-id newstate newstatus)) ;; NEW BEHAVIOR: Count tests running in all runs! ;; (define (db:get-count-tests-running dbstruct run-id fastmode) (let* ((qry (if fastmode "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND NOT (uname = 'n/a' AND item_path = '') LIMIT 1;" "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND NOT (uname = 'n/a' AND item_path = '');"))) (db:with-db dbstruct run-id #f (lambda (db) (let* ((stmth (db:get-cache-stmth dbstruct db qry))) (sqlite3:first-result stmth))))))	\| \| \| \|	3219 3220 3221 3222 3223 3224 3225 3226 3227 3228 3229 3230 3231 3232 3233 3234 3235 3236	(if newstatus (sqlite3:execute db "UPDATE tests SET status=? WHERE id=?;" newstatus test-id)) (if newcomment (sqlite3:execute db "UPDATE tests SET comment=? WHERE id=?;" newcomment ;; (sdb:qry 'getid newcomment) test-id)))))) (mt:process-triggers dbstruct run-id test-id newstate newstatus)) ;; NEW BEHAVIOR: Count tests running in all runs! ;; (define (db:get-count-tests-running dbstruct run-id) ;; fastmode) (let* ((qry ;; (if fastmode ;; "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND NOT (uname = 'n/a' AND item_path = '') LIMIT 1;" "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND NOT (uname = 'n/a' AND item_path = '');")) ;; ) (db:with-db dbstruct run-id #f (lambda (db) (let* ((stmth (db:get-cache-stmth dbstruct db qry))) (sqlite3:first-result stmth))))))
︙			︙
3243 3244 3245 3246 3247 3248 3249 ~~3250 3251 3252 3253~~ 3254 3255 3256 3257 3258 3259 3260	;; "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND run_id NOT IN (SELECT id FROM runs WHERE state='deleted') AND NOT (uname = 'n/a' AND item_path = '');") "SELECT count(id) FROM tests WHERE state in ('RUNNING','REMOTEHOSTSTART','LAUNCHED') AND run_id=?;" run-id)))) ;; NOT IN (SELECT id FROM runs WHERE state='deleted');") ;; NEW BEHAVIOR: Look only at single run with run-id ;; ;; (define (db:get-running-stats dbstruct run-id) (define (db:get-count-tests-running-for-run-id dbstruct run-id fastmode) (let* ((qry (if fastmode "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND run_id=? LIMIT 1;" "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND run_id=?;"))) (db:with-db dbstruct run-id #f (lambda (db) (let* ((stmth (db:get-cache-stmth dbstruct db qry))) (sqlite3:first-result stmth run-id))))))	\| \| \| \|	3249 3250 3251 3252 3253 3254 3255 3256 3257 3258 3259 3260 3261 3262 3263 3264 3265 3266	;; "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND run_id NOT IN (SELECT id FROM runs WHERE state='deleted') AND NOT (uname = 'n/a' AND item_path = '');") "SELECT count(id) FROM tests WHERE state in ('RUNNING','REMOTEHOSTSTART','LAUNCHED') AND run_id=?;" run-id)))) ;; NOT IN (SELECT id FROM runs WHERE state='deleted');") ;; NEW BEHAVIOR: Look only at single run with run-id ;; ;; (define (db:get-running-stats dbstruct run-id) (define (db:get-count-tests-running-for-run-id dbstruct run-id) ;; fastmode) (let* ((qry ;; (if fastmode ;; "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND run_id=? LIMIT 1;" "SELECT count(id) FROM tests WHERE state in ('RUNNING','LAUNCHED','REMOTEHOSTSTART') AND run_id=?;")) ;; ) (db:with-db dbstruct run-id #f (lambda (db) (let* ((stmth (db:get-cache-stmth dbstruct db qry))) (sqlite3:first-result stmth run-id))))))
︙			︙
3966 3967 3968 3969 3970 3971 3972 3973 3974 3975 3976 3977 3978 ~~3979~~ 3980 3981 3982 3983 3984 3985 3986	;; ;; (db:general-call dbdat 'state-status-msg (list state status msg test-id)) ;; ;; (db:general-call dbdat 'state-status (list state status test-id))) ;; (db:set-state-status-and-roll-up-items dbstruct run-id test-id #f state status msg) ;; ;; process the test_data table ;; (if (and test-id state status (equal? status "AUTO")) ;; (db:test-data-rollup dbstruct run-id test-id status)) ;; (mt:process-triggers dbstruct run-id test-id state status))) ;; state is the priority rollup of all states ;; status is the priority rollup of all completed statesfu ;; ;; if test-name is an integer work off that instead of test-name test-path ;; ~~(define (db:set-state-status-and-roll-up-items dbstruct run-id test-name item-path state status comment)~~ ;; establish info on incoming test followed by info on top level test ;; BBnote - for mode itemwait, linkage between upstream test & matching item status is propagated to run queue in db:prereqs-not-met (let* ((testdat (if (number? test-name) (db:get-test-info-by-id dbstruct run-id test-name) ;; test-name is actually a test-id (db:get-test-info dbstruct run-id test-name item-path))) (test-id (db:test-get-id testdat)) (test-name (if (number? test-name)	> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > \|	3972 3973 3974 3975 3976 3977 3978 3979 3980 3981 3982 3983 3984 3985 3986 3987 3988 3989 3990 3991 3992 3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 4006 4007 4008 4009 4010 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 4078 4079 4080 4081 4082 4083 4084 4085 4086 4087 4088 4089 4090 4091 4092 4093 4094 4095 4096 4097 4098 4099 4100 4101 4102 4103 4104 4105 4106 4107 4108 4109 4110	;; ;; (db:general-call dbdat 'state-status-msg (list state status msg test-id)) ;; ;; (db:general-call dbdat 'state-status (list state status test-id))) ;; (db:set-state-status-and-roll-up-items dbstruct run-id test-id #f state status msg) ;; ;; process the test_data table ;; (if (and test-id state status (equal? status "AUTO")) ;; (db:test-data-rollup dbstruct run-id test-id status)) ;; (mt:process-triggers dbstruct run-id test-id state status))) ;; NOT FINISHED (define (db:calc-state-status-toplevel state status tl-state tl-status) `(,state ,status)) ;; (match state ;; (("COMPLETED") ;; (match `(,tl-state ,tl-status) ;; (("COMPLETED" "PASS") `(,state ,status)) ;; (("COMPLETED" thestatus) ;; (case (string->symbol thestatus) ;; ((ABORT CHECK DEAD) ;; (if `("COMPLETED" ,thestatus)) ;; (match `(,thestatus ,status) ;; (("FAIL" "ABORT") '("COMPLETED" "ABORT")) ;; (("FAIL" "CHECK") '("COMPLETED" "CHECK")) ;; (("FAIL" "DEAD") '("COMPLETED" "DEAD")) ;; (("WARN" "FAIL") '("COMPLETED" "FAIL")) ;; (("WARN" "CHECK") '("COMPLETED" "CHECK")) ;; (("WARN" "DEAD") (define (db:set-state-status-and-roll-up-items dbstruct run-id test-name item-path state status comment) ;; establish info on incoming test followed by info on top level test ;; BBnote - for mode itemwait, linkage between upstream test & matching item status is propagated to run queue in db:prereqs-not-met (mutex-lock! db-transaction-mutex) ;; why do we need a mutex? (let* ((testdat (if (number? test-name) (db:get-test-info-by-id dbstruct run-id test-name) ;; test-name is actually a test-id (db:get-test-info dbstruct run-id test-name item-path))) (test-id (db:test-get-id testdat)) (test-name (if (number? test-name) (db:test-get-testname testdat) test-name)) (no-sync-db (db:no-sync-db #f)) (rollup-flag #f) (wait-flag #f) (rollup-lock-key (conc run-id "-rollup-" test-name)) (waiting-lock-key (conc run-id "-waiting-" test-name))) (db:test-set-state-status dbstruct run-id test-id state status #f) (if (and test-id state status (equal? status "AUTO")) (db:test-data-rollup dbstruct run-id test-id status)) (if (member state '("LAUNCHED" "REMOTEHOSTSTART")) (db:general-call dbstruct 'set-test-start-time (list test-id))) (if (not (equal? item-path "")) ;; only roll up IF incoming test is an item (begin ;; is there a rollup lock? If not, take it (sqlite3:with-transaction no-sync-db (lambda () ;; (debug:print 0 default-log-port "EXCEPTION: exn="exn) (let* ((rollup-lock-time (db:no-sync-get/default no-sync-db rollup-lock-key #f)) (waiting-lock-time (db:no-sync-get/default no-sync-db waiting-lock-key #f))) (if rollup-lock-time ;; someone is doing a rollup (if (not waiting-lock-time) ;; no one is waiting (begin (set! wait-flag #t) (set! rollup-flag #t) (db:no-sync-set no-sync-db waiting-lock-key (current-seconds)))) ;; we are going to wait (begin (set! rollup-flag #t) (db:no-sync-set no-sync-db rollup-lock-key (current-seconds))))))) (if wait-flag (let loop ((count 10)) ;; about 20 seconds (thread-sleep! 2) (if (and (not (db:no-sync-get/default no-sync-db waiting-lock-key #f)) (> count 0)) (loop (+ count 1)) (sqlite3:with-transaction no-sync-db (lambda () (db:no-sync-set no-sync-db rollup-lock-key (current-seconds)) (db:no-sync-del! no-sync-db waiting-lock-key)))))) ;; now the rollup (mutex-unlock! db-transaction-mutex) ;; why do we need a mutex? (if rollup-flag ;; put this into a thread (begin ;; (thread-start! (make-thread ;; (lambda () (db:roll-up-test-state-status dbstruct run-id test-name state status) (db:no-sync-del! no-sync-db rollup-lock-key)) ;; (conc "thread for run-id: " run-id " test-name: " test-name)))))))) )) (mutex-unlock! db-transaction-mutex) ;; why do we need a mutex? ))) ;; I'd like to remove the need for item-path - it is logically not needed here ;; for now we pass in state and status - NOTE: There is a possible race if a test ;; is rapidly re-run while an earlier run is waiting to rollup. ;; (define (db:roll-up-test-state-status dbstruct run-id test-name state status) (let* ((testdat (if (number? test-name) (db:get-test-info-by-id dbstruct run-id test-name) ;; test-name is actually a test-id (db:get-test-info dbstruct run-id test-name ""))) (test-id (db:test-get-id testdat)) (test-name (if (number? test-name) (db:test-get-testname testdat) test-name)) (tl-testdat (db:get-test-info dbstruct run-id test-name "")) (tl-test-id (db:test-get-id tl-testdat))) (db:with-db dbstruct #f #f (lambda (db) ;; NB// Pass the db so it is part fo the transaction ;; item-path is used in get-all-state-status counts to exclude current state/status of THIS test ;; but with the state/status being set earlier this is not needed any longer (let* ((state-status-counts (db:get-all-state-status-counts-for-testname dbstruct run-id test-name)) (state-statuses (if (null? state-status-counts) '() (db:roll-up-rules state-status-counts state status))) (newstate (if (null? state-statuses) state (car state-statuses))) (newstatus (if (null? state-statuses) status (cadr state-statuses)))) (if tl-test-id (db:test-set-state-status db run-id tl-test-id newstate newstatus #f)) ))) #t)) ;; state is the priority rollup of all states ;; status is the priority rollup of all completed statesfu ;; ;; if test-name is an integer work off that instead of test-name test-path ;; (define (db:set-state-status-and-roll-up-items-orig dbstruct run-id test-name item-path state status comment) ;; establish info on incoming test followed by info on top level test ;; BBnote - for mode itemwait, linkage between upstream test & matching item status is propagated to run queue in db:prereqs-not-met (let* ((testdat (if (number? test-name) (db:get-test-info-by-id dbstruct run-id test-name) ;; test-name is actually a test-id (db:get-test-info dbstruct run-id test-name item-path))) (test-id (db:test-get-id testdat)) (test-name (if (number? test-name)
︙			︙
3997 3998 3999 4000 4001 4002 4003 ~~4004~~ 4005 4006 4007 ~~4008 4009 4010~~ 4011 4012 4013 4014 4015 4016 4017 4018 4019 4020 4021 4022 4023 4024 4025 4026 4027 4028 4029 4030 4031 4032 4033 4034 4035 4036 4037 4038 4039 4040 4041 4042 4043 4044 4045 4046 4047 4048 4049 4050 4051 4052 4053 4054 4055 4056 4057 4058 4059 4060 4061 4062 4063 4064 4065 4066 4067 4068 4069 4070 4071 4072 4073 4074 4075 4076 4077 ~~4078 4079 4080~~ 4081 4082 ~~4083~~ 4084 4085 4086 4087 4088 4089 4090 4091 ~~4092~~ ~~4093~~ ~~4094~~ 4095 4096 ~~4097~~ 4098 4099 4100 4101 4102 4103 4104	(db:with-db dbstruct #f #f (lambda (db) (let ((tr-res (sqlite3:with-transaction db (lambda () ~~;; NB// Pass the db so it is part fo the transaction~~ (db:test-set-state-status db run-id test-id state status comment) ;; this call sets the item state/status (if (not (equal? item-path "")) ;; only roll up IF incoming test is an item (let* ((state-status-counts (db:get-all-state-status-counts-for-test dbstruct run-id test-name item-path state status)) ;; item-path is used to exclude current state/status of THIS test ~~(state-sta~~uses~~ (db:roll-up-rules state-status-counts state status)) (newstate (car state-stauses)) (newstatus (cadr state-stauses)))~~ (debug:print 4 default-log-port "BB> tl-test-id="tl-test-id" ; "test-name":"item-path" newstate="newstate" newstatus="newstatus" len(sscs)="(length state-status-counts) " state-status-counts: " (apply conc (map (lambda (x) (conc (with-output-to-string (lambda () (pp (dbr:counts->alist x)))) " \| ")) state-status-counts))); end debug:print (if tl-test-id (db:test-set-state-status db run-id tl-test-id newstate newstatus #f)) ;; we are still in the transaction - must access the db and not the dbstruct )))))) (mutex-unlock! db-transaction-mutex) (if (and test-id state status (equal? status "AUTO")) (db:test-data-rollup dbstruct run-id test-id status)) tr-res))))) (define (db:roll-up-rules state-status-counts state status) (let* ((running (length (filter (lambda (x) (member (dbr:counts-state x) common:running-states)) state-status-counts))) (bad-not-started (length (filter (lambda (x) (and (equal? (dbr:counts-state x) "NOT_STARTED") (not (member (dbr:counts-status x) common:not-started-ok-statuses)))) state-status-counts))) (all-curr-states (common:special-sort ;; worst -> best (sort of) (delete-duplicates (if (and state (not (member state common:dont-roll-up-states))) (cons state (map dbr:counts-state state-status-counts)) (map dbr:counts-state state-status-counts))) common:std-states >)) (all-curr-statuses (common:special-sort ;; worst -> best (delete-duplicates (if (and state status (not (member state common:dont-roll-up-states))) (cons status (map dbr:counts-status state-status-counts)) (map dbr:counts-status state-status-counts))) common:std-statuses >)) (non-completes (filter (lambda (x) (not (member x (cons "COMPLETED" common:dont-roll-up-states)))) all-curr-states)) (preq-fails (filter (lambda (x) (equal? x "PREQ_FAIL")) all-curr-statuses)) (num-non-completes (length non-completes)) (newstate (cond ((> running 0) "RUNNING") ;; anything running, call the situation running ((> (length preq-fails) 0) "NOT_STARTED") ((> bad-not-started 0) "COMPLETED") ;; we have an ugly situation, it is completed in the sense we cannot do more. ((> num-non-completes 0) (car non-completes)) ;; (remove (lambda (x)(equal? "COMPLETED" x)) all-curr-states))) ;; only rollup DELETED if all DELETED (else (car all-curr-states)))) (newstatus (cond ((> (length preq-fails) 0) "PREQ_FAIL") ((or (> bad-not-started 0) (and (equal? newstate "NOT_STARTED") (> num-non-completes 0))) "STARTED") (else (car all-curr-statuses))))) (debug:print-info 2 default-log-port "\n--> probe db:set-state-status-and-roll-up-items: " "\n--> state-status-counts: "(map dbr:counts->alist state-status-counts) "\n--> running: "running "\n--> bad-not-started: "bad-not-started "\n--> non-non-completes: "num-non-completes "\n--> non-completes: "non-completes "\n--> all-curr-states: "all-curr-states "\n--> all-curr-statuses: "all-curr-statuses "\n--> newstate "newstate "\n--> newstatus "newstatus "\n\n") ~~;; NB// Pass the db so it is part of the transaction (list newstate newstatus)))~~ (define (db:set-state-status-and-roll-up-run dbstruct run-id curr-state curr-status) ~~(mutex-lock! db-transaction-mutex)~~ (db:with-db dbstruct #f #f (lambda (db) (let ((tr-res (sqlite3:with-transaction db (lambda () (let* ((state-status-counts (db:get-all-state-status-counts-for-run dbstruct run-id)) ~~(state-sta~~uses~~ (db:roll-up-rules state-status-counts #f #f ))~~ ~~~~(newstate~~ (car state-sta~~uses~~))~~ ~~~~(newstatus~~ (cadr state-stauses)))~~ (if (or (not (eq? newstate curr-state)) (not (eq? newstatus curr-status))) (db:set-run-state-status dbstruct run-id newstate newstatus ))))))) ~~(mutex-unlock! db-transaction-mutex)~~ tr-res)))) (define (db:get-all-state-status-counts-for-run dbstruct run-id) (let* ((test-count-recs (db:with-db dbstruct #f #f (lambda (db)	\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| < \| \| \| \| > > \| > > \| \|	4121 4122 4123 4124 4125 4126 4127 4128 4129 4130 4131 4132 4133 4134 4135 4136 4137 4138 4139 4140 4141 4142 4143 4144 4145 4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159 4160 4161 4162 4163 4164 4165 4166 4167 4168 4169 4170 4171 4172 4173 4174 4175 4176 4177 4178 4179 4180 4181 4182 4183 4184 4185 4186 4187 4188 4189 4190 4191 4192 4193 4194 4195 4196 4197 4198 4199 4200 4201 4202 4203 4204 4205 4206 4207 4208 4209 4210 4211 4212 4213 4214 4215 4216 4217 4218 4219 4220 4221 4222 4223 4224 4225 4226 4227 4228 4229 4230 4231	(db:with-db dbstruct #f #f (lambda (db) (let ((tr-res (sqlite3:with-transaction db (lambda () ;; NB// Pass the db so it is part of the transaction (db:test-set-state-status db run-id test-id state status comment) ;; this call sets the item state/status (if (not (equal? item-path "")) ;; only roll up IF incoming test is an item (let* ((state-status-counts (db:get-all-state-status-counts-for-test dbstruct run-id test-name item-path state status)) ;; item-path is used to exclude current state/status of THIS test (state-statuses (db:roll-up-rules state-status-counts state status)) (newstate (car state-statuses)) (newstatus (cadr state-statuses))) (debug:print 4 default-log-port "BB> tl-test-id="tl-test-id" ; "test-name":"item-path" newstate="newstate" newstatus="newstatus" len(sscs)="(length state-status-counts) " state-status-counts: " (apply conc (map (lambda (x) (conc (with-output-to-string (lambda () (pp (dbr:counts->alist x)))) " \| ")) state-status-counts))); end debug:print (if tl-test-id (db:test-set-state-status db run-id tl-test-id newstate newstatus #f)) ;; we are still in the transaction - must access the db and not the dbstruct )))))) (mutex-unlock! db-transaction-mutex) (if (and test-id state status (equal? status "AUTO")) (db:test-data-rollup dbstruct run-id test-id status)) tr-res))))) (define (db:roll-up-rules state-status-counts state status) (let* ((running (length (filter (lambda (x) (member (dbr:counts-state x) common:running-states)) state-status-counts))) (bad-not-started (length (filter (lambda (x) (and (equal? (dbr:counts-state x) "NOT_STARTED") (not (member (dbr:counts-status x) common:not-started-ok-statuses)))) state-status-counts))) (all-curr-states (common:special-sort ;; worst -> best (sort of) (delete-duplicates (if (and state (not (member state common:dont-roll-up-states))) (cons state (map dbr:counts-state state-status-counts)) (map dbr:counts-state state-status-counts))) common:std-states >)) (all-curr-statuses (common:special-sort ;; worst -> best (delete-duplicates (if (and state status (not (member state common:dont-roll-up-states))) (cons status (map dbr:counts-status state-status-counts)) (map dbr:counts-status state-status-counts))) common:std-statuses >)) (non-completes (filter (lambda (x) (not (member x (cons "COMPLETED" common:dont-roll-up-states)))) all-curr-states)) (preq-fails (filter (lambda (x) (equal? x "PREQ_FAIL")) all-curr-statuses)) (num-non-completes (length non-completes)) (newstate (cond ((> running 0) "RUNNING") ;; anything running, call the situation running ((> (length preq-fails) 0) "NOT_STARTED") ((> bad-not-started 0) "COMPLETED") ;; we have an ugly situation, it is completed in the sense we cannot do more. ((> num-non-completes 0) (car non-completes)) ;; (remove (lambda (x)(equal? "COMPLETED" x)) all-curr-states))) ;; only rollup DELETED if all DELETED (else (car all-curr-states)))) (newstatus (cond ((> (length preq-fails) 0) "PREQ_FAIL") ((or (> bad-not-started 0) (and (equal? newstate "NOT_STARTED") (> num-non-completes 0))) "STARTED") (else (car all-curr-statuses))))) (debug:print-info 2 default-log-port "\n--> probe db:set-state-status-and-roll-up-items: " "\n--> state-status-counts: "(map dbr:counts->alist state-status-counts) "\n--> running: "running "\n--> bad-not-started: "bad-not-started "\n--> non-non-completes: "num-non-completes "\n--> non-completes: "non-completes "\n--> all-curr-states: "all-curr-states "\n--> all-curr-statuses: "all-curr-statuses "\n--> newstate "newstate "\n--> newstatus "newstatus "\n\n") ;; NB// Pass the db so it is part of the transaction (list newstate newstatus))) (define (db:set-state-status-and-roll-up-run dbstruct run-id curr-state curr-status) ;; (mutex-lock! db-transaction-mutex) (db:with-db dbstruct #f #f (lambda (db) (let ((tr-res (sqlite3:with-transaction db (lambda () (let* ((state-status-counts (db:get-all-state-status-counts-for-run dbstruct run-id)) (state-statuses (db:roll-up-rules state-status-counts #f #f )) (newstate (if (null? state-statuses) curr-state (car state-statuses))) (newstatus (if (null? state-statuses) curr-status (cadr state-statuses)))) (if (or (not (eq? newstate curr-state)) (not (eq? newstatus curr-status))) (db:set-run-state-status dbstruct run-id newstate newstatus ))))))) ;; (mutex-unlock! db-transaction-mutex) tr-res)))) (define (db:get-all-state-status-counts-for-run dbstruct run-id) (let* ((test-count-recs (db:with-db dbstruct #f #f (lambda (db)
︙			︙
4146 4147 4148 4149 4150 4151 4152 4153 4154 4155 4156 4157 4158 4159	(nonmatch-countrec-lambda (lambda (countrec) (not (match-countrec-lambda countrec)))) (unrelated-rec-list (filter nonmatch-countrec-lambda other-items-count-recs))) (cons updated-count-rec unrelated-rec-list))) ;; (define (db:get-all-item-states db run-id test-name) ;; (sqlite3:map-row ;; (lambda (a) a) ;; db ;; "SELECT DISTINCT state FROM tests WHERE item_path != '' AND state != 'DELETED' AND run_id=? AND testname=?" ;; run-id test-name))	> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >	4273 4274 4275 4276 4277 4278 4279 4280 4281 4282 4283 4284 4285 4286 4287 4288 4289 4290 4291 4292 4293 4294 4295 4296 4297 4298 4299 4300 4301 4302 4303 4304 4305 4306 4307 4308 4309 4310 4311 4312 4313 4314 4315 4316	(nonmatch-countrec-lambda (lambda (countrec) (not (match-countrec-lambda countrec)))) (unrelated-rec-list (filter nonmatch-countrec-lambda other-items-count-recs))) (cons updated-count-rec unrelated-rec-list))) ;; full count not including toplevel ;; (define (db:get-all-state-status-counts-for-testname dbstruct run-id test-name) (let* ((test-count-recs (db:with-db dbstruct #f #f (lambda (db) (sqlite3:map-row (lambda (state status count) (make-dbr:counts state: state status: status count: count)) db "SELECT state,status,count(id) FROM tests WHERE run_id=? AND testname=? AND item_path != '' GROUP BY state,status;" run-id test-name))))) test-count-recs)) ;; full count not including toplevel ;; (define (db:get-all-state-status-counts-for-testname dbstruct run-id test-name) (let* ((test-count-recs (db:with-db dbstruct #f #f (lambda (db) (sqlite3:map-row (lambda (state status count) (make-dbr:counts state: state status: status count: count)) db "SELECT state,status,count(id) FROM tests WHERE run_id=? AND testname=? AND item_path != '' GROUP BY state,status;" run-id test-name))))) test-count-recs)) ;; (define (db:get-all-item-states db run-id test-name) ;; (sqlite3:map-row ;; (lambda (a) a) ;; db ;; "SELECT DISTINCT state FROM tests WHERE item_path != '' AND state != 'DELETED' AND run_id=? AND testname=?" ;; run-id test-name))
︙			︙

︙			︙
319 320 321 322 323 324 325 ~~326~~ 327 328 329 330 331 332 333	;; Take advantage of a good place to exit if running the one-pass methodology (if (and (> (runs:dat-can-run-more-tests-count runsdat) 20) (args:get-arg "-one-pass")) (exit 0)) (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) ~~(let* ((num-running (rmt:get-count-tests-running run-id ~~#f)) ;; fastmode=no~~~~ (num-running-in-jobgroup (rmt:get-count-tests-running-in-jobgroup run-id jobgroup)) (job-group-limit (let ((jobg-count (configf:lookup configdat "jobgroups" jobgroup))) (if (string? jobg-count) (string->number jobg-count) jobg-count)))) (if (> (+ num-running num-running-in-jobgroup) 0) (runs:inc-can-run-more-tests-count runsdat)) ;; (set! runs:can-run-more-tests-count (+ runs:can-run-more-tests-count 1)))	\|	319 320 321 322 323 324 325 326 327 328 329 330 331 332 333	;; Take advantage of a good place to exit if running the one-pass methodology (if (and (> (runs:dat-can-run-more-tests-count runsdat) 20) (args:get-arg "-one-pass")) (exit 0)) (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (let* ((num-running (rmt:get-count-tests-running run-id)) (num-running-in-jobgroup (rmt:get-count-tests-running-in-jobgroup run-id jobgroup)) (job-group-limit (let ((jobg-count (configf:lookup configdat "jobgroups" jobgroup))) (if (string? jobg-count) (string->number jobg-count) jobg-count)))) (if (> (+ num-running num-running-in-jobgroup) 0) (runs:inc-can-run-more-tests-count runsdat)) ;; (set! runs:can-run-more-tests-count (+ runs:can-run-more-tests-count 1)))
︙			︙
1563 1564 1565 1566 1567 1568 1569 ~~1570~~ 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592	)) extras) extras) '()))) (waitons (delete-duplicates (append (tests:testqueue-get-waitons test-record) extra-waits) equal?)) (newtal (append tal (list hed))) (regfull (>= (length reg) reglen)) ~~(num-running (rmt:get-count-tests-running-for-run-id run-id ~~#t)) ;; fastmode=yes~~~~ (testdat (make-runs:testdat hed: hed tal: tal reg: reg reruns: reruns test-record: test-record test-name: test-name item-path: item-path jobgroup: jobgroup waitons: waitons testmode: testmode newtal: newtal itemmaps: itemmaps ;; prereqs-not-met: prereqs-not-met ))) (runs:dat-regfull-set! runsdat regfull) (if (> num-running 0) (set! last-time-some-running (current-seconds))) (if (> (current-seconds)(+ last-time-some-running (or (configf:lookup configdat "setup" "give-up-waiting") 36000))) (hash-table-set! max-tries-hash tfullname (+ (hash-table-ref/default max-tries-hash tfullname 0) 1)))	\| >	1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593	)) extras) extras) '()))) (waitons (delete-duplicates (append (tests:testqueue-get-waitons test-record) extra-waits) equal?)) (newtal (append tal (list hed))) (regfull (>= (length reg) reglen)) (num-running (rmt:get-count-tests-running-for-run-id run-id)) (testdat (make-runs:testdat hed: hed tal: tal reg: reg reruns: reruns test-record: test-record test-name: test-name item-path: item-path jobgroup: jobgroup waitons: waitons testmode: testmode newtal: newtal itemmaps: itemmaps ;; prereqs-not-met: prereqs-not-met ))) (thread-sleep! 0.2) ;; no point in blasting though this a thousand times a second (runs:dat-regfull-set! runsdat regfull) (if (> num-running 0) (set! last-time-some-running (current-seconds))) (if (> (current-seconds)(+ last-time-some-running (or (configf:lookup configdat "setup" "give-up-waiting") 36000))) (hash-table-set! max-tries-hash tfullname (+ (hash-table-ref/default max-tries-hash tfullname 0) 1)))
︙			︙
1829 1830 1831 1832 1833 1834 1835 ~~1836~~ 1837 1838 1839 1840 1841 1842 1843 1844 1845 ~~1846~~ 1847 1848 1849 1850 1851 1852 1853 1854 1855 ~~1856~~ 1857 1858 1859 1860 1861 1862 1863	(rmt:set-var (conc "lunch-complete-" run-id) "yes") ;; now if -run-wait we wait for all tests to be done ;; Now wait for any RUNNING tests to complete (if in run-wait mode) ;; (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (thread-sleep! 10) ;; I think there is a race condition here. Let states/statuses settle ~~(let wait-loop ((num-running (rmt:get-count-tests-running-for-run-id run-id ~~#t)) ;; fastmode=yes~~~~ (prev-num-running 0)) ;; (debug:print-info 13 default-log-port "num-running=" num-running ", prev-num-running=" prev-num-running) (if (and (or (args:get-arg "-run-wait") (equal? (configf:lookup configdat "setup" "run-wait") "yes")) (> num-running 0)) (begin ;; Here we mark any old defunct tests as incomplete. Do this every fifteen minutes ;; (debug:print 0 default-log-port "Got here eh! num-running=" num-running " (> num-running 0) " (> num-running 0)) (if (> (current-seconds)(+ last-time-incomplete 900)) ~~(let ((actual-num-running (rmt:get-count-tests-running-for-run-id run-id ~~#f))) ;; fastmode=no~~~~ (debug:print-info 0 default-log-port "Marking stuck tests as INCOMPLETE while waiting for run " run-id ". Running as pid " (current-process-id) " on " (get-host-name)) (set! last-time-incomplete (current-seconds)) ;; FIXME, this might be causing slow down - use of set! (rmt:find-and-mark-incomplete run-id #f) (debug:print-info 0 default-log-port "run-wait specified, waiting on " actual-num-running " tests in RUNNING, REMOTEHOSTSTART or LAUNCHED state at " (time->string (seconds->local-time (current-seconds)))))) ;; (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (thread-sleep! 5) ;; (if (>= num-running max-concurrent-jobs) 5 1)) ~~(wait-loop (rmt:get-count-tests-running-for-run-id run-id ~~#t) ;; fastmode=yes~~~~ num-running)))) ;; LET* ((test-record ;; we get here on "drop through". All done! ;; this is moved to runs:run-testes since this function is getting called twice to ensure everthing is completed. ;; (debug:print-info 0 default-log-port "Calling Post Hook") ;; (runs:run-post-hook run-id) (debug:print-info 1 default-log-port "All tests launched")))	\| \| \|	1830 1831 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 1849 1850 1851 1852 1853 1854 1855 1856 1857 1858 1859 1860 1861 1862 1863 1864	(rmt:set-var (conc "lunch-complete-" run-id) "yes") ;; now if -run-wait we wait for all tests to be done ;; Now wait for any RUNNING tests to complete (if in run-wait mode) ;; (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (thread-sleep! 10) ;; I think there is a race condition here. Let states/statuses settle (let wait-loop ((num-running (rmt:get-count-tests-running-for-run-id run-id)) (prev-num-running 0)) ;; (debug:print-info 13 default-log-port "num-running=" num-running ", prev-num-running=" prev-num-running) (if (and (or (args:get-arg "-run-wait") (equal? (configf:lookup configdat "setup" "run-wait") "yes")) (> num-running 0)) (begin ;; Here we mark any old defunct tests as incomplete. Do this every fifteen minutes ;; (debug:print 0 default-log-port "Got here eh! num-running=" num-running " (> num-running 0) " (> num-running 0)) (if (> (current-seconds)(+ last-time-incomplete 900)) (let ((actual-num-running (rmt:get-count-tests-running-for-run-id run-id))) (debug:print-info 0 default-log-port "Marking stuck tests as INCOMPLETE while waiting for run " run-id ". Running as pid " (current-process-id) " on " (get-host-name)) (set! last-time-incomplete (current-seconds)) ;; FIXME, this might be causing slow down - use of set! (rmt:find-and-mark-incomplete run-id #f) (debug:print-info 0 default-log-port "run-wait specified, waiting on " actual-num-running " tests in RUNNING, REMOTEHOSTSTART or LAUNCHED state at " (time->string (seconds->local-time (current-seconds)))))) ;; (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (thread-sleep! 5) ;; (if (>= num-running max-concurrent-jobs) 5 1)) (wait-loop (rmt:get-count-tests-running-for-run-id run-id) num-running)))) ;; LET* ((test-record ;; we get here on "drop through". All done! ;; this is moved to runs:run-testes since this function is getting called twice to ensure everthing is completed. ;; (debug:print-info 0 default-log-port "Calling Post Hook") ;; (runs:run-post-hook run-id) (debug:print-info 1 default-log-port "All tests launched")))
︙			︙