Megatest: Changes On Branch 5deb96c7f743aad7

Changes In Branch v1.65-side2 Through [5deb96c7f7] Excluding Merge-Ins

This is equivalent to a diff from 2769e4b7c9 to 5deb96c7f7

2021-03-01
17:42		Manually patched in the new view check-in: f5206150ee user: mrwellan tags: v1.6569-new-view
2021-01-26
14:00		Fix for the > crash. Maybe... Leaf check-in: 5a05fc04ff user: matt tags: v1.6569-gt-crash-fix
2021-01-25
12:03		rebased lazy-queue rollup check-in: 07ab120544 user: matt tags: v1.65-lazyqueue-items-rollup
2021-01-15
22:46		begin diet check-in: badd71f3b3 user: matt tags: v1.6569-diet
21:34		eval-string-in-environment if was disabled, re-enabled check-in: 9564772564 user: matt tags: v1.6569-reenable-eval-if
2021-01-08
11:42		enable custom value for max delay between archive time and test last update time Leaf check-in: 86a3d1148e user: pjhatwal tags: v1.6569-refactor
2020-11-25
12:00		Fixed issues in server gating code Leaf check-in: 063273e8cb user: mrwellan tags: v1.6569-server-gate-fix
2020-11-24
22:27		Added support for resetting run - allows to reload tests-paths to add tests to a run part way though. Just run megatest -clean-cache -runname $MT_RUNNAME Leaf check-in: 213021e02d user: mrwellan tags: v1.6596-reload-tests-paths
2020-10-13
22:17		Added delay that may help server starts Leaf check-in: 06179e8f43 user: mrwellan tags: v1.65-side2
16:46		Changed version from 69 to 76. No other changes. Will compile with chicken 13 check-in: 87ca35010f user: mmgraham tags: v1.65, v1.6576
15:52		Added info on post-hook for runs. check-in: 5deb96c7f7 user: mrwellan tags: v1.65-side2
2020-10-12
16:58		Merged minor change to v1.65 check-in: 60a665385a user: mrwellan tags: v1.65-side2
16:49		Reduced message from failed to info. Reverted a delay which seems to help pass full stack ext-tests. Leaf check-in: 9e35b1252c user: mrwellan tags: v1.65-minor-patch
10:18		Safe vector access in rmt. check-in: 58bb6d997a user: mrwellan tags: v1.65-side2
2020-10-11
22:46		Patched forward adjutant code. check-in: f936717bfa user: matt tags: v1.65-adjutant-again
2020-10-05
22:49		Do not exit on failure to create directory - race conditons on NFS cause false fail scenarios - just keep going and cross your fingers... (cherrypicked from v1.6572) check-in: 05b253a452 user: matt tags: v1.65-sidework
22:46		run duration testdat check-in: 4a0b43f3c6 user: matt tags: v1.65-test-rundat2
2020-09-21
15:36		merged in 1.65-test-rundat branch ==/FAIL/orion,mars/== check-in: cfd25d66e9 user: mmgraham tags: v1.6571, v1.65-failed-testdat
07:00		Added get-testsuite-name all over launch:setup and still not set when needed! This did NOT work. Closed-Leaf check-in: 2efe8ad422 user: mrwellan tags: v1.65-get-testsuitename
2020-09-19
04:21		Start moving test_rundat to no-sync db. ==/20/2/WARN/1203/mars/== check-in: abfabdb839 user: matt tags: v1.65-test-rundat
2020-09-18
17:30		added check for file existence before file delete ==/14/1.9/WARN/orion,mars/== NOTE: This is the last v1.65 before the split off. I.e code from before this point IS in the far future v1.65 branch. Code from this point to that branch might NOT be in the branch. check-in: 2769e4b7c9 user: mmgraham tags: v1.65, v1.6569
12:27		cherry picked 2 fixes, changed version to 1.6569 ==/7.2/2.0/PASS/1201/mars/== check-in: d145d0eb02 user: mmgraham tags: v1.65

Modified api.scm from [4fa67bb6bd] to [cc4c2bfc8f].

Modified common.scm from [33c7316880] to [2732dee33e].

Modified configf.scm from [b115fef76f] to [83ecc5b24c].

Modified docs/manual/megatest_manual.html from [a02a70016f] to [00d3df112f].

Modified docs/manual/reference.txt from [6aa04b6eea] to [530f6e150c].

Modified launch.scm from [d0067277fa] to [d79c56de3e].

Modified rmt.scm from [39d97c528a] to [23bba59d7b].

Modified runs.scm from [030b929939] to [f99424cdf8].


152 153 154 155 156 157 158 ~~159~~ 160 161 162 ~~163~~ 164 165 166 167 168 ~~169~~ 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186	((not (vector? dat)) ;; it is an error to not receive a vector (vector #f (vector #f "remote must be called with a vector"))) ((> api-process-request-count 20) ;; 20) (debug:print 0 default-log-port "WARNING: api:execute-requests received an overloaded message.") (set! server-overloaded #t) (vector #f (vector #f 'overloaded))) ;; the inner vector is what gets returned. nope, don't know why. please refactor! (else ~~(let* ((cmd-in (vector-ref dat 0))~~ (cmd (if (symbol? cmd-in) cmd-in (string->symbol cmd-in))) ~~(params (vector-ref dat 1))~~ (start-t (current-milliseconds)) (readonly-mode (dbr:dbstruct-read-only dbstruct)) (readonly-command (member cmd api:read-only-queries)) (writecmd-in-readonly-mode (and readonly-mode (not readonly-command))) (foo (begin ~~(common:telemetry-log (conc "api-in:"(->string cmd))~~ payload: `((params . ,params))) #t)) (res (if writecmd-in-readonly-mode (conc "attempt to run write command "cmd" on a read-only database") (case cmd ;;=============================================== ;; READ/WRITE QUERIES ;;=============================================== ((get-keys-write) (db:get-keys dbstruct)) ;; force a dummy "write" query to force server; for debug in -repl ;; SERVERS ((start-server) (apply server:kind-run params)) ((kill-server) (set! server-run #f))	\| \| \| > >	152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188	((not (vector? dat)) ;; it is an error to not receive a vector (vector #f (vector #f "remote must be called with a vector"))) ((> api-process-request-count 20) ;; 20) (debug:print 0 default-log-port "WARNING: api:execute-requests received an overloaded message.") (set! server-overloaded #t) (vector #f (vector #f 'overloaded))) ;; the inner vector is what gets returned. nope, don't know why. please refactor! (else (let* ((cmd-in (common:safe-vector-ref dat 0 'nocmd)) (cmd (if (symbol? cmd-in) cmd-in (string->symbol cmd-in))) (params (common:safe-vector-ref dat 1 '())) (start-t (current-milliseconds)) (readonly-mode (dbr:dbstruct-read-only dbstruct)) (readonly-command (member cmd api:read-only-queries)) (writecmd-in-readonly-mode (and readonly-mode (not readonly-command))) (foo (begin #;(common:telemetry-log (conc "api-in:"(->string cmd)) payload: `((params . ,params))) #t)) (res (if writecmd-in-readonly-mode (conc "attempt to run write command "cmd" on a read-only database") (case cmd ;;=============================================== ;; READ/WRITE QUERIES ;;=============================================== ((nocmd) '(#f "All broken!")) ((get-keys-write) (db:get-keys dbstruct)) ;; force a dummy "write" query to force server; for debug in -repl ;; SERVERS ((start-server) (apply server:kind-run params)) ((kill-server) (set! server-run #f))

357 358 359 360 361 362 363 ~~364~~ 365 366 367 368 ~~369~~ 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 ~~386 387~~ 388 389 390 391 392 393 394	;; save all stats (let ((delta-t (- (current-milliseconds) start-t))) (hash-table-set! db-api-call-time cmd (cons delta-t (hash-table-ref/default db-api-call-time cmd '())))) (if writecmd-in-readonly-mode (begin ~~(common:telemetry-log (conc "api-out:"(->string cmd))~~ payload: `((params . ,params) (ok-res . #t))) (vector #f res)) (begin ~~(common:telemetry-log (conc "api-out:"(->string cmd))~~ payload: `((params . ,params) (ok-res . #f))) (vector #t res)))))))) ;; http-server send-response ;; api:process-request ;; db:* ;; ;; NB// Runs on the server as part of the server loop ;; (define (api:process-request dbstruct $) ;; the $ is the request vars proc (set! api-process-request-count (+ api-process-request-count 1)) (let* ((cmd ($ 'cmd)) (paramsj ($ 'params)) (params (db:string->obj paramsj transport: 'http)) ;; incoming data from the POST (or is it a GET?) (resdat (api:execute-requests dbstruct (vector cmd params))) ;; process the request, resdat = #( flag result ) ~~(success (vector-ref resdat 0)) (res (vector-ref resdat 1))) ;; (vector flag payload), get the payload, ignore the flag (why?)~~ (if (not success) (debug:print 0 default-log-port "ERROR: success flag is #f for " cmd " with params " params)) (if (> api-process-request-count max-api-process-requests) (set! max-api-process-requests api-process-request-count)) (set! api-process-request-count (- api-process-request-count 1)) ;; This can be here but needs controls to ensure it doesn't run more than every 4 seconds ;; (rmt:dat->json-str	\| \| \| \|	359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396	;; save all stats (let ((delta-t (- (current-milliseconds) start-t))) (hash-table-set! db-api-call-time cmd (cons delta-t (hash-table-ref/default db-api-call-time cmd '())))) (if writecmd-in-readonly-mode (begin #;(common:telemetry-log (conc "api-out:"(->string cmd)) payload: `((params . ,params) (ok-res . #t))) (vector #f res)) (begin #;(common:telemetry-log (conc "api-out:"(->string cmd)) payload: `((params . ,params) (ok-res . #f))) (vector #t res)))))))) ;; http-server send-response ;; api:process-request ;; db:* ;; ;; NB// Runs on the server as part of the server loop ;; (define (api:process-request dbstruct $) ;; the $ is the request vars proc (set! api-process-request-count (+ api-process-request-count 1)) (let* ((cmd ($ 'cmd)) (paramsj ($ 'params)) (params (db:string->obj paramsj transport: 'http)) ;; incoming data from the POST (or is it a GET?) (resdat (api:execute-requests dbstruct (vector cmd params))) ;; process the request, resdat = #( flag result ) (success (common:safe-vector-ref resdat 0 #f)) (res (common:safe-vector-ref resdat 1 #f))) ;; (vector flag payload), get the payload, ignore the flag (why?) (if (not success) (debug:print 0 default-log-port "ERROR: success flag is #f for " cmd " with params " params)) (if (> api-process-request-count max-api-process-requests) (set! max-api-process-requests api-process-request-count)) (set! api-process-request-count (- api-process-request-count 1)) ;; This can be here but needs controls to ensure it doesn't run more than every 4 seconds ;; (rmt:dat->json-str


2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431	<div class="listingblock"> <div class="content monospaced"> <pre>[setup] # this will automatically kill the test if it runs for more than 1h 2m and 3s runtimelim 1h 2m 3s</pre> </div></div> </div> </div> </div> <div class="sect2"> <h3 id="_tests_browser_view">Tests browser view</h3> <div class="paragraph"><p>The tests browser (see the Run Control tab on the dashboard) has two views for displaying the tests.</p></div> <div class="olist arabic"><ol class="arabic"> <li>	> > > > > > > > > > > > > > >	2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 2431 2432 2433 2434 2435 2436 2437 2438 2439 2440 2441 2442 2443 2444 2445 2446	<div class="listingblock"> <div class="content monospaced"> <pre>[setup] # this will automatically kill the test if it runs for more than 1h 2m and 3s runtimelim 1h 2m 3s</pre> </div></div> </div> <div class="sect4"> <h5 id="_post_run_hook">Post Run Hook</h5> <div class="paragraph"><p>This runs script to-run.sh after all tests have been completed. It is not necessary to use -run-wait as each test will check for other running tests on completion and if there are none it will call the post run hook.</p></div> <div class="paragraph"><p>Note that the output from the script call will be placed in a log file in the logs directory with a file name derived by replacing / with _ in post-hook-<target>-<runname>.log.</p></div> <div class="listingblock"> <div class="content monospaced"> <pre>[runs] post-hook /path/to/script/to-run.sh</pre> </div></div> </div> </div> </div> <div class="sect2"> <h3 id="_tests_browser_view">Tests browser view</h3> <div class="paragraph"><p>The tests browser (see the Run Control tab on the dashboard) has two views for displaying the tests.</p></div> <div class="olist arabic"><ol class="arabic"> <li>

3482 3483 3484 3485 3486 3487 3488 ~~3489~~ 3490 3491 3492 3493	</div> </div> </div> <div id="footnotes"><hr></div> <div id="footer"> <div id="footer-text"> Version 1.5<br> ~~Last updated 2020-09-08 0~~8:39:29~~ PDT~~ </div> </div> </body> </html>	\|	3497 3498 3499 3500 3501 3502 3503 3504 3505 3506 3507 3508	</div> </div> </div> <div id="footnotes"><hr></div> <div id="footer"> <div id="footer-text"> Version 1.5<br> Last updated 2020-10-12 20:12:01 PDT </div> </div> </body> </html>


52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73	cinfo (if (server:check-if-running areapath) (client:setup areapath) #f)))) (define send-receive-mutex (make-mutex)) ;; should have separate mutex per run-id ;; RA => e.g. usage (rmt:send-receive 'get-var #f (list varname)) ;; (define (rmt:send-receive cmd rid params #!key (attemptnum 1)(area-dat #f)) ;; start attemptnum at 1 so the modulo below works as expected #;(common:telemetry-log (conc "rmt:"(->string cmd)) payload: `((rid . ,rid) (params . ,params))) (if (> attemptnum 2) (debug:print 0 default-log-port "INFO: attemptnum in rmt:send-receive is " attemptnum)) (cond ((> attemptnum 2) (thread-sleep! 0.05)) ((> attemptnum 10) (thread-sleep! 0.5)) ((> attemptnum 20) (thread-sleep! 1)))	> > > > > > > > > > > > > > > > > > > > > \|	52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94	cinfo (if (server:check-if-running areapath) (client:setup areapath) #f)))) (define send-receive-mutex (make-mutex)) ;; should have separate mutex per run-id (define rmt-query-last-call-time 0) (define rmt-query-last-rest-time 0) ;; last time there was at least a 1/2 second rest - giving other processes access to the db ;; NOTE: This query rest algorythm will not adapt to long query times. REDESIGN NEEDED. TODO. FIXME. ;; (define (rmt:query-rest) (let* ((now (current-milliseconds))) (cond ((> (- now rmt-query-last-call-time) 500) ;; it's been a while since last query - no need to rest (set! rmt-query-last-rest-time now) (set! rmt-query-last-call-time now)) ((> (- now rmt-query-last-rest-time) 5000) ;; no natural rests have happened (debug:print 0 default-log-port "query rest needed. blocking for 1/2 second.") (thread-sleep! 0.5) ;; force a rest of a half second (set! rmt-query-last-rest-time now) (set! rmt-query-last-call-time now)) (else ;; sufficient rests have occurred, just record the last query time (set! rmt-query-last-call-time now))))) ;; RA => e.g. usage (rmt:send-receive 'get-var #f (list varname)) ;; (define (rmt:send-receive cmd rid params #!key (attemptnum 1)(area-dat #f)) ;; start attemptnum at 1 so the modulo below works as expected #;(common:telemetry-log (conc "rmt:"(->string cmd)) payload: `((rid . ,rid) (params . ,params))) (if (not (equal? (configf:lookup configdat "setup" "query-rest") "no")) (rmt:query-rest)) (if (> attemptnum 2) (debug:print 0 default-log-port "INFO: attemptnum in rmt:send-receive is " attemptnum)) (cond ((> attemptnum 2) (thread-sleep! 0.05)) ((> attemptnum 10) (thread-sleep! 0.5)) ((> attemptnum 20) (thread-sleep! 1)))

367 368 369 370 371 372 373 ~~374 375~~ 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 ~~394~~ 395 396 397 398 399 400 401	(vector #t '())) ;; should always get a vector but if something goes wrong return a dummy (if (and (vector? v) (> (vector-length v) 1)) (let ((newvec (vector (vector-ref v 0)(vector-ref v 1)))) newvec) ;; by copying the vector while inside the error handler we should force the detection of a corrupted record (vector #t '())))) ;; we could also check that the returned types are valid (vector #t '()))) ~~(success (vector-ref resdat 0)) (res (vector-ref resdat 1))~~ (duration (- (current-milliseconds) start))) (if (and read-only qry-is-write) (debug:print 0 default-log-port "ERROR: attempt to write to read-only database ignored. cmd=" cmd)) (if (not success) (if (> remretries 0) (begin (debug:print-error 0 default-log-port "local query failed. Trying again.") (thread-sleep! (/ (random 5000) 1000)) ;; some random delay (rmt:open-qry-close-locally cmd run-id params remretries: (- remretries 1))) (begin (debug:print-error 0 default-log-port "too many retries in rmt:open-qry-close-locally, giving up") #f)) (begin ;; (rmt:update-db-stats run-id cmd params duration) ;; mark this run as dirty if this was a write, the watchdog is responsible for syncing it (if qry-is-write (let ((start-time (current-seconds))) (mutex-lock! db-multi-sync-mutex) ~~/ (set! db-last-access start-time) ;; THIS IS PROBABLY USELESS? (we are on a client)~~ (mutex-unlock! db-multi-sync-mutex))))) res)) (define (rmt:send-receive-no-auto-client-setup connection-info cmd run-id params) (let* ((run-id (if run-id run-id 0)) (res (handle-exceptions exn	\| \| \|	388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422	(vector #t '())) ;; should always get a vector but if something goes wrong return a dummy (if (and (vector? v) (> (vector-length v) 1)) (let ((newvec (vector (vector-ref v 0)(vector-ref v 1)))) newvec) ;; by copying the vector while inside the error handler we should force the detection of a corrupted record (vector #t '())))) ;; we could also check that the returned types are valid (vector #t '()))) (success (common:safe-vector-ref resdat 0 #f)) ;; (vector-ref resdat 0)) (res (common:safe-vector-ref resdat 1 #f)) ;; (vector-ref resdat 1)) (duration (- (current-milliseconds) start))) (if (and read-only qry-is-write) (debug:print 0 default-log-port "ERROR: attempt to write to read-only database ignored. cmd=" cmd)) (if (not success) (if (> remretries 0) (begin (debug:print-error 0 default-log-port "local query failed. Trying again.") (thread-sleep! (/ (random 5000) 1000)) ;; some random delay (rmt:open-qry-close-locally cmd run-id params remretries: (- remretries 1))) (begin (debug:print-error 0 default-log-port "too many retries in rmt:open-qry-close-locally, giving up") #f)) (begin ;; (rmt:update-db-stats run-id cmd params duration) ;; mark this run as dirty if this was a write, the watchdog is responsible for syncing it (if qry-is-write (let ((start-time (current-seconds))) (mutex-lock! db-multi-sync-mutex) (set! db-last-access start-time) ;; THIS IS PROBABLY USELESS? (we are on a client) (mutex-unlock! db-multi-sync-mutex))))) res)) (define (rmt:send-receive-no-auto-client-setup connection-info cmd run-id params) (let* ((run-id (if run-id run-id 0)) (res (handle-exceptions exn