Changes In Branch v1.65-experiment Through [a160c138d8] Excluding Merge-Ins
This is equivalent to a diff from 0cbf1a0b26 to a160c138d8
2020-09-08
| ||
10:12 | Added testplan section to manual. NOTE: Passes ext-tests without any re-runs, seems like a sweet spot. ==24.9/2.2/1201/WARN/mars== check-in: 0d4dd9a19f user: mrwellan tags: v1.65 | |
2020-09-07
| ||
23:29 |
Added query throttle.
Fully tested and passes ext-tests on sles11 and Ubuntu. Closed-Leaf check-in: 70a65ade2a user: matt tags: v1.65-rmt-call-throttle-orig | |
11:39 | Better flagging with LAUNCHING state. NOTE: itemwait subrun items are re-running when they perhaps should not. check-in: 5d2d0fddc3 user: matt tags: v1.65-experiment | |
2020-09-05
| ||
23:59 | Cache testdat. Not sure yet this is a good idea but it sure cuts down on queries that seem unnecessary. check-in: a160c138d8 user: matt tags: v1.65-experiment | |
21:51 | Merged the prereq attempt to rate gate check-in: 9dfe6cbfa1 user: matt tags: v1.65-experiment | |
13:41 | Try reduced frequency queries for prereq not met. ==/3.5/0.83/PASS/1201/mars/== check-in: 275adb0d10 user: matt tags: v1.65-prereq-qry-freq | |
11:17 | Merged cleanup branch back to v1.65 ==9.4/2.2/1201/WARN/mars== check-in: 0cbf1a0b26 user: matt tags: v1.65 | |
2020-09-04
| ||
16:34 | Change ERROR to INFO on directories in logs dir NOTE: Use this version as baseline: ext-tests gives full PASS, obfusc-skel test gives full PASS and no exn= in logs dir (but there are exn= in tests). check-in: 5c063e3452 user: mrwellan tags: v1.65-cleanup, v1.6568, v1.6565a | |
2020-08-25
| ||
15:08 | Added get-intercept and get-delay functions for load monitoring ==/FAIL/orion,zeus/== check-in: 85b652d5a1 user: jmoon18 tags: v1.65 | |
Modified db.scm from [2f649dc1fb] to [a8d9328753].
︙ | ︙ | |||
3045 3046 3047 3048 3049 3050 3051 | -1 "-" "-")) ;; ;; 1. cache tests-match-qry ;; 2. compile qry and store in hash ;; 3. convert for-each-row to fold ;; | | | 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 | -1 "-" "-")) ;; ;; 1. cache tests-match-qry ;; 2. compile qry and store in hash ;; 3. convert for-each-row to fold ;; #;(define (db:get-tests-for-run-state-status dbstruct run-id testpatt) (db:with-db dbstruct run-id #f (lambda (db) (let* ((res '()) (stmt-cache (dbr:dbstruct-stmt-cache dbstruct)) (stmth (let* ((sh (db:hoh-get stmt-cache db testpatt))) (or sh |
︙ | ︙ |
Modified runs.scm from [030b929939] to [f581c02f6f].
︙ | ︙ | |||
58 59 60 61 62 63 64 | (last-load-check-time 0) (last-jobs-check-time 0) ) (defstruct runs:testdat hed tal reg reruns test-record test-name item-path jobgroup | | > > > > | 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | (last-load-check-time 0) (last-jobs-check-time 0) ) (defstruct runs:testdat hed tal reg reruns test-record test-name item-path jobgroup waitons testmode newtal itemmaps (prereqs-not-met '()) (last-update 0) ;; ) ;; look in the $MT_RUN_AREA_HOME/.softlocks directory for key-host-pid.softlock files ;; - remove any that are over 3600 seconds old ;; - if there are any that are younger than 10 seconds ;; * sleep 10 seconds ;; * touch my key-host-pid.softlock file ;; * return |
︙ | ︙ | |||
827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 | ;; => review of a previously seen test is higher priority of never visited test ;; reg - list of previously visited tests ;; tal - list of never visited tests ;; prefer next hed to be from reg than tal. (define runs:nothing-left-in-queue-count 0) ;;====================================================================== ;; runs:expand-items is called by runs:run-tests-queue ;;====================================================================== ;; ;; return value of runs:expand-items is passed back to runs-tests-queue and is fed to named loop with this signature: ;; (let loop ((hed (car sorted-test-names)) ;; (tal (cdr sorted-test-names)) ;; (reg '()) ;; registered, put these at the head of tal ;; (reruns '())) | > > > > > > > > > > > > > > > > > > > > > > | > > | | | | 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 | ;; => review of a previously seen test is higher priority of never visited test ;; reg - list of previously visited tests ;; tal - list of never visited tests ;; prefer next hed to be from reg than tal. (define runs:nothing-left-in-queue-count 0) (define (runs:lazy-get-prereqs-not-met testdat run-id waitons hed item-path #!key (mode '(normal))(itemmaps #f)) ;; mode: testmode itemmaps: itemmaps) (if (< (- (current-seconds) (runs:testdat-last-update testdat)) 10) ;; only refresh for this test if it has been at least 10 seconds (begin ;; (debug:print 0 *default-log-port* "last-update=" (runs:testdat-last-update testdat) "(current-seconds)=" (current-seconds)) (runs:testdat-prereqs-not-met testdat)) ;; (rmt:get-prereqs-not-met 46 '("r1") "y1" "" mode: '(itemmatch) itemmaps: #f) (let* ((res (let ((res (rmt:get-prereqs-not-met run-id waitons hed item-path mode: mode itemmaps: itemmaps))) (debug:print 4 *default-log-port* "Get prereqs for " hed ", have " (length res) " prereqs. last-update=" (runs:testdat-last-update testdat) " current-seconds=" (current-seconds) " delta=" (- (current-seconds) (runs:testdat-last-update testdat))) (if (list? res) res (begin (debug:print 0 *default-log-port* "ERROR: rmt:get-prereqs-not-met returned non-list!\n" " res=" res " run-id=" run-id " waitons=" waitons " hed=" hed " item-path=" item-path " testmode=" mode " itemmaps=" itemmaps) '()))))) (runs:testdat-prereqs-not-met-set! testdat res) (runs:testdat-last-update-set! testdat (current-seconds)) res))) ;;====================================================================== ;; runs:expand-items is called by runs:run-tests-queue ;;====================================================================== ;; ;; return value of runs:expand-items is passed back to runs-tests-queue and is fed to named loop with this signature: ;; (let loop ((hed (car sorted-test-names)) ;; (tal (cdr sorted-test-names)) ;; (reg '()) ;; registered, put these at the head of tal ;; (reruns '())) (define (runs:expand-items hed tal reg reruns regfull newtal jobgroup max-concurrent-jobs run-id waitons item-path testmode test-record can-run-more items runname tconfig reglen test-registry test-records itemmaps testdat) (let* ((loop-list (list hed tal reg reruns)) (prereqs-not-met (runs:lazy-get-prereqs-not-met testdat run-id waitons hed item-path mode: testmode itemmaps: itemmaps)) #;(let ((res (rmt:get-prereqs-not-met run-id waitons hed item-path mode: testmode itemmaps: itemmaps))) (if (list? res) res (begin (debug:print 0 *default-log-port* "ERROR: rmt:get-prereqs-not-met returned non-list!\n" " res=" res " run-id=" run-id " waitons=" waitons " hed=" hed " item-path=" item-path " testmode=" testmode " itemmaps=" itemmaps) '()))) (have-itemized (not (null? (lset-intersection eq? testmode '(itemmatch itemwait))))) ;; (prereqs-not-met (mt:lazy-get-prereqs-not-met run-id waitons item-path mode: testmode itemmap: itemmap)) (fails (runs:calc-fails prereqs-not-met)) (prereq-fails (runs:calc-prereq-fail prereqs-not-met)) (non-completed (runs:calc-not-completed prereqs-not-met)) (runnables (runs:calc-runnable prereqs-not-met)) (unexpanded-prereqs (filter (lambda (testname) |
︙ | ︙ | |||
1440 1441 1442 1443 1444 1445 1446 | ;; every time though the loop increment the test/itempatt val. ;; when the min is > max-allowed and none running then force exit ;; (define *max-tries-hash* (make-hash-table)) (define (runs:pretty-long-list lst) | | > > | 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 | ;; every time though the loop increment the test/itempatt val. ;; when the min is > max-allowed and none running then force exit ;; (define *max-tries-hash* (make-hash-table)) (define (runs:pretty-long-list lst) (if (> (length lst) 8)(append (take lst 3)(list "...")) lst)) (define *runs-testdat-cache* (make-hash-table)) ;; full/testname => testdat ;;====================================================================== ;; runs:run-tests-queue is called by runs:run-tests ;;====================================================================== ;; ;; test-records is a hash table testname:item_path => vector < testname testconfig waitons priority items-info ... > (define (runs:run-tests-queue run-id runname test-records keyvals flags test-patts required-tests reglen-in all-tests-registry) |
︙ | ︙ | |||
1564 1565 1566 1567 1568 1569 1570 | extras) extras) '()))) (waitons (delete-duplicates (append (tests:testqueue-get-waitons test-record) extra-waits) equal?)) (newtal (append tal (list hed))) (regfull (>= (length reg) reglen)) (num-running (rmt:get-count-tests-running-for-run-id run-id #t)) ;; fastmode=yes | > > > > > > > > > > > > > > > > | | | | | | | | | | | | | | | > > | 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 | extras) extras) '()))) (waitons (delete-duplicates (append (tests:testqueue-get-waitons test-record) extra-waits) equal?)) (newtal (append tal (list hed))) (regfull (>= (length reg) reglen)) (num-running (rmt:get-count-tests-running-for-run-id run-id #t)) ;; fastmode=yes (testdat (let ((oldtestdat (hash-table-ref/default *runs-testdat-cache* tfullname #f))) (if oldtestdat (begin (runs:testdat-hed-set! oldtestdat hed) (runs:testdat-tal-set! oldtestdat tal) (runs:testdat-reg-set! oldtestdat reg) (runs:testdat-reruns-set! oldtestdat reruns) (runs:testdat-test-record-set! oldtestdat test-record) (runs:testdat-newtal-set! oldtestdat newtal) (if (not (equal? (runs:testdat-waitons oldtestdat) waitons)) (debug:print 0 *default-log-port* " waitons changed for runs:testdat")) (if (not (equal? (runs:testdat-itemmaps oldtestdat) itemmaps)) (debug:print 0 *default-log-port* " itemmaps changed for runs:testdat")) oldtestdat) (let ((newtestdat (make-runs:testdat hed: hed tal: tal reg: reg reruns: reruns test-record: test-record test-name: test-name item-path: item-path jobgroup: jobgroup waitons: waitons testmode: testmode newtal: newtal itemmaps: itemmaps ;; prereqs-not-met: prereqs-not-met ))) (hash-table-set! *runs-testdat-cache* tfullname newtestdat) newtestdat))))) (runs:dat-regfull-set! runsdat regfull) (if (> num-running 0) (set! last-time-some-running (current-seconds))) (if (> (current-seconds)(+ last-time-some-running (or (configf:lookup *configdat* "setup" "give-up-waiting") 36000))) (hash-table-set! *max-tries-hash* tfullname (+ (hash-table-ref/default *max-tries-hash* tfullname 0) 1))) |
︙ | ︙ | |||
1713 1714 1715 1716 1717 1718 1719 | ;; wait for load here (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (loop-can-run-more (runs:can-run-more-tests runsdat run-id jobgroup max-concurrent-jobs) (- remtries 1))))))) ))))) ;; I'm not clear on why prereqs are gathered here TODO: verfiy this is needed | > > | | 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 | ;; wait for load here (if (runs:dat-load-mgmt-function runsdat)((runs:dat-load-mgmt-function runsdat))) (loop-can-run-more (runs:can-run-more-tests runsdat run-id jobgroup max-concurrent-jobs) (- remtries 1))))))) ))))) ;; I'm not clear on why prereqs are gathered here TODO: verfiy this is needed (runs:lazy-get-prereqs-not-met testdat run-id waitons hed item-path mode: testmode itemmaps: itemmaps) ;; I'm not clear on why we'd capture running job counts here TODO: verify this is needed (runs:dat-can-run-more-tests-set! runsdat (runs:can-run-more-tests runsdat run-id jobgroup max-concurrent-jobs)) (let ((loop-list (runs:process-expanded-tests runsdat testdat))) ;; in process-expanded-tests ultimately run:test -> launch-test -> test actually running (if loop-list (apply loop loop-list)))) |
︙ | ︙ | |||
1782 1783 1784 1785 1786 1787 1788 | ;; if items is a proc then need to run items:get-items-from-config, get the list and loop ;; - but only do that if resources exist to kick off the job ;; EXPAND ITEMS ((or (procedure? items)(eq? items 'have-procedure)) (debug:print-info 4 *default-log-port* "cond branch - " "rtq-4") (let ((can-run-more #f)) ;; (runs:can-run-more-tests runsdat run-id jobgroup max-concurrent-jobs))) | | | | 1832 1833 1834 1835 1836 1837 1838 1839 1840 1841 1842 1843 1844 1845 1846 1847 1848 | ;; if items is a proc then need to run items:get-items-from-config, get the list and loop ;; - but only do that if resources exist to kick off the job ;; EXPAND ITEMS ((or (procedure? items)(eq? items 'have-procedure)) (debug:print-info 4 *default-log-port* "cond branch - " "rtq-4") (let ((can-run-more #f)) ;; (runs:can-run-more-tests runsdat run-id jobgroup max-concurrent-jobs))) (if (not can-run-more) #;(and (list? can-run-more) ;; IDEA, this mechanism may have had some value, make it configurable to test pros/cons TODO (car can-run-more)) (let ((loop-list (runs:expand-items hed tal reg reruns regfull newtal jobgroup max-concurrent-jobs run-id waitons item-path testmode test-record can-run-more items runname tconfig reglen test-registry test-records itemmaps testdat))) ;; itemized test expanded here (if loop-list (apply loop loop-list) (debug:print-info 4 *default-log-port* " -- Can't expand hed="hed) ) ) ;; if can't run more just loop with next possible test (loop (car newtal)(cdr newtal) reg reruns)))) |
︙ | ︙ |