Megatest

Changes On Branch v1.7001-multi-db-nohome
Login

Changes In Branch v1.7001-multi-db-nohome Excluding Merge-Ins

This is equivalent to a diff from 02526c166e to 760892c0d7

2022-05-15
05:06
Merging the v1.7001-multi-db-nohome branch into single commit to rebase forward Closed-Leaf check-in: f5e182b504 user: matt tags: v1.7001-multi-db-nohome-for-rebase
04:56
Merged all v1.7001-multi-db changes into one commit to rebase forward Closed-Leaf check-in: d9f5072bcb user: matt tags: v1.7001-multi-db-for-rebase
04:52
Remove short circuit for no homehost in common:get-homehost Closed-Leaf check-in: 760892c0d7 user: matt tags: v1.7001-multi-db-nohome
2022-05-09
19:43
common:get-homehost returns '(#f . #f) if there is no homehost file. check-in: 4db396b0c0 user: matt tags: v1.7001-multi-db-nohome
14:41
No homehost, the beginning. check-in: ae88a2163a user: mrwellan tags: v1.7001-multi-db-nohome
14:12
Allow stealing db lock rather than just failing Closed-Leaf check-in: 02526c166e user: mrwellan tags: v1.7001, v1.7001-multi-db-rb01
07:30
merged fork check-in: 782400400d user: matt tags: v1.7001-multi-db-rb01

Modified TODO from [da5eae4898] to [539b258f86].

14
15
16
17
18
19
20





































21
22
23
24
25
26
27
# 
#     You should have received a copy of the GNU General Public License
#     along with Megatest.  If not, see <http://www.gnu.org/licenses/>.

TODO
====






































WW15
. fill newview matrix with data, filter pipeline gui elements
. improve [script], especially indent handling

WW16
. split db into megatest.db (runs etc.) db/<something>.db
. release basic newview implementation







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# 
#     You should have received a copy of the GNU General Public License
#     along with Megatest.  If not, see <http://www.gnu.org/licenses/>.

TODO
====

No-homehost
-----------

Server side

    1. Add invocation type to -m [DONE]
    
    2. Switch starting of servers to look at .homehost, if it exists, respect it,
       otherwise start on current machine. [DONE]
       
    3. On start server drops a packet into .meta after starting the http server,
       pkt includes:
        a. host
        b. port
        c. Invocation type of process that started the server
        d. D card (packet create card)
        e. Process id of the server process
	
    4. Server will stay alive if it receives calls
    
    5. Server touches the pkt file every ten seconds
    
    6. On exiting the server removes its pkt file
    
Client side

    1. If no pkts in .meta start a server, wait 5-10 seconds and look again
    
    2. Read all pkts in .meta dir
    
    3. Sort servers by (take left most)
        a. Invocation type: dboard -> runner -> other -> exec
        b. Run duration: shortest -> longest
        c. Tie breaker is the shar1 hash for the pkt
	
    4. Ping the server and continue as before
    
WW15
. fill newview matrix with data, filter pipeline gui elements
. improve [script], especially indent handling

WW16
. split db into megatest.db (runs etc.) db/<something>.db
. release basic newview implementation

Modified common.scm from [fd321f41c8] to [3b9840f92c].

1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
  (message-digest-string (md5-primitive) str))

;;======================================================================
;; E X I T   H A N D L I N G
;;======================================================================

(define (common:run-sync?)
    (and (common:on-homehost?)
	 (args:get-arg "-server")))

(define (common:human-time)
  (time->string (seconds->local-time (current-seconds)) "%Y-%m-%d %H:%M:%S"))


(define (std-signal-handler signum)







|







1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
  (message-digest-string (md5-primitive) str))

;;======================================================================
;; E X I T   H A N D L I N G
;;======================================================================

(define (common:run-sync?)
    (and (common:on-homehost?) ;; huh? isn't this by definition both true?
	 (args:get-arg "-server")))

(define (common:human-time)
  (time->string (seconds->local-time (current-seconds)) "%Y-%m-%d %H:%M:%S"))


(define (std-signal-handler signum)
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337

1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350



1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
      (if (and (getenv "MT_ITEMPATH")
               (not (equal? (getenv "MT_ITEMPATH") "")))
          (getenv "MT_TEST_NAME")
          (conc (getenv "MT_TEST_NAME") "/" (getenv "MT_ITEMPATH")))
      #f))

;;======================================================================
;; logic for getting homehost. Returns (host . at-home)
;; IF *toppath* is not set, wait up to five seconds trying every two seconds
;; (this is to accomodate the watchdog)
;;
(define (common:get-homehost #!key (trynum 5))
  ;; called often especially at start up. use mutex to eliminate collisions

  (mutex-lock! *homehost-mutex*)
  (cond
   (*home-host*
    (mutex-unlock! *homehost-mutex*)
    *home-host*)
   ((not *toppath*)
    (mutex-unlock! *homehost-mutex*)
    (launch:setup) ;; safely mutexed now
    (if (> trynum 0)
	(begin
	  (thread-sleep! 2)
	  (common:get-homehost trynum: (- trynum 1)))
	#f))



   (else
    (let* ((currhost (get-host-name))
	   (bestadrs (server:get-best-guess-address currhost))
	   ;; first look in config, then look in file .homehost, create it if not found
	   (homehost (or (configf:lookup *configdat* "server" "homehost" )
			 (handle-exceptions
			     exn
			     (if (> trynum 0)
				 (let ((delay-time (* (- 5 trynum) 5)))
				   (mutex-unlock! *homehost-mutex*)
				   (debug:print 0 *default-log-port* "ERROR: ["(common:human-time)"] Failed to read .homehost file, delaying "
						delay-time " seconds and trying again, message: "  ((condition-property-accessor 'exn 'message) exn)
						", exn=" exn)
				   (thread-sleep! delay-time)
				   (common:get-homehost trynum: (- trynum 1)))
				 (begin
				   (mutex-unlock! *homehost-mutex*)
				   (debug:print 0 *default-log-port* "ERROR: ["(common:human-time)
						"] Failed to read .homehost file after trying five times. Giving up and exiting, message: "
						((condition-property-accessor 'exn 'message) exn))
				   (exit 1)))
			   (let ((hhf (conc *toppath* "/.homehost")))
			     (if (common:file-exists? hhf)
				 (with-input-from-file hhf read-line)
				 (if (file-write-access? *toppath*)
				     (begin
				       (with-output-to-file hhf
					 (lambda ()
					   (print bestadrs)))
				       (begin
					 (mutex-unlock! *homehost-mutex*)
					 (car (common:get-homehost))))
				     #f))))))
	   (at-home  (or (equal? homehost currhost)
			 (equal? homehost bestadrs))))
      (set! *home-host* (cons homehost at-home))
      (mutex-unlock! *homehost-mutex*)
      *home-host*))))

;;======================================================================
;; am I on the homehost?
;;
(define (common:on-homehost?)
  (let ((hh (common:get-homehost)))
    (if hh







|





>
|
|
|
|
|
|
|
|
|
|
|
|
|
>
>
>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<










|
|
|
|
|
|







1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375

1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
      (if (and (getenv "MT_ITEMPATH")
               (not (equal? (getenv "MT_ITEMPATH") "")))
          (getenv "MT_TEST_NAME")
          (conc (getenv "MT_TEST_NAME") "/" (getenv "MT_ITEMPATH")))
      #f))

;;======================================================================
;; logic for getting homehost . Returns (host . at-home)
;; IF *toppath* is not set, wait up to five seconds trying every two seconds
;; (this is to accomodate the watchdog)
;;
(define (common:get-homehost #!key (trynum 5))
  ;; called often especially at start up. use mutex to eliminate collisions
  (let ((hhf (conc *toppath* "/.homehost")))
    (mutex-lock! *homehost-mutex*)
    (cond
     (*home-host*
      (mutex-unlock! *homehost-mutex*)
      *home-host*)
     ((not *toppath*)
      (mutex-unlock! *homehost-mutex*)
      (launch:setup) ;; safely mutexed now
      (if (> trynum 0)
	  (begin
	    (thread-sleep! 2)
	    (common:get-homehost trynum: (- trynum 1)))
	  `(#f . #f)))
     #;((and (configf:lookup *configdat* "server" "no-homehost")
	   (not (file-exists? hhf)))
      `(#f . #f))  ;; NEW METHOD, DO NOT USE A HOMEHOST - nope, not doing it this way.
     (else
      (let* ((currhost (get-host-name))
	     (bestadrs (server:get-best-guess-address currhost))
	     ;; first look in config, then look in file .homehost, create it if not found
	     (homehost (or (configf:lookup *configdat* "server" "homehost" )
			   (handle-exceptions
			       exn
			       (if (> trynum 0)
				   (let ((delay-time (* (- 5 trynum) 5)))
				     (mutex-unlock! *homehost-mutex*)
				     (debug:print 0 *default-log-port* "ERROR: ["(common:human-time)"] Failed to read .homehost file, delaying "
						  delay-time " seconds and trying again, message: "  ((condition-property-accessor 'exn 'message) exn)
						  ", exn=" exn)
				     (thread-sleep! delay-time)
				     (common:get-homehost trynum: (- trynum 1)))
				   (begin
				     (mutex-unlock! *homehost-mutex*)
				     (debug:print 0 *default-log-port* "ERROR: ["(common:human-time)
						  "] Failed to read .homehost file after trying five times. Giving up and exiting, message: "
						  ((condition-property-accessor 'exn 'message) exn))
				     (exit 1)))

			     (if (common:file-exists? hhf)
				 (with-input-from-file hhf read-line)
				 (if (file-write-access? *toppath*)
				     (begin
				       (with-output-to-file hhf
					 (lambda ()
					   (print bestadrs)))
				       (begin
					 (mutex-unlock! *homehost-mutex*)
					 (car (common:get-homehost))))
				     `(#f . #f))))))
	     (at-home  (or (equal? homehost currhost)
			   (equal? homehost bestadrs))))
	(set! *home-host* (cons homehost at-home))
	(mutex-unlock! *homehost-mutex*)
	*home-host*)))))

;;======================================================================
;; am I on the homehost?
;;
(define (common:on-homehost?)
  (let ((hh (common:get-homehost)))
    (if hh

Modified server.scm from [6d65c175e8] to [b26c3818fa].

109
110
111
112
113
114
115









116
117
118
119
120
121
122
    ((rpc)  (db:obj->string (vector success/fail query-sig result)))
    ((http) (db:obj->string (vector success/fail query-sig result)))
    ((fs)   result)
    (else 
     (debug:print-error 0 *default-log-port* "unrecognised transport type: " *transport-type*)
     result)))










;; Given an area path,  start a server process    ### NOTE ### > file 2>&1 
;; if the target-host is set 
;; try running on that host
;;   incidental: rotate logs in logs/ dir.
;;
(define  (server:run areapath) ;; areapath is *toppath* for a given testsuite area
  (let* ((curr-host   (get-host-name))







>
>
>
>
>
>
>
>
>







109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
    ((rpc)  (db:obj->string (vector success/fail query-sig result)))
    ((http) (db:obj->string (vector success/fail query-sig result)))
    ((fs)   result)
    (else 
     (debug:print-error 0 *default-log-port* "unrecognised transport type: " *transport-type*)
     result)))

(define (server:what-type-of-invocation)
  (cond
   ((args:get-arg "-run")       "run")
   ((args:get-arg "-server")    "server")
   ((args:get-arg "-execute")   "execute")
   ((or (args:get-arg "-remove-runs")) "run-related")
   ((string-search (car (argv)) "dboard") "dboard")
   (else (conc "other:"(string-intersperse (command-line-arguments) "_")))))

;; Given an area path,  start a server process    ### NOTE ### > file 2>&1 
;; if the target-host is set 
;; try running on that host
;;   incidental: rotate logs in logs/ dir.
;;
(define  (server:run areapath) ;; areapath is *toppath* for a given testsuite area
  (let* ((curr-host   (get-host-name))
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
	 (profile-mode (or (configf:lookup *configdat* "misc" "profilesw")
			   ""))
	 (cmdln (conc (common:get-megatest-exe)
		      " -server " (or target-host "-") (if (equal? (configf:lookup *configdat* "server" "daemonize") "yes")
							   " -daemonize "
							   "")
		      ;; " -log " logfile
		      " -m testsuite:" testsuite
		      " " profile-mode
		      )) ;; (conc " >> " logfile " 2>&1 &")))))
	 (log-rotate  (make-thread common:rotate-logs  "server run, rotate logs thread")) ;; why are we rotating logs here? This is a sensitive location with a lot going on!?
         (load-limit  (configf:lookup-number *configdat* "jobtools" "max-server-start-load" default: 3.0)))
    ;; we want the remote server to start in *toppath* so push there
    (push-directory areapath)
    (debug:print 0 *default-log-port* "INFO: Trying to start server (" cmdln ") ...")







|







140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
	 (profile-mode (or (configf:lookup *configdat* "misc" "profilesw")
			   ""))
	 (cmdln (conc (common:get-megatest-exe)
		      " -server " (or target-host "-") (if (equal? (configf:lookup *configdat* "server" "daemonize") "yes")
							   " -daemonize "
							   "")
		      ;; " -log " logfile
		      " -m testsuite:"testsuite":"(server:what-type-of-invocation)
		      " " profile-mode
		      )) ;; (conc " >> " logfile " 2>&1 &")))))
	 (log-rotate  (make-thread common:rotate-logs  "server run, rotate logs thread")) ;; why are we rotating logs here? This is a sensitive location with a lot going on!?
         (load-limit  (configf:lookup-number *configdat* "jobtools" "max-server-start-load" default: 3.0)))
    ;; we want the remote server to start in *toppath* so push there
    (push-directory areapath)
    (debug:print 0 *default-log-port* "INFO: Trying to start server (" cmdln ") ...")