View Issue Details

IDProjectCategoryView StatusLast Update
0006230Talerdeployment and operationspublic2021-08-24 16:23
ReporterbuckE Assigned TobuckE  
PrioritynormalSeverityminorReproducibilityhave not tried
Status closedResolutionfixed 
Target Version0.8Fixed in Version0.8 
Summary0006230: Inferred from Christian: buildbot with weblatetest
Description> You access buildbot under buildbot.taler.net for the Web interface, and
> using
>
> $ git clone git+ssh://git@git.taler.net/deployment.git
>
> (and git commit/push) to configure (buildbot/master.cfg).
>
> You can use the 'checker-worker' to add additional tasks for experiments.
>
> Maybe better: why not use the weblatetest@taler.net user (which AFAIK is
> now useless, right?) to configure/setup your own buildslave and add that
> to the slave list?
TagsNo tags attached.

Activities

buckE

2020-05-06 08:43

reporter   ~0015859

> You access buildbot under buildbot.taler.net for the Web interface

Right but I'm asking what credentials to use?

Christian Grothoff

2020-05-06 12:21

manager   ~0015860

Those credentials are also in the Git. But that Web UI is not really used for anything, all the real stuff is done via Git.

buckE

2020-05-11 05:14

reporter   ~0015876

Okay I'll look online for some buildbot howto instructions so I can start learning this.

buckE

2020-05-11 08:27

reporter   ~0015877

I'm estimating 32 hours of research/testing/virtual machine work before I'm ready to create a test project. But if there is someone willing to provide some doc on how we handle this, that goes down to about two or three hours. If I'm to understand deployment.git/README.md and deployment.git/buildbot/README, I add 24 hours to my estimate.

Christian Grothoff

2020-05-11 09:48

manager   ~0015881

http://buildbot.net/ really is the documentation for buildbot. the buildbot/README merely says which Debian packages you need for the features of Buildbot we use. The top-level README is only giving the entire deployment.git structure, but you really only need the stuff in buildbot/master.cfg to get started. Basically, that file already contains 'workers' with 'passphrases', so setting up a new buildslave for testing is basically adding another entry to the c["workers"] array and running the buildbot tool to configure and launch the buildslave under the respective account.

buckE

2020-05-26 08:22

reporter   ~0015963

When I know what buildbot is and how we use it, I am sure that comment will make a lot of sense. :) I will begin my research tomorrow.

buckE

2020-05-27 09:11

reporter   ~0015972

Small note: this is starting to make sense.

buckE

2020-05-29 06:43

reporter   ~0015978

Maybe it's still making sense. Without info it is extremely slow going but I continue to read and grep and guess.

buckE

2020-05-29 09:58

reporter   ~0015982

> Basically, that file already contains 'workers' with 'passphrases'

Does it? I don't see those "workers" anywhere else.

> , so setting up a new buildslave for testing is basically adding another entry to the c["workers"] array

Well, I can copypasta with new words like "testing" but I don't know what exactly that would be doing.

> and running the buildbot tool

What buildbot tool? How do I run this?

> to configure and launch the buildslave under the respective account.

What account? I don't know what you mean. I think there are must be some secrets I am not aware of. I've read the Buildbot Manual and the Buildbot Tutorial, and these statements and master.cfg do not indicate steps I can take based on what I have learned from those docs.

Christian Grothoff

2020-05-30 12:09

manager   ~0015984

To add a new buildslave, you:
1) add a line like:
      worker.Worker("lcov-worker", "lcov-pass"),
    to the c["workers"] array. This specifies the $NAME and $PASS for the worker. This is how the buildbot *client*
    will authenticate to the buildmaster. We don't really care about security here, as the worker simply 'volunteers' to
    run code on behalf of the master. These are NOT the username/password you would find in /etc/passwd.
2) You login (say via SSH) as some username (say weblatetest). Here, you use
    $ buildbot-worker create-worker DIRECTORY http://buildbot.taler.net:9989/ $NAME $PASS
   to create a buildbot worker directory using $NAME and $PASS to log into the buildmaster on port 9989
   (or localhost, whatever is applicable)
3) you use
   $ buildbot-worker start
   to launch the buildslave.
4) You configure which jobs the buildslave should run (and when) in the master.cfg.

Not sure how any of the above CANNOT be documented in the buildbot documentation, we got it from there.

buckE

2020-06-01 09:17

reporter   ~0016217

Aha! (2) actually helps illuminate things a lot. Thank you!

buckE

2020-06-01 10:54

reporter   ~0016218

Here is the result of "$ buildbot-worker create-worker DIRECTORY http://buildbot.taler.net:9989/ $NAME $PASS" when run as "weblatetest" user on taler.net:

weblatetest@gv:~$ buildbot-worker
Traceback (most recent call last):
  File "/usr/local/bin/buildbot-worker", line 6, in <module>
    from buildbot_worker.scripts.runner import run
ModuleNotFoundError: No module named 'buildbot_worker

buckE

2020-06-02 11:08

reporter   ~0016224

Progress:

buildbot-worker create-worker buildbot-weblatetest buildbot.taler.net:9989 buildbot-weblatetest <password>
mkdir /home/weblatetest/buildbot-weblatetest
mkdir /home/weblatetest/buildbot-weblatetest/info
Creating info/admin, you need to edit it appropriately.
Creating info/host, you need to edit it appropriately.
Not creating info/access_uri - add it if you wish
Please edit the files in /home/weblatetest/buildbot-weblatetest/info appropriately.
worker configured in /home/weblatetest/buildbot-weblatetest

buckE

2020-06-03 08:47

reporter   ~0016225

There is still something missing. Using for reference: https://docs.buildbot.net/latest/tutorial/firstrun.html#creating-a-worker

When I start the new worker (that I created above) with "buildbot-worker start buildbot-weblatetest", it fails. I am supposed to look at "twistd.log" but it is not in /var/log or in the /home/weblatetest directory.

If you see https://docs.buildbot.net/latest/tutorial/firstrun.html#creating-a-master, there are two locations being discussed but I do not necessarily know what these are. I learned above that one of them (where I create the worker) is the /home/weblatetest directory onn taler.net. The only other location I have access to is my own local terminal. But I am looking for what is referenced here:

 "Meanwhile, from the other terminal, in the master log (twisted.log in the master directory)"

I really just need a brief run-down on the setup. It would save so much time and I could focus on buildbot, instead of learning our setup.

Christian Grothoff

2020-06-03 13:06

manager   ~0016227

Two issues:
1) buildmaster had trouble re-loading its configuration because of the python3.7/3.8 migration
2) your config had buildbot.taler.net, but the master.cfg only binds to loopback (127.0.0.1) for security reasons;

I've fixed those. However, https://bugzilla.redhat.com/show_bug.cgi?id=1557687 also affects Debian, so when restarting the buildmaster I had to disable the Web interface. This should ideally be fixed in Debian, but as a work-around we could pip install buildbot locally under buildbot-master@taler.net. I've added Buck's SSH key to that account so he can read the logs there and bring back the Web interface.

buckE

2020-06-04 08:57

reporter   ~0016228

2) excellent. I only used the hostname because of your instructions, and I was concerned about the difference between this and the current docs (https://docs.buildbot.net/latest/tutorial/firstrun.html#creating-a-worker). Probably you built on an older version before localhost was the correct protocol. One question answered.

Regarding buildbot-www, do I need the web interface? I was getting the impression it was not necessary from your earlier comments

The worker still does not load. I do not expect you to debug but if you understand the error please tell me. Also I've posted a follow-up to a similar existing buildbot bug report. Here is the twistd.log output:

2020-06-04 08:15:42+0200 [Broker,client] ReconnectingPBClientFactory.failedToGetPerspective
2020-06-04 08:15:42+0200 [Broker,client] While trying to connect:
        Traceback from remote host -- builtins.RuntimeError: rejecting duplicate worker
        
2020-06-04 08:15:42+0200 [-] Stopping factory <buildbot_worker.pb.BotFactory object at 0x7f6016beac40>
2020-06-04 08:15:42+0200 [-] Main loop terminated.
2020-06-04 08:15:42+0200 [-] Server Shut Down.
2020-06-04 08:23:34+0200 [-] Loading buildbot.tac...
2020-06-04 08:23:34+0200 [-] Loaded.
2020-06-04 08:23:34+0200 [-] twistd 18.9.0 (/usr/bin/python3 3.8.3) starting up.
2020-06-04 08:23:34+0200 [-] reactor class: twisted.internet.epollreactor.EPollReactor.
2020-06-04 08:23:34+0200 [-] Starting Worker -- version: 2.7.0
2020-06-04 08:23:34+0200 [-] recording hostname in twistd.hostname
2020-06-04 08:23:34+0200 [-] Starting factory <buildbot_worker.pb.BotFactory object at 0x7f778f699ca0>
2020-06-04 08:23:34+0200 [Broker,client] ReconnectingPBClientFactory.failedToGetPerspective
2020-06-04 08:23:34+0200 [Broker,client] While trying to connect:
        Traceback from remote host -- builtins.RuntimeError: rejecting duplicate worker
        
2020-06-04 08:23:34+0200 [-] Stopping factory <buildbot_worker.pb.BotFactory object at 0x7f778f699ca0>
2020-06-04 08:23:34+0200 [-] Main loop terminated.
2020-06-04 08:23:34+0200 [-] Server Shut Down.

Christian Grothoff

2020-06-04 09:58

manager   ~0016229

Eh, I had already started the slave under the weblatetest account. So you're getting this message because somehow you are trying to start/connect a *second* one (with the same credentials).

The Web interface might have made this easier for you to understand, that's why it is "needed". It's not strictly needed for us to use the buildbot (and certainly not to configure it), but it can be convenient to simply see which workers are running without having to read logs...

buckE

2020-06-05 06:56

reporter   ~0016232

> Eh, I had already started the slave under the weblatetest account.

How did you do that? Is it automatic or did you type the same command I typed?

Can you disable it? Should you disable it? I'm trying to learn how the pieces fit together. If it's normal for you to start it, great. Although I'm not sure how I would control it then when I want to make changes?

Christian Grothoff

2020-06-05 17:29

manager   ~0016233

I don't know which command you used, I used:

weblatetest@gv:~$ cd buildbot-weblatetest/
weblatetest@gv:~/buildbot-weblatetest$ buildbot-worker start
Following twistd.log until startup finished..
The buildbot-worker appears to have (re)started correctly.

You can disable it with 'stop'. Integration with systemd was not done (yet), as this is just for testing.
Now, once it is running, you don't usually ever do anything with the buildslave (modulo integration with systemd to start it on system boot), all actions are controlled via the buildmaster (via deployment.git).

buckE

2020-06-08 08:33

reporter   ~0016236

Okay great. What I'm asking though is why you started it instead of me, since I was testing it. That means that there is some other process/step (whatever you were involved in) that I did not know about, and should know about, to proceed. Of course I can stop | start it but why did this happen in the first place, and will it happen again? And should it, or was it a mistake, etc.?

buckE

2020-06-09 04:09

reporter   ~0016237

Last edited: 2020-06-09 07:37

I've just run "pip install buildbot.www" and "buildbot.www" (not the full buildbot, just the -www and .www packages) on buildbot-master user account. I did not do "pip install buildbot" as you suggested because I think that will create two buildbot executables (the debian and the pip) which we probably want to avoid. Please correct me if I am wrong.

Lines 1019-1029 ("www" definintion) of master.cfg uncommented and re-committed to deployment.git.

I do not see the web interface at https://buildbot.taler.net/ yet, and netstat does not show use of the port in master.cfg. But I assume it will restart with a fresh buildbot cycle automatically? I still do not understand our config, including where and how often the automatic reads/runs are configured, but I think this is correct.

There is also the possibility that the webUI fails because we are using python2.7: https://bugs.gnunet.org/view.php?id=6369

I did decide to take the risk and run "buildbot restart" but the error was:

buildbot-master@gv:~$ buildbot restart
error reading '/home/buildbot-master/buildbot.tac': No such file or directory
invalid buildmaster directory '/home/buildbot-master'

Maybe I could symlink buildbot-master to master/ (or just specify "master" as the command line option) and re-run the restart but that is way too far to go without checking first that it is ok. But if you confirm OK I will restart buildbot.

buckE

2020-06-09 05:52

reporter   ~0016238

weblatetest job starts | stops fine. I'm not sure yet where to add scripts/instructions but we have cleared the recent problems.

buckE

2020-06-09 05:57

reporter   ~0016239

weblatetest job starts | stops fine. I'm not sure yet where to add scripts/instructions but we have cleared the recent problems (with the weblatetest build worker)

buckE

2020-06-09 06:16

reporter   ~0016240

(also did pip install buildbot-waterfall-view buildbot-console-view)

buckE

2020-06-09 08:04

reporter   ~0016241

Update: after hours of waiting for webUI to show up, I realized there must be a problem with the master.cfg reloading. So I tried it manually ("buildbot restart master") and saw that the webUI authentication was also commented out. Now the master reloads, and the webUI is working: https://buildbot.taler.net/#/

buckE

2020-06-09 08:05

reporter   ~0016242

The original task (and also the task of setting up the WebUI) has been completed.

Christian Grothoff

2020-06-09 11:56

manager   ~0016245

Last edited: 2020-06-09 11:57

https://bugs.gnunet.org/view.php?id=6230#c16239 => it's just your playground worker, so I think that one we can leave for experiments without too much documentation or scripting.

For 'real' jobs, we should have the worker launched as a systemd user service and create additional, worker-specific accounts. That said, you surely could use that worker to test out new CI/CD jobs.

You already got your first ones: Web site link checking and generating alerts when there are warnings building the docs.git Sphinx documentation (which you may want to directly integrate with the existing doc-builder).

Issue History

Date Modified Username Field Change
2020-05-04 09:04 buckE New Issue
2020-05-04 09:04 buckE Assigned To => buckE
2020-05-04 09:04 buckE Status new => assigned
2020-05-06 08:43 buckE Note Added: 0015859
2020-05-06 12:21 Christian Grothoff Note Added: 0015860
2020-05-11 05:14 buckE Note Added: 0015876
2020-05-11 08:27 buckE Note Added: 0015877
2020-05-11 09:48 Christian Grothoff Note Added: 0015881
2020-05-26 08:22 buckE Note Added: 0015963
2020-05-27 09:11 buckE Note Added: 0015972
2020-05-29 06:43 buckE Note Added: 0015978
2020-05-29 09:58 buckE Note Added: 0015982
2020-05-30 12:09 Christian Grothoff Note Added: 0015984
2020-06-01 09:17 buckE Note Added: 0016217
2020-06-01 10:54 buckE Note Added: 0016218
2020-06-02 11:08 buckE Note Added: 0016224
2020-06-03 08:47 buckE Note Added: 0016225
2020-06-03 13:06 Christian Grothoff Note Added: 0016227
2020-06-04 08:57 buckE Note Added: 0016228
2020-06-04 09:58 Christian Grothoff Note Added: 0016229
2020-06-05 06:56 buckE Note Added: 0016232
2020-06-05 17:29 Christian Grothoff Note Added: 0016233
2020-06-08 08:33 buckE Note Added: 0016236
2020-06-09 04:09 buckE Note Added: 0016237
2020-06-09 04:48 buckE Note Edited: 0016237
2020-06-09 05:37 buckE Note Edited: 0016237
2020-06-09 05:52 buckE Note Added: 0016238
2020-06-09 05:57 buckE Note Added: 0016239
2020-06-09 06:16 buckE Note Added: 0016240
2020-06-09 07:37 buckE Note Edited: 0016237
2020-06-09 08:04 buckE Note Added: 0016241
2020-06-09 08:05 buckE Status assigned => resolved
2020-06-09 08:05 buckE Resolution open => fixed
2020-06-09 08:05 buckE Note Added: 0016242
2020-06-09 11:56 Christian Grothoff Note Added: 0016245
2020-06-09 11:57 Christian Grothoff Note Edited: 0016245
2020-07-24 11:56 Christian Grothoff Target Version => 0.8
2020-07-24 11:56 Christian Grothoff Fixed in Version => 0.8
2021-08-24 16:23 Christian Grothoff Status resolved => closed