CoreOS ne peut pas tirer le conteneur docker sur le coffre

Tout en suivant le guide systemd j'ai rencontré un problème avec cet exemple de configuration:

[Unit] Description=MyApp After=docker.service Requires=docker.service [Service] TimeoutStartSec=0 ExecStartPre=-/usr/bin/docker kill busybox1 ExecStartPre=-/usr/bin/docker rm busybox1 ExecStartPre=/usr/bin/docker pull busybox ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done" [Install] WantedBy=multi-user.target 

Si j'exécute "systemctl start hello" pendant que le système fonctionne, le docker démarrera, mais si j'active "hello.service" pour démarrer à partir de systemd et redémarrer la machine, je reçois cette erreur:

 -- Reboot -- Mar 15 01:17:44 general systemd[1]: Starting MyApp... Mar 15 01:17:47 general docker[510]: Error response from daemon: No such container: busybox1 Mar 15 01:17:47 general docker[510]: time="2015-03-15T01:17:47Z" level="fatal" msg="Error: failed to kill one or more containers" Mar 15 01:17:47 general docker[637]: Error response from daemon: No such container: busybox1 Mar 15 01:17:47 general docker[637]: time="2015-03-15T01:17:47Z" level="fatal" msg="Error: failed to remove one or more containers" Mar 15 01:17:47 general systemd[1]: hello.service: control process exited, code=exited status=1 Mar 15 01:17:47 general systemd[1]: Failed to start MyApp. Mar 15 01:17:47 general systemd[1]: Unit hello.service entered failed state. Mar 15 01:17:47 general systemd[1]: hello.service failed. Mar 15 01:17:47 general docker[673]: Pulling repository busybox Mar 15 01:17:47 general docker[673]: time="2015-03-15T01:17:47Z" level="fatal" msg="Get https://index.docker.io/v1/repositories/library/busybox/images: dial tcp: lookup index.docker.io: connection refused" 

Tout indice de ce que je fais mal?

Si vous souhaitez l'exécuter au démarrage, vous devez dépendre du service network-online.target . Vous devez vous assurer que la mise en réseau est terminée avant d'exécuter la command docker pull . Reportez-vous à la section " Exécution des services après la mise en place du réseau " dans la documentation de systemd pour plus d'informations.

Les travaux suivants pour moi:

 [Unit] Description=MyApp Requires=docker.service network-online.target After=docker.service network-online.target [Service] TimeoutStartSec=0 ExecStartPre=-/usr/bin/docker kill busybox1 ExecStartPre=-/usr/bin/docker rm busybox1 ExecStartPre=/usr/bin/docker pull busybox ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done" [Install] WantedBy=multi-user.target 

J'ai essayé le suggéré:

 Requires=docker.service network-online.target After=docker.service network-online.target 

Mais ça n'a pas marché. J'ai:

 -- Reboot -- Feb 09 23:20:24 coreos-512mb-fra1-01 systemd[1]: Starting New Relic Linux Server Monitor (nrsysmond)... Feb 09 23:20:24 coreos-512mb-fra1-01 docker[782]: Failed to kill container (nrsysmond): Error response from daemon: Cannot kill container nrsysmond: Container c608c10f44c06c550492e872773b0d5a59a7b86e639f63487f6186983a4f786b is not running Feb 09 23:20:25 coreos-512mb-fra1-01 docker[787]: nrsysmond Feb 09 23:20:25 coreos-512mb-fra1-01 docker[794]: Pulling repository docker.io/newrelic/nrsysmond Feb 09 23:20:25 coreos-512mb-fra1-01 docker[794]: Error while pulling image: Get https://index.docker.io/v1/repositories/newrelic/nrsysmond/images: dial tcp: lookup index.docker.io: Temporary failure in name resolution Feb 09 23:20:25 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Control process exited, code=exited status=1 Feb 09 23:20:25 coreos-512mb-fra1-01 systemd[1]: Failed to start New Relic Linux Server Monitor (nrsysmond). Feb 09 23:20:25 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Unit entered failed state. Feb 09 23:20:25 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Failed with result 'exit-code'. 

Ajout de Restart et RestartSec résolu le problème:

 # Restart after crash Restart=always # Give the service 10 seconds to recover after the previous restart RestartSec=10s 

Ce n'est pas élégant mais fonctionne. Voici le journal:

 Feb 09 23:23:57 coreos-512mb-fra1-01 systemd[1]: Starting New Relic Linux Server Monitor (nrsysmond)... Feb 09 23:23:57 coreos-512mb-fra1-01 docker[792]: Failed to kill container (nrsysmond): Error response from daemon: Cannot kill container nrsysmond: Container 31fb78809 Feb 09 23:23:57 coreos-512mb-fra1-01 docker[797]: nrsysmond Feb 09 23:23:57 coreos-512mb-fra1-01 docker[804]: Pulling repository docker.io/newrelic/nrsysmond Feb 09 23:23:57 coreos-512mb-fra1-01 docker[804]: Error while pulling image: Get https://index.docker.io/v1/repositories/newrelic/nrsysmond/images: dial tcp: lookup ind Feb 09 23:23:57 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Control process exited, code=exited status=1 Feb 09 23:23:57 coreos-512mb-fra1-01 systemd[1]: Failed to start New Relic Linux Server Monitor (nrsysmond). Feb 09 23:23:57 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Unit entered failed state. Feb 09 23:23:57 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Failed with result 'exit-code'. Feb 09 23:24:08 coreos-512mb-fra1-01 systemd[1]: newrelic.service: Service hold-off time over, scheduling restart. Feb 09 23:24:08 coreos-512mb-fra1-01 systemd[1]: Stopped New Relic Linux Server Monitor (nrsysmond). Feb 09 23:24:08 coreos-512mb-fra1-01 systemd[1]: Starting New Relic Linux Server Monitor (nrsysmond)... Feb 09 23:24:08 coreos-512mb-fra1-01 docker[869]: Failed to kill container (nrsysmond): Error response from daemon: Cannot kill container nrsysmond: No such container: Feb 09 23:24:08 coreos-512mb-fra1-01 docker[875]: Failed to remove container (nrsysmond): Error response from daemon: No such container: nrsysmond Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: latest: Pulling from newrelic/nrsysmond Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: 6ffe5d2d6a97: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: f4e00f994fd4: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: e99f3d1fc87b: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: a3ed95caeb02: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: a3ed95caeb02: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: 65cdb07f703d: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: a3ed95caeb02: Already exists Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: Digest: sha256:c184f97452321baa0b0ee4ee402e3aaa204f041beb7a71a347db6c4efecba07f Feb 09 23:24:10 coreos-512mb-fra1-01 docker[883]: Status: Image is up to date for newrelic/nrsysmond:latest Feb 09 23:24:10 coreos-512mb-fra1-01 systemd[1]: Started New Relic Linux Server Monitor (nrsysmond). 

Depuis que j'ai eu ce problème sur DigitalOcean CoreOS, j'ai cherché de l'aide dans leurs ressources. Il y a un thread lié à ce problème. Cela suggère ceci:

 After=early-docker.service systemd-networkd-wait-online.service Requires=early-docker.service systemd-networkd-wait-online.service Before=early-docker.target 

Mais ces lignes ont rendu Docker bloqué / gelé. Je devais tuer tous les process liés à mon service, puis redémarrer le service Docker pour le rendre à nouveau réactif.


Version Docker:

 docker -v Docker version 1.10.0, build e21da33