Skip to content

Dead plugin+systemd = dead docker #813

@bboreham

Description

@bboreham

I am developing a Docker Network plugin, so from time to time it crashes and/or I stop it running. If, during this time, I restart the host, Docker will often get completely hosed.

It seems this is a result of Docker waiting 15 seconds per endpoint to try to talk to my plugin and systemd only waiting a limited time for Docker to get started. Also it appears to be timing out three times for the same endpoint (at least in this example):

Docker should provide some way for installations to recover from this situation - maybe a --force-network-cleanup flag ? Or do the retrying in the background after getting the main daemon up and running.

Dec 10 10:16:35 systemd[1]: Starting Docker Application Container Engine...
Dec 10 10:16:35 docker[1274]: time="2015-12-10T10:16:35.392673377Z" level=info msg="API listen on /var/run/docker.sock"
Dec 10 10:16:35 docker[1274]: time="2015-12-10T10:16:35.423531946Z" level=info msg="Firewalld running: false"
Dec 10 10:16:35 docker[1274]: time="2015-12-10T10:16:35.463752856Z" level=error msg="getNetworkFromStore for nid d5a93b9cbfa323a53eae7808358d67ebd6470b801cf5de33b5991a7bcd5c79bf failed while trying to build sandbox for cleanup: network d5a93b9cbfa323a53eae7808358d67ebd6470b801cf5de33b5991a7bcd5c79bf not found"
Dec 10 10:16:35 docker[1274]: time="2015-12-10T10:16:35.464265535Z" level=warning msg="Failed getting network for ep 72270e49642470d5676acd7f19ead85dd7e460030f71c7dc66436f1836107fa5 during sandbox cbbec073b80c886dc1f6dddbfd32f91ad5999638afdf4cf9a61bbdf29a1ba19a delete: network d5a93b9cbfa323a53eae7808358d67ebd6470b801cf5de33b5991a7bcd5c79bf not found"
Dec 10 10:16:35 docker[1274]: time="2015-12-10T10:16:35.464430396Z" level=error msg="failed to delete sandbox cbbec073b80c886dc1f6dddbfd32f91ad5999638afdf4cf9a61bbdf29a1ba19a while trying to cleanup: could not cleanup all the endpoints in container 0317189ada28cb6a1f5c16c670c8ff744e7f6510db73035bcf991f02cf9ca531 / sandbox cbbec073b80c886dc1f6dddbfd32f91ad5999638afdf4cf9a61bbdf29a1ba19a"
Dec 10 10:16:35 docker[1274]: time="2015-12-10T10:16:35.484161832Z" level=warning msg="Unable to locate plugin: weave, retrying in 1s"
Dec 10 10:16:36 docker[1274]: time="2015-12-10T10:16:36.485561219Z" level=warning msg="Unable to locate plugin: weave, retrying in 2s"
Dec 10 10:16:38 docker[1274]: time="2015-12-10T10:16:38.486948113Z" level=warning msg="Unable to locate plugin: weave, retrying in 4s"
Dec 10 10:16:42 docker[1274]: time="2015-12-10T10:16:42.487467103Z" level=warning msg="Unable to locate plugin: weave, retrying in 8s"
Dec 10 10:16:50 docker[1274]: time="2015-12-10T10:16:50.487936016Z" level=warning msg="Unable to locate plugin: weave, retrying in 1s"
Dec 10 10:16:51 docker[1274]: time="2015-12-10T10:16:51.488902480Z" level=warning msg="Unable to locate plugin: weave, retrying in 2s"
Dec 10 10:16:53 docker[1274]: time="2015-12-10T10:16:53.490038312Z" level=warning msg="Unable to locate plugin: weave, retrying in 4s"
Dec 10 10:16:57 docker[1274]: time="2015-12-10T10:16:57.491067983Z" level=warning msg="Unable to locate plugin: weave, retrying in 8s"
Dec 10 10:17:05 docker[1274]: time="2015-12-10T10:17:05.491327232Z" level=error msg="getEndpointFromStore for eid 35b6ce7a721a46901c6f15828899801ce144c97da50758de83da3b7e756a7e5c failed while trying to build sandbox for cleanup: could not find endpoint 35b6ce7a721a46901c6f15828899801ce144c97da50758de83da3b7e756a7e5c: datastore not found for scope "
Dec 10 10:17:05 docker[1274]: time="2015-12-10T10:17:05.491416646Z" level=warning msg="Unable to locate plugin: weave, retrying in 1s"
Dec 10 10:17:06 docker[1274]: time="2015-12-10T10:17:06.491579074Z" level=warning msg="Unable to locate plugin: weave, retrying in 2s"
Dec 10 10:17:08 docker[1274]: time="2015-12-10T10:17:08.491737246Z" level=warning msg="Unable to locate plugin: weave, retrying in 4s"
Dec 10 10:17:12 docker[1274]: time="2015-12-10T10:17:12.491906061Z" level=warning msg="Unable to locate plugin: weave, retrying in 8s"
Dec 10 10:17:20 docker[1274]: time="2015-12-10T10:17:20.492103311Z" level=warning msg="Unable to locate plugin: weave, retrying in 1s"
Dec 10 10:17:21 docker[1274]: time="2015-12-10T10:17:21.493062542Z" level=warning msg="Unable to locate plugin: weave, retrying in 2s"
Dec 10 10:17:23 docker[1274]: time="2015-12-10T10:17:23.493250821Z" level=warning msg="Unable to locate plugin: weave, retrying in 4s"
Dec 10 10:17:27 docker[1274]: time="2015-12-10T10:17:27.493425280Z" level=warning msg="Unable to locate plugin: weave, retrying in 8s"
Dec 10 10:17:35 docker[1274]: time="2015-12-10T10:17:35.493618436Z" level=warning msg="Failed detaching sandbox eacfbde4430e0e618097962072c92a1fba5b46f0613134b74bedf29114fbfcb7 from endpoint 35b6ce7a721a46901c6f15828899801ce144c97da50758de83da3b7e756a7e5c: failed to get endpoint from store during leave: could not find endpoint 35b6ce7a721a46901c6f15828899801ce144c97da50758de83da3b7e756a7e5c: datastore not found for scope \n"
Dec 10 10:17:35 docker[1274]: time="2015-12-10T10:17:35.493672618Z" level=warning msg="Unable to locate plugin: weave, retrying in 1s"
Dec 10 10:17:36 docker[1274]: time="2015-12-10T10:17:36.493849667Z" level=warning msg="Unable to locate plugin: weave, retrying in 2s"
Dec 10 10:17:38 docker[1274]: time="2015-12-10T10:17:38.494796847Z" level=warning msg="Unable to locate plugin: weave, retrying in 4s"
Dec 10 10:17:42 docker[1274]: time="2015-12-10T10:17:42.494940516Z" level=warning msg="Unable to locate plugin: weave, retrying in 8s"
Dec 10 10:17:50 docker[1274]: time="2015-12-10T10:17:50.495108328Z" level=warning msg="Unable to locate plugin: weave, retrying in 1s"
Dec 10 10:17:51 docker[1274]: time="2015-12-10T10:17:51.495295240Z" level=warning msg="Unable to locate plugin: weave, retrying in 2s"
Dec 10 10:17:53 docker[1274]: time="2015-12-10T10:17:53.496253373Z" level=warning msg="Unable to locate plugin: weave, retrying in 4s"
Dec 10 10:17:57 docker[1274]: time="2015-12-10T10:17:57.496416010Z" level=warning msg="Unable to locate plugin: weave, retrying in 8s"
Dec 10 10:18:05 docker[1274]: time="2015-12-10T10:18:05.497531014Z" level=warning msg="Failed deleting endpoint 35b6ce7a721a46901c6f15828899801ce144c97da50758de83da3b7e756a7e5c: failed to get endpoint from store during Delete: could not find endpoint 35b6ce7a721a46901c6f15828899801ce144c97da50758de83da3b7e756a7e5c: datastore not found for scope \n"
Dec 10 10:18:05 systemd[1]: docker.service start operation timed out. Terminating.
Dec 10 10:18:05 systemd[1]: docker.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 10 10:18:05 systemd[1]: Failed to start Docker Application Container Engine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions