Edge managed node cannot connect to cloud leader

I am attempting to set up an Edge managed node on a server at home to connect to Cribl Cloud using a Free license. I have Stream working using Syslog, but now am trying to run cribl in managed edge mode.

I used the set up script copied from the Edge “Add/Update Edge Node” > “Update existing” > CLI. That worked like a charm.

After starting cribl (./cribl start), I see the following
From cribl.log:

{"time":"2022-05-31T00:21:27.102Z","cid":"api","channel":"output:DistWorker","level":"info","message":"attempting to connect","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:27.102Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"will retry to connect","nextConnectTime":1653956491408}
{"time":"2022-05-31T00:21:27.102Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"connecting","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:27.171Z","cid":"api","channel":"input:DistMaster","level":"debug","message":"opened connection","src":"44.236.94.xxx:4200"}
{"time":"2022-05-31T00:21:27.171Z","cid":"api","channel":"output:DistWorker","level":"info","message":"connected","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:27.171Z","cid":"api","channel":"output:DistWorker","level":"info","message":"flushing buffer backlog","count":1,"totalSize":304}
{"time":"2022-05-31T00:21:37.174Z","cid":"api","channel":"output:DistWorker","level":"error","message":"connection error","error":"This socket has been ended by the other party"}
{"time":"2022-05-31T00:21:37.185Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"will retry to connect","nextConnectTime":1653956499338}
{"time":"2022-05-31T00:21:39.255Z","cid":"api","channel":"output:DistWorker","level":"warn","message":"sending is blocked","since":1653956498,"elapsed":1,"endpoint": {"host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}}
{"time":"2022-05-31T00:21:39.346Z","cid":"api","channel":"output:DistWorker","level":"info","message":"attempting to connect","host":"logstream.xxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:39.346Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"will retry to connect","nextConnectTime":1653956503652}
{"time":"2022-05-31T00:21:39.346Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"connecting","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:39.414Z","cid":"api","channel":"input:DistMaster","level":"debug","message":"opened connection","src":"44.236.94.xxx:4200"}
{"time":"2022-05-31T00:21:39.414Z","cid":"api","channel":"output:DistWorker","level":"info","message":"connected","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:39.415Z","cid":"api","channel":"output:DistWorker","level":"info","message":"flushing buffer backlog","count":1,"totalSize":4088}
{"time":"2022-05-31T00:21:39.416Z","cid":"api","channel":"output:DistWorker","level":"info","message":"sending unblocked","since":1653956499,"endpoint":{"host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}}
{"time":"2022-05-31T00:21:42.084Z","cid":"api","channel":"output:DistWorker","level":"error","message":"connection error","error":"This socket has been ended by the other party"}
{"time":"2022-05-31T00:21:42.093Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"will retry to connect","nextConnectTime":1653956504246}
{"time":"2022-05-31T00:21:43.150Z","cid":"api","channel":"output:DistWorker","level":"warn","message":"sending is blocked","since":1653956502,"elapsed":1,"endpoint":{"host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}}
{"time":"2022-05-31T00:21:44.158Z","cid":"api","channel":"output:DistWorker","level":"warn","message":"sending is blocked","since":1653956502,"elapsed":2,"endpoint":{"host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}}
{"time":"2022-05-31T00:21:44.249Z","cid":"api","channel":"output:DistWorker","level":"info","message":"attempting to connect","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:44.250Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"will retry to connect","nextConnectTime":1653956508556}
{"time":"2022-05-31T00:21:44.250Z","cid":"api","channel":"output:DistWorker","level":"debug","message":"connecting","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:44.319Z","cid":"api","channel":"input:DistMaster","level":"debug","message":"opened connection","src":"44.236.94.xxx:4200"}
{"time":"2022-05-31T00:21:44.320Z","cid":"api","channel":"output:DistWorker","level":"info","message":"connected","host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}
{"time":"2022-05-31T00:21:44.320Z","cid":"api","channel":"output:DistWorker","level":"info","message":"flushing buffer backlog","count":1,"totalSize":304}
{"time":"2022-05-31T00:21:44.320Z","cid":"api","channel":"output:DistWorker","level":"info","message":"sending unblocked","since":1653956504,"endpoint":{"host":"logstream.xxxxxxxxxx.cribl.cloud","port":4200,"tls":false}}

What I notice is that the connection appears to be reported as blocked and unblocked ever few milliseconds.

I checked that I can curl https://cdn.cribl.io/telemetry/ and receive cribl /// living the stream!, so the required anonymous telemetry should be getting through, but I cannot tell. In the Cribl Edge cloud console, I do not see any indications that the Cribl Edge Managed Node is able to communicate with the leader node for the fleet in Cribl Cloud.

Am I missing something simple, like managed nodes are not supported under Free license?

1 UpGoat

Hi @chiefgeek157, you can manage up to 100 managed Edge nodes in the Cribl Cloud free license tier. https://cribl.io/cribl-cloud.

I’m noticing in your logs that the Edge node is attempting to connect to your tenant, but it is not using TLS (which is enabled and required for Cloud Leader connectivity). Can you modify the $CRIBL_HOME/local/_system/instance.yml file to add the following to the master stanza?

distributed:
  mode: managed-edge
  master:
    ... other settings here
    tls:
     disabled: false
1 UpGoat

Thank you @bdalpe . I made the change as suggested and noted a couple of effects. Unfortunately I still do not see the Edge noe registering with default_fleet.

Here is my _system/instance.yml. All settings are taken directly from the “Add Node” pop-up with the addition of “not disabling” TLS as noted below.

distributed:
  mode: managed-edge
  master:
    host: logstream.xxxxxxxxxx.cribl.cloud
    port: 4200
    authToken: "<token>"
	tls:
	  disabled: false
  group: default_fleet

First, in cribl.log I no longer see the long chains of connect/reset messages. That is very good. In fact, there is only the following over several minutes:

cribl.log:

{"time":"2022-06-02T22:24:50.640Z","cid":"api","channel":"PeriodicScheduler","level":"info","message":"loading jobs","jobs":[],"group":"default"}
{"time":"2022-06-02T22:24:50.641Z","cid":"api","channel":"PeriodicScheduler","level":"info","message":"loaded jobs","scheduled":[]}
{"time":"2022-06-02T22:24:50.653Z","cid":"api","channel":"Executors","level":"info","message":"updated functions list","enabled":3,"all":3}
{"time":"2022-06-02T22:25:49.762Z","cid":"api","channel":"preview","level":"info","message":"setting cpu profile ttl","ttl":1800000}

I note there are now two processes: server -r WORKER and server -r LEADER. I do not recall checking that that was the case before. However, I do now get two worker logs for worker 0 and worker 1 whereas I only recall seeing worker 0 before. Perhaps that was because the other worker never started. I didn’t expect to see a LEADER since I am trying to run in managed-edge mode, but that may just be my lack of understanding.

Looking at worker/0/cribl.log:

{"time":"2022-06-02T22:41:05.095Z","cid":"w0","channel":"server","level":"info","message":"_raw stats","inEvents":0,"outEvents":0,"inBytes":0,"outBytes":0,"starttime":1654209600,"endtime":1654209660,"activeCxn":0,"openCxn":0,"closeCxn":0,"rejectCxn":0,"abortCxn":0,"droppedEvents":0,"tasksStarted":0,"tasksCompleted":0,"activeEP":1,"blockedEP":0,"cpuPerc":0.15,"mem":{"heap":97,"ext":8,"rss":171}}
{"time":"2022-06-02T22:41:55.546Z","cid":"w0","channel":"clustercomm","level":"info","message":"metric sender","total":630,"dropped":0}

Looking at worker/1/cribl.log:

{"time":"2022-06-02T22:42:05.097Z","cid":"w1","channel":"server","level":"info","message":"_raw stats","inEvents":0,"outEvents":0,"inBytes":0,"outBytes":0,"starttime":1654209660,"endtime":1654209720,"activeCxn":0,"openCxn":0,"closeCxn":0,"rejectCxn":0,"abortCxn":0,"droppedEvents":0,"tasksStarted":0,"tasksCompleted":0,"activeEP":1,"blockedEP":0,"cpuPerc":0.15,"mem":{"heap":97,"ext":8,"rss":168}}
{"time":"2022-06-02T22:42:55.616Z","cid":"w1","channel":"clustercomm","level":"info","message":"metric sender","total":630,"dropped":0}

So far as I can tell these are in round-robin.

Hi @chiefgeek157, did you copy and paste the config directly from your box? I noticed there is a literal tab character (\t) as the indention character for the tls lines. Not sure if this is a forum formatting issue or if it is truly a tab character. Can you make sure that the tls line starts with 4 space characters and not a tab? YAML is very particular about formatting. I did reproduce the same behavior on a test instance, but resolved it when changing the tab to spaces. Here’s a clean version with spaces:

distributed:
  mode: managed-edge
  master:
    host: logstream.xxxxxxxxxx.cribl.cloud
    port: 4200
    authToken: "<token>"
    tls:
      disabled: false
  group: default_fleet
2 UpGoats

Bingo!! I should have checked my YAML for tabs. Thank you for finding that very subtle text file error.

The edge node has now appeared in the default_fleet.

1 UpGoat

Excellent! Have fun with your Cribl Edge deployment :slight_smile: