mDNS doesn't seem to work

@merck, testing the Bridge you sent me (ZZCC63031 FW v.2.10.8). Doesn’t seem to respond to mDNS requests. I’m not an expert in mDNS, but the code I’m using does discover Google devices, Shelly, Sonoff.

Can it be caused by country (AU)?

Looking for _bond._tcp.local. with scantime 00:00:10
Scanning on iface WiFi, idx 13, IP: 192.168.1.83
Attempting to bind to 0.0.0.0:5353 on adapter WiFi
Bound to 0.0.0.0:5353
Bound to multicast address
About to send on iface WiFi
Sent mDNS query on iface WiFi
IP: 192.168.1.57, Name: Chromecast-Ultra-fc4fa8f59ee317262fcbb0e08132eae0, Bytes: 400, IsResponse: True
IP: 192.168.1.57, Name: Chromecast-Ultra-fc4fa8f59ee317262fcbb0e08132eae0, Bytes: 400, IsResponse: True
IP: 192.168.1.57, Name: Chromecast-Ultra-fc4fa8f59ee317262fcbb0e08132eae0, Bytes: 385, IsResponse: True
IP: 192.168.1.135, Name: Google-Home-Mini-7593a88f11f13f50a5f3a669bda20a21, Bytes: 394, IsResponse: True
IP: 192.168.1.135, Name: Google-Home-Mini-7593a88f11f13f50a5f3a669bda20a21, Bytes: 379, IsResponse: True
IP: 192.168.1.2, Name: Google-Home-Mini-e35ff60a2f0d4edc88ae33d3af9f4665, Bytes: 400, IsResponse: True
IP: 192.168.1.2, Name: Google-Home-Mini-e35ff60a2f0d4edc88ae33d3af9f4665, Bytes: 385, IsResponse: True
IP: 192.168.1.2, Name: Google-Cast-Group-09b8a7e12ffb45709dc56599cbfd26de, Bytes: 399, IsResponse: True
IP: 192.168.1.2, Name: Google-Cast-Group-09b8a7e12ffb45709dc56599cbfd26de, Bytes: 384, IsResponse: True
IP: 192.168.1.156, Name: Google-Home-Mini-1e3c6bde40881e3aeee60bc4283ebb3f, Bytes: 398, IsResponse: True
IP: 192.168.1.156, Name: Google-Home-Mini-1e3c6bde40881e3aeee60bc4283ebb3f, Bytes: 383, IsResponse: True
IP: 192.168.1.30, Name: _http, Bytes: 457, IsResponse: True
IP: 192.168.1.30, Name: shellyswitch25-BA8F94, Bytes: 457, IsResponse: True
IP: 192.168.1.2, Name: googlerpc-2, Bytes: 180, IsResponse: True
IP: 192.168.1.156, Name: googlerpc, Bytes: 178, IsResponse: True
IP: 192.168.1.156, Name: googlerpc, Bytes: 178, IsResponse: True
IP: 192.168.1.135, Name: googlerpc-1, Bytes: 180, IsResponse: True
IP: 192.168.1.2, Name: googlerpc-2, Bytes: 180, IsResponse: True
IP: 192.168.1.135, Name: googlerpc-1, Bytes: 180, IsResponse: True
IP: 192.168.1.2, Name: googlerpc-2, Bytes: 180, IsResponse: True
IP: 192.168.1.2, Name: e35ff60a-2f0d-4edc-88ae-33d3af9f4665, Bytes: 276, IsResponse: True
IP: 192.168.1.135, Name: 7593a88f-11f1-3f50-a5f3-a669bda20a21, Bytes: 226, IsResponse: True
IP: 192.168.1.57, Name: fc4fa8f5-9ee3-1726-2fcb-b0e08132eae0, Bytes: 226, IsResponse: True
IP: 192.168.1.57, Name: fc4fa8f5-9ee3-1726-2fcb-b0e08132eae0, Bytes: 226, IsResponse: True
IP: 192.168.1.156, Name: googlerpc, Bytes: 178, IsResponse: True
IP: 192.168.1.156, Name: 1e3c6bde-4088-1e3a-eee6-0bc4283ebb3f, Bytes: 280, IsResponse: True
IP: 192.168.1.156, Name: 1e3c6bde-4088-1e3a-eee6-0bc4283ebb3f, Bytes: 280, IsResponse: True
IP: 192.168.1.2, Name: e35ff60a-2f0d-4edc-88ae-33d3af9f4665, Bytes: 276, IsResponse: True
IP: 192.168.1.135, Name: 7593a88f-11f1-3f50-a5f3-a669bda20a21, Bytes: 226, IsResponse: True
The thread 0x23b0 has exited with code 0 (0x0).
Done Scanning

[Update] Eventually it did find the device, but mDNS record properties are empty:

We just use mDNS for discovery of Bond ID and IP Address. What are you expecting to see under “properties”?

You are right Chris. That should be sufficient. Other companies have some info in service properties, mostly device type. Sonoff also use mDNS record to report minimal status data (256 bytes) to eliminate need for polling.

Ok, main issue is that discovery is slow. I’ll collect more stats. Google and Sonoff devices always reply within 2-3 seconds.

As I said before it’s not much of an issue when mDNS service is collecting the devices in background (i.e. Avahi, Bonjour). But for direct broadcast it’s a problem.

I think it makes sense to return device type (even if it’s only one Bridge for now) and protocol version in mDNS service properties. To make it ready for feature - when you will add more devices.

It took half an our to discover the Bridge…

[Update]
Interesting finding: when I use your Python cli (python.exe" -m bond discover) - it always prints empty table.

But. If my mDNS discovery is running at the same time - it finds the device straight away.

Interesting observation. It probably means that Bridge sends some response to your Python app, which your Python app doesn’t understand, but my app receives the response.

Hope it helps?

I think I found the solution. After digging in both Zeroconf implementations (yours Python and mine C#) I found the difference: Python code sends mDNS Question with Class.IN (the Internet), but C# implementation sends ANY = 255 (any class) - which supposedly should work better, but apparently it doesn’t work with your mDNS implementation (a bug?)

[EDIT] Sending mDNS Question with Class.IN in my Zeroconf code works

But why your Python code doesn’t find the Bridge?

OK, you’re not alone in reporting mDNS weirdness with Bond, so apparently there’s something misconfigured or a bug in our stack, but we have not noticed it on the client platforms/stacks we use here at Olibra.

If there’s anyone familiar with ESP32 develoment, I’ll post our full config here. Zermatt uses ESP-IDF (ESP32) v3.3 and the built-in mdns.h header. Here’s our full mDNS initialization:


static esp_err_t bwifi_event_handler_esp32(void *ctx, system_event_t * event)
{
  switch (event->event_id) {
     /* ... */
  }
  mdns_handle_system_event(ctx, event);
  return ESP_OK;
}

void bwifi_esp32_mdns_init()
{
  esp_err_t err = mdns_init();
  if (err) {
    SysError(BE_500_BWIFI_ESP32_MDNS_INIT, "%s", esp_err_to_name(err));
    return;
  }
  err = mdns_hostname_set(BONDID_Get_Serial());
  if (err) {
    SysError(BE_500_BWIFI_ESP32_MDNS_HOSTNAME_SET, "%s", esp_err_to_name(err));
    return;
  }
  err = mdns_service_add(NULL, "_bond", "_tcp", 80, NULL, 0);
  if (err) {
    SysError(BE_500_BWIFI_ESP32_MDNS_SERVICE_ADD, "%s", esp_err_to_name(err));
    return;
  }
  SysInfo("mdns started");
}

Hmm, we’ve never tried it on Windows.

@alexbk66 a Wireshark capture of the mDNS traffic could be revealing.

1 Like

As I said, eventually your device replies to mDNS - but sometimes it takes half an hour. So if you are running a separate Zeroconf service (Avahi or Bonjour) - the service will catch the reply and keep it (for TTL I think, after TTL it should clear the record). So from your client Zeroconf poin of view it will look like immediate responce - because it doesn’t talk to the device, but gets reply from the service.

O you can test on Windows without Bonjour, then install Bonjour.
Also I use ServiceBrowser.exe from https://marknelson.us/posts/2011/10/25/dns-service-discovery-on-windows.html

I guess main clue is - your device doesn’t reply to mDNS request Class.ANY, but does reply to Class.IN.

But again, I’m not an expert in mDNS.

Is it here https://docs.espressif.com/projects/esp-idf/en/latest/api-reference/protocols/mdns.html?

Well, specifically the version Zermatt (and SBB) is on as of v2.10.x firmware: https://docs.espressif.com/projects/esp-idf/en/v3.3/api-reference/protocols/mdns.html

The older Bond Bridge HW uses a different mDNS stack from OpenWRT.

It’s hard to tell if changing request to Class.IN is a good solution or just a temporary workaround (requiring fix in your firmware)?

On one hand the change makes it work, perfectly.
On other hand - the C# code I’m using is standard Zeroconf library which worked for Google devices, Sonoff and Shelly. So it should be right?

I wish I had more understanding of Zeroconf.

Actually, I’m sure your mDNS should be fixed - even your Python code doesn’t locate devices on Windows.

And third party ServiceBrowser too.

Can you debug your Python example to find out the problem (at least on Windows)?