Hacker News
DNS Explained – How Domain Names Get Resolved
jedberg
|next
[-]
The most egregious of course is ISPs rewriting TTLs (or resolvers that just ignore them). But there are other implementation issues too, like caching things that shouldn't be or doing it wrong. I've seen resolvers that cache a CNAME and the A record it resolves to with the TTL of the CNAME (which is wrong).
I'm also very concerned about the "WHY DNS MATTERS FOR SYSTEM DESIGN" section. While everything there is correct enough, it doesn't dive into the implication of each and how things go wrong.
For example, using DNS for round robin balancing is an awful idea in practice. Because Comcast will cache one IP of three, and all of a sudden 60% of your traffic is going to one IP. Similar issue with regional IPs. There are so many ways for the wrong IP to get into a cache.
There is a reason we say "it's always DNS".
progbits
|root
|parent
[-]
For round-robin, I've actually had it work reasonably well for API usage. Of course it's not ideal, but when I wanted to roll out new things slowly over several days and could not use a load balancer or reverse proxy, it kind of worked. I think most API users are just running with a reasonable resolver and not residential ISP ones.
jedberg
|root
|parent
[-]
But after two months, about 1% was still going to the old server (I had set it up as a proxy for the cutover). Most of that traffic looked like crawlers that were written in things like Python or Ruby and had probably hard coded the IP or done something where it just didn't know what a TTL was.
So at that point I just shut down the old server.
You're probably right about API clients using better resolvers though. I was talking about consumer facing things where a lot of people would be on ISP DNS.
soneil
|next
|previous
[-]
Propagation might be a useful way to visualise it, but doesn't match reality unless every cache is a warm cache.
YesThatTom2
|root
|parent
|next
[-]
It’s accurate to say that a user is waiting for the change to propagate if they are sitting there clicking re-try as they wait for the cascading cache expirations to do their thing.
thomascountz
|root
|parent
|next
|previous
[-]
And checkout their Mess with DNS playgound!
torh
|next
|previous
[-]
This used to be true until virtual hosting came along, allowing for several domains to point to the same IP address, but only for non-HTTPS traffic. Then a bit later we got SNI (Server Name Indication) that did the same thing for HTTPS.
I remember having web servers with 10-12 public IP adresses when I started working. The number of IPv4 addresses needed has been greatly reduced since.
petemilly
|next
|previous
[-]
Noticeably faster as in just loading a website? Or in some script where small differences add up? I thought typical DNS lookup was sub 100ms, but I've never tried switching my resolver so I'm curious
nerdsniper
|next
|previous
[-]
pastage
|next
|previous
[-]
I have been broken for three decades and I still don't understand DNS. It is a simple protocol but people use it in complicated manners.
cyberax
|root
|parent
|next
[-]
It's the most baroque protocol that is still somehow surviving from the initial Internet. There are so many weird limitations, like not being able to use CNAME for apex zones. Or the entire DNSSEC fiasco.
iberator
|root
|parent
|previous
[-]
chasd00
|root
|parent
|next
[-]
IP's can change without warning.
stevekemp
|next
|previous
[-]
(Assuming a typical home connection, your router is _probably_ not a DNS server with local cache, it probably is a DHCP server which will hand out the upstream/ISPs' nameservers.)
jdsnape
|root
|parent
|next
[-]
stevekemp
|root
|parent
[-]
Nowadays I'm in Finland and definitely the router runs no DNS service, the DHCP service advertises the ISP resolvers.
Probably depends on the region/ISP I guess, but I had no expectation that it would be the more common option.
stackskipton
|root
|parent
[-]
RegnisGnaw
|root
|parent
|next
|previous
[-]
direwolf20
|root
|parent
|next
|previous
[-]
whalesalad
|root
|parent
|previous
[-]
dnsmasq is the defacto tool on these embedded devices for dhcp+dns. probably a billion deployments. it's up there with sqlite for most used tech.
chasd00
|root
|parent
[-]
criticalfault
|next
|previous
[-]
would really be happy to have had these explanations before I had to figure it out for myself.
then you have these guys who reached the next level
mbreese
|next
|previous
[-]
What I think is missing is a bit more of the “in practice” side. If the author was surprised about TTL values, I doubt they have much experience with some of the other pitfalls, so I’m not surprised (not a knock on the author). But there is a reason why the phrase “It’s always DNS” exists.
As an example, it could be helpful to mention that ISP DNS resolvers (or any caching resolver in the path) could decide to ignore the TTL. In this case, your 360 sec TTL might not get updated for an hour or a day or longer. This can be infuriating to troubleshoot.
A section on troubleshooting might also be beneficial. But this mainly consists of checking results from different resolvers in your path - does it work with a local resolver? Your ISPs DNS? The authoritative server?
chasd00
|root
|parent
[-]
not exactly the same but it reminds me of MTU. You can set it to whatever you want but it only takes one router that you don't control in the path to undo what you're trying to accomplish.
tallanvor
|next
|previous
[-]
The biggest pain of DNS for most people is if someone has set the TTL to an absurdly large number, or if a resolver isn't respecting TTL. And once you get into advanced configurations, SOAs and delegation certainly create their own headaches!