中文版:Tailscale出口节点无网络问题的调试与分析 – Frank’s Weblog
As mentioned in an earlier post, I used Tailscale to create a mesh network that connects all of my devices, and I used a cloud server located in AliCloud Beijing as an exit node, in order to access geographically restricted internet services.
However, I noticed that I could not access the Internet at all when using that exit node. I thought it was a network connectivity issue with the relays, so I didn’t worry too much about it. But afterward, I noticed some other services on that server stopped functioning, so I looked into it and found out that the problem was not that simple.
First I noticed that I couldn’t access the internet at all from the server, but curl
the IP address was working, which indicated the problem with DNS resolution. resolvectl status
showed that there were two DNS servers. I assumed this was the DNS server for the Tailscale internal network (actually not, will elaborate later) since the IPs started with 100.100[1],
Link 2 (eth0)
......
Current DNS Server: 100.100.2.136
DNS Servers: 100.100.2.136
100.100.2.138
I tried dig @100.100.2.136 baidu.com
to check the response from the DNS server and got connection timed out: no servers could be reached
. The response from the command became normal after shutting down Tailscale. So probably Tailscale somehow affected the DNS resolution on the system.
Workaround
Changing the DNS configuration on the server will work around this problem. Edit /etc/netplan/99-netcfg.yaml
, add a public DNS into nameserver
section under eth0
.
network:
version: 2
renderer: networkd
ethernets:
eth0:
dhcp4: yes
dhcp6: no
nameservers:
addresses: [114.114.114.114]
Note: 114.114.114.114 is a public DNS that widely used in China, similar to 8.8.8.8
Run sudo netplan apply
to apply changes, then dig baidu.com
returns the correct response.
However, modifying the DNS server allows the server to access the Internet, but many services inside AliCloud still require internal DNS resolution. For example, AliCloud’s internal apt
mirror (mirrors.cloud.aliyuncs.com
) and products such as cloud databases. Configuring apt sources to public mirrors can be a workaround for the apt mirror issue.
Locating Issue
To locate the problem, we need to find the reason why the IP address 100.100.2.136 is not reachable. I thought these two DNS servers were IPs in Tailscale’s internal network, but they were inaccessible by all means. After some searching, I found that 100.100.2.136 and 100.100.2.138 are actually internal DNS servers provided by AliCloud. There are also some AliCloud internal services that use similar IPs, for example, the apt
mirror whose IP is 100.100.2.148, which is also not able to connect using curl
.
We can therefore draw a preliminary conclusion that Tailscale somehow affected access to the 100.100.x.x IP range.
Possibilities
Routing
My first thought was that Tailscale was routing the entire 100.100.x.x IP range. However, according to the Tailscale documentation, Tailscale only routes the assigned IP address, not the entire CIDR. ip route list
also confirms this.
ip route list table 52
100.69.x.x dev tailscale0
100.90.x.x dev tailscale0
100.96.x.x dev tailscale0
100.98.x.x dev tailscale0
100.100.100.100 dev tailscale0
100.104.x.x dev tailscale0
100.121.x.x dev tailscale0
100.127.x.x dev tailscale0
ip route get 100.100.2.136
returns the following result, indicating that the packet will be routed to the eth0
interface. It means the routing table is correct and that the problem is not with the routing.
100.100.2.136 via 172.24.63.253 dev eth0 src 172.24.4.100 uid 0
cache
iptables
Another thing that may interfere the packets traveling is iptables. iptables -S
reveals the following entries related to Tailscale.
-A ts-forward -i tailscale0 -j MARK --set-xmark 0x40000/0xffffffff
-A ts-forward -m mark --mark 0x40000 -j ACCEPT
-A ts-forward -s 100.64.0.0/10 -o tailscale0 -j DROP
-A ts-forward -o tailscale0 -j ACCEPT
-A ts-input -s 100.92.187.56/32 -i lo -j ACCEPT
-A ts-input -s 100.115.92.0/23 ! -i tailscale0 -j RETURN
-A ts-input -s 100.64.0.0/10 ! -i tailscale0 -j DROP
The last entry of these rules drops the packets to the entire 100.64.0.0/10
CIDR. The problem was solved after removing the rule using iptables -D
.
After some searching, I found there are issues already posted earlier this year:
Conclusion
To sum up, the problem was caused by a firewall rule set by Tailscale to block traffic to 100.64.0.0/10
CIDR, therefore some services on AliCloud’s internal network were blocked because they reside in this IP range. According to Tailscale CLI documentation, adding --netfilter-mod=off
parameter when starting Tailscale can avoid this rule from being set. However, this poses some security risks.
Tailscale set this rule because the IP range (100.64.0.0/10
)[1] it uses for the Tailscale network is reserved for Carrier Grade NAT (CGNAT) and was assumed not to be used by the private networks. However, AliCloud uses this IP range for their internal services, thus causing conflict.
发表回复/Leave a Reply