Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


terrbear

15 posts

Geek


#307129 22-Sep-2023 14:54
Send private message quote this post

I’ve been trying to get a Chorus tech out here, but haven’t yet had luck because the sporadic packet loss I see isn’t easily reproduced without patience on RSP’s end.

 

The gist is, at various points during the day, I’ll see some packet loss. I ran speedtest from the cli 100 times, and of those, 10 times showed packet loss between 0.5-2%.

 

I’ve been trying to figure this out for a few months now. I thought at first it was my modem. I was on 2degrees and using their FritzBox. I changed to another modem/router (Asus GT-AX11000), same issue. 

 

So I’d thought maybe it was some unluckily routed MTU blackhole. Turned MTUs down everywhere to 1280, no dice.

 

I gave up on 2degrees because their hold times were really long, and had the hope that maybe having a different route out of NZ would be helpful. Switched to OneNZ. Same issue.

 

I can hook a laptop directly to the ONT and run mtr to the gateway, and still see 50% packet loss. I realize that some ICMP will get dropped, but it’s so consistent and I think a corresponding symptom to the other packet drops I see during the day.

 

I have a static IP, and if I run a mtr from outside to my IP, I see 0% packet loss until the last hop, which is my house :)

 

So, I’ve tried 2 ISPs and 3 modems (FritzBox, Asus, Macbook), and no luck. I told One I’d be happy to pay the no-fault fee for Chorus, so they scheduled a tech, but warned that because they didn’t see any problems Chorus would likely not show up. It seems that is true; nobody’s shown up.

 

At the ONT, I see the sleeve for the fibre cable is damaged (I can see all the inside cables). I’m told that’s not a problem because traffic is passing.

 

What would be the best way to get a Chorus tech out here? It’s entirely possible I’m being dumb. I’m happy to throw money at this problem, but I really need reliable TCP connections.

 

I made a mistake early on and told the support team that if I run traffic through Wireguard, the problem is mitigated. I think that’s because WG UDP retransmits a lot more aggressively than a normal TCP connection. But I’m not running my traffic through Wireguard normally.

 

Oh, the traffic in question is mostly SSH/HTTPS. Like, doing git things with Github and doing API calls to AWS. The rest of the family notices this problem when pages don’t load or buttons don’t seem to click the first time and they resubmit. 

 

Thanks in advance, smarter people here :)


Filter this topic showing only the reply marked as answer View this topic in a long page with up to 500 replies per page Create new topic
 1 | 2
terrbear

15 posts

Geek


  #3130533 22-Sep-2023 15:18
Send private message quote this post

Update: I think I got a real fault lodged and someone will come out Monday. Clearly I just needed to appeal to the geekzone gods :)


 
 
 

Free kids accounts - trade shares and funds (NZ, US) with Sharesies (affiliate link).
raytaylor
3835 posts

Uber Geek

Trusted

  #3136695 29-Sep-2023 19:59
Send private message quote this post

I have had a similar problem in the past. 

 

The slightly damaged fiber cable is unlikely to be the problem - fiber either works or it doesnt. It doesnt scale back speed or capacity. And the beauty of GPON is the links dont establish and fail quick enough that you would call it a packet loss issue - it would go down for several seconds before coming back up again.   

 

   

 

If you have two ISPs on the same ONT and only one has packet loss, then its likely to be a problem with the SFP module at one of the ISPs end. 
I have had this issue in the past where an IT customer was having a problem with their ISP who had a bad SFP module in the local exchange on either their end or the chorus end of the handover link.
But being a small isp they didnt have many customers who noticed it as a problem and werent very interested in fixing it. 
I put in a secondary circuit to our handover in the same exchange, next rack over, and found no issue at all.  

 

Chorus does have an allowance for a certain level of packet loss in their residential bitstream connections with their original UFB government contracts but in the real world they have pretty much zero packet loss due to the lack of congestion.     

 

  

 

You do mention that you have tried different ISPs so that does lead me to believe its more on the chorus side of things. 





Ray Taylor

There is no place like localhost

Spreadsheet for Comparing Electricity Plans Here


terrbear

15 posts

Geek


  #3143597 6-Oct-2023 13:15
Send private message quote this post

I think I've got it all worked out!

 

 

 

1) Chorus came out after I found a really helpful support person at One. He said the line was terrible and was surprised neither RSP had told me as such. He replaced the line and ONT.

 

 

 

Things were better but still unreliable, and then something seems to have happened at One that made things go really really wonky, like UDP streams were dropping all over. I saw a Reddit thread where people had noticed packet loss w/ One in the last few weeks, so at that point I decided I'd try another RSP.

 

 

 

I switched to Orcon (yeah, same as 2d, but this was a hail mary) and got their router and paid for premium support.

 

 

 

Cutover happened, same issue. I could do an `ssh -vvv host ls` and see it hang (when it hangs) at the same spot. Also, this had been going on for a few months, but Disney+ only worked on one of our Samsung TVs if IPv6 was off (the other TVs don't have IPv6 so didn't have a problem).

 

 

 

Frustrated, I tried setting the MTU on the dest host for SSH to 576 and ... voila. Fixed! 

 

 

 

A bit of manual binary searching found 950 to be the magic value. So now I've got the MTU on the router set to 950 and now Disney+ works over IPv6. Woot!

 

 

 

But, still having connection woes on my Linux machine. Turns out by default MTU probing is disabled on Ubuntu, so I flipped that on and now it seems like everything is magically working. 

 

 

 

I'm not sure why SSH on OSX doesn't MTU probe. Haven't gotten that far yet.




terrbear

15 posts

Geek


  #3143599 6-Oct-2023 13:26
Send private message quote this post

OSX, it seems you need to set net.inet.tcp.pmtud_blackhole_mss. Mine was set to 1200, which wouldn't catch the 950 MTU.

 

 

 

That said, BSD docs say (https://man.freebsd.org/cgi/man.cgi?query=tcp&sektion=4):

 

 

 

> pmtud_blackhole_detection Enable automatic path MTU blackhole detection. In case of retransmits of MSS sized segments, the OS will lower the MSS to check if it's an MTU problem. If the current MSS is greater than the configured value to try (net.inet.tcp.pmtud_blackhole_mss and net.inet.tcp.v6pmtud_blackhole_mss), it will be set to this value, otherwise, the MSS will be set to the default values (net.inet.tcp.mssdflt and net.inet.tcp.v6mssdflt). Settings: 0 Disable path MTU blackhole detection. 1 Enable path MTU blackhole detection for IPv4 and IPv6. 2 Enable path MTU blackhole detection only for IPv4. 3 Enable path MTU blackhole detection only for IPv6.

 

 

 

So I'm not sure if I'm misreading that or what.

 

 


RunningMan
7960 posts

Uber Geek


  #3143600 6-Oct-2023 13:26
Send private message quote this post

Orcon should accept up to 1500 for the MTU, so something seems off there.

 

https://help.orcon.net.nz/hc/en-us/articles/900001248426-Setting-up-your-own-Fibre-modem 


terrbear

15 posts

Geek


  #3143608 6-Oct-2023 14:02
Send private message quote this post

I don't think it's Orcon, because I'd noticed this with One also; seems somewhere in between here and California (I assume through Sydney).


RunningMan
7960 posts

Uber Geek


  #3143611 6-Oct-2023 14:13
Send private message quote this post

It was more in relation to this comment:

 

terrbear: [snip] So now I've got the MTU on the router set to 950 and now Disney+ works over IPv6. Woot!

 

You may have established a workaround, but it's just that, a workaround.




terrbear

15 posts

Geek


  #3143616 6-Oct-2023 14:24
Send private message quote this post

Ahh, yeah, agreed. I would rather 1500 MTU for sure :D

 

 

 

Interestingly, my first hop after the Orbi:

 

 

 

❯ ping -s 899 -M do 60.234.8.50
PING 60.234.8.50 (60.234.8.50) 899(927) bytes of data.
907 bytes from 60.234.8.50: icmp_seq=1 ttl=63 time=2.79 ms
907 bytes from 60.234.8.50: icmp_seq=2 ttl=63 time=2.77 ms
^C
--- 60.234.8.50 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 2.774/2.782/2.791/0.008 ms
❯ ping -s 900 -M do 60.234.8.50
PING 60.234.8.50 (60.234.8.50) 900(928) bytes of data.
908 bytes from 60.234.8.50: icmp_seq=155 ttl=63 time=3.80 ms

 

 

 

I haven't been able to get 928 bytes thru (well, 1/155). So my empirical testing was not quite on the mark I think :)

 

 

 

I pinged Orcon about this but I expect there'll be some back and forth with them before they stop giving me the form reply about placing my wireless router in the right spot and setting MTU to 1500. 


raytaylor
3835 posts

Uber Geek

Trusted

  #3143676 6-Oct-2023 17:46
Send private message quote this post

terrbear:

 

1) Chorus came out after I found a really helpful support person at One. He said the line was terrible and was surprised neither RSP had told me as such. He replaced the line and ONT.

 

 

I am pretty sure retailers can only see signal level as "Acceptable" or "Not acceptable" in the chorus assurance system.  The chorus technicians however can see the actual signal levels through a system they use.    

 

I dont know what chorus consider to be an acceptable signal level but normally for a gpon system its between -8db and -26db though chorus typically runs a 1:16way split so the best you would see is -12.  But even if your signal is down at at -26db you would still get full gigabit speed which is why i say it would be cutting out completely if it was a signal/line issue.    

 

If you were to ping a remote server and the signal level was down at -28db then there would indeed be intermittant blocks of reply failures. There would be several successful replies and then it would drop out for about 10 seconds, maybe more while the ONT renegotiates with the OLT at the exchange, before you got any responses again.      

If your getting one or two success, then a few failures, and then working again without a big block of failures then i doubt it would be a line or signal issue. 





Ray Taylor

There is no place like localhost

Spreadsheet for Comparing Electricity Plans Here


terrbear

15 posts

Geek


  #3143683 6-Oct-2023 18:10
Send private message quote this post

Ooh good to know. The drops are definitely not all traffic, just what I thought were rando packets, but seems to be big(-ger) ones.

 

 

 

Like, if I set TCP timeouts for clients I'm using to be say 2 seconds, I'll see a drop and retry but everything works itself out on the reconnect. Definitely not 10 second drops.


terrbear

15 posts

Geek


  #3143904 7-Oct-2023 12:00
Send private message quote this post

I've done some more testing (thanks @Cxf for your help!) and ... this is the weirdest thing. It seems that some packet lengths are just blackholed from my house. And only even numbered ones. If I send pings for odd-numbered packet lengths, eg 901 or 899, I see 0% packet loss. I tested every size from 100-927 (my current MTU) with DF set, and get this:

 

 

 

size: 100 (128), sent: 4, recv: 2, loss: 50.00

 

size: 132 (160), sent: 4, recv: 1, loss: 75.00

 

size: 148 (176), sent: 4, recv: 2, loss: 50.00

 

size: 164 (192), sent: 4, recv: 2, loss: 50.00

 

size: 180 (208), sent: 4, recv: 2, loss: 50.00

 

size: 196 (224), sent: 4, recv: 1, loss: 75.00

 

size: 212 (240), sent: 4, recv: 2, loss: 50.00

 

size: 228 (256), sent: 4, recv: 0, loss: 100.00

 

size: 244 (272), sent: 4, recv: 1, loss: 75.00

 

size: 260 (288), sent: 4, recv: 2, loss: 50.00

 

size: 516 (544), sent: 4, recv: 0, loss: 100.00

 

size: 452 (480), sent: 4, recv: 0, loss: 100.00

 

size: 276 (304), sent: 4, recv: 2, loss: 50.00

 

size: 596 (624), sent: 4, recv: 0, loss: 100.00

 

size: 484 (512), sent: 4, recv: 0, loss: 100.00

 

size: 500 (528), sent: 4, recv: 3, loss: 25.00

 

size: 612 (640), sent: 4, recv: 0, loss: 100.00

 

size: 372 (400), sent: 4, recv: 2, loss: 50.00

 

size: 564 (592), sent: 4, recv: 0, loss: 100.00

 

size: 580 (608), sent: 4, recv: 0, loss: 100.00

 

size: 324 (352), sent: 4, recv: 2, loss: 50.00

 

size: 628 (656), sent: 4, recv: 0, loss: 100.00

 

size: 292 (320), sent: 4, recv: 2, loss: 50.00

 

size: 660 (688), sent: 4, recv: 0, loss: 100.00

 

size: 116 (144), sent: 4, recv: 3, loss: 25.00

 

size: 420 (448), sent: 4, recv: 1, loss: 75.00

 

size: 388 (416), sent: 4, recv: 2, loss: 50.00

 

size: 404 (432), sent: 4, recv: 1, loss: 75.00

 

size: 308 (336), sent: 4, recv: 1, loss: 75.00

 

size: 532 (560), sent: 4, recv: 0, loss: 100.00

 

size: 644 (672), sent: 4, recv: 0, loss: 100.00

 

size: 548 (576), sent: 4, recv: 0, loss: 100.00

 

size: 676 (704), sent: 4, recv: 0, loss: 100.00

 

size: 724 (752), sent: 4, recv: 0, loss: 100.00

 

 

 

Pinging 1.1.1.1. 

 

 

 

I can reproduce plugging a machine directly into the ONT, so I don't think it's the Orbi :)

 

 

 

I think the aggressive MTU is alleviating the issue by forcing smaller packets that don't hit those bad numbers as often.

freitasm
BDFL - Memuneh
76351 posts

Uber Geek

Administrator
ID Verified
Trusted
Geekzone
Lifetime subscriber

  #3143923 7-Oct-2023 12:34
Send private message quote this post

terrbear:

 

I don't think it's Orcon, because I'd noticed this with One also; seems somewhere in between here and California (I assume through Sydney).

 

 

If this was the case, we would have a constant stream of people complaining - from many different ISPs.





Please support Geekzone by subscribing, or using one of our referral links: Dosh referral: 00001283 | Sharesies | Goodsync | Mighty Ape | Backblaze

 

freitasm on Keybase | My technology disclosure

 

 

 

 

 

 


terrbear

15 posts

Geek


  #3143926 7-Oct-2023 12:39
Send private message quote this post

freitasm:

 

terrbear:

 

I don't think it's Orcon, because I'd noticed this with One also; seems somewhere in between here and California (I assume through Sydney).

 

 

If this was the case, we would have a constant stream of people complaining - from many different ISPs.

 

 

 

 

Yep, you're right. Seems it's something between me and the RSP handover.


BMarquis
374 posts

Ultimate Geek

Trusted
Chorus
Lifetime subscriber

  #3143933 7-Oct-2023 13:01
Send private message quote this post

What is your speedtest result to your RSP’s speedtest.net server?
Please also run an mtr to 1.1.1.1 and 8.8.8.8

If you can do those above 3 test both for you 980MTU and 1500 MTU, please.



terrbear

15 posts

Geek


  #3144179 7-Oct-2023 23:01
Send private message quote this post

I’m fixed! Well, my internet connection at least.

Huge thanks to @Cxf for sifting through my ramblings to figure out where to look for the issue :)

 1 | 2
Filter this topic showing only the reply marked as answer View this topic in a long page with up to 500 replies per page Create new topic





News and reviews »

Samsung Announces Galaxy AI
Posted 28-Nov-2023 14:48


Epson Launches EH-LS650 Ultra Short Throw Smart Streaming Laser Projector
Posted 28-Nov-2023 14:38


Fitbit Charge 6 Review 
Posted 27-Nov-2023 16:21


Cisco Launches New Research Highlighting Gap in Preparedness for AI
Posted 23-Nov-2023 15:50


Seagate Takes Block Storage System to New Heights Reaching 2.5 PB
Posted 23-Nov-2023 15:45


Seagate Nytro 4350 NVMe SSD Delivers Consistent Application Performance and High QoS to Data Centers
Posted 23-Nov-2023 15:38


Amazon Fire TV Stick 4k Max (2nd Generation) Review
Posted 14-Nov-2023 16:17


Over half of New Zealand adults surveyed concerned about AI shopping scams
Posted 3-Nov-2023 10:42


Super Mario Bros. Wonder Launches on Nintendo Switch
Posted 24-Oct-2023 10:56


Google Releases Nest WiFi Pro in New Zealand
Posted 24-Oct-2023 10:18


Amazon Introduces All-New Echo Pop in New Zealand
Posted 23-Oct-2023 19:49


HyperX Unveils Their First Webcam and Audio Mixer Plus
Posted 20-Oct-2023 11:47


Seagate Introduces Exos 24TB Hard Drives for Hyperscalers and Enterprise Data Centres
Posted 20-Oct-2023 11:43


Dyson Zone Noise-Cancelling Headphones Comes to New Zealand
Posted 20-Oct-2023 11:33


The OPPO Find N3 Launches Globally Available in New Zealand Mid-November
Posted 20-Oct-2023 11:06









Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.







MyHeritage