Geekzone: technology news, blogs, forums
Guest
Welcome Guest.
You haven't logged in yet. If you don't have an account you can register now.


jb5

jb5

27 posts

Geek


#317704 7-Nov-2024 14:45
Send private message quote this post

I've been seeing a bit of packet loss since the 24th of Oct at about 10 pm. Only started looking because my work VPN was dropping out but not sure if it's related to the loss. The packet loss seems to start at the Voyager edge hop.

 

 

 

WinMTR to voyager.nz from my Win 11 desktop

 

 

 

 

The below from a smokeping ct on my Proxmox host

 

Google

 



TradeMe

 

 

 

 

Prior to the 24th I didn't have any loss.

 

 

 

Edit:

 

Here's where the loss began

 

 

 

 

 


View this topic in a long page with up to 500 replies per page Create new topic
 1 | 2
Linux
11287 posts

Uber Geek

Trusted
Lifetime subscriber

  #3306631 7-Nov-2024 15:10
Send private message quote this post
 
 
 

Backblaze Unlimited Backup. World’s easiest cloud backup. Get peace of mind knowing your files are backed up securely in the cloud (affiliate link).
Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3306660 7-Nov-2024 15:36
Send private message quote this post

Hey @jb5,

 

Just dropping a message to acknowledge this. We did do some planned work on the evening of the 24th to correct an issue we were seeing for Hyperfibre connections coming into Christchurch. The maintenance corrected that issue. Looking over some things to see if it may have unintended consequences. Bear with me will come back to you.

 

 

 

Regards,

 

~H


Psilan
856 posts

Ultimate Geek


  #3306743 7-Nov-2024 21:36
Send private message quote this post

Thanks Firebirdnz. I'm in Chch with similar issues. Definitely became much worse on the 24th.





Voyager referral - https://refer.voyager.nz/68QKJ8XKK




Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3306882 8-Nov-2024 11:09
Send private message quote this post

Just to provide an update. We have identified the likely cause of the packet loss. I am now trying to get to the bottom of why it's happening and how to resolve it. Will provide updates.


jb5

jb5

27 posts

Geek


  #3306884 8-Nov-2024 11:12
Send private message quote this post

Legend. Thank you.


Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3306970 8-Nov-2024 14:03
Send private message quote this post

Issue is super odd. Network into our Christchurch Broadband termination is done over multiple links for redundancy and capacity. Those links are connected to multiple upstream devices, once again for redundancy and capacity. We are seeing drops on a single link in a single direction. If I take that link out of service we then see drops on another link. It's only ever a single link affected in a single direction. The rate of drops seems to decrease if I balance the traffic more evenly over the links. I have done this and loss has quartered. I'll do some more digging and come up with a plan but at least for you guys it's better. Would appreciate continued reports from you all. Sorry for any impact this is causing. Appreciate the report and patience while I fix it.

 

~H


jb5

jb5

27 posts

Geek


  #3306976 8-Nov-2024 14:16
Send private message quote this post

Firebirdnz:

 

Issue is super odd. Network into our Christchurch Broadband termination is done over multiple links for redundancy and capacity. Those links are connected to multiple upstream devices, once again for redundancy and capacity. We are seeing drops on a single link in a single direction. If I take that link out of service we then see drops on another link. It's only ever a single link affected in a single direction. The rate of drops seems to decrease if I balance the traffic more evenly over the links. I have done this and loss has quartered. I'll do some more digging and come up with a plan but at least for you guys it's better. Would appreciate continued reports from you all. Sorry for any impact this is causing. Appreciate the report and patience while I fix it.

 

~H

 

 

 

 

I appreciate the continued effort to resolve it. From here it looks like things improved a lot since around 23:00 yesterday, and since then it has been mostly good with just a few blips of loss.




Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3307918 12-Nov-2024 08:48
Send private message quote this post

Yesterday I started testing on a probable long term solution. So far it's looking good. Need to tick off a few more things before scheduling a window to put it into production. At this stage that window will likely be on the week of the 25th. Thanks again for your continued patience.

 

~H


toejam316
1459 posts

Uber Geek

Trusted
Lifetime subscriber

  #3307921 12-Nov-2024 09:01
Send private message quote this post

Not sure how much info you'd be allowed to share, but I'd be super interested in hearing about this fault and what the RCA and resolution was, once you're done @Firebirdnz





Anything I say is the ramblings of an ill informed, opinionated so-and-so, and not representative of any of my past, present or future employers, and is also probably best disregarded.


jb5

jb5

27 posts

Geek


  #3307922 12-Nov-2024 09:05
Send private message quote this post

Thanks! Sounds good.

 

I just had a look at my Smokeping graphs and to be fair I don't see any packet loss since those initial changes you made.

 

I would also be interested in learning a bit more about the cause and fix if you are able to share.


Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3307971 12-Nov-2024 10:59
Send private message quote this post

I'm glad it seems to be looking better. The balancing I did looks like reduced the loss by about 75%. Still seeing loss on our graphs:

 

As you can see it's only about 150pps being dropped at peak which is something like sub 0.005% loss. So you'll possibly still be getting drops it's just much less likely.

 

Root cause looks to point towards a shared port buffer issue with a particular device. Plan currently is to rearchitect some of the setup to remove reliance on that particular device. This plan was something we were looking at anyways. Plan is to directly connect our BNGs into our Provider Core network which would allow us more scalability of our broadband services. I'll see if I am able to share more information when its all said and done. :)


Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3313713 28-Nov-2024 10:24
Send private message quote this post

Hey all,

 

Just an update on this one. I have been working on testing an architectural change that has shown to fix the issues. I have created a maintenance window for next Wednesday 4 December to implement the changes. Notification can be seen here: https://status.voyager.nz/incidents/3bjysd9t78t0

 

 

 

Please only read past here if you want a technical explanation of what's been going on.

 

Voyager use a technology called pseudowire headend termination to aggregate and provide subscriber services. We utilise Nokia SR7750 routers to complete this termination process. Within the Nokia operating system there are two primary ways to complete this function. I/O interface binding is the first way. This is where there is a single interface (or in our case a single LAG), which simply put uses the hardware resources of the interface into the box to also do the subscriber termination. This provides a simple and scalable way to do this kind of subscriber termination. The LAG on the Nokia side connects to multiple upstream devices and this is where the issue is stemming from. For some reason these upstream devices are low level dropping some traffic. We have hit a deadend for understanding why and the changes we are completing are to remove these devices out of the traffic flow. In order to do this we have to change from our current termination type, I/O interface binding to the second type, PXC termination. For this method you assign dedicated resources to the termination that must be separate from the interfaces used to go into and out of the device. We have to change because the new devices we will be connecting to aren't suitable for a cross device single LAG (MC-LAG). This change was something that was being considered prior to these issues as we have had a desire to directly connect our Nokia SR7750s directly into our Core Provider ("P") network, which also isn't suitable for MC-LAG. This maintenance window work completes about 95% of what is required to make this happen.

 

Next year at some point there will be further work to complete the same kind of changes to our Wellington and Auckland broadband networks.

 

 

 

Appreciate your patience while we work through this issue and I hope come this time next week it'll all be behind us!

 

~H


jb5

jb5

27 posts

Geek


  #3313724 28-Nov-2024 10:35
Send private message quote this post

Thank you @Firebirdnz. I've noticed this week the packet loss has been worse, at least according to smokeping. Thankfully I have not noticed any issues with my connection in day to day use.

 

Appreciate your explanation too, even though I only have a vague understanding of those words :)


Firebirdnz
35 posts

Geek

Trusted
Voyager

  #3313745 28-Nov-2024 10:56
Send private message quote this post

@jb5,

 

Have you got a graph of the last week that I can correlate to my data?

 

 

 

Regards,

 

~H


jb5

jb5

27 posts

Geek


  #3313748 28-Nov-2024 11:00
Send private message quote this post

@Firebirdnz I'll just give you the link so you can zoom the graph as you need https://smokeping.a-website.net/


 1 | 2
View this topic in a long page with up to 500 replies per page Create new topic





News and reviews »

Bolt Launches in New Zealand
Posted 11-Jun-2025 00:00


Suunto Run Review
Posted 10-Jun-2025 10:44


Freeview Satellite TV Brings HD Viewing to More New Zealanders
Posted 5-Jun-2025 11:50


HP OmniBook Ultra Flip 14-inch Review
Posted 3-Jun-2025 14:40


Flip Phones Are Back as HMD Reimagines an Iconic Style
Posted 30-May-2025 17:06


Hundreds of School Students Receive Laptops Through Spark Partnership With Quadrent's Green Lease
Posted 30-May-2025 16:57


AI Report Reveals Trust Is Key to Unlocking Its Potential in Aotearoa
Posted 30-May-2025 16:55


Galaxy Tab S10 FE Series Brings Intelligent Experiences to the Forefront with Premium, Versatile Design
Posted 30-May-2025 16:14


New OPPO Watch X2 Launches in New Zealand
Posted 29-May-2025 16:08


Synology Premiers a New Lineup of Advanced Data Management Solutions
Posted 29-May-2025 16:04


Dyson Launches Its Slimmest Vaccum Cleaner PencilVac
Posted 29-May-2025 15:50


OPPO Reno13 Pro 5G Review 
Posted 29-May-2025 15:33


Logitech Introduces New G522 Gaming Headset
Posted 21-May-2025 19:01


LG Announces New Ultragear OLED Range for 2025
Posted 20-May-2025 16:35


Sandisk Raises the Bar With WD_BLACK SN8100 NVME SSD
Posted 20-May-2025 16:29









Geekzone Live »

Try automatic live updates from Geekzone directly in your browser, without refreshing the page, with Geekzone Live now.



Are you subscribed to our RSS feed? You can download the latest headlines and summaries from our stories directly to your computer or smartphone by using a feed reader.







Backblaze unlimited backup