Best Practices for Large Scale Connectors


We define a large-scale connector strategy as one that produces over 5000 Work Units in BWX per sync request.

A work unit in BWX is a combination of a file + workflow step + language pair. So a project with 5 files, 2 workflow steps (Translation + Review), and 5 languages will result in 50 work units. You can see how it’s not that difficult to go over 5000 Work Units when you begin to multiply these variables.

It’s tempting to want to centralize everything in a single project, but our experience shows that with large-scale connectors, grouping projects by language pair is the best way to go.

This may seem like simply a matter of how you choose to define a project but it goes much deeper than that. In this article, we will dissect the impact of this decision in terms of:


1. Message Brokering

Messaging grows exponentially with large-scale connectors. We have seen connectors that create over 1MM messages per synchronization request. This results in tremendous server activity and performance issues that can be mitigated by dividing projects per locale.


2. Problem-solving

This is directly tied to risk-mitigation. But in localization, we often see issues that are restricted to a given locale and how it impacts the parsing segmentation and pre/post processing of a given set of files. By dismembering between locales, you can create locale specific RegularExpressions, processing rules, segmentation that provides so much more flexibility as far as the overall architecture is concerned. Rather than working restricted with fixes that work across the board, you can iterate based on locale and ultimately reach a more mature as well as predictable behavior pattern when you work based on locales.


3. Risk-mitigation

By decoupling projects into one per locale you will mitigate management risks because if something goes wrong in a given locale that does not mean that the entire pull-request/delivery mechanism is compromised. You can isolate, and naturally compartmentalize issues. This may not seem like a thing during SOP but when unexpected issues come up (and they always do in localization), you will be grateful for having built a brick house as opposed to one made of hay.


4. Queuing/performance

Instead of lining up 150,000 elements for processing for instance, you can line up 15,000 elements ten times. Again, this does not seem like a big difference since you will ultimately have to process the same 150,000 elements but having the flexibility to process serially vs. in parallel or opportunistically as desired gives you so much more flexibility as well as performance bandwidth.


5. Automation potential

Typically project decisions and workflows will be asymmetrical across locales. Separating projects per locale gives you far greater flexibility as far as automation potential long-term. You can have a scenario where you have entirely different parameters as well as data-sets when you segregate per locale as opposed to consolidating all elements together.


6. Management Ease

This one is also counter-intuitive. From a management perspective typically consolidation is the best-practice for better governance. But in large-scale connectors the opposite is true. A project will naturally filter per locale, allowing different project managers to own different parts of the project more easily, reducing use of filters to generate reports and creating greater simplicity as far as tracking what is going on per locale.


7. Scalability

With large-scale connectors, you will reach a point where it simply becomes unmanageable to scale by grouping all work-units together into a single project. By separating you lay down the framework for a program that is easier to scale long-term. Remember that things multiply when it comes to work units and messages. By separating per locale you eliminate one of the large multiplying variables, allowing you to scale up much more easily.


Conclusion

There is an illusion that consolidation is better. One pull-request per project makes everyone’s life easier. While that holds true for small scale connectors, it falls apart in large scale ones. Our goal is always to deliver the most elegant and reliable solutions to our clients and we have seen again and again that with large scale situations that divide and conquer is the way to go.

Consolidating in a Single Project

Separating per locale

Consolidating in a Single Project

Separating per locale

Entire Project has a 0 or 1 status

Status can be stratified per locale

Singular parsing, filtering and RegEx Framework

Flexibile framework per locale

Single-lane queuing per project

Flexible queuing and processing naturally distributing processing

Filtering per locale within the project

One less filtering step

Automation rules that work across the board

Locale specific automation parameters and data

All-eggs in one basket from a risk perspective

Risk distributed across locales

Complex to isolate and troubleshoot

One less huge variable making it easier to trouble-shoot