Download the eBook and get 7 tips for using Git in the enterprise
The road to product release is forked, twisted, winding, and anything but straight. That’s because modern product development is increasingly multi-disciplinary, rapid, iterative, and geographically distributed. Art, design, prototyping, manufacturing, programming, and so forth all use different tools yet need to work together more closely than ever before. Therefore critical to success is having what Agile devotees call a "single source of truth." This means one—and only one—place where all of that product-development content is stored, revised, secured, and synchronised, even when contributors are spread around the world.
Git has long been popular among developers, but attempts to make it work for the enterprise is neither easy nor obvious. The purpose of this article is to explain a few of the more salient challenges and their workarounds identify a pattern that emerges, and outline the shape of a better solution.
File Limitations
The largest challenge Git faces in the enterprise arguably stems from its own limitations and performance problems when dealing with large numbers of files or very large file sizes. Git repositories become so slow and unwieldy as they grow that the largest practical size is broadly recognised as being between 1 – 2 GB of content.
This is precisely what drives the phenomenon known as "Git sprawl," or the tendency to break what really should be one large repository into dozens, hundreds, or even thousands of smaller repos. Managing so many repos is not easy and has unsurprisingly given rise to a variety of tools (e.g., git-annex) to tame such complexity.
But this isn’t the only workaround. Another is to store the larger files outside the repository itself, using a tool like the Git Large File Storage extension. The idea is to store only a small "pointer" file in the actual repository yet retrieve the large data when needed from a completely different system.
Hosting and Security
Another set of challenges centre around hosting so many repositories and protecting their content. Git includes its own daemon for easy hosting, but such hosting is completely open. It requires no authorisation, so anyone on the network can see or do anything. That works nicely for a variety of use cases, but the only thing it’s good for in the enterprise is producing ulcers.
Its limitations stem from the fact that Git concerns itself with only authentication, leaving authorisation to the file system. In other words, Git provides tools that can be used to ensure commits are correctly signed cryptographically by the individuals making them, but it doesn’t offer any options for locking down particular files, folders, branches, etc. via the usual access control lists or other mechanisms.
These needs explain the explosion of third-party hosting tools and services such as GitHub, GitLab, Atlassian Stash, etc.—all of which involve trade-offs. Free hosting in the cloud usually means limited space or privacy. Free local hosting behind your own firewall usually means limited features and IT headaches. As with so many things in life, you get what you pay for, but even the paid offerings can’t magically dispel the limitations of the underlying tool.
A Pattern Emerges
These are only some of the challenges facing Git adoption in the enterprise, but they’re enough to establish a pattern, one that is overlooked surprisingly often. Each limitation of Git makes burdening the enterprise with the "weight" of a corresponding workaround unavoidable.
So it doesn’t take long before that metaphorical weight meets or exceeds that of the original problems a version control system (VCS) was supposed to solve. Storing many files or large files means you need to split your repos, or perhaps host the large files externally. The need for secure hosting means you need to embrace a third-party solution or build your own. These needs—as well as those for more granular access control, synchronisation, scalability, high availability, disaster recovery, etc.—all add more extensions, tools, and processes to the picture in very short order.
The result is effectively another kind of Git sprawl, what we might call "IT sprawl," forcing IT departments who embrace Git to learn, adopt, and support a broad variety of elements from its larger ecosystem as well. This increases complexity significantly and only ramps up the learning curve for a tool not praised for its simplicity.
A Prettier Picture
Modern product development requires us to store, revision, secure, and synchronise content across teams. Any VCS whose limitations require you to store content outside it, divide content artificially, and/or augment the system with a large number of other tools and processes misses the mark for the enterprise.
The shape of a better solution should already be pretty clear. It should let you store any number of files of any size and type. It should easily revision that content and synchronise it across multiple teams, no matter where they’re located. It should be flexible in its hosting configuration yet preferably offer the benefits of centralised administration, making it easy to define groups, users, and their permissions with fine-grained access control. In addition, it should, of course, be open and flexible enough to adapt to different workflows, getting out of the way as much as possible and letting artists, designers, programmers, and other contributors develop content using their favourite tools.
By itself, Git traces only a small portion of that shape’s outline. Adding multiple extensions and tools brings more of it into focus. But for the enterprise, there comes a point where the workarounds -- with their additional demands, complexity, and limitations – begin to cause as many problems as they solve. This is the point where you realise you needed an enterprise solution all along.
A Better Solution
In contrast, Perforce Helix meets enterprise needs without forcing you to embrace a host of unsupported extensions or burn precious development cycles on home-brewed tools to fill gaps. Quite the contrary, it completes the picture we’ve been framing right out of the box from a single vendor renowned for great support, all available via native Git and the full ecosystem of Git tools developers know and love.
For example, Helix can handle any number of digital assets in addition to source code, any type of data stored in files of any size. Helix supports tens of thousands of concurrent users pushing millions of transactions each day. And with its federated architecture, clustering, and high-availability options, it can automatically synchronise all that content to remote teams located around the world and keep users working even when hardware is down. This is possible only because the Helix Versioning Engine has had decades of careful development and tuning, all aimed at maximum scalability, performance, and reliability.
What’s more, Helix isn’t just a great Git solution for developers; it’s a much broader platform for all the stakeholders in the enterprise. It offers its own native distributed version control system (DVCS) features, granular security right down to the file-level, locking features for digital assets that can’t be merged, collaboration and review, analytics and insight into the complete production pipeline, and more. The Helix Threat Detection component even leverages advanced behavioural analytics to detect, categorise, assess, and report potential risks to your IP before it walks out the door.
Conclusion
Choosing the right VCS is all about empowering your teams to build better products faster and more cheaply. Git may be free, but making Git work for the enterprise isn’t free, particularly in terms of the hidden costs associated with lost productivity, the complexity and burdens of various workarounds, and climbing the learning curves for it and its tools/extensions. Organisations should consider all the relevant factors before making a final decision. In short, if your VCS doesn’t easily and intrinsically address all the reasons why it’s needed, then it’s clearly not an ideal solution. You should look elsewhere.