Darcs vs Git: mathematician versus engineer

Nablaone summarizes (in Polish) today’s presentations at TechAula about distributed version control systems. Precisely, he wrote about main difference between Darcs and Git—being the difference between science and engineering:

Darcs represents what’s best in the science, beautiful ideas and loooooong waiting for the reply, approaching infinity. Git, on the other hand, represents engineering—down-to-earth, mundane, hairy duct-tape-driven architecture, responding within seconds.

Choice between darcs and git is simple.  We study darcs, we use git. :-)

Having used both of this system—Darcs extensively, now mostly switched to Git—I can’t help but agree.

7 thoughts on “Darcs vs Git: mathematician versus engineer

  1. Thanks for your translation of his summary! I’d be interested to learn more about your experiences mostly switching to git, and what darcs could learn from that.

    It doesn’t have to be this way, of course, this dichotomy between science and engineering. Good engineering can be built on good science, right?

    The darcs community is taking some time to optimise and work harder on the performance problems. It may take a while, but we think there are lots of things we can do to make life better for folks. We’re already starting to make a little bit of progress, http://lists.osuosl.org/pipermail/darcs-users/2008-December/016757.html and we have some ideas (and prototype implementations) for the future.

    Thanks!

    • Of cource good engineering may be based on good science, and most of the time it actually is based on science, but on yesterday’s science, not tomorrow’s. Engineer’s goals are different from those of scientist. Scientist wants to build an elegant model and explanation of things; engineer wants a working, robust solution, not necesarilly a beautiful one. If working solution involves science, it’s good, because it explains why it works and how will it continue to work, but if science gives only vague hint at answers, and a few rolls of duct tape make things work, duct tape wins.

      As for my switch, I got bitten by darcs a few times. Its conceptual model and theory are just great¹, it has cool and usable terminal UI, and a nice learning curve. I make some repos-branches, I push, pull, send and apply patches around, and it just does what I want it to. As long as I have small repo, with small history, and what I want to do is simple. When things get non-trivial, and when I already invested time and effort in using darcs in some project, repo grows, history grows, I may need to check in some binaries along with my text files (like images for Web site, or ODT documents accompanying TeX-based documentation, and so on), and suddenly… poof! Exponential merge here, mysteriously failing darcs apply there, 90-minute checkout on LAN somewhere else, different bugs, in different places, occuring in the least predictable and most inconvenient moments. And this is not only my own experience.

      And there’s more. Darcs is monolithic, and written in Haskell, which I don’t know well and Haskell environment is not as ubiquitous as Git’s C/Perl, or even Mercurial’s and bzr’s Python; I had much trouble trying to get ghc working and compiling itself from packages on PLD Linux, and finally gave up and used darcs binary. Git is scriptable in any language, including shell scripting, and exposes its guts in a very convenient way. I don’t think something like git-svn (which is a really great way to cooperate with those still trapped in centralised version control) would be easily done in darcs. Darcs doesn’t have a decent GUI, which makes it harder to work with non-geeks; git, besides included Tk-based git-gui, has GitX. All of these are actually details, but those details matter, especially when there’s so much of those.

      I hear from time to time that darcs gets better; probably it does. I still tend to use it for smaller projects, or those, which I know won’t contain any binaries or large number of files, or won’t undergo really heavy development (like, say, CL-OpenID). In bigger projects I use git; it has steeper learning curve, it’s duct-tape-driven, it names things funny (index?), but it just works, scales well, if fails–does it early, has a GUI and is scriptable Darcs may be the future of source code management, but I have some sources to control right now.

      ¹ I can’t, however, stand the “quantum” thing; darcs’ theory itself is cool, but it is built upon operator calculus which is only used by quantum mechanics, and is actually linear algebra in the space of functions (or, in darcs’ case, of patches). Waving the “quantum patch theory” term around makes it seem like it was named this way because it only sounds cool, where accurate name would be “patch algebra” or “patch calculus”.

  2. I’m sorry to hear you and others have been having so much better. Indeed, darcs is getting better and will continue to get better as we devote more of our energy to engineering issues (having worked so hard on improving the theory for darcs 2).

    I should be begin by saying that darcs 2 (first released in 2008-04) is a much better darcs and is completely backward compatible. There are some things you can do to take advantage of it, things which subsequent releases of darcs now do automatically (so users don’t have to read documentation to find out about them). The rest of it is just advertising the many little things we have done and are continuing to do to make life better.

    First: darcs supports a new darcs-2 format (since 2008-04) that avoids the exponential merge problem for most cases. So we encourage people to use the darcs-2 format for any new repositories.

    Second: hashed formats (there is both a hashed darcs 1 format for backward compatibility, also the darcs 2 format is always hashed) provides much better robustness (no mystery bugs).

    Third: using hashed formats (i.e. either with –hashed or using repositories created with the –darcs-2 format) plus the new cache feature (from 2008-04) allows for much faster checkouts (when you are retrieving a repository a second time, which I often do). It also allows for the new –lazy way of fetching repositories, which is quite fast and more robust than the old –partial flag (which we now discourage).

    Later releases of darcs do some of these things automatically. For example, since darcs 2.1 (2008-11) we create new repositories using the darcs 2 format by default. The 2009-01-15 release which we are planning will enable the global cache by default as well, so we hope that these practical things that few users may know about will start benefiting a wider audience.

    Darcs doesn’t have a decent GUI (or a free host) yet, and it may take a while for us to get there. But we are making steps in that direction. For example, the 2009-01-15 release of darcs will now expose all of its modules as a library. We don’t have an API yet, but we hope to evolve one over time. Meanwhile, we think this will help people who want to build 3rd party tools.

    I’m sorry to hear you’re having GHC trouble. I’m sure the darcs-users list or the #haskell channel would be delighted to help.

    Finally, we’ve noticed the links to quantum mechanics and are sorry for it. David — the inventor of darcs — is a physicist and he couldn’t resist explaining darcs the way he did to his colleagues. We have long ago removed mentions to quantum mechanics from our documentation.

    Thanks!

    • Thank you for detailed response, and for debunking some of the myths I spread ;) I know most of these things are fixed, I’ve just been bitten by some of those bugs/limitations in most inconvenient moments, and currently I’m a bit afraid that later in life of project I would run into some new bug that would require a PhD to even understand what causes it. That said, I still use darcs for smaller projects, and I haven’t totally given up on it. I find darcs’ model of floating patches much better than git’s explicit history chain, just at this moment git wins as a practical tool for larger projects.

  3. Just for another perspective, I’ve been using darcs since 1.0, and have found the performance of it to be sufficiently good. By contrast, I find the duct-tape architecture of git regularly slows me down when I use it.

    My sense is that I would personally lose more time wrestling with git for my projects than darcs occasional slower response time.

    • It may depend on usage; with darcs 1.x (maybe 2.x is better), I’ve already seen 90-minute checkouts, caused by just one 2M binary file that actually belonged in the repo: repo contained documentation, mainly in TeX, and the binary was one .odt document that was part of the same collection. For small projects and small repos darcs’ response times were and are good, but I was bitten by edge cases, and the deciding factor is not the speed itself, but fear that darcs may unexpectedly fail me again when I have other problems to handle.

Comments are closed.