Skip to main content

Home/ dvcs-vs-vcs/ Group items tagged git

Rss Feed Group items tagged

Daniel Jomphe

An introduction to git-svn for Subversion/SVK users and deserters - 0 views

  • This article is aimed at people who want to contribute to projects which are using Subversion as their code-wiki
  • Subversion users can skip SVK and move straight onto git-svn with this tutorial.
  • People who are responsible for Subversion servers and are converting them to git in order to lay them down to die are advised to consider the one-off git-svnimport, which is useful for bespoke conversions where you don't necessarily want to leave SVN/CVS/etc breadcrumbs behind. I'll mention bespoke conversions at the end of the tutorial, and the sort of thing that you end up doing with them.
  • ...77 more annotations...
  • A lot of this tutorial is dedicated to advocacy, sadly necessary. Those who would rather just cut to the chase will probably want to skip straight to
  • Another way of looking at it is to say that it's really a content- addressable filesystem, used to track directory trees.
  • we've got a simple and efficient filesystem which competes with RevML but is XML free
  • Subversion added nothing to CVS' development model.
  • Yes, it's a bunch of small programs that do one thing and do it well, get over it, they're being unified
  • There's also a pure Java
  • I used to push strongly for SVK, but got brow-beaten by people who were getting far more out of their version control system than I knew possible until I saw what they were talking about.
  • SVK could easily use git as a backing filesystem and drop the dependency on Subversion altogether. So could bzr or hg.
  • The repository model (see right) is also simple enough that there are complete git re-implementations you can draw upon, in a variety of languages.
  • git is first and foremost a toolkit for writing VCS systems
  • Writing a tool to do something that you want is often quite a simple matter of plugging together a few core commands. It's simple enough that once a few basic concepts are there, you begin to feel comfortable knowing that the repository just can't wedge, changes can be discarded yet not lost unless you request them to be cleaned up, etc.
  • I really haven't seen a nicer tool than gitk for browsing a repository.
  • gitk does some really cool things but is most useful when looking at projects that have cottoned onto feature branches (see feature branches, below). If you're looking at a project where everyone commits largely unrelated changes to one branch it just ends up a straight line, and not very interesting.
  • You can easily publish your changes for others who are switched on to git to pull. At a stretch, you can just throw the .git directory on an HTTP server somewhere and publish the path.
  • There's the git-daemon for more efficient serving of repositories (at least, in terms of network use), and gitweb.cgi to provide a visualisation of a git repository.
  • With Subversion, everyone has to commit their changes back to the central wiki, I mean repository, to share them.
  • With Git (actually this is completely true for other distributed systems), it's trivial to push and pull changes between each other. If what you're pulling has common history then git will just pull the differences.
  • If the person publishes their repository as described above, using the git-daemon(1), http or anything else that you can get your kernel to map to its VFS, then you can set it up as a "remote" and pull from it
  • Most people say "but I don't want branches". But users of darcs report that they didn't know how much they really did want branches, but never knew until darcs made it so easy. In essence every change can behave as a branch, and this isn't painful.
  • Because you can easily separate your repositories into stable branches, temporary branches, etc, then you can easily set up programs that only let commits through if they meet criteria of your choosing.
  • Because you can readily work on branches without affecting the stable branch, it is perfectly acceptable for a stable branch to be updated by a single maintainer only
  • Some repositories, for instance the Linux kernel, run a policy of no commit may break the build. What this means is that if you have a problem, you can use bisection to work out which patch introduced the bug.
  • You might use a continual integration server that is responsible for promoting branches to trunk should they pass the strictures that you set.
  • There is an awful lot less to keep in your head, and you don't have to do things like plan branching in advance.
  • Good feature branches mean you end up prototyping well-developed changes; the emphasis shifts away from making atomic commits. If you forgot to add a file, or made some other little mistake, it's easy to go back and change it. If you haven't even pushed your changes anywhere, that's not only fine, but appreciated by everyone involved. Review and revise before you push is the counter-balance to frequent commits.
  • Not only is the implementation fast locally, it's very network efficient, and the protocol for exchanging revisions is also very good at figuring out what needs to be transferred quickly. This is a huge difference - one repository hosted on Debian's Alioth SVN server took 2 days to synchronise because the protocol is so chatty. Now it fits in 3 megs and would not take that long to synchronise over a 150 baud modem.
  • Disk might be cheap, but my /home is always full - git has a separate step for compacting repositories, which means that delta compression can be far more effective. If you're a compression buff, think of it as having an arbitrarily sized window, because when delta compressing git is able to match strings anywhere else in the repository - not just the file which is the notional ancestor of the new revision.
  • Disk might be cheap, but my /home is always full - git has a separate step for compacting repositories, which means that delta compression can be far more effective. If you're a compression buff, think of it as having an arbitrarily sized window, because when delta compressing git is able to match strings anywhere else in the repository - not just the file which is the notional ancestor of the new revision. This space efficiency affects everything - the virtual memory footprint in your buffercache while mining information from the repository, how much data needs to be transferred during "push" and "pull" operations, and so on. Compare that to Subversion, which even when merging between branches is incapable of using the same space for the changes hitting the target branch. The results speak for themselves - I have observed an average of 10 to 1 space savings going from Subversion FSFS to git.
  • Perhaps somebody has already made a conversion of the project and put it somewhere
  • But people who use git are used to treating their repositories as a revision data warehouse which they use to mine useful information when they are trying to understand a codebase.
  • importing the whole repository from Subversion
  • If you like, you can skip early revisions using the -r option to git-fetch.
  • make a local branch for development
  • The name "foo" is completely private; it's just a local name you're assigning to the piece of work you're doing. Eventually you will learn to group related commits onto branches, called "topic branches", as described in the introduction.
  • Say you want to take a project, and work on it somewhere else in a different direction, you can just make a copy using cp or your favourite file manager. Contrast this with Subversion, where you have to fiddle around with branches/ paths, svn cp, svn switch, etc
  • Each of those copies is fully independent, and can diverge freely. You can easily push and pull changes between them without tearing your hair out.
  • Each time you have a new idea, make a new branch and work in that.
  • git svn init
    • Daniel Jomphe
       
      I used git svn clone instead
    • Daniel Jomphe
       
      I used git svn clone instead (with the same parameters)
  • git svn fetch
    • Daniel Jomphe
       
      this wasn't needed b/c of clone instead of init
  • But anyway, that copying was too slow and heavy. We don't want to copy 70MB each time we want to work on a new idea. We want to create new branches at the drop of a hat. Maybe you don't want to copy the actual repository, just make another checkout. We can use git-clone again
  • The -l option to git-clone told git to hardlink the objects together, so not only are these two sharing the same repository but they can still be moved around independently. Cool. I now have two checkouts I can work with, build software in, etc.
  • But all that's a lot of work and most of the time I don't care to create lots of different directories for all my branches. I can just make a new branch and switch to it immediately with git-checkout:
  • Once you have some edits you want to commit, you can use git-commit to commit them. Nothing (not even file changes) gets committed by default; you'll probably find yourself using git-commit -a to get similar semantics to svn commit.
  • There is also a GUI for preparing commits in early (but entirely functional) stages of development.
  • People used to darcs or SVK's interactive commit will like to try git add -i
  • correcting changes in your local branch
  • If it's the top commit, you can just add --amend to your regular git-commit command to, well, amend the last commit. If you explored the git-gui interface, you might have noticed the "Amend Last Commit" switch as well.
  • You can also uncommit. The command for this is git-reset
  • HEAD~1 is a special syntax that means "one commit before the reference called HEAD". HEAD^ is a slightly shorter shorthand for the same thing. I could have also put a complete revision number, a partial (non-ambiguous) revision number, or something like remotes/trunk. See git-rev-parse(1) for the full list of ways in which you can specify revisions.
  • I sometimes write commands like `gitk --all `git-fsck | awk '/dangling commit/ {print $3}'`' to see all the commits in the repository, not just the ones with "post-it notes" (aka references) stuck to them.
  • In this scenario, we'll assume that what I'm currently working on isn't finished, either - and I don't want to have to finish it first. It's not ready. I'm just going to call it "WIP".
  • "Another" way to revise commits is to make a branch from the point a few commits ago, then make a new series of commits that is revised in the way that you want. This is the same scenario as before.
  • I've introduced a new command there - git-cherry-pick. This takes a commit and tries to copy its changes to the branch you've currently got checked out. This technique is called rebasing commits. There is also a git-rebase command which probably would have been fewer commands than the above. But that's my way.
  • Using Git opens the door to a bazaar of VCS tools rather than sacrificing your projects at the altar of one.
  • keep your local branch up to date with Subversion
  • The recommended way to do this for people familiar with Subversion is to use git-svn rebase.
  • Note: before you do this, you should have a "clean" working tree - no local uncommitted changes. You can use git-stash (git 1.5.3+) to hide away local uncommitted changes for later.
  • This command is doing something similar to the above commands that used git-cherry-pick; it's copying the changes from one point on the revision tree to another
  • Better still is to bunch up your in-progress working copy changes into a set of unfinished commits, using git add -i (or git-gui / git-citool). Then try the rebase. You'll end up this time with more commits on top of the SVN tree than just one, so using Stacked Git you can "stg uncommit -n 4" (if you broke your changes into 4 commits), then use "stg pop" / "stg push" to wind around the stack (as well as "stg refresh" when finished making changes) to finish them - see
  • in my experience stg is the best tool for rebasing
  • Once you grok that, you'll only need to use stg and git-svn fetch.
  • Ok, so you've already gone and made the commits locally that you wanted to publish back to the Subversion server. Perhaps you've even made a collection of changes, revising each change to be clearly understandable, making a single small change well such that the entire series of changes can be easily reviewed by your fellow project contributors. It is now time to publish your changes back to Subversion. The command to use is git svn dcommit. The d stands for delta
  • git-svn won't let the server merge revisions on the fly; if there were updates since you fetched / rebased, you'll have to do that again.
  • People are not used to this, thinking somehow that if somebody commits something to file A, then somebody else commits something to file B, the server should make a merged version with both changes, despite neither of the people committing actually having a working tree with both changes. This suffers from the same fundamental problem that darcs' patch calculus does - that just because patches apply 'cleanly' does not imply that they make sense - such a decision can only be automatically made with a dedicated continual integration (smoke) server.
  • This is normally what I use in preference to rebase.
  • This will merge all the commits that aren't in your ancestry, but are in the ancestry of the branch trunk (try setting rightmost drop-down in gitk to 'ancestor' and clicking around to get a feel for what this means), and make a new commit which has two parents - your old HEAD, and whatever commit trunk is up to.
  • there are many shortfallings in git.
  • Sadly, this model is in use by virtually every Subversion hosted project out there. And that is going to be hard to undo.
  • Left: what darcs thinks when you start committing without marking tag points.
  • Right: Subversion has a somewhat smaller brain...
  • It is possible to use git in this way (see the figure to the right) - but it's not trivial, and not default. In fact git itself is developed in this way, using feature branches, aka topic branches.
  • bzr comes with some great utilities like the Patch Queue Manager which helps show you your feature branches. With PQM, you just create a branch with a description of what you're trying to do, make it work against the version that you branched off, and then you're done. The branch can be updated to reflect changes in trunk, and eventually merged and closed.
  • Windows support is good. Consistent implementation. Experience with the distributed development model. Friendly and approachable author and core team.
  • Actually the models of git and bzr are similar enough that bzr could be fitted atop of the git repository model
  • Mercurial is missing lightweight branches that makes git so powerful, and there is no content hashing, so it doesn't really do the whole "revision protocol" thing like git.
  • If you're on Windows it's probably a lot easier to get going.
  • git-svn fetch
    • Daniel Jomphe
       
      was not needed because I used git svn clone
  •  
    Good at introducing concepts; not really great examples
Daniel Jomphe

The Thing About Git - 0 views

  • Version control systems have traditionally required a lot of up-front planning followed by constant interaction to get changes to the right place at the right time and in the right order. And woe unto thee if a rule is broken somewhere along the way, or you change your mind about something, or you just want to fix this one thing real quick before having to commit all the other crap in your working copy.
  • You can work on five separate logical changes in your working copy – without interacting with the VCS at all – and then build up a series of commits in one fell swoop. Or, you can take the opposite extreme and commit really frequently and mindlessly, returning later to rearrange commits, annotate log messages, squash commits together, tease them apart, or rip stuff out completely. It’s up to you, really. Git doesn’t have an opinion on the matter.
  • I’ve personally settled into a development style where coding and interacting with version control are distinctly separate activities. I no longer find myself constantly weaving in and out due to the finicky workflow rules demanded by the VCS. When I’m coding, I’m coding. Period. Version control - out of my head. When I feel the need to organize code into logical pieces and write about it, I switch into version control mode and go at it. I’m not saying this is the Right Way to use Git: in the end, it all goes to the same place. I’m saying that this is the way I seem naturally inclined to develop software, and Git is the first VCS I’ve used that accommodates the style.
  • ...20 more annotations...
  • Taking Control of Your Local Workflow
  • Git means never having to say, “you should have”
  • The big problem here is models.rb - it’s “tangled” in the sense that it includes modifications from two different logical changes. I need to tease these changes apart into two separate commits, somehow. This is the type of situation that occurs fairly regularly (to me, at least) and that very few VCS’s are capable of helping out with. We’ll call it, “The Tangled Working Copy Problem.”
  • Git is quite different in this regard. You can work on five separate logical changes in your working copy — without interacting with the VCS at all — and then build up a series of commits in one fell swoop. Or, you can take the opposite extreme and commit really frequently and mindlessly, returning later to rearrange commits, annotate log messages, squash commits together, tease them apart, or rip stuff out completely. It’s up to you, really. Git doesn’t have an opinion on the matter.
  • I've personally settled into a development style where coding and interacting with version control are distinctly separate activities. I no longer find myself constantly weaving in and out due to the finicky workflow rules demanded by the VCS. When I'm coding, I'm coding. Period. Version control – out of my head. When I feel the need to organize code into logical pieces and write about it, I switch into version control mode and go at it. I'm not saying this is the Right Way to use Git: in the end, it all goes to the same place. I'm saying that this is the way I seem naturally inclined to develop software, and Git is the first VCS I've used that accommodates the style.
  • The Index is also sometimes referred to as The Staging Area, which makes for a much better conceptual label in this case. I tend to think of it as the next patch: you build it up interactively with changes from your working copy and can later review and revise it. When you're happy with what you have lined up in the staging area, which basically amounts to a diff, you commit it. And because your commits are no longer bound directly to what’s in your working copy, you're free to stage individual pieces on a file-by-file, hunk-by-hunk basis. Once you've wrapped your head around it, this seemingly simple and poorly named layer of goo between your working copy and the next commit can have some really magnificent implications on the way you develop software.
  • We want to commit all of the changes to synchronize-bookmarks and some of the changes to models.rb, so let’s add them to the staging area:
  • add bin/synchronize-bookmarks
  • add --patch models.rb
  • tage this hunk [y/n/a/d/j/J/?]?
  • Stage this hunk [y/n/a/d/j/J/?]?
  • I run into The Tangled Working Copy Problem so often that I've devised a manual process for dealing with it under VCS’s that punt on the problem. For instance, if I were using Subversion, I might go at it like this:
  • The magic is in the --patch argument to git-add(1). This instructs Git to display all changes to the files specified on a hunk-by-hunk basis and lets you choose one of the following options for each hunk: y – stage this hunk _n_ – do not stage this hunk _a_ – stage this and all the remaining hunks in the file _d_ – do not stage this hunk nor any of the remaining hunks in the file _j_ – leave this hunk undecided, see next undecided hunk _J_ – leave this hunk undecided, see next hunk _k_ – leave this hunk undecided, see previous undecided hunk _K_ – leave this hunk undecided, see previous hunk _s_ – split the current hunk into smaller hunks
  • I like to review that the changes in the staging area match my expectations before committing: $ git diff --cached [diff of changes in staging area]
  • I also like to verify that my unstaged / working copy changes are as I expect: $ git diff [diff of changes in working copy that are not in the staging area]
  • Everything looks good, so I commit the staged changes: $ git commit -m "fix bookmark sucking problems"
  • git add --patch is actually a shortcut to features in git add --interactive, a powerful front-end for managing all aspects of the staging area. The git-add(1) manual page is a treasure trove of worthwhile information that’s often passed over due to the traditional semantics of VCS “add” commands. Remember that git-add(1) does a lot more than just add stuff – it’s your interface for modifying the staging area.
  • git commit --amend takes the changes staged in the index and squashes them into the previous commit. This lets you fix a problem with the last commit, which is almost always where you see the technique prescribed, but it also opens up the option of a commit-heavy workflow where you continuously revise and annotate whatever it is you're working on. See the git-commit(1) manual page for more on this.
  • And then there’s git rebase --interactive, which is a bit like git commit --amend hopped up on acid and holding a chainsaw – completely insane and quite dangerous but capable of exposing entirely new states of mind. Here you can edit, squash, reorder, tease apart, and annotate existing commits in a way that’s easier and more intuitive than it ought to be. The “INTERACTIVE MODE” section of the git-rebase(1) manual page is instructive but Pierre Habouzit’s demonstration is what flipped the light on for me.
  • There’s a section of the Git User’s Manual called The Workflow that describes, at a fairly low level, the various interactions between the working copy, the index, and the object database.
Daniel Jomphe

Setting up your Git repositories for open source projects at GitHub « Insoshi... - 0 views

  • In setting up the repositories for Insoshi, I’ve applied the version control experience I gained at Discover, where I was technical lead for the software configuration management (SCM) team.
  • Except for that interaction, everyone works within their own repository and on their own schedule. There’s no process waiting to be completed that blocks you from moving on to whatever you need/want to do next. And you’re not forcing anyone to drop what they’re doing to right now to handle your request.
  • One of the major benefits of a distributed version control system like Git is that each repository is on an equal footing; in particular, we would like every fork to have the same master branch, so that if the “official” Insoshi repository should ever be lost there would be plenty of redundant backups.
  • ...14 more annotations...
  • it’s a bad idea in general to work on the master branch; experienced Git users typically work on separate development branches and then merge those branches into master when they’re done
  • Your local repository: The “right” way Keeping the big picture in mind, here are the commands I’ve run to set up my local repository (using the GitHub id long): $ git clone git://github.com/insoshi/insoshi.git $ cd insoshi $ git branch --track edge origin/edge $ git branch long edge $ git checkout long $ git remote add long git@github.com:long/insoshi.git $ git fetch long $ git push long long:refs/heads/long $ git config branch.long.remote long $ git config branch.long.merge refs/heads/long
  • You should note that the Git URL for the clone references the official Insoshi repository and not the URL of my own fork
  • Insoshi also has an ‘edge’ branch for changes that we want to make public but may require a bit more polishing before we’d consider them production-ready (in the past this has included migrating to Rails 2.1 and Sphinx/Ultrasphinx).  Our typical development lifecycle looks something like development -> edge -> master
  • I’m resisting the temptation to immediately start working on the local ‘master’ and ‘edge’ branches. I want to keep those in sync with the official Insoshi repository. I’ll keep my changes separate by creating a new branch ‘long’ that’s based off edge and checking it out
  • I’m starting my changes off of ‘edge’ since that contains all the latest updates and any contribution I submit a pull request for will be merged first into the official Insoshi ‘edge’ branch to allow for public testing before it’s merged into the ‘master’.
  • I’m finally adding the remote reference to my fork on GitHub
  • We should run a fetch immediately in order to sync up the local repository with the fork
  • I’m pushing up my new local branch up to my fork. Since it’ll be a new branch on the remote end, I need to fully specify the remote refspec
  • Now that the new branch is up on my fork, I want to set the branch configuration to track it
  • Setting the remote lets me just simply use $ git push to push changes on my development branch up to my fork
  • I’ve got a shell script for you.
  • The extra work is worth the effort, because with this configuration My changes will be easily identifiable in my named branch I can easily get updates from the main Insoshi repository Any updates I’ve pulled into master and edge are automatically pushed up to my fork on GitHub The last one is a bonus because the default refspec for remotes is refs/heads/*:refs/heads/*. This means that the simple ‘git push’ command will push up changes for all local branches that have a matching branch on the remote. And if I make it a point to pull in updates to my local master and edge but not work directly on them, my fork will match up with the official repository.
  • So what is the benefit of all this to open source projects like Insoshi? The easier it is for the contributor to pull in updates, the more likely it will be that the pull request will be for code that merges easily with the latest releases (with few conflicts) You can tell if someone is pulling updates by looking at their master and edge branches and seeing if they match up with the latest branches on the main repository By getting contributors in the habit of working on branches, you’re going to get better organized code contributions Basically, the less effort that’s required to bring in code via a pull request, the sooner it can be added to the project release. And at the end of the day, that’s really what it’s all about.
Daniel Jomphe

'Re: clarification on git, central repositories and commit access lists' - MARC - 0 views

  • I certainly agree that almost any project will want a "central" repository in the sense that you want to have one canonical default source base that people think of as the "primary" source base. But that should not be a *technical* distinction, it should be a *social* one, if you see what I mean.
  • Release management: you often want the central "development" repository to be totally separate from the release management tree. Yes, you approximate that with branches
  • Yes, you can branch in a truly centralized model too, but it's generally a "big issue"
  • ...20 more annotations...
  • the branches are globally visible
  • The other problem is the "permission from maintainers" thing: I have an ego the size of a small planet, but I'm not _always_ right, and in that kind of situation it would be a total disaster if everybody had to ask for my permission to create a branch to do some re-architecting work.
  • the "globally visible" part means that if you're not sure this makes sense, you're much less likely to begin a branch - even if it's cheap, it's still something that everybody else will see, and as such you can't really do "throwaway" development that way.
  • So you absolutely need *private* branches, that can becom "central" for the people involved in some re-architecting, even if they never ever show up in the "truly central" repository.
  • and you need permission from the maintainers of the centralized model too.
  • So it's not strictly true that there is a single "central" one, even if you ignore the stable tree (or the vendor trees). There are subsystems that end up working with each other even before they hit the central tree - but you are right that most people don't even see it. Again, it's the difference between a technical limitation, and a social rule: people use multiple trees for development, but because it's easier for everybody to have one default tree, that's obviously what most people who aren't actively developing do.
  • I'm literally talking about things like some people wanting to use the "stable" tree, and not my tree at all, or the vendor trees. And they are obviously *connected*, but it doesn't have to be a totally central notion at all.
  • There are lots of kernel subprojects that are used by developers - exactly so that if you report a bug against a particular driver or subsystem, the developer can tell you to test an experimental branch that may fix it.
  • In the KDE group, for example, there really is no reason why the people who work on one particular application should ever use the same "central" repository as the people who work on another app do. You'd have a *separate* group (that probably also maintains some central part like the kdelibs stuff) that might be in charge of *integrating* it all, and that integration/core group might be seen to outsiders as the "one central repository", but to the actual application developers, that may actually be pretty secondary, and as with the kernel, they may maintain their own trees at places like ftp.linux-mips.org - and then just ask the core people to pull from them when they are reasonably ready. See? There's really no more "one central place" any more. To the casual observer, it *looks* like one central place (since casual users would always go for the core/integration tree), but the developers themselves would know better. If you wanted to develop some bleeding edge koffice stuff, you'd use *that* tree - and it might not have been merged into the core tree yet, because it might be really buggy at the moment!
  • This is one of the big advantages of true distribution: you can have that kind of "central" tree that does integration, but it doesn't actually have to integrate the development "as it happens". In fact, it really really shouldn't. If you look at my merges, for example, when I merge big changes from somebody else who actually maintains them in a git tree, they will have often been done much earlier, and be a series of changes, and I only merge when they are "ready". So the core/central people should generally not necessarily even do any real development at all: the tree that people see as the "one tree" is really mostly just an integration thing. When the koffice/kdelibs/whatever people decide that they are ready and stable, they can tell the integration group to pull their changes. There's obviously going to be overlap between developers/integrators (hopefully a *lot* of overlap), but it doesn't have to be that way (for example, I personally do almost *only* integration, and very little serious development).
  • Yes, you want a central build-bot and commit mailing list. But you don't necessarily want just *one* central build-bot and commit mailing list. There's absolutely no reason why everybody would be interested in some random part of the tree (say, kwin), and there's no reason why the people who really only do kwin stuff should have to listen to everybody elses work. They may well want to have their *own* build-bot and commit mailing list! So making one central one is certainly not a mistake, but making *only* a central one is. Why shouldn't the groups that do specialized work have specialized test-farms? The kernel does. The NFS stuff, for example, tends to have its own test infrastructure.
  • So we do commit mailing lists from kernel.org, but (a) that doesn't mean that everything else should be done from that central site and (b) it also doesn't mean that subprojects shouldn't do their *own* commit mailing lists. In fact, there's a "gitstat" project (which tracks the kernel, but it's designed to be available for *any* git project), and you can see an example of it in action at http://tree.celinuxforum.org/gitstat
  • So centralized is not at all always good. Quite the reverse: having distributed services allows *specialized* services, and it also allows the above kind of experimental stuff that does some (fairly simple, but maybe it will expand) data-mining on the project!
  • So I do disagree, but only in the sense that there's a big difference between "a central place that people can go to" and "ONLY ONE central place". See? Distribution doesn't mean that you cannot have central places - but it means that you can have *different* central places for different things. You'd generally have one central place for "default" things (kde.org), but other central places for more specific or specialized services! And whether it's specialized by project, or by things like the above "special statistics" kind of thing, or by usage, is another matter! For example, maybe you have kde.org as the "default central place", but then some subgroup that specializes in mobility and small-memory-footprint issues might use something like kde.mobile.org as _their_ central site, and then developers would occasionally merge stuff (hopefully both ways!)
  • different sub-parts of the kernel really do use their own trees, and their own mailing lists. You, as a KDE developer, would generally never care about it, so you only _see_ the main one.
  • You don't see how those lieutenants have their own development trees, and while the kernel is fairly modular (so the different development trees seldom have to interact with each others), they *do* interact. We've had the SCSI development tree interact with the "block layer" development tree, and all you ever see is the end result in my tree, but the fact is, the development happened entirely *outside* my tree. The networking parts, for example, merge the crypto changes, and I then merge the end result of the crypto _and_ network changes. Or take the powerpc people: they actually merge their basic architecture stuff to me, but their network driver stuff goes through Jeff Garzik - and you as a user never even realize that there was another "central" tree for network driver development, because you would never use it unless you had reported a bug to Jeff, and Jeff might have sent you a patch for it, or alternatively he might have asked if you were a git user, and if so, please pull from his 'e1000e' branch.
  • The fact that anybody can create a branch without me having to know about it or care about it is a big issue to me: I think it keeps me honest. Basically, the fundamental tool we use for the kernel makes sure that if I'm not doing a good job, anybody else can show people that they do a better job, and nobody is really "inconvenienced". Compare that to some centralized model, and something like the gcc/egcs fork: the centralized model made the fork so painful that it became a huge political fight, instead of just becoming an issue of "we can do this better"!
  • you can use your old model if you want to. git doesn't *force* you to change. But trust me, once you start noticing how different groups can have their own experimental branches, and can ask people to test stuff that isn't ready for mainline yet, you'll see what the big deal is all about. Centralized _works_. It's just *inferior*.
  • you do a single commit in each submodule that is atomic to that *private* copy of that submodule (and nobody will ever see it on its own, since you'd not push it out), and then in the supermodule you make *another* commit that updates the supermodule to all the changes in each submodule. See? It's totally atomic.
  • Git actually does perform fairly well even for huge repositories (I fixed a few nasty problems with 100,000+ file repos just a week ago), so if you absolutely *have* to, you can consider the KDE repos to be just one single git repository, but that unquestionably will perform worse for some things (notably, "git annotate/blame" and friends). But what's probably worse, a single large repository will force everybody to always download the whole thing. That does not necessarily mean the whole *history* - git does support the notion of "shallow clones" that just download part of the history - but since git at a very fundamental level tracks the whole tree, it forces you to download the whole "width" of the tree, and you cannot say "I want just the kdelibs part".
Daniel Jomphe

Some thoughts on Git vs complexity | Javalobby - 0 views

  • Git is "simple" but hard. Subversion is "easy", but eventually complex.Git is *a lot* of features in one tool (think of the 100+ git plumbing commands). Each feature is simple, but learning to use them together is good bit of work. As soon as you've understood the model and you get that Eureka-moment, the tool never fails you, and you find it more and more fun to use the more you learn.
  • Git, on the other hand, is simple: It all boils down to being three kinds of objects in a graph (commits, blobs, trees), and the rest of it are algorithms that work this data-structure.
Daniel Jomphe

'Re: clarification on git, central repositories and commit access lists' - MARC - 0 views

  • Btw, to see this in another light: as an example of a git tree that merges those same branches, but *before* they are ready, just look at the -mm tree. Now, Andrew actually ends up exposing the end result not as a git tree, but as patches, but what he actually *does* is to: - get my git tree - merge in about 30-40 other git trees from other developers (not all of which necessarily have actual development on them at any particular time) - then merge in his own patch list - expose it all as the -mm patch series So this is an example of how you actually have a totally separate, and still fairly central (the -mm tree is certainly now unknown outside of the core developer circles) tree, and where git is a big part in making a central "experimental" tree that is separate from my own central "development" tree. Also, it's an example of why centralization is bad: different people and entities have different intents. You could *not* reasonably do something like this with a centralized SCM like SVN.
Daniel Jomphe

'Re: clarification on git, central repositories and commit access lists' - MARC - 0 views

  • With that I mean that in KWord I tend to work on a feature by first writing a unit test and committing that. After that I start to implement the feature until it actually passes the unit test and after that I add UIs etc etc etc. In other words; it takes me a week with 30 commits before I finish this new feature. And finish naturally doesn't mean bug-free. During this time I will surely find bugs in other pieces of code, or simple little features to add there. I commit those separately. All the above goes into one git tree and depending on how much I work with others on the features I publish that git tree. But the small fixes will be committed to the 'release' tree (aka svn) as soon as possible. At the end of the week when my feature is 'done' I will also push that upto the release tree. So, what IMOHO you, as a svn user, will end up with is the trunk that doesn't have half finished features that mess everything up. You will still see the current development (mostly) but not the dirty work-in-progress-excuse-the-mess versions. As an example; in Krita we have this excellent GSoC project for color mixing, the author programs in trunk and thus commits everything there. We have had a couple of times when his commits made it hard for others to try out his work. I.e. it was quite broken. I'm all for experimentation so I'm not complaining. But at the same time I do see it as a great opportunity for Git where I can imagine him committing his work-in-progress that is known to create regressions and publish that for other interrested people to see. And only after a week of hacking commit his updated version to the release tree so everyone can enjoy it. leaving the release tree free from major regressions.
  • And I believe that Git actually helps by allowing others to see a more representative version of the software instead of one that is constantly in flux.
  • All the workflows you are used to are still there, there just are more workflow possibilities and thus more ways to get productive. So, I really don't think it is in any way an extra barrier. It actually tears down several barriers.
Daniel Jomphe

'Re: clarification on git, central repositories and commit access lists' - MARC - 0 views

  • Well, it is working in the sense that we can use git without disrupting things for those who wish to continue working with svn. No, it isn't optimal for those who wish to completely convert to using git, but it is better than no git at all I guess.
Daniel Jomphe

Git changes how developers manage source code versions - SD Times On The Web - 0 views

  • It decouples the notion of checkpointing from publishing. In Subversion, those are the same thing
  • Git lets developers experiment with the whole of a project, without worrying about breaking things or losing work
  • With Git running locally, developers push commits whenever they feel like and can fork and experiment to their hearts’ content. Those changes won't bother anyone “until you share [them],” said Vilain. “You share at the end of the day, after a day's work, when you've gone over the code a bit more.”
  • ...1 more annotation...
  • “The reason that most of us use it is that it helps productivity,” said Scott Chacon, a developer at GitHub. “It provides cheap local branching. It works for really big repositories and really small repositories. And it’s free, and it’s easy to get.
Daniel Jomphe

'Re: clarification on git, central repositories and commit access lists' - MARC - 0 views

  • Another option is to look at git-svnserver which would allow a git repository backbone, but could talk svn over the wire which these tools could use...
Olivier Valuet

ONJava.com -- Sussman on DVCS, Van Zyl using GIT+SVN - 0 views

  •  
    Sussman cautions that the world isn't ready for wide distributed version control adoption while a few high profile open source projects move towards adopting it. Maven is trying out GIT for external collaboration, and Sun is deploying Mercurial. What do you think?
Daniel Jomphe

'Re: clarification on git, central repositories and commit access lists' - MARC - 0 views

  • The biggest difference is that with git (or any other SCM that can _actually_ merge) multiple people can have their branches and merge with each other and after merging "annotate" and the like still work. That is not the case with svn and svnmerge.
Daniel Jomphe

Linus Torvalds: I Have Never Really Talked To Microsoft! - 0 views

  • I think it is much improved over the early drafts, and I don't think it's a horrible licence. I just don't think it's the same kind of 'great' licence that the GPLv2 is.So in the absence of the GPLv2, I could see myself using the GPLv3. But since I have a better choice, why should I?That said, I try to always be pragmatic, and the fact that I think the GPLv3 is not as good a licence as the GPLv2 is not a 'black and white' question. It's a balancing act. And if there are other advantages to the GPLv3, maybe those other advantages would be big enough to tilt the balance in favour of the GPLv3.Quite frankly, I don't really see any, but if Solaris really is to be released under the GPLv3, maybe the advantage of avoiding unnecessary non-compatible licence issues could be enough of an advantage that it might be worth trying to re-license the Linux kernel under the GPLv3 too.Don't get me wrong -- I think it's unlikely. But I do want to make it clear that I'm not a licence bigot, per se. I think the GPLv2 is clearly the better licence, but licences aren't everything.After all, I use a lot of programs that are under other licences. I might not put a project I start myself under the BSD (or the X11-MIT) licence, but I think it's a great licence, and for other projects it may well be the right one.
  • I like making strong statements, because I find the discussion interesting. In other words, I actually tend to 'like' arguing. Not mindlessly, but I certainly tend to prefer the discussion a bit more heated, and not just entirely platonic.And making strong arguments occasionally ends up resulting in a very valid rebuttal, and then I'll happily say: "Oh, ok, you're right."But no, that didn't happen on SVN/CVS. I suspect a lot of people really don't much like CVS, so I didn't really even expect anybody to argue that CVS was really anything but a legacy system. And while I've gotten a few people who argued that I shouldn't have been quite so impolite against SVN (and hey, that's fair -- I'm really not a very polite person!), I don't think anybody actually argued that SVN was 'good'.SVN is, I think, a classic case of 'good enough'. It's what people are used to, and it's 'good enough' to be used fairly widely, but it's good enough in exactly the sense DOS and Windows were 'good enough'. Not great technology, just very widely available, and it works well enough for people and looks familiar enough that people use it. But very few people are 'proud' of it, or excited about it.Git, on the other hand, has some of the 'UNIX philosophy' behind it. Not that it is about UNIX, per se, but like original UNIX, it had a fundamental idea behind it. For UNIX, the underlying philosophy was/is that, "Everything is a file." For git, it's, Everything is just an object in the content-addressable database."
  • I think there's both a licence issue, and a community and personality issue. The BSD licences always encouraged forking, but also meant that if somebody gets really successful and makes a commercial fork, you cannot necessarily join back. And so even if that doesn't actually happen (and it did, in the BSD cases -- with BSDi), people can't really 'trust' each other as much.In contrast, the GPLv2 also encourages forking, but it not only encourages the branching off part, it also encourages (and 'requires') the ability to merge back again. So now you have a whole new level of trust: you 'know' that everybody involved will be bound by the licence, and won't try to take advantage of you.So I see the GPLv2 as the licence that allows people the maximum possible freedom within the requirement that you can always join back together again from either side. Nobody can stop you from taking the improvements to the source code.
  • ...2 more annotations...
  • So is the BSD licence even more 'free'? Yes. Unquestionably. But I just wouldn't want to use the BSD licence for any project I care about, because I not only want the freedom, I also want the trust so that I can always use the code that others write for my projects.So to me, the GPLv2 ends up being a wonderful balance of 'as free as you can make it', considering that I do want everybody to be able to trust so that they can always get the source code and use it.Which is why I think the GPLv3 ends up being a much less interesting licence. It's no longer about that trust about "getting the source code back"; it has degenerated into a "I wrote the code, so I should be able to control how you use it."In other words, I just think the GPLv3 is too petty and selfish. I think the GPLv2 has a great balance between 'freedom' and 'trust'. It's not as free as the BSD licences are, but it gives you peace of mind in return, and matches what I consider 'tit-for-tat': I give source code, you give me source code in return.The GPLv3 tries to control the 'use' of that source code. Now it's, "I give you my source code, so if you use it, you'd better make your devices hackable by me." See? Petty and small-minded, in my opinion.
  • I don't really believe in the 'browser OS', because I think that people will always want to do some things locally. It might be about security, or simply about privacy reasons. And while connectivity is widely available, it certainly isn't 'everywhere'. So I think the whole 'Web OS' certainly is part of the truth, but another part that people seem to dismiss is that operating systems have been around for decades, and it's really a fairly stable and well-known area of endeavour. People really shouldn't expect the OS to magically change: it's not like people were 'stupid' back in the 60s either, or even that hardware was 'that' fundamentally different back then!So don't expect a revolution. I think OSs will largely continue to do what they do, and while we'll certainly evolve, I don't think they'll change radically. What may change radically are the interfaces and the things you do on top of the OS (and certainly the hardware beneath the OS will continue to evolve too), and that's what people obviously care about.The OS? It's just that hidden thing that makes it all possible. You really shouldn't care about it, unless you find it very interesting to know what is really going on in the machine.
Daniel Jomphe

SVN 1.5 - 0 views

  • that's not the whole picture. First and foremost you forgot the disconnected part. You can't commit to Subversion unless you can reach the repository, which is often in a server over the Internet. Also, each developer isn't restricted to one branch. He very often has a lot of them. Right now I have 28 separate branches of Qt in my workstation: they range from previous stable releases of Qt (to test regressions and fixes with) to branches I created to start working on fixing tasks to research projects.
  • And that's just my private branches. When I am collaborating with other people in projects, I have more branches. For one project right now in Qt, we are tracking 4 or 5 different branches, each with a different "theme": optimisations, new features, animations, etc. And there's an extra branch which is the merger of all those "theme branches", so that we can get a feel of what it will be when it's done.
  • Finally, you're also forgetting the ability to undo, redo, and modify your work. Once you commit to Subversion, it's there for life. Removing something from the repository means dumping and reloading it. With a Git, you can undo your commits, change them, squash them together without problems. (You can do that after you've published them, technically, but you shouldn't)
  • ...1 more annotation...
  • people are asked to only commit stuff to trunk when they are done with it. Which for refactors and bigger things means it may be a week or more before you can commit it. And due to that requirement, thiago's post becomes very relevant. Those tools are essential to a scalable workflow.
Daniel Jomphe

Discipline... - 0 views

  • And then git... If I look at what happens in the linux kernel development community, I see the best developers being permanently busy with merging, merging, merging and merging and very little else. And, actually, a scheme where we only seldom integrate, where every developer at one, infrequent, moment publishes their branch for merging sounds like a recipe for integration hell, no matter how developer-friendly and good at branching and merging a tool might be.
  • The Linux kernel development process is not very similar to KDE's of course. The overhead of what's going in is higher, it's quite formal and bound to specific persons that decide it all. KDE is quite different in that it's team-based, so potentially making it easier to share code in those teams and merge it upstream as it becomes interesting for others until finally bugs have been shaken out and it's ready to go into trunk.
Daniel Jomphe

[Kde-scm-interest] Distributed model VS accountability - 0 views

  • Distributed development needs a network of trust. For small projects, things are very simple: There's (say) one person, and that person reviews all incoming changes, and publishes (pushes) the result to a known location. Users need only trust that single point of distribution. If someone else publishes a different version of the same project, it's the user's problem to trust that someone else. For a larger project things become more involved. It starts by a single point of distribution, which only one or few people have push access to. Those people must be trustworthy. Let's call them BD. (Of course, nobody is forced to trust them.) But since it cannot be expected that they can review all incoming changes, they need other people, "lieutenants", whom they in turn trust. That is, BD will pull changes from the lieutenants without detailed review and push them out to the single point of distribution. The lieutenants themselves are possibly again backed by "major contributors" whom they trust. (At this point the fields of responsibility are likely small enough that incoming changes can be reviews in detail by the lieutenant or major contributors.)
  • At the infrastructure level, you need means that the chains of trust cannot be broken. At the lowest level, git provides a chain of trust by the SHA1 signatures. At a higher level, trust must be implemented such that each person can change (basically) only his own repository. For this reason, it is good that in your earlier proposal access rights to certain repositories were very limited. BD and lieutenants will only pull from such restricted repositories. Major contributors will review all incoming changes and push to their own trustable repositories and ask the lieutenants to propagate the changes upstream. In this context, "incoming" will mean mostly changes pushed to the publically accessible repository tree by random contributors, and the users' repositories. To come back to your example: I will trust *you*. And will blindly pull from *your* repository, no matter whose name the commit carries.
1 - 20 of 33 Next ›
Showing 20 items per page