May 4, 2012
Before getting deep into history, I want to leave a quick up-front warning for frontend/chrome developers. Every scope, including JSMs, now has its own compartment. So objects that formerly touched each other directly now interact via wrappers. This should be pretty transparent, since chrome->chrome wrappers just forward everything along. Here’s the rub though: when chrome accesses content objects, it gets a special Xray (formerly known as XPCNativeWrapper) view on the underlying object, which bypasses content expandos and gives chrome a fresh slate. This wrapper also has a special expando holder to store expandos that chrome wants to put on the object. Xray wrappers are per-compartment. So previously there would only be one Xray wrapper (and thus one holder object) for a given content object, but now there can be several. This means that different chrome scopes no longer share expandos for content objects. You have been warned.
Back to the story.
Most importantly though, it made everything under the hood make much, much more sense. I use this phrase broadly to refer to both understandability (how easy is it for the programmer to grok what’s happening) and power (how well do the abstractions we use map to the things we want to do). The value of this can’t be overstated; It’s the difference between “we could fix that – I guess – It would probably take a month to implement and another month to track down all the regressions” and “sure, I can probably write and land that by the end of the day”. The more sense the code makes, the easier it is to diagnose and fix bugs, add new features, and make things fast. And as a rule, more compartments makes the code make more sense.
So why didn’t we do Compartment-Per-Global from the beginning? There are a number of reasons, but two stand out. First, compartments at that time imposed a much larger memory overhead, and so we didn’t want to be be creating them willy-nilly. More importantly though, ensuring that same-origin code shared compartments was a good way to avoid having too much stuff break. Objects in different compartments have a membrane between them that can only be crossed by Cross-Compartment-Wrappers. These wrappers are mostly transparent, but in practice there are a few ways in which they change behavior in subtle ways. Fortunately for us at the time, the security model of the web places pretty strong barriers between code running from different origins. So in general, code wasn’t expecting to do intimate and subtle things with objects from other origins. Therefore, by introducing compartments on a per-origin basis, we could avoid breaking too much existing code. Privileged browser code (aka chrome) was a particularly big beneficiary of this strategy. Chrome code all runs with the same principal (System Principal), so most of it saw no change.
So Firefox 4 landed with a splash, and everyone was happy. For a while. As time went on, it became apparent that more and more stuff that we wanted to do required a one-to-one mapping between compartments and global objects. A lot of these dependency chains are pretty long. For example, IonMonkey won’t work until we get rid of JSStackFrames, but we can’t get rid of JSStackFrames until we change our security architecture to pull principals directly off object and context compartments, but we can’t do that until the compartment principal exactly matches that of the scope (i.e. until we have CPG). And in general, there are lots of operations that can be made much faster with the help of CPG.
So about a year ago, Luke started looking into what it would take to make this a reality. A number of dependent bugs had to be fixed first, and other work got in the way. By early January, he had patches that would at least launch the browser. But there were lots of things that were broken. So I took the reins for a while, and pounded away at XPConnect, the JS Engine, the DOM, and the frontend to get things green. Some fixes were simple. Others were harder. Some had a very high comment-to-code ratio. But almost everything was landed as dependent fixes, meaning that the final patches that flipped the switch were quite manageable.
The final landing was small, but its effects rippled throughout the code. So every few days it was delayed, something new would be checked into the tree that broke it. This made the last stretch pretty strenuous. But thankfully, I wasn’t alone. Luke did a ton of work (and in fact holds authorship on eight of the nine patches that landed on Thursday). Mano, Mak, and zpao were hugely helpful in diagnosing browser-chrome issues. Blake was on the hook for reviewing all the really tricky stuff. Boris provided much-needed insight, and took some of my other work so that I could finish this stuff up. Johnny greased the wheels when they needed it. Kyle heckled. And Ed Morley helped out at the last minute to avert a backout. Go team.
December 13, 2011
tl;dr – You can now pass typed arrays to XPCOM methods that expect XPIDL arrays.
These days I work mostly on XPConnect, which serves as the bridge between Spidermonkey and the rest of Gecko. For those not steeped in Mozilla lore, XPConnect has a reputation for being one of the nastiest and most incomprehensible parts of the platform. Unfortunately, it’s also a central bottleneck through which most important things must pass.
So while many people desperately need features and bug fixes for XPConnect, very few people have the knowledge and fortitude to hack on the code. Those who do, folks like Blake Kaplan, Boris Zbarsky, and Peter Van Der Beken, tend to be very busy people.
I started working on XPConnect in large part to alleviate this problem. Unfortunately, when word got out that there was a fresh face in the cartel, I too quickly became a very busy person. Thus, when a bug was filed to implement support for passing typed arrays to XPCOM methods, I expressed skepticism that I’d have the bandwidth for it any time in the near future.
So imagine my surprise and joy when one of the requesters (Antti Haapala, ztane on IRC) expressed interest in taking a crack at it. And imagine my utter disbelief when a working patch, with tests, appeared a day later:
A few review iterations later, the patch landed on mozilla-central.
The force is strong in this one – let’s hope he sticks around.
October 23, 2010
This summer I went on a quest to improve my workflow. I wasn’t really happy with the standard Mercurial/mq approach used by most Mozilla developers. I spent a while experimenting with alternative ways of using Mercurial, and even did a fair amount of hacking on hg itself to fix some bugs and shortcomings. I wrote quite a long blog post about all of this and almost published it, but in the end I decided that it still wasn’t as good as I’d like it to be.
To its credit, Mercurial’s extension model made all this very doable, and I probably could have continued to cobble together a workflow that did what I wanted. However, I was pretty sure that Git did exactly what I wanted out of the box. So I gave it a shot, and it works even better than I’d hoped.
Brief Aside – A Ten-Second Introduction to Git
Git and Mercurial are quite similar – both use SHA-1 hashes to identify commits. The primary user-facing difference between Git and Mercurial is that Git branches are extremely lightweight. Git is essentially a user-space filesystem, where each commit is represented as file named by its SHA-1 hash. A branch is nothing more than a smart alias for a hash identifier. So a Git repository consists of 3 primary things (this is a bit of an oversimplification, but it’s fine for our purposes):
- An objects directory, which contains a soup of commit files, bucketed into sub-directories.
- A refs/heads directory, which contains one file for each named branch. So if I have a branch called collectunderpants whose latest commit is 7bc99958bc164028b94ec47dbf1fb1ad9034c580, there’s a file called refs/heads/collectunderpants whose contents is simply 7bc99958bc164028b94ec47dbf1fb1ad9034c580. That’s all git needs.
- A file called HEAD containing the name of the current branch. This is important, because when I make a commit, Git needs to know which branch should be scooted forward to point to the new commit.
Suspend your disbelief for the time being and assume that I have a git repository called /files/mozilla/link that contains an up-to-date mirror of mozilla-central in git form (I’ll explain how this is done later).
$ cd /files/mozilla
$ git clone link src
After a waiting a few moments, I now have a full git repository named src. The default branch is master, which I can see immediately because of a neat shell prompt trick (works best when put in ~/.profile):
$ export PS1='\u@\h \w$(__git_ps1 " (%s)") $ '
/files/mozilla/src (master) $ echo w00t!
So I’m on master. Unfortunately, I check TBPL and it looks like the tree is burning as a result of another Jonas Sicking push-and-run. The last green commit was 5 changesets back, so I want to base my work off of that.
(master) $ git checkout -b master-stable master~5
This makes a new branch called master-stable based 5 commits back from the commit pointed to by master, and switches the working directory to it.
I make a .mozconfig, set the objdir to /files/mozilla/build/main, make -f client.mk, and go shoot some nerf darts at dolske. A short while later, I’ve got a full build waiting for me in /files/mozilla/build/main.
Let’s run some simple diff queries:
(master-stable) $ git diff HEAD # Diffs against the current head
(master-stable) $ git diff master # Inverse of Sicking's bad push
(master-stable) $ git diff HEAD^^^ # workdir vs 3-commits-back
(master-stable) $ git diff HEAD~3 # Same as above
(master-stable) $ git diff master-stable~3 # Same as above
The ability to reference revisions symbolically (relative to either heads or branches) is really nice, and is something that I missed with Mercurial. Edit: bz points out in the comments that this is actually possible with Mercurial.
Now suppose I get an idea for a quick one-off patch, and hack on a few files. To save this work (along with its ancestry), I create a branch off the current head:
(master-stable) $...hack hack hack...
(master-stable) $ git checkout -b oneoff
(oneoff) $ git commit -a
The first command creates a new branch called oneoff that points to the same commit as master-stable. The second creates a new commit containing the changes in the working directory. The reason for the -a option has to do with a git feature called the “index”, which is a staging area between your working directory and full-blown commits. I don’t want to digress too much, but you should definitely read more about it.
Remember that branches are just aliases to SHA-1 identifiers, which in turn are used to locate the actual commit in the soup. So oneoff is an alias for a SHA-1 identifier which points to the new commit. That commit knows the hash of its parent, which is the same hash pointed to by master-stable. Git commits are immutable, since their names a are cryptographic function of their contents (so if a commit changes, it’s really just a new commit). Furthermore, git is garbage collected when you call git gc. So objects in git are just like immutable objects in a garbage-collected language. For example, suppose we want to modify that commit we just made:
(oneoff) $ ...more hacking...
(oneoff) $ git commit -a --amend
Normally git commit makes a new child of the previous commit. However the --amend option makes a new sibling that combines the previous commit with any working changes, and points the branch and head to it. The old commit is still there, but is now orphaned, and will be removed in the next call to git gc.
I use one branch per bug, and one commit per patch. This allows me to model my patches as a DAG, where patches are descendents of work they depend on. Contrast this with the MQ model, where a linear ordering is forced upon possibly unrelated patches.
Suppose I’m doing some architectural refactoring in a bug called substrate, and using the clean new architecture in a feature bug called bling. Initially, I start work on bling as follows:
(substrate) $ git checkout -b bling
(bling) $ ...hack commit hack commit hack....
But then I think of something else that would be useful for bling that should really go in substrate. So I stash away my uncommitted changes, and go add another patch to substrate:
(bling) $ git stash
(bling) $ git checkout substrate
(substrate) $ ...hack hack...
(substrate) $ git commit -a
(substrate) $ git checkout bling
At this point, I’d really like get back to working on bling, but unfortunately bling isn’t yet based on the latest patch in substrate. To fix this, we need to rebase:
(bling) $ git rebase --onto substrate bling~3..bling
This tells git to take all the changesets in the range (bling~3, bling] and apply them incrementally as commits on top of substrate. If there are conflicts, I’m given the opportunity to resolve them, or to abort the whole endeavor. Once the rebase is complete, the branch bling is updated to point to the new, rebased tip. Now I can reapply my work-in-progress and get back to business:
(bling) $ git stash pop
My code is always perfect the first time I write it, but suppose for the sake of argument that Joe gets a bee in his bonnet and I have to alter patch 7 of 18 in bigbug to appease him. I could do it the long way:
(bigbug) $ git checkout -b _tmp HEAD~11
(_tmp) $ ...appease appease...
(_tmp) $ git commit -a --amend
(_tmp) $ git rebase --onto _tmp bigbug~11..bigbug
(_tmp) $ git checkout bigbug
(bigbug) $ git branch -d _tmp
This gets tedious after a while though. Thankfully, there’s a better way:
(bigbug) $ git rebase --interactive HEAD~12
This fires up an editor, which allows me to select which parts of the history I want to modify:
pick f901b35 patch 7
pick 613cb9e patch 8
pick db26bd3 patch 9
pick 678b170 patch 18
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
So if I change the pick on the first line to edit (or just e), git brings me to that revision, lets me edit it, and does all the rebasing for me. Huzzah!
Pushing to Bugzilla
One nice bonus of git is an add-on developed by Owen Taylor called git-bz. I’ve made some modifications to it to make it more mozilla-friendly, and haven’t yet found the time to make them upstreamable. So in the mean time, I’d recommend that you grab my fork, git-bz-moz.
While it does a lot of things, my favorite part of git-bz is pushing to bugzilla. For credentials, git-bz uses login cookie of your most recently opened Firefox profile – so if you’re already logged into BMO things should work seamlessly. Let’s say I want to attach all 18 patches of bigbug to bug 513681. I run:
(bigbug) $ git bz attach --no-add-url -e 513681 HEAD~19..HEAD
And then I’m presented with a sequence of 18 files to edit in my editor, each of which looks like the following:
# Attachment to Bug 513681 - Eliminate repeated code with image decoder superclass
Commit-Message: Bug 513681 - Eliminate repeated code with image decoder superclass.
#Obsoletes: 470931 - patch v2
# Please edit the description (first line) and comment (other lines). Lines
# starting with '#' will be ignored. Delete everything to abort.
# To obsolete existing patches, uncomment the appropriate lines.
This pulls the relevant data from the bug, and let’s me do a lot in one edit. I can set the patch description, add a comment in the bug, edit the commit message (for facilitating hg qimport), obsolete other patches in the bug, flag for review, and grant self-review. I’ve found this to be a massive timesaver when working on many-part bugs.
When I want to push, I just qimportbz from the bug. This gives me an incentive to make sure that the patches committed are the ones on bugzilla.
Aside – I haven’t done much active development since the end of august, and git-bz just choked on the cookie database of a recent nightly when I tried it. A 3.6 profile still works fine though.
Edit – dwitte points out in the comments that this is due to a change in the sqlite database format, and should be fixed by upgrading to sqlite 3.7.x.
Multiple Working Directories
The ability to multitask is crucial to being productive in the Mozilla ecosystem. I can be waiting on tryserver results for one patch, guidance from bz on a second, review from Jeff on a third, and a dependent patch from Joe for a fourth. I need to be able to work on multiple patches at once, and context-switch quickly.
In theory, multitasking with git is quite simple: just do a git checkout of the branch you want to work on. However, some code changes require significant rebuilding. For example, if I have a patch that modifies nsDocument.h, context-switching between that patch and any other patch incurs a massive recompilation burden.
I’ve heard through the grape-vine that bz manages this problem by having 8 different mercurial repositories (each with its own object directory), and economizing on space via hardlinks. This eliminates the recompilation burden, but doesn’t allow work to be easily shared between repositories. For example, I might want to give both bling and substrate separate object directories, but still be able to rebase bling on top of new code in substrate.
Thankfully, git allows me to get the best of both worlds with multiple working directories.
/files/mozilla/src (blah) $ mkdir ../proj
/files/mozilla/src (blah) $ cd ../proj
/files/mozilla/proj/ $ git-new-workdir ../src a
This gives me a full working directory and a lightweight repository that is composed mostly of symlinks to files in ../src/.git/. Everything is shared seamlessly between them, and just about the only thing private to the new repository is the HEAD file, which specifies the checked-out branch. I can then make a .mozconfig pointing to a new object directory in /files/mozilla/build/a, and build away.
Earlier in this post I promised to explain where /files/mozilla/link came from.
I initially started using git with a mirror maintained by Julien Rivaud. Unfortunately, there was some flakiness with the cron job, and the repository would often stop updating from mozilla-central. So I decided to generate my own mirror. Edit: Julien mentions in the comments that the repository should be reliable now. Give it a shot!
Long-story short: don't use hg-git. It chokes miserably on mozilla-central. Instead, use hg-fast-export. Let it run overnight, and it should be done in the morning. Incremental updates are also very fast (roughly linear in the number of new commits), so I don't ever find myself waiting for it.
- From a general zippiness standpoint, git seems about 5 times faster than Mercurial. Your mileage may vary.
- Overall, I really like the garbage-collection model of git. With Mercurial, rewriting history involves stripping entries out of the repository, which can be very slow. With git, unwanted objects go away just by redirecting pointers, and they're still recoverable (with careful munging) until the next git gc.
- I've found that I'm spending a lot less time dealing with merge conflicts than I did when I was using hg/mq. Git seems to be pretty smart about these things, and I think it uses 3 lines of context internally. In contrast, it's standard to use 8 lines of context for mq patches so they can be easily exported to bugzilla. I've modified git-bz to generate 8 lines of context when posting to bugzilla, which allows me to be more efficient locally while still sharing my work in the appropriate format.
There's lots more to say about git, but I think that this is enough for now. Share your experiences in the comments!
August 17, 2009
This weekend I had the opportunity to give a talk about the Mozilla community to a pretty diverse audience. The focus of the event was on open/participatory models of cooperation, so I tried to tilt the talk in that direction. The response was pretty amazing. Despite having only one or two programmers in the audience, there was a ton of engagement, with lots of questions and discussion. I’d originally slated the talk to last about an hour, but it ended up taking 2 and a half hours with all of the audience participation. A lot was discussed about whether the “Mozilla Model” could be applied to other aspects of society, or whether the forkability and patchability of code (that is to say, the ease of experimenting with mutually exclusive solutions simultaneously) makes software a special case. In the end I think that everybody, including myself, learned a lot.
Note to self – don’t recommend Adblock when there might be web publishers in the audience.
I put together some slides (pdf, keynote) for the occasion that I based on Mike Beltzner‘s 2009 intern brownbag slides (keynote). I was pretty happy with how they came out, and I’d encourage anyone interested in talking about Mozilla (including beltzner!) to make use of them.
June 4, 2009
Sometime in early 2009, it became clear that lcms as a module wasn’t really working out for us, and we needed to write our own color management implementation (mostly) from scratch. Since I was a full-time student at the time, I didn’t have it in me to lead the effort and get things done in time for 3.5. Thankfully, Jeff Muizelaar stepped in and took up the task, resulting in qcms. I’d encourage everyone to go check out his post.
Also, worry not – most of the work I did on lcms (mainly the performance optimizations) are still there in qcms, so things should be as fast as ever, or faster.
December 11, 2008
December 8, 2008
Checking code into mozilla-central is a serious time commitment. Among other things, the person pushing the patch must watch the Tinderbox for several hours after committing to make sure that no unforeseen problems show up. This is enough of a pain that many of Mozilla’s most senior developers prefer to let other people push patches on their behalf.
It’s much easier to land a patch after hours, since there are fewer other people landing at the same time and so you’re less likely to get caught in someone else’s mess. However, I don’t always want to spend my evenings refreshing the tinderbox page. As such, I’ve always dreamed of having a convenient notification system that I could configure to send me an SMS message when the tree changed status. This would mean that I’d only need to be near a computer, rather than sitting at one, while I was watching the tree.
This evening I was talking to Wolf on IRC and had the idea of piggy-backing on firebot‘s tree notification system. Currently, firebot sits on #developers, checking the tree periodically, and speaks up any time the tree changes status. However, there was still quite a ways to go from the IRC channel to my cell phone. I use Colloquy as my IRC client, which has nice support for the Growl notification framework (all Mac only, sorry). Growl, in turn, has a notification option called “MailMe”, which uses your default Mail.app account to send a notification to the address of your choice. Finally, my cellular provider (AT&T) provides a service by which you can email [phone number] AT txt.att.net, and the message will be delivered to that number. So by putting a Colloquy watch-word on “Success”, “Failed”, and “Burning” and setting the Growl notification event JVChatMentioned to use the MailMe plugin, I had firebot sending me text messages. (Side note – Growl does in fact have an SMS plugin, but it requires having an account with a commercial service and seems to cost money.)
The MailMe plugin has a slight problem though: it appends a long footer string to every message. This is fine over email, but it caused all my SMS notifications to span multiple messages, which can get pretty expensive if all the boxes start burning at once. Furthermore, this message was hard-coded into the binary, so there was no easy configuration file to modify. As a result, I grabbed the source, hacked it to silence both the footer and the subject, and made a replacement plugin that you can download here. Just delete the existing MailMe.growlView in /Library/PreferencePanes/Growl.prefPane/Contents/Resources/GrowlHelperApp.app/Contents/PlugIns, drop this one in instead, and you should be good to go. I’m not sure how portable it is, but I can’t imagine any problems if you’re using up-to-date versions of Leapord and Growl.
If you decide to give this a try, let me know how it works out.
Edit: It looks like I forgot to include the source of the (minor) modifications to Growl. I’ve since deleted the directory I was working in, so I just re-did the changes as I remember them and posted them here. I haven’t tested the new patch beyond compilation, but it’s trivial enough that I didn’t feel like it.