Cool German weather guy

As a little follow-up to our previous post on weather forecasts, here a short video of the weather announcer on the German regional public TV channel for Rhineland-Palatinate (Rheinland-Pfalz) elegantly handling an intruder while not missing a beat.

On a personal note, I’m currently in the process of moving across London, but more regular posting will pick up soon. The new place is a 12min walk from my work place, which should gain me well over an hour in commuting time. The gain is mostly earmarked for blogging and the Eggcorn Database (which needs it).

How would you like your temperatures today? French or imperial?

In late October last year, I travelled to North America. One leg of the trip was a flight from San Francisco (think sun, palm trees, T-shirt weather) to Toronto. This was my second trip to Toronto, and the first had taken place during the coldest week of 2007, so this time I wasn’t going to be caught unprepared: I consulted the weather forecast beforehand. Online, of course.

I used two services, weather.com and Yahoo! Weather.

Online weather services have to deal with the problem that some visitors will prefer temperatures displayed in degrees Fahrenheit while others are used to thinking in degrees Celsius. Weather sites typically make an initial guess, maybe based on the visitor’s IP address, and then provide a little link with a bit of Javascript behind it, so that the temperature scale can be changed with one click. [There are better ways than going by IP address, but that’s for another post.]

Here is how Yahoo! handles this internationalization task:

Temperature scale selection on Yahoo! Weather
Temperature scale selection on Yahoo! Weather

Nothing special about this, though I’d maybe have expected °C and °F. The surprise was weather.com’s temperature selector:

Temperature scale selection on weather.com
Temperature scale selection on weather.com

English vs metric? Huh. Turns out, “metric” is not entirely off-base: We may be thinking of units of length, area, volume, mass and weight, but degrees Celsius is indeed part of the original metric system. But it’s not the SI temperature scale: that would be Kelvin (not “degrees Kelvin”, btw). But there’s simply no justification for “English”. Maybe they originally used “imperial”, and someone pointed out that degrees Fahrenheit aren’t considered imperial units either.

The bottom line: don’t complicate matters when simple does just fine.

What the Zune debacle shows us

When the news trickled through on New Year’s Eve that all (?) 30GB Zune mp3 players worldwide had collectively locked up around midnight Pacific Standard Time, speculations abounded — was the issue related to the changeover of the year to 2009, as the moniker Z2K9 suggests? Or maybe to the fact that a leap second was to be added in at midnight GMT (8pm PST)? Neither cause looked likely to me, but who knew. At the same time, I hoped it had something to do with 2008 being a leap year. Leap year bugs may well show up on the last day of the year if the code is unprepared for a 366th day.

The joy that my wild guess turned out to be correct was not just schadenfreude, but related to a passage I recently read in a book, The Mythical Man-Month by Fred Brooks. This is a classic in software project management, and relevant to the Open University’s postgraduate computing course I’m enrolled in. Brooks’s essays lay down a number of now widely accepted principles: Time (months) and people are not interchangeable, adding programmers to a late project tends to make the it even later etc.

The passage I was thinking of is from the chapter on the “second-system effect”: the idea that the second time a team builds the same type of system, the product is likely to be bloated, for the team will succumb to the temptation to implement all the nifty features that occurred to them while working on the previous system, but didn’t find the time to include. IBM’s mainframe operating system OS/360 was, according to Brooks, who had been one of the the project’s managers in the 1960s, a typical second system. And to support his claim that it was overblown and wasteful, he offers this example:

For example, OS/360 devotes 26 bytes of the permanently resident date-turnover routine to the proper handling of December 31 on leap years (when it is Day 366). That might have been left to the operator.

26 bytes is what it takes, in the most commonly used encodings for plain text — ascii, any iso-8859, utf-8 — to store the bit that’s a different colour at the start of this paragraph. But my goal here is not to point out how far computers have come in the last 40+ years. My point is about how we all are, in the most mundane situations, computer operators now. The mind-boggling part is not the 26 bytes figure. It is the “leave it to the operator” attitude, the idea that anyone’s time would be reasonably spent, the night before New Year’s Eve, to manually carry out some procedure in order to assure date calculations don’t fall over.

Some final disconnected notes:

1/ Collective experience of adversity spawns linguistic innovation. In addition to the ubiquitous Z2K9, we got Zunicide and Zuneocalypse, and a whole lot of bad puns.

2/ Kudos to the Microsoft support engineer Corey Gouker, who was among those who posted unofficial DYI unbricking instructions. I don’t know if he is connected to the Zune team, but as someone who runs a technical product support team in my day job, I sympathize with the poor support guys.

3/ The code of the third-party clock driver is available, and the bug is all too elementary (scroll down into the comments for a step-by-step analysis). It’s a banal infinite loop in a straightforward piece of code.

4/ As for the Fred Brooks text, I’m finding it less inspiring than I expected. Sure, reading the original instead of one of the numerous summaries of his ideas always adds a lot of little extra insights, and a good chunk of history, and this is why I’m not regretting the read. But the tone can be grating. As an example, here is how he introduces the “second-system effect”:

An architect’s first work is apt to be spare and clean. He knows he doesn’t know what he’s doing, so he does it carefully and with great restraint.

As he designs the first work, frill after frill and embellishment after embellishment occur to him. These get stored away to be used “next time.” Sooner or later the first system is finished, and the architect, with firm confidence and a demonstrated mastery of that class of systems, is ready to build a second system.

This second is the most dangerous system a man ever designs.

Even in the 1970s not everyone wrote with so much pathos. My mental eye invariably sees a bunch of very serious manly men, highly aware of the sacred nature of their roles.

5/ I just can’t think of the book’s title without being reminded of Elizabeth Bishop’s wonderful poem The Man-Moth.

Of boys, girls, goats and probabilistic intuition

Oh, well done, CodingHorror! Jeff Atwood posted a statistics puzzle and its solution, and his readers’ reaction was just about identical to what happened in the probability & statistics course I attended 15+ years ago, when a mathematically very similar problem was set on our weekly exercise sheet. Jeff’s two posts have been commented upon more than 1500 times in about 3 days, people are vociferously defending their reasoning and accusing “the other side” of idiocy, lack of maths skills, and boneheadedness, some careful and constructive chaps have written simulations… a nice mayhem.

Back in my statistics class, getting us students all riled up and passionate about solving an exercise was of course the intended effect.

Jeff’s puzzle was formulated this way: “Let’s say, hypothetically speaking, you met someone who told you they had two children, and one of them is a girl. What are the odds that person has a boy and a girl?”

(My old statistics exercise was given in a form that’s harder to grasp intuitively: You’re on a game show, three closed doors in front of you. You know that behind one of them is a car and behind the others are goats. If the door you’ll be opening has a car behind it, you get to keep the car; if it’s a goat, you lose. You pick a door, but don’t open it yet. Then, the game show host opens a different door, one you haven’t picked, and reveals that there is a goat behind it. You now have a choice between sticking to your original selection and switching to the other still closed door. What should you do? We didn’t have Wikipedia then, or Google.)

Intuition of probabilities is a tricky thing. There are studies that show, for example, that people’s intuitive judgments are much more often correct if they are asked to reason about frequencies (how often something will happen, out of a number of repeated tries) than in terms of probabilities (a number between 0 and 1) — even though both are mathematically equivalent.

It’s the same for Jeff’s problem. My reaction to reading it was “well 50% of course… one second… there’s something fishy about it… YOU’RE DOING IT WRONG… “. Turns out, how easy it is to see the correct answer depends on what sort of interpretation you put on the original formulation. So let’s take this apart and develop the puzzle into a longer narrative:

  1. “You’re a teacher, and overhear a father enrolling his two children, Sam and Robin. You didn’t catch all of the conversation, so you don’t know the sex of the kids. However, the father is picking up some leaflets for after-school activities and you hear him exclaim ‘You have a girl geek computer camp? Great, I’ve got someone at home who will love this!’ What are the odds that this father has a boy and a girl?” The solution is straightforward: One of the two, Sam or Robin, is a (computer-loving) girl, and for the other we don’t know — they may be male, or dislike computers, or the wrong age for the group, or not interested for any other reason. The possibilities are: Sam is a girl, and Robin is also a girl; Sam is a girl and Robin is a boy; Sam is a boy and Robin is a girl. As the probability of a random child being male or female is roughly 50%, these three combinations are equally probable. Two of the three combinations are mixed sex. Therefore, the odds of the father having a boy and a girl are 67%, approximately. This is the intended correct interpretation and solution of the puzzle.
  2. A lot of readers, however, form a different mental representation of the problem, which translates into a scenario that’s subtly different from the above: What you hear the father exclaim is “You have a girl geek computer camp? Great, my Sam will love this!” In this case, you know which of the children can be identified as known to be female, so the possible combinations are reduced to: Sam female – Robin male, and Sam female – Robin female. 50% odds of the family having a boy and a girl. However, nothing in the original formulation indicates that you know which kid is female, only that (at least) one of them is. This is why this solution is incorrect.
  3. There is yet another, “common sense, normal English” interpretation: If I meet someone and they tell me, outright, they have two children and one of them is a girl, I can pretty much assume that the other one isn’t. This, of course, would make this not a maths puzzle, but a trick question, with 100% odds that the family has two kids of different sex. Correct or incorrect? You decide. (I’m leaning towards “correct”.)

Our frequently wrong intuition about problems like this, even among people with mathematical training and jobs that require formal reasoning, is an interesting feature of cognitive psychology. The Wikipedia article about the “Monty Hall problem” (the one with the goats) contains and links to fascinating material. Jeff’s commenters also reveal the pitfall in their mapping of the formal problem to a mental picture:

So are you actually saying that once you’ve had one girl child the odds of having a boy increase????

This example is rubbish. The question is simply “What is the probability that my other child is a boy”. The answer is 50%. The sex of the first child has no bearing on the sex of the second.

If the person has two children, there are three possible combinations of gender: BB BG GG. If we rule out BB that leaves us with two options — BG and GG — which results in 50% chance of the other kid being a boy.

Solution is flat out wrong, it assumes a dependency that does not exist. If I flip a coin, its a 50% chance of being heads, and a 50% chance of being tails. This is independent of, and completely regardless of the outcome of a previous coin toss. This is exactly the same. Having one child that is a girl has absolutely no bearing on the sex of the other child, its still the same completely random 50% chance for each sex.

I’ve just written some code to test this out, and in the procces (and result) I’ve come to the 2/3 chance (25% no girls, 50% boy+girl, 25% boy boy) conclusion. […] I’m still quite confused by the implications. It feels like I’m being told that past events influence future events.

I’d say it’s 50%. If you have two children, you have BB, BG, GB, or GG. One of them is a girl then it’s not BB. When you mention that one of the children is a girl it’s 25% chance you are talking about a girl in BG, 25% a girl in GB and 50% (25%+25%) a girl in GG. (Four girls and you’re talking about one of them with equal probability.) Then chance to have a girl and a boy is 50% because it’s either BG or GB, not GG.

Fundamentally, analysing a statistical problem in terms of equally likely, atomic combinations, is not something that comes naturally to us.

The state of the beast

This site is slowly getting close to being ready to launch. I just upgraded it to WordPress 2.7, and to my dismay the upgrade didn’t go all smoothly. The first problem, PHP errors all over the place, seems to have been the caused by an incomplete FTP transfer. The second, after carefully re-uploading all parts, was more severe: suddenly the non-ascii characters, such as the ß in the blog’s name, were broken. I fixed it the same way that also fixed a similar issue that appeared for many blogs for the upgrade to 2.2: by removing this line from wp-config.php:

define('DB_CHARSET', 'utf8');

What makes it rather worrying is that this blog started out as a WP 2.6 install with correctly set up utf-8 tables. It’s too late to investigate further, though, and I’m glad we seem to be at least visually ok now.

Bilingualism FAIL.

Is it futile to blog a phenomenon already noted by BoingBoing and Language Log? Probably, but does it matter?

As reported by the BBC, Swansea council neglected one of the basic principles of multilingual publishing: employ competent proofreaders for each of the languages you’re publishing in. Even if you have an in-house translation service, as is the case of Swansea council.

When officials asked for the Welsh translation of a road sign, they thought the reply was what they needed. Unfortunately, the e-mail response to Swansea council said in Welsh: “I am not in the office at the moment. Please send any work to be translated”.

The English's fine, the Welsh is an out-of-office reply
The English's fine, the Welsh's an out-of-office reply

The only similar example I’ve recently seen was the Chinese dining hall (located on the Beijing-Taiyuan expressway) that was advertised as “Translate server error” on a billboard.

The Chinese read "dining hall"
The Chinese reads "dining hall"

Here in the West, we like to make fun at the sometimes misguided Chinese efforts to adopt English in public signage alongside with the local language. And face it, they are funny. What we may be forgetting is how easy it is to fall into the same trap if you have similar requirements, as is the case for public officials in bilingual areas, who are likely to have a legal duty to promote languages they may not, themselves, master.

Nudibranch and emperor

Malcolm Hey: Reclining emperor shrimp
Malcolm Hey: Reclining emperor shrimp

While I was looking through a stack of postcards, this image from London’s Natural History Museum jumped out, so I’ll be using it today to write to a friend. The picture was taken by Malcolm Hey, is entitled “Reclining emperor shrimp” and won a Wildlife Photographer of the Year award in 2005.

The textures, the colour, and generally the calm and sense of whimsy that emanates from it are what makes this piece of photography so attractive. But but what triggered my posting this right now is the explanatory paragraph on the back of the card, which starts as follows:

Twirling and whirling in a crimson leotard and white tutu, the Spanish dancer (a large nudibranch, or seaslug) emerges to feed at night. Sometimes it has a passive partner, an emperor shrimp, tucked in the frilly folds of its gills. The tiny shrimp (about a centimetre – 0.4 inches long) turns red to blend in with its host’s costume.

Nudibranch is a great word.

(So is seaslug.)

Quote of the day: Polaroid ad on technology

From a 1972 marketing film for the Polaroid SX-70 camera. The promotional video was made by Charles and Ray Eames:

you can look at technology as a living tree: the trunk bearing branches, the branches leafing out. or you can see it as a net: each knot tying up threads from many sides. but the human reality is more intricate than either one. we have been looking at one invention that began pretty purely out of the conception of a need: the hope to change the person who takes pictures from a harried, off-stage observer into someone who is a natural part of the event. no single thread wove this invention. not lens, not moving mirror, not film chemistry, not clever circuits. they are coordinate: parts of a single strategy, working together to protect and fulfill the original hope. this invention is finally a system. call it a system of novelties.

but even that is not enough. the camera enters the real world only once it is precisely manufactured in quantity. that process, too, reflects a civilized concern. it has its visual beauty. it rewards skill and care with immediate feedback. in the end, it links the inventors, the engineers, the workers, the distributors into one chain of craftsmanship. the user is the final link. the device helps meet the universal need to do things well. it offers as a matter of course a tool for supplying a rich texture to memory. more than that, thoughtful use can help reveal meaning in the flood of images which makes up so much of human life. We hope the user will fully complete the chain […].

The excerpt starts at 7:41. The length of the film is 10:51. The entire thing is well worth watching, as a study in (granted, promotional) technology communication.

Hat tip: Jeff Shaumeyer on Bearcastle Blog.