Birch Street Computing -

about me

John M is a Linux fan in Lowell, MA.

I work at at a company writing software. I fool around with Free and Open Source software for fun & profit.

I was big into Last.fm and you can still see some of what I listen to there. I can also be found using github and recently sourcehut as well as bitbucket (historically). I don't care for most popular social media sites. If I have an account on one or the other it's probably either old and unused or was created just for tinkering.

promo

Links to things I like, use, or otherwise feel is worth sharing. Things I'd like to see get more popular.

Why Thee Kay

Well, Python 3000 is done and 3.0 has been released. I like a lot of the changes that have been made to the language. My favorite is probably the change to make print a function, I've run into many situations where I would have like to create a class with a print(...) method but had to settle for a less convenient name.

But today, I'm not here to be positive. I'm going to gripe! Or a least express a little bit of skepticism. So far there are two things about the change that I don't quite like. There is the removal of the callable(...) function. This is minor, but I really don't get it. Maybe I'm weird in that I was using it pretty often. But it's easy to deal with, just create my own utility function that does the same thing.

Less easy for me to just deal with, is the change to unicode strings everywhere. What bugs me isn't that unicode is used as the default, but that "regular" byte strings have been sort-of deprecated. sys.argv, sys.stdin and sys.stdout are now "text" based which really doesn't make sense on Linux/Unix systems. How the hell does the language know that I'm not piping in binary data? I wouldn't be bothered as much if there was a bytes equivalent open by default (rawstdin or some such). I guess I can do something like: "os.fdopen(sys.stdin.fileno(), 'rb')" but that seems quite clunky. The old model was simple on unix-like systems, it was obvious and straightforward.

I hadn't thought much about the issue until I read some messages by Matt Mackall, the creator of Mercurial, who was complaining about the change. Personally, I would've gone with byte strings, that always have an attached encoding which is always defaulted to UTF-8 everywhere. :-) The fact that there is more than one way to represent unicode "text" makes it hard to know what the original byte stream was. This is too bad, because it's nice to be able to write stupid programs.

Every blog page or article on this site is available under the CC-BY-SA license unless otherwise noted.