I have often become confused, angry or cynical over the past few years when seeing self-professed “open source users” with Macs on their desks, or using R under Windows. I once had a discussion with a Linux user group about which laptop to buy: when many had said my laptop was “under-powered” I pressed them and found out that they meant it would have been slow running Windows. Contributors to help forums and on IRC have often assumed that my machines dual-boot Windows and GNU/Linux: “Can you see the partition when you boot into Windows?” I have also seen the insistence, or mere suggestion, of calling the operating system I’m using “GNU/Linux,” instead of Linux, dismissed as “zealotry,” or “mere semantics.” I became angry because I assumed that everyone in these situations had heard of the values of freedom embodied by the GNU project and had rejected them as unimportant. How could freedom possibly be unimportant? What could be more important to Americans, other than money?
There was another possibility that I only considered for a few seconds at a time, but it’s now becoming clear that this possibility is more feasible: these people have never heard of the GNU Project, or the Four Freedoms, or Richard Stallman. They have never heard of the true benefits of software freedom, the dangers of proprietary software, or the full breadth of freedom that is possible. If they have heard of it, perhaps they did dismiss it without thinking it was possible: perhaps software freedom is, to most people, an urban legend. This seems strange, since I came to free software by reading about it on Wikipedia and gnu.org and my interest was primarily motivated by (a) freedom and (b) the possibility of having a Unix-like system to work on. The fact that it was free to download and install merely removed the barriers to enacting those freedoms.
The barrier to my own belief that people have just never heard of freedom is that it seems to me that all systems (in fact all things) are imperfect. We all know how imperfect Windows is, and I got annoyed as hell using a Mac, so as much as its devotees attest to its perfection, it’s not perfect for everybody. However, people complain the most about the imperfections of Linux[sic]. Perhaps this is because they can, as in if they complain, someone will do something about it eventually. With Windows and MacIntyre, you have to get fifty million corporate employees to complain, whereas with free operating systems, you can be just one guy and raise a huge stink about how the buttons on the top of the windows are arranged all wrong (of course, the other advantage is that somebody can explain to you how that’s your fault). Despite the lowered barriers to complaints, I always had the feeling that people were complaining because they feel like GNU/Linux is just not “professional,” or “slick” because it’s not purveyed by a huge corporation. Therefore they complain about all kinds of things that really aren’t important to me.
Nevertheless, you still get people promoting the hell out of Linux[sic]. I could never understand why. Take NixiePixel for example, a YouTube personality who promotes primarily Ubuntu and Linux Mint. I really thank her for doing so, because whether she likes it or not, she’s promoting freedom: better that people have it and not know it than not have it at all. However, she never says why she’s promoting these alternatives. Why is it better to use Ubuntu than Windows, particularly if there aren’t the same games available for it? She even has a new series called OSAlt where she discusses and rates “open source” alternatives to non-free programs. Again the question is why? Is “open source” inherently better for users somehow? I suppose in some ways it is, but how?
This is so puzzling because for me, without freedom, everything comes down to your personal choices. No computer operating system, no anything, is going to work well, or even comfortably for anybody. Life just doesn’t work that way: nothing “just works.” So why promote one alternative over another? Freedom is the only motivator to use GNU/Linux that stands that test. The freedom leads to a lot of nice by-products, but freedom is the prime mover. Some users may not have a choice of what to use; they may have to use a proprietary system at work, and not have time to learn to use something else at home. Additionally, some users like NixiePixel will be unwilling to embrace a campaign for freedom because considerations of freedom are intensely personal at the same time as “political” and the possibility for insulting people is pretty high. There is also a lot of angry, cynical behavior in the open source and free software worlds. That’s bound to happen whenever a community is composed of human beings instead of marketing personnel.
This is why it’s so crucial to let people know about their freedom at every possible opportunity, i.e. every time you mention the system. I know that “GNU/Linux” is a mouthful, but it’s too easy for people to hear about “Linux” and not know there’s anything special about it except that nerds like it. I myself had heard of “Linux” for years before I knew that it was free of charge, much less free-as-in-freedom (FAIF). There’s too much possibility that people will hear of “Linux” and just think it is another operating system. Or, they may get sucked into using non-free software by the “nerd-allure” of it.
Take Android for example: Android is a Linux system, but it only took me a few minutes of using my dad’s Samsung phone to see that Android is not a freedom-respecting system. None of the values of the free software movement were respected in its interface or its operation. There weren’t even subsidiary values (those by-products I mentioned), like organization, clarity and standards. There was an avenue for spam and advertising that was pretty well-lubricated, but the only reason I saw for using the Linux kernel was that it’s adaptable to many devices. After playing Angry Birds for a few minutes, it became clear to me why it’s important to call the system I’m using now GNU/Linux: it’s accurate, and it promotes a mission that is in line with my values. As often as I can inform people of their possibility for freedom in technology, I will do my best.
For more on these issues, you can read The GNU/Linux FAQ
Recently I decided that maintaining my homepage in HTML was getting too laborious: the primary problem was things like lists and hierarchies. I have used Emacs’ Org Mode for my daily agenda for almost five years, and decided that it was the right tool for organizing these structures. Org Mode allows you to view “your life in plain text,” which is, of course, the most versatile way to do so. What Org can also do is export your hierarchical documents to HTML, LaTeX and many other formats (including formatted ascii, which is very nice). Along with this is the feature org-publish that uses Tramp to transfer a set of exported HTML files (and other files) to another location.
Read the Org Manual’s section on org-publish: you can find a simple example there. A single variable called `org-publish-project-alist’ configures all the stuff you need to publish an entire website. Here’s mine:
(setq org-publish-project-alist '(("mysite" :base-directory "~/Documents/web/" :base-extension "org" :recursive t :section-numbers nil :table-of-contents nil :publishing-directory "/ssh:email@example.com:~/public_html" :style "") ("imgs" :base-directory "~/Documents/web/imgs/" :base-extension "jpg\\|gif\\|png" :publishing-directory "/ssh:firstname.lastname@example.org:~/public_html/imgs" :publishing-function org-publish-attachment :recursive t) ("etc" :base-directory "~/Documents/web/" :base-extension "css\\|bib\\|el" :publishing-directory "/ssh:email@example.com:~/public_html" :publishing-function org-publish-attachment) ("docs" :base-directory "~/Documents/web/docs/" :base-extension "html\\|tex\\|bib" :publishing-directory "/ssh:firstname.lastname@example.org:~/public_html/docs" :publishing-function org-publish-attachment) ("thewholedamnshow" :components ("mysite" "imgs" "etc" "docs"))))
After a few days of having this in my .emacs I decided this needed its own file, which I called “project.el” and placed in the home directory of my project.
Each one of the members of this list is a “project.” Projects can include other projects by using the “:components” property. Suppose my website’s files are in the directory “~/Documents/web/”. This is where I keep the actual org-mode files, css files and any other files I want to publish. The property “:publishing-directory” puts the exported files in the specified location, which is a tramp url. The trick is really the property “:publishing-function,” which tells `org-publish’ how to treat the files. If left blank, this will translate the files into HTML. For .css files and other stuff you might link to (e.g. my .bib or tex .files, or images) you can use the function `org-publish-attachment’, which does no translation.
The crucial part of this variable is then the last “project,” which has only a “:components” property. This includes all the other projects, and hence when I publish “thewholedamnshow” using `org-publish’ my entire set of files is exported and uploaded.
Now I have all the sources for my website in one directory. Before I had used a highly hierarchical setup that made links very complicated. After realizing that I didn’t have actually that much content, I now have all the org files in the toplevel directory, with two subdirectories: one for images and one for special documents that are not in Org Mode. These are essays or LaTeX documents that are already finished works and I do not expect to change them.
I keep all the Org Mode source files in Bazaar. This greatly simplifies things. With project.el included along with the website, I can work on this on any machine as long as I evaluate that variable before I upload using `org-publish’.
A huge advantage is that now everything (including my CV, publications, and my ever-expanding academic FAQ) is in Org Mode. This means that changes are super-easy, even structural changes that I wouldn’t have attempted with HTML. So now when I need to update my CV, or add an FAQ, all I do is edit in Org Mode, something that I am very familiar with because I do it most of the day every day. I actually just categorized my FAQ using Org Mode in a matter of minutes. Linking with Org Mode is also incredibly easy, and the exporter knows how to handle links to files, headlines within files and internet urls. Also, since these documents are now in Org Mode, if someone wants a PDF version, all it takes is a few keystrokes to produce it.
The weirdest thing is how easy this is once I figured it all out. After only a week of tinkering, I now have a website that I can update or make major changes to in a matter of minutes. It looks better, is easier to maintain and easier to configure.
I had another interesting breakthrough yesterday with regard to how I think about programming, or rather creating applications using programming. I’ve learned over the past couple years of creating an application that seemed simple at the outset (a simple number-crunching program!) that the really hard parts of programming are not the parts that people typically write about, and the really challenging parts are things that are obvious, but nobody talks about those things, perhaps because they are so challenging. This brings me to another question about how I should handle those really hard parts.
Yesterday I toyed again with the idea of learning C++, and decided against it yet again. I’d heard that C++ had a number of tools that are good for numerical programs, like vectors, and I recently heard of a new matrix library called Eigen. However, I’ve avoided C++ because I still believe object-oriented programming is one of those bad habits people pick up in programming classes, and it didn’t seem to offer any advantages over C. C is good; I mean C is The Right Thing.
There was still something else nagging me, however. When you read the books with the trivial examples that you could do easily with a pocket calculator, they don’t match up with the way the language is designed. Even books on Haskell don’t seem to be saying as much as they should about what’s really important in designing an application. That was the realization: what you’re supposed to do with a programming language is build an application. The “program” part (that is, the algorithm) is really immaterial. After exploring many programming languages, I have found that with few exceptions there are very few that really differ in their offerings for completing algorithms. You can write the same algorithm in almost any language and have it perform pretty well on most hardware these days. So what’s missing? What are all the manuals full of? Why does every programming language have a preoccupation with strings?
Let me use an analogy: I play the banjo, and took over a year of lessons, read tons of books and have probably spent over 3,000 hours playing and practicing, and even after getting in a really good band, having great people to jam with, and practicing really well, there was still something about playing that was so difficult. I just kept saying “I don’t know what to play,” or “I can’t make the notes fit there!” After I started graduate school and my second son was born, I needed to shift back to listening and if playing, playing a quieter instrument, so I started doing things I’d never done with my banjo using a guitar: playing scales, picking out melodies, and listening very carefully to my favorite guitar players. Listening to Steve Stevens, Jerry Garcia, David Gilmour and Kurt Cobain, I noticed something: these guys don’t play notes, they play phrases.
Why had absolutely no one mentioned playing phrases to me? Was I not listening? Did no one just say “Melodies, counter-melodies, rhythms, etc., i.e. music (dude!) is composed of phrases. You can construct phrases in many ways, but the key is punctuation.” When I learned to play the banjo, I learned the punctuation marks (licks). I learned how to move my fingers, and I learned chord formations. But I never learned the fundamental thing about music is phrasing. After I figured this out my brother told me how a famous drummer sat him down at a workshop and pointed his finger saying “One thing is important: phrasing.” Luckily this was when my brother was fifteen. Since I’m not a pro like him, I can understand why I didn’t get that opportunity, but still come on! This is hugely important. Why did nobody mention it?
And why has nobody mentioned, in any programming book that I’ve ever found that the crucial thing — the hard thing — about designing a program is the user interface. There are books about user interface, certainly, but they are concerned with superficialities of user interface, like what color the frame should be. Who cares? The difficult part is deciding how your program should interact with its user. Eric Raymond does spend a whole chapter on this, but he doesn’t start with it. I’d like to read a book that starts with “You can figure out all that stuff about your algorithms: you have the equations, you have the data structures, you know what it’s going to do; spend time thinking about how a user would get it to do that well.”
So my realization yesterday is that the reason the C standard library is full of string functions, the reason Lisp programmers are so concerned with reading program text and the reason that there are so many programming languages and libraries and plugins is that the really hard part is between the user and the algorithm. My inclination is to say that the simplest interface is best. The simplest interface would be “stick something in and see what comes out.” That’s called Unix. Even in Unix you can’t just do that: you have to mediate somehow between the algorithm in a world of numbers, and the user who lives in a world of text. This is easiest on Unix, but it’s still not easy.
There are other schools of thought: your user interface should be a pane full of buttons and pretty colors to dazzle your user into thinking they’re doing something useful, or a monolithic shell that does everything with the computer. I don’t really buy either of those things, because I know how to use stream editing and Make to tie things together. However, sometimes I need a program that I don’t have to re-run all the time. I would like something in between: something where I can run a simulation, look at the results, then tweak it a little and run it again, then set it up into batch mode to produce a huge pile of results that I can analyze. There’s no reason that all has to be in one huge program, it could be several, but the point is that the algorithm contained in there would be the same for all those steps. There are languages like this, such as R, Octave and Scilab. However, I don’t like programming in any of their languages. Maybe I can come to like it since they make the hard parts easy.
The approach I should take with my next program is “How do I write a language for running a simulation?”
This weekend I’ve made trips to two events that really got me thinking about who we should promote free software to. The first stop was the Durham Farmer’s Market, and the second was a benefit concert for a cooperative preschool in Chapel Hill. I have been thinking for a long time about the “organic food crowd,” particularly because I’m a biology graduate student, and most of my fellow graduate students buy organic food or shop at farmer’s markets. They seem to have values in common with me, yet few of them use free operating systems. A lot of my fellow graduate students know about certain free software, like Firefox, R and Python. However, mostly they use Window$ or McIntyre operating systems.
I really think somebody needs to get the idea of free operating systems to people at the Durham Farmer’s Market, Whole Foods and events like the concert I just attended. Obviously that could be me, and I could just go and talk to the vendors at the farmer’s market. That would be easy. There are a few problems, chief among them the assumptions I’m making. I automatically assume that these people who I seem to have a lot in common with are very different from me. I assume that they are making their decisions from a fashion-inspired reflex. I think I feel this way because I have come to my own values my own way, and not because of fashion. However, I know my conclusion is not justified. I don’t actually have good data about the “organic food people,” and probably at least ten percent of them do indeed use free software. Probably more than ninety percent of them at least know about Firefox, even if they don’t know what’s actually good about “open source.” I do know that pretty often I see cars like the one I saw driving back from Chapel Hill: bumper stickers saying “When words fail, music sings,” alongside an Apple sticker.
The other problem is just what I would say to them? Would I recommend a particular distro? Would I recommend that they read GNU Philosophy? Would I recommend that they learn about the issues on Wikipedia? These were all helpful things for me. However, it’s best to get across the ethical essence of the idea by simply giving people a persuasive argument. That almost always gets people’s attention, but you need to give them at least a step to get going. Another good first step is to recommend the film Revolution OS, but that’s starting to seem a bit dated. Perhaps it’s time for another documentary, like Patent Absurdity.
The third challenge is to remember is that promoting freedom is not a race to get the most users. People in the software press seem to always be concerned about numbers, about “desktop share,” and about “killer apps.” That’s really not the point. The point is to demonstrate that ethical motivation is enough to create a working operating system. In other words, whether the GNU/Linux operating system was created for freedom and fun, it was not created for money. Often the first thing people tell me when I give them my persuasive argument is “but programmers have to make money!” as if money were the only reason that anybody ever does anything. The point of free software (and Wikipedia) is to show skeptics that there are people who have different values.
Ultimately, I believe, that ethical motivation will prevail and one way or another, whether they know it or not, people will end up using ethics-promoting software. It doesn’t matter how many Windows users we “convert” or how many Mac users we tell the truth about much of the software they’re using. It doesn’t matter that we “conquer the world” or anything like that. What matters is that those of us who care about our freedom now do what we can to continue to improve our ability to live our lives without using ethics-compromising software. The more we can do that, the better demonstration we make to people who finally decide that they want to make the effort to preserve their freedoms. We will do our best, and others will see it and make their decision.
Most people think “programming is for programmers,” and by “programmers” they mean people who earn a living writing software, i.e. “end-user” software: software that people will buy, or that will be used in some big company. However, recently I’ve overheard a lot of talk from people in the business world about what those large companies do, and much of it sounds like it could be done by simple computer programs. The problem is that people don’t learn programming, nor do they learn to think of their problems as amenable to programming. I surmise that for most people, a programming solution doesn’t even enter into their thinking.
At a recent breakfast conversation, my brother told me that at his company most of the problems that come up result from people not thinking of something if a notification doesn’t come up on their computer screens and ask them. Even if they know there’s a problem, they won’t do anything about it if they don’t see it right there in front of their faces. They won’t even get up and walk five feet over to the guy in charge of that problem to ask him. These people and their tasks could be replaced with simple programs. He also told me that the corporation he works for uses none of the operations research or systems design theory that he learned in business school. Everything is just left up to guessing at the best solution and sticking with it for years, regardless of how much it costs or the alternatives.
I also sat next to some people in the airport who were using Excel and mentioned SAP (which my brother tells me is basically a money-milking machine; the companies who buy it are the cows). One of these people said her current project was “organizing [inaudible] into categories based on characteristics, including horsepower…they weren’t labeled with horsepower.” She was doing it “by hand.” One of my missions in my last job and in graduate school is to intercede whenever I hear the phrase “by hand.” We have computers. “By hand” should be a thing of the past. This young woman apparently didn’t think of her task algorithmically. Why would she when it’s unlikely any of her education included the word “algorithm?”
These patterns highlight the current failings of commercial computing. Commercial computing has one goal: sell computers. For most of the history of computing, this approach has been focused on hardware, but now people mostly see it as software. Commercial computing’s current goals are to sell people software as if it were hardware and then walk away, wagging your finger when the customer comes back complaining that it doesn’t work. Eric Raymond calls this the “manufacturing delusion.” Programs aren’t truly manufactured because they have zero marginal costs (it costs only as much to make a billion copies of a program as it does to make one copy). Commercial computing focuses on monolithic hardware and software, i.e. programs that try to do everything the user might need, and funneling everyone’s work through that program. That doesn’t work.
Academic computing, on the other hand, has the perspective that if something doesn’t work the way you need it to work, you rewire it, you combine it with something else, or build onto it so that it will accomplish a specific task. People literally rewired computers up until twenty-five years ago, when it became cheaper to buy a new machine (if anyone can correct me on that date, please let me know). Similarly for software, if the software you have doesn’t do the job you need, you write the software to do the job you need. If you have several programs that decompose the problem, you tie them together into a workflow. Suppose you have a specific problem, even one that you will only do once, and might take you one day to program — potentially saving you a week of “by hand” — then you write a program for it. Then if you ever have to do it again, you already have a program. You might also have a new problem that is very similar. So you broaden the scope of the previous program. Recently I wrote a script that inserted copyright notices with the proper licenses into a huge number of files. I had to insert the right notice, either for the GPL or All Rights Reserved based on the content of those files. On the other hand, if you have a program that generally does what you want, e.g. edits text, and you want it to do something specific, you extend that program to do what you need.
Basically I see a dichotomy between the thinking that certain people should make money, and solving problems only to the extent that solving their problems makes those people a lot of money, versus actually solving problems. If you disagree that this dichotomy exists, let me know and I’ll show you academic computing in action.
The solution for all these problems is teaching people to think algorithmically. Algorithmic thinking is inherent in the use of certain software and therefore that software should be used to teach algorithmic thinking. Teaching people algorithmic thinking using Excel is fine, but Excel is not free software, and thus should not be considered “available” to anyone. Teaching these skills in non-computer classes will get the point across to people that they will be able to apply these skills in any job. Teaching this to high school students will give them the skills that they need to streamline their work: they will be able to do more work, do the work of more people, communicate better and think through problems instead of just sitting there. People will also know when someone else is bullshitting them, trying to sell them something that they don’t need. Make no mistake, I’m not saying that teaching programming will get rid of laziness, but it might make it a lot harder to tolerate laziness. If you know that you can replace that lazy person with a very small shell script then where will the lazy people work?
If you teach biology, or any field that is not “computer science,” then I urge you to start teaching your students to handle their problems algorithmically. Teach them programming! I am going to try to create a project to teach this to undergraduate students. I have in mind a Scheme interpreter tied to a graphics engine, or perhaps teaching people using R, since it has graphics included. Arrgh…Scheme is just so much prettier. Teaching them the crucial ideas behind Unix and Emacs will go a long way. Unix thinking is workflow thinking. Unix (which most often these days is actually GNU/Linux) really shines when you take several programs and link them together, each doing its task to accomplish a larger goal. Emacs thinking is extension-oriented thinking. Both are forms of algorithmic thinking.
If you are a scientist, then stop procrastinating and learn a programming language. To be successful you will have to learn how to program a computer for specific tasks at some point in your career. Every scientist I know spends a huge amount of time engaged in programming. Whenever I visit my grad student friends, their shelves and desks are littered with books on Perl, MySQL, Python, R and Ruby. I suggest learning Scheme, but if you have people around you programming in Python, then go for it. I also suggest learning the basics of how to write shell-scripts: a lot of people use Perl when they should use shell-scripts. Learn to use awk, sed and grep and you will be impressed with what you can do. The chapters of Linux in a Nutshell should be enough to get you going. Classic Shell Scripting is an excellent book on the subject. Use Emacs and you’ll get a taste of just how many things “text editing” can be.
Every profession today is highly data-oriented. Anybody hoping to gain employment in any profession will benefit from this sort of learning. Whether people go into farming, business, science or anything else, they will succeed for several reasons. There are the obvious benefits of getting more work done, but there are also social reasons we should teach people algorithmic thinking. The biggest social reason is that teaching algorithmic thinking removes the divide between “customers” and “programmers.” This is why it bothers me to hear “open source” commentators constantly referring to “enterprise” and “customers” and “consumers.” If a farmer graduates from high school knowing that he can program his way to more efficient land use, then he will not be at the mercy of someone who wants to sell him a box containing the supposed secret. Again, you can teach algorithmic thinking using Excel, but then you already have the divide between Microsoft and the user. With free software that divide just doesn’t exist.