Sunday, October 12, 2008

Ideas for Swarm

Ian Clark's SwarmFor this episode, a little less marketing, little more geek.

Ian Clark has some interesting results with a new distributed programming language called Swarm. As the founder of Freenet and other large-scale computing systems, his opinions on the subject are not idle thoughts.

He does a fine job of explaining Swarm with preliminary results, so I'll limit myself here to a pile of off-the-cuff reactions:

  1. If keys in the store are an MD5 or SHA1 of the content they are storing, caching data across nodes becomes trivial. If stack state is stored like this as well (compute only on send?) and chained (one stack frame per hashed data element, each with a "pointer" to the hash of the next stack frame), transmission of continuations can also be saved.

  2. The graphics are the compelling thing. It's what separates this from other projects. Emotional? Yes but adherence to a programming language is as much emotion as anything else. I say build visualization into the language itself. Make it trivial to produce. This is what everyone's going to share and talk about.

  3. Use the JVM model; don't invent a new VM. But you need full control. So, start with a JVM interpreter written in Java (see joeq or jikes or write your own using ASM) which you could then modify. Then you inherit all existing Java libraries to your new language. You have to inherit a bunch of libraries, at least at first!

    This also opens the door for interesting optimizations, like identifying short stretches of bytecode where break-for-continuation is not permitted, breaking those out into dynamically-written subroutines, and allowing the underlying JVM to JIT it.

  4. Scala might be fun to learn, but if this project gets going it will be hard enough to root out the bugs without the underlying language also riddled with bugs!  Not to mention the extra barrier to entry ("Wait, I have to learn Scala first?").

  5. How does user interaction work when the execution is moved? Even something as simple as a command-line, much less a GUI. Doesn't this imply that at some point in the execution stack you have to return to the original machine?

    (More reason to use Java directly -- bridge between distributed-mode and local-mode for the non-distributed part of the work.)

  6. Same question with external resources.  File system is easy, but what about a TCP connection or a database connection?  How shared across machines?  Or do you need a way to say "Send the execution to this specific node, the one that houses this resource?"  Maybe with an instruction that says "When this routine completes, redistribute this execution."  Maybe that instruction has a back-pointer to the original executing node, not requiring you to return there (i.e. what if that node is now overloaded?) but suggesting since that node does have all the necessary data cached.

  7. In Java some critical, tight, high-performance routines are in C; in Swarm perhaps tight routines can be in Java!  Java Annotations might be a way to specify "don't distribute" on a method.

  8. If you base on the JVM and use Annotations, perhaps existing code could be ported with no alteration! Or you can mix Swarm and plain Java with one line of code. This "easy to revert back" attribute is critical for adoption because people don't like lock-in.

  9. How does synchonization work?  Locks-held need to be part of the continuation.  But are there other subtle issues?

  10. You'll need your own synchronization of course.  Please please please use deadlock-detection, throwing an exception instead of just locking up.  It's not hard to implement.

  11. Suggestion that MapReduce be the next thing that is implemented because it's the hot thing in distributed computation and folks are convinced that many useful things can be expressed that way. Demoing efficiency (and pretty pictures) here would be compelling to many people.

  12. Fault tolerance. Probably don't have to have this at first, but need a thought-experiment-level concept of how to handle.

  13. Computational caching. With SHA of input and full knowledge of bytecode, you could perhaps automatically cache the results of computation!  Think of algorithms where you should use functional programming. Or even just dynamic webpages where the home page doesn't change that often.

  14. Consider JavaSpaces for object transfer?  Might solve some issues with fault tolerance.

Giving advice and asking questions is easy. Hopefully some brave souls will do the real work of getting Swarm up and running. Good luck Ian!

Wednesday, October 8, 2008

Giving it away

In The Coldest Call, Gerry Cullen gives us an pithy rule of sales:

If you can't give it away for free,
you can't sell it.

It sounds tautological at first, but it helps you create products that are easy to sell. Here's how.

Because of Smart Bear and this blog, I hear new company ideas all the time. When I start asking about new products, the conversation invariably looks like this:

Me: Would you get customers if your software were free?

Confident Entrepreneur: Of course! Why not take it if it's free?

Me: That's what I'm asking -- are there reasons people still wouldn't take your software even if it were free?

Confident Entrapreneur: Free is free. Of course they'd take it.

Not so fast there, pardner.

Let's say it's 1998 and you've invented a corporate-wide spam filter. Great timing -- the web is exploding, everyone has an email account, spam is choking in-boxes and wasting time. You've invented a box that sits in front of the mail server, tossing the garbage before it hits your server, much less your workstations and laptops.

So couldn't you give away a free spam filter?

Well. What happens when the filter accidentally marks something as spam when in fact it's a real email? Will we lose productivity as people get confused or spend time digging through a massive spam dumping ground looking for the message? Does email recovery require an administrator? Will he be drowned in requests? Will we have to hire a spam depository admin? Operating this system clearly costs time and money.

How much training is required to get people to use the new system? How many spam-filter-related questions will hit our internal help-desk? Support activities are expensive.

What if the spam filter box gets overloaded with too much mail? If there's a bug, is it possible to loose an email completely? What happens if the spam filter box crashes -- does email cease across the entire company? Losing email is unacceptable.

These concerns are so scary and costly that the spam filter might not be worth it, even for free. And if you're ambivalent about taking it for free, you're certainly not going to pay for it.

So how do you design a product that passes Gerry's test? Ask yourself brutal questions to root out how your product might cause more pain than it solves. Here's some to get you started:

If your product fails catastrophically, what's the impact?

Good answers include:
  • Because the product is completely independent of any other system, in the worse case you're back to how things were before you bought our product.
  • We'll show you how to configure other systems to silently and automatically route around the failure. During the trial period you can test this yourself.
  • We have built-in support for switching back to the way you were doing it before.
  • Administrators are instantly alerted of the failure
  • You can use your existing monitoring/alerting system to detect failures
  • We support live-redundancy and continuous backup

Is it easy to rip this out if I don't like it?

Good answers include:
  • Since this system is completely independent of all other systems, you can just turn it off.
  • All data in the system can be exported at any time in a standard, human-readable format (e.g. XML, CSV). (You can also use this for backup.)
  • Because we handle catastrophic failure gracefully, you can literally pull the plug and everything else continues to work.

How much training does this require?

Good answers include:
  • Our website has pre-recorded training presentations. We give you the source materials for free so you can customize for internal training classes.
  • We have tutorials and screenshots showing how to do common tasks.
  • We have excellent in-product help, as well as a printed manual.
  • Accomplishing typical tasks is obvious.

Can my end users inadvertently break the product or prevent other users from using the product?

Good answers include:
  • Each workstation is separate so it cannot break other people's workstations.
  • The server has quotas, permissions, and other administrator-controlled limits to prevent excessive or improper use.
  • We support running inside a virtual server so our software failures are isolated.
  • We blast our software with load-testing, failure-case testing, and intrusion-testing, so we know that users can't break it with normal use.

If your company goes out of business, what's the impact on me?

Good answers include:
  • Because you own the software/hardware and you host it yourself, inside your firewall, you're not affected.
  • Because your license code is good forever -- only upgrades require you to give us more money -- the software continues to work.
  • Although we charge a monthly fee, the license agreement states that if we go out of business you can continue using the software without charge.
  • We'll put our software in escrow so if we cease support you have the ability to maintain the product yourself. (In this case it's reasonable to require the customer to pay all escrow costs.)
  • Our software is open-source and licensed such that you can continue using it and changing it. (This works if you're selling professional services.)

With these questions in mind, here's some ideas for tweaking the corporate spam filter product:

  • The filter runs as a plug-in to your existing mail server. The email admin therefore has full control over when it runs, making it trivial to disable.
  • If the plug-in fails it makes a log in the mail system which can then be monitored by the same tool that already monitors the mail system.
  • Because it's a plug-in, it scales as your mail server scales.
  • Users get one email per day summarizing the mail that was marked spam. They can glance over it looking for things that are not spam, and use a link next to each one to recover it, in which case it's instantly delivered to their inbox. Thus they can help themselves most of the time without burdened email admins.
  • The summary email can is clear enough that most people will understand it without training classes.
  • Spam emails are stored in a special folder in the mail system, not in a proprietary format. Then data access and backup can be done with any email client, even if you uninstall the spam filter, even if the supplying company vanishes.

Does your product pass Gerry's test? Want to brainstorm about it? Leave a comment and let me know.

Saturday, October 4, 2008

Customers over Employees

Alex Kjerulf articulates why customers can't always come first.

Let me get this straight: The company will side with petulant, unreasonable, angry, demanding customers instead of with me, its loyal employee? And this is meant to lead to better customer service?
Everyone says "put customers first."  They pay the bills, they're who the company exists to serve, they're the ones who must be satisfied, in their hands rests word-of-mouth, the most powerful force of marketing.

But what about employees? The ones who you'd like to be motivated to serve these customers, day in and day out. Where do they fit in the customer service model? When it comes down to employee happiness versus customer happiness, what do you do? And yes, it can come down to it.

Some customers are so poisonous to your poor employees that it's your duty to get rid of them.  Some you should wish on your competitors.  Sometimes the customer isn't right.

Maybe 1% of your customers are problematic, but they're a vocal and time-sucking and morale-draining 1%.  Is 1% more business worth it?

Saturday, September 27, 2008

Peer code review interview at GeekAustin

Lynn Bender is getting really good at interviewing!

He just posted a recent interview with me about why peer code review works and how it fits with modern software development.

If you haven't been to one of his mixers, you should stop by the GeekAustin 8th Anniversary party this Tuesday.  He always throws a great party.

Tuesday, September 23, 2008

Joshua Bloch dripping with wisdom

When Joshua Bloch speaks, the ground shakes, oceans part, and Java developers fall to their knees and tremble.  At least, I do.  I think animals get nervous before Josh writes articles.


His latest article on API design is a must-read.  I don't think he's written a single word I don't agree with, and this is no exception.

P.S. If you're a Java developer and you don't have Effective Java within arm's reach, consider it a critical-severity bug in your personal development.  The Puzzlers are less practical but fun.

Sunday, September 7, 2008

Software Quality Mortgage

Your code is a mess. Years of squeezing in "must-have" features for big customers have stretched the code beyond its original design. Core modules are riddled with landmines. Tacit assumptions shared by the two founders aren't obvious to the next ten hires.


When companies are new and unknown, still seeking their niche, the most important thing is to get the software out the door, bugs and all.

It's the right thing for the company and the right thing for the software, but there comes a day when your emphasis shifts from "time-to-market" to reducing tech support calls and not pissing off tens of thousands of existing users with a dud release.

But now you have all this crappy code.

I call this phenomenon the "Quality Mortgage." The analogy to a home mortgage is instructive.

A responsible, hard-working person still cannot afford to purchase a home outright and therefore enters into debt. If shippable, salable software is the house you want today, your debt is the quality and maintainability of your code. Sure you could build a close-enough-to-bug-free application if given ten years to work on it, but you're taking out a mortgage to build v1.0 in six months.

But eventually you have to pay back the debt. With interest. You pay interest in the form of bugs. Bugs everywhere, many preventable had you given time to have unit tests, good design, manual tests, and use-cases. And fixing those surface bugs doesn't fix the underlying problems in the code. This is perfectly analogous to those first years of the mortgage where you're paying interest without reducing the principal. But this is still the right choice at first -- fix the most heinous bugs and keep going.

Over time you can pay back the principal, slowly. You can refactor one file while adding a feature. You can add complete unit test coverage for a handful of core methods. You can write a manual test plan for a particularly complex dialog box. This is all good! But at this rate it's still going to take ten years to pay it back.

Or maybe you'll never pay it back. Because unlike a house your software is constantly expanding with new features and reused for purposes beyond its original conception. Without fixing the underlying mess or the process that brought you that mess, you'll never catch up. It's more like an interest-only mortgage.

At some point you can't tolerate this anymore. It's time to pay down the principal in earnest. But this requires allocating time for major rework.

Winning the right to refactor can be tough politically, especially with non-technical stakeholders. Here's how to combat the common arguments against spending time refactoring:

  1. Clean-up is invisible to users; we need to add new features.
    The bugs constantly produced by messy code are visible to users too. Time spent fixing those bugs could have been spent adding features. The longer we stay in quality debt, the more time it takes to add each new feature.

  2. We don't have time for clean-up.
    You'd rather spend your time fixing bugs generated by the problem rather than fixing the problem? That's like whacking weeds every weekend instead of pulling them up by the roots. Prevention is sixteen times more valuable than cure.

  3. Developers got themselves into this mess; they should get themselves out of it on their own time.
    Had developers not gotten releases out the door as fast as they did, had they not responded so swiftly to early adopter feedback, even when the product morphed into a beast quite different from its original conception, we wouldn't have our current customers and revenue. We'd be working for another company, not complaining about the software we built.

    Attention CEO's: Finger-pointing impedes resolution. Instead, challenge your developers to reduce bug reports. This is easily measured, so you can track time versus results. Remember, developers prefer implementing new features to fixing bugs, so if they're begging for time to fix bugs, it's serious.
A fine line separates debt as a lever for acceleration and an insurmountable drag. The quality mortgage is a necessary evil in early software development, despite its eventual problems. Just plan on paying it back.

P.S. After writing this I found Martin Fowler making the same point.

Wednesday, August 20, 2008

Avatar Marketing

Addressing your entire customer base at once is tough, but it's exactly what your web page has to do. Unfortunately most companies approach this in exactly the wrong way.

Examples of our struggle:

  • We want managers to see that they'll get metrics and reports, but we want end users to see that they'll save time and busywork.
  • We want to look professional so big-company managers are comfortable choosing us, but not so aloof that small-company developers think we're too corporate and can't relate.
  • We want to highlight our configurable workflows that allow large customers to apply one tool for all groups, but we need small customers to realize that you can turn all that off so it doesn't slow you down.
The usual response to this conundrum is to cast a wide net. The worry is that if you hit one type of customer on the head, another type will feel excluded and might look elsewhere. So you use generic messages like "The Power to Know."

This is dangerous thinking. Generalized messaging has no power, no emotional connection, no interest. If a phrase like "The Power to Know" is equally useful for business intelligence software, buying decision analysis, and theosophical treatises, it's not exactly hitting the nail on the head.

Let me suggest a completely opposite approach. Start by describing a perfect customer. Give her a name (Carol). Pick a concrete company that she works for, a company similar to one of your existing, thrilled customers. What's her official title and what does she do? If your potential market includes a wide variety of company types and positions, just pick one in particular. Whatever problems your product solves, Carol has all those problems. Write those down from her point of view, the way she would describe them if complaining to a friend over lunch. Whatever advantages you have over your competitors, Carol needs exactly those things. List them.

Carol is literally custom-built to be blown away by your product.

Now the question is: What would a web page / Google ad / print ad / tradeshow booth / postcard be like such that Carol would immediately understand that you are her savior? Remember, you get only 3 seconds to grab her attention and another 5-10 to convince her that your product is the second coming.

Can you make it clear in a picture? Maybe a before/after she can relate to? Will describing three features make it plain? Will pointing out your best competitive advantage make her weep for joy? Can you ask a provocative question, something she identifies with? Is there a phrase she'd laugh out loud at because "that's so true?"

You only get a few seconds, so a paragraph won't do. You have to communicate in a picture and a few words. The good news is you have to please only Carol, and you know Carol. You even know she'll honestly be thrilled to find you.

If your ad can't grab Carol's attention -- your perfect customer -- why do you think it will grab anyone else's attention?

If you still say it's impossible to communicate your message in 5-10 seconds, no one in the world will get your message.

This isn't just an academic exercise; your ad will work on non-Carols too! In fact, non-Carols might not be as "non" as you think:

So called "large company managers" might be running small agile groups; you might do well to appeal to that side of them. Software development managers might like metrics, but it's wrong to think they are unconcerned with their developers' quality of life. Yes big companies like to choose "stable" vendors, but small companies with strong products are in vogue now, and even IBM admits that people can be fired for buying IBM.

When your message is powerful, Carol and anyone remotely like Carol will notice. If your message is weak, no one will notice.