Another Google scraping fail

Are you concerned about disease prevention – but not any specific disease that you can name? Perhaps you’ve searched in google for something like [disease prevention] and seen this:

Text of Google’s answer card:

These images are in the public domain and are thus free of any copyright restrictions. As a matter of courtesy, we request that the content provider be credited …

More text:

  • Disease Prevention
  • Exercise & Activity
  • First Aid
  • Home and Family …

They include links to the CDC and to Medicinenet, but it’s hard to imagine searchers clicking on them after seeing such lousy information.

This strikes me as another case of Google doing an extremely bad job at providing direct answers to users in search result pages, as a result of being far too aggressive at scraping content websites. In this case, searchers would have been far better served by a simple list of ten blue links.

Seth Roberts thinks it was Nick Szabo

Seth Roberts thinks that Nick Szabo created bitcoin and he gives a few good reasons to back up his claim.

One question, though: if Nick Szabo created bitcoin, why did he use the name “Satoshi Nakamoto”?

Possible answers:

  1. Nick Szabo had never heard of the Satoshi Nakamoto who lives in California, and “Satoshi Nakamoto” is a completely random gibberish name and means nothing at all to Nick Szabo.
  2. Nick Szabo had never heard of the Satoshi Nakamoto who lives in California, but “Satoshi Nakamoto” has some kind of personal meaning to Nick Szabo.
  3. Nick Szabo had heard of Satoshi Nakamoto, and intentionally used his name in order to pin responsibility for bitcoin on him.

None of these sounds convincing to me at all.

There’s another possibility, of course. Satoshi Nakamoto might have been the creator of bitcoin, while Nick Szabo may have worked on the original paper with him – perhaps as an equal collaborator, sharing ideas and critiquing each other’s work, or perhaps just as an editor.

Newsweek got it right

Like a lot of people, I got pretty excited on Thursday to read that a Newsweek reporter had figured out the identity of Satoshi Nakamoto, the creator of bitcoin. I found the article, to which I’m not linking because Newsweek’s website sucks, so compelling that I read it immediately that morning instead of paying attention during my company’s 401K seminar for employees.

To summarize, it’s been assumed for years that “Satoshi Nakamoto” was a pseudonym and that the “Satoshi Nakamoto” who created bitcoin may actually have been a group of people working together, rather than a single person – perhaps even the NSA or the CIA.

What Goodman learned is that there is a man named Satoshi Nakamoto who lives in California, who is extremely nerdy and skilled at mathematics and engineering (means), who is a libertarian and frequently does international monetary transactions (motive) and who was apparently not employed full time during the years when bitcoin seems to have been developed (opportunity).

Goodman contacted this Nakamoto and asked him whether he created bitcoin. Nakamoto said, “I’m no longer involved in that and I cannot discuss it,” while she was being prevented from accosting him at his home by police officers, who overheard the exchange.

My reaction to all of this was a bit of thrill and a bit of disappointment. The thrill was because Satoshi Nakamoto was not living under any real deep cover at all – he was hiding in plain sight, having changed his name legally to Dorian Prentice Satoshi Nakamoto – and had simply chosen no longer to involve himself in bitcoin’s activities and development. The disappointment was because I anticipated an ensuing personalization of the bitcoin project, community and product in a way that would inevitably harm bitcoin (for what it’s worth, I have never mined, traded or received bitcoin).

I assumed this case was basically closed, but I was wrong. Nearly immediately, Nakamoto himself began lukewarm and lame denials that he’d said what he said, or perhaps that he’d meant what he said. Unfortunately for him, the police officers who’d been at his home verified Goodman’s account, confirming that Nakamoto had in fact replied that he was “no longer involved in” bitcoin. Not that he had never been involved bitcoin; not that he didn’t know what it was: that he was no longer involved in it.

Dorian Prentice Satoshi Nakamoto has a lot of good reasons to deny that he is the Satoshi Nakamoto.

The Satoshi Nakamoto who created bitcoin, by virtue of having been involved in mining at the very beginning, controls a stash worth approximately $400,000,000. If that person’s identity became public, he’d immediately become a very attractive target to con artists, thieves and hackers, which is not an appealing prospect. Someone could do some harm to him and hold something of his for ransom, like what happened to Naoki Hiroshima.

There are also a lot of unresolved legal issues with bitcoin that could expose not just bitcoin’s founder, but anyone involved in the bitcoin ecosystem to federal money laundering charges. And for the guy behind it all, the cherry on top of that money laundering might be a nice fat conspiracy charge to make sure he spends the rest of his life behind bars.

Additionally, if Nakamoto created bitcoin using any kind of technical knowledge that he received while working for the government, or for a government contractor, he might be in violation of his non-disclosure agreements. Or he might think this is a possibility. Or he might think that that some government lawyer thinks this is a possibility.

Besides that, the intelligence, cryptographic, finance, libertarian and privacy communities would all be on top of him constantly, demanding that he appear as a speaker and panelist, writing papers, giving interviews and weighing in on important matters. He’d get calls to appear on the Sunday morning talk shows and he’d be nominated for all kinds of awards. But he is a private person – not a recluse, but content to live his life and create his creations away from prying eyes. Perhaps he’s just not interested in changing his lifestyle to play the role of Mr. Bitcoin.

Nakamoto’s denial was weak and uncompelling:

Several times during the interview with AP, Nakamoto mistakenly referred to the currency as “bitcom,” and as a single company, which it is not. He said he’s never heard of Gavin Andresen, a leading bitcoin developer. Andresen had told Newsweek he’d worked closely with the person or entity known as “Satoshi Nakamoto” in developing the system, but said he never met Nakamoto in person or spoke on the phone.

Since the creator of bitcoin, whether he is Dorian Prentice Satoshi Nakamoto, a different Satoshi Nakamoto, another person or a group of people, would definitely have very strong incentives to deny any involvement, we can safely disregard Nakamoto’s silliness about not understanding what bitcoin is or what it’s called, which is nakedly self-serving – and anyone who doubts the incentive to deny involvement must explain why bitcoin’s creator hasn’t come forward to share his identity until now. On top of that, Nakamoto himself has personal reasons not to want any public attention: he’s a nerdy, shy, very private libertarian who lives with his mother and has some unfortunate health issues.

The reddit-style backlash to Goodman’s article in Newsweek that I saw on twitter – particularly in the feed of my former CNET colleague Declan McCullagh, but certainly not limited to him – was pretty underwhelming. He shared a rapgenius page that attempted to debunk the Newsweek article, which I will sample:

you’d imagine this article would have gone out of its way to provide airtight proof.

Not really. I’d imagine that the article would provide a lot of evidence – as much evidence as possible to convince the readership and make a splash. Airtight proof would be unlikely and impossible if bitcoin’s creator had destroyed the proof and if he wasn’t willing to confirm his identity.

Instead, the evidence presented for this proposition is extraordinarily thin and does not demonstrate [list of technical proofs]:

But as I stated above, both the Satoshi Nakamoto who created bitcoin and the Satoshi Nakamoto in the Newsweek article have a lot of incentive to deny creating bitcoin. So the fact that neither one has confirmed the story by offering technical proof is not convincing.

it seems highly implausible that a person who was a world-class expert in cryptography — and who went great lengths to cover his tracks and remain pseudonymous — would be so silly as to use his middle name and real last name as a cover identity.

Unless he never intended for his bitcoin creation to become the humongous spectacle that it did become. Or unless he has just a tiny bit of vanity that did indeed lead him to use his real name.

the invasion of privacy of an elderly citizen with no evidence tying him to Bitcoin’s ownership

Dorian Prentice Satoshi Nakamoto’s privacy was not invaded, at least by Leah Goodman. She reported a newsworthy story backed up by a horde of evidence. If bitcoin were a crime, I believe this would be enough evidence for a jury to convict him, though I don’t believe it’s a slam dunk case.

Nakamoto is also not a victim here: he is the one who apparently created bitcoin and used his real name to do it.

Yet it was not retracted, and indeed many journalists are still repeating the claims therein.

Why would it be retracted? Nothing in it has been proven incorrect. Nakamoto’s denials are predictable and easily disregarded. Newsweek has done the right thing in standing by their reporter.

An account that was previously known to be used by the actual author(s) of Bitcoin (well before s/he became world famous) in turn denied that he is Dorian Nakamoto:

D.P.S. Nakamoto has incentive to deny that he is the Satoshi Nakamoto who created bitcoin.

Writing samples of Dorian and Satoshi do not match

This is probably the only strong point that the deniers have made. I don’t have a very good rebuttal to this, which is why I think Goodman’s case is overwhelmingly likely to be true, but not to the point that it can be accepted as obviously proven.

With that in mind, it is a fact that both Dorian Prentice Satoshi Nakamoto and Satoshi Nakamoto are extraordinarily smart – certainly smart enough to maintain two separate writing styles, one for formal work with the bitcoin community and another for casual things like emails. It’s also the case that, as an immigrant to the United States, Nakamoto likely isn’t the best writer in English and may write English poorly when the subject is unimportant, but invest a lot of effort to composing really tight, excellent prose for critical communications.

Requisite level of technical expertise does not match

Satoshi Nakamoto of the internet created bitcoin. Satoshi Nakamoto of Temple City, California, is highly skilled at math and engineering, has worked for the defense industry, and is a nerdy libertarian who collects and modifies model trains. While there’s no proof that the latter has the exact technical expertise necessary to create bitcoin, can you seriously claim that he’s unlikely ever to have looked into cryptography in his spare time?

Most importantly, Dorian has not demonstrated access to the private keys associated (a) with early mined BTC blocks or (b) with the known PGP public key of Bitcoin’s author.

But why would he? The whole point is that he doesn’t want people to know that it’s him.

Dorian Nakamoto evidently worked as an engineer on classified military projects in the past. The penalties for violating confidentiality in those circumstances are severe. It makes total sense that he would be worried about getting into trouble for talking to a reporter about his prior work — particularly a reporter who had previously approached him to talk about model trains before switching the topic to his work as an engineer.

If that were the case, then Nakamoto could simply have said, “I signed a non-disclosure agreement and I have nothing more to say.” Or he could have said, “I have nothing to say about my professional career working for the government.”

The reporter claims he tacitly acknowledged “his role in the Bitcoin project”, but note that even in the quote below, on which this entire story rests, he does not actually say “Bitcoin”.

That’s because he’s not a robot and doesn’t need to keep repeating a proper noun after the subject of his statement has been made clear.

Four million people hold top secret security clearances. It is not that exceptional. This kind of verbal suggestion typifies the article.

But how many of those people are named Satoshi Nakamoto? How many are libertarians? How many got their security clearances by being engineers who were extremely gifted in math? Just one.

The first name Satoshi and last name Nakamoto are moderately common in Japan, certainly more so than in the US. Satoshi is the 69th most popular first name in Japan and Nakamoto is the 492nd most popular surname. The combination is indeed infrequent in the same way the combination of any two names is infrequent, but it’s not necessarily “distinctive” in the way the author is implying.

There are only a few people in the world named Satoshi Nakamoto. Only one of them has had the skills, interests and availability to create bitcoin.

OK. When it comes to math, let’s stipulate for the sake of argument that Mr. Dorian Nakamoto is above the population average in mathematical ability [followed by longwinded digression that attempts to argue that D. P. S. Nakamoto is good at math, but not that good at math].

Here’s the problem with this: it’s the dumbest kind of nitpicking I’ve ever seen.

Albert Einstein, before he spent a year coming up with four theories that overturned physics, may have been considered good at math, but not that good at math. Once that happened, however, and once the skeptics had been satisifed that yes, in fact, this guy named Albert Einstein had done these miraculous things, he was recongnized in retrospect as being that good at math. Why couldn’t that be true for Dorian Prentice Satoshi Nakamoto? Why must we believe that someone who was good at math couldn’t in fact be that good at math and that the world just didn’t understand or appreciate him until now?

Dorian did classified work. There are penalties for violating security clearances. This doesn’t mean he is hiding anything about Bitcoin.

But it definitely implies that if he denies involvement in bitcoin, there could be several good explanations besides a genuine lack of involvement in bitcoin.

Note that the article nowhere says that his brother suggested Dorian may have created Bitcoin, but the quote ‘He’ll never admit to starting Bitcoin.” does of course leave the impression that his brother now thinks that Dorian did.

Putting aside what his brother believes, which isn’t relevant, his brother says that he won’t acknowledge having created bitcoin – and that is very relevant.

If you needed to, then you still have that need.
Seeing how you treated the faux Satoshi, the real one is not likely to meet you any time soon.

Petty and amateurish. I’m going to stop critiquing this critique and get on with my life, secure in what I’m pretty sure has been definitively demonstrated: Dorian Prentice Satoshi Nakamoto of Temple City, California, is the same Satoshi Nakamoto who created bitcoin.

To summarize the beliefs of the people who disagree:

  1. The whole story wasn’t newsworthy.
  2. It may in fact be newsworthy to write a story about the man who created bitcoin, but in this case, a man’s personal and private information was revealed against his will.
  3. It may in fact be fine to reveal the identity of the man who created bitcoin, but Leah Goodman got it wrong.

To address these point by point:

  1. This is definitely a newsworthy story and the media are justified in making a big event out of it. Bitcoin has the potential to revolutionize financial transactions in a way that hasn’t been seen since the creation of fiat money. And bitcoin is way better than fiat money. On the other hand, bitcoin may be such a threat to the government and to finance that it’s made illegal, or that its users are compelled to stop using it by being accused of money laundering. Either way, this is a huge story.
  2. The creator of bitcoin is one of the most important men – or groups of men – alive today, all the more so if bitcoin really takes off and if the government fails to stop it (or if the government decides to embrace it). His identity shouldn’t matter in the sense that it changes what we think of bitcoin, but it certainly does and should matter for other reasons – just like the identity of anyone who creates something ingeniuously matters.
  3. For Leah Goodman and Newsweek to have gotten this wrong, think of all the crazy coincidences that must be true. Now think of how unlikely they are all to be true. Now think of how likely it is that she was right about Satoshi Nakamoto.

By the way, I do hold out the possibility that Dorian Prentice Satoshi Nakamoto is being set up by some secret cabal of libertarian cryptographer programmers who discovered that he had a security clearance and that he hadn’t been working steadily for a few years, and decided that he’d be the perfect fall guy. This hasn’t actually been disproven. But conspiracy theories like this tend to open up a whole other can of worms. In this case, the obvious question would be: why did the conspirators choose to pick on Dorian Prentice Satoshi Nakamoto and why did they conceal it by using the name “Satoshi Nakamoto” when they could have used “Dorian Nakamoto” or “Dorian P. S. Nakamoto” or something that would have let investigators find this guy and blame him sooner and easier?

Similarly, if the person who created bitcoin simply made up the name “Satoshi Nakamoto” to use as his pseudonym, then why did he choose it? What does it mean to him? Why use a given name and family name at all? Why not pick a name with more overt significance for his magnum opus? Or why not present himself to the world with a randomly generated string of alphanumeric characters as his “name”?

One more point on denials: imagine for a minute that you have the very rare name Satoshi Nakamoto (which you’ve modified to Dorian Prentice Satoshi Nakamoto). Imagine also that there’s another Satoshi Nakamoto who did something significant, which started off obscure but then, over a period of several years, gained a lot of traction and got a lot of people talking about it. Imagine that someone with your profile – middle aged man, libertarian, gifted at math, professional engineer, top secret securty clearance, nerdy – is almost 100% guaranteed to have heard of this thing, just by virtue of who you are, what your name is and what your interests and skills are.

Doesn’t it stand to reason that D. P. S. Nakamoto would certainly and obviously have heard of bitcoin before Leah Goodman of Newsweek started calling him? Of course he would have.

And imagine that you, this other Satoshi Nakamoto, heard about the amazing creation of the mysterious Satoshi Nakamoto. What would your response be? What would your denial be like when people asked if it was you?

You’d probably say something like…

  • “That’s amazing that we share the same very rare name!”
  • “That isn’t me, but I WISH it was me!”
  • “Is someone playing a joke on me?!”

How likely is it that your response would actually go like this:

“I am no longer involved in that and I cannot discuss it,” he says, dismissing all further queries with a swat of his left hand. “It’s been turned over to other people. They are in charge of it now. I no longer have any connection.”

Not likely at all.

More new SERP answer cards

Back in January, I was interested and a bit excited to see that Google had rolled out some new answer cards in search results, and I wrote about them a few times:

Now I’ve seen a couple more kinds that I thought I’d also point out.

The first is a Youtube music player card, accompanied by information about the song:

It’s so big, taking up so much space on my computer screen, that I had to zoom out in order to get it all in the screenshot.

The image is a link to the Youtube page for this song, Till the Morning Comes by the Grateful Dead. The URL has the parameter &feature=kp, but I don’t know what that might mean.

The second was for the query sous vide salmon temperature. It was an answer card that scraped the text “Our favorite temperature is 45 °C / 113 °F, which is rare. But you can go up as high as 52 °C / 126 °F for a firm texture,” from Modernist Cuisine.

Unfortunately, I didn’t take a screenshot of this one in time, as the answer card has disappeared. It appears that this result has changed, with Modernist Cuisine now ranking at the bottom of the first page, potentially ineligible for the answer card.

Instapaper update

I recently blogged about using both Pocket and Instapaper and wishing that Instapaper would get some of Pocket’s features so I could go back to using it. There was a big Instapaper update today, so I decided to take a look at what had been done to see if it’s ready for me:

  • Instapaper Daily is now integrated into the “Browse” section. Irrelevant.
  • Send to Kindle functionality. Irrelevant.
  • AirPlay support. Relevant, but unimportant because Instapaper is still terrible at parsing video.
  • Back gestures with pagination enabled. Irrelevant.
  • Auto-renewable subscriptions. Relevant, but very minor.

So they’re wasting months of development time working on features that don’t matter, and ignoring basic things that would really help growth.

Backblaze vs. Crashplan

Late last year, after reading on Lifehacker about how to use Amazon Glacier to back up a computer, I got excited to try it out on my home computer and several terabytes of media stored on an external hard drive. Glacier is designed to be a cheap cloud-based system that’s used for backups, and Arq is a Mac app that uses Glacier as a backend.

As I got started and looked into pricing, however, I soon realized that Glacier wasn’t the right solution for me: Arq alone costs $40, and at $.01/GB per month for backup to Amazon Glacier, I’d be looking at a total of $400 for the first year.

At prices like that, I’d be better off just hacking something together myself with Bit Torrent Sync by buying a couple of large external hard drives, giving them to friends in other cities, and syncing my computer’s home directory and media drive to those friends’ drives over the internet. But that’s a hassle and requires coordination and reliance on friends of questionable tech savvy to keep the system running.

Enter Crashplan and Backblaze.

Both of these services have a similar idea: you run a program on your computer and it backs up whatever you want – presumably everything – to their servers over the internet for a monthly fee. The main appeal to me of both was that they’d offer unlimited data for a few dollars a month. Because Lifehacker liked Crashplan the best, I decided to install it and start using it to see how it went.

Here’s how it went: slowly. Really slowly. Really, really slowly, in fact. After a couple months of letting Crashplan run on my computer all the time, I still had an estimated eight to ten months remaining in my initial backup (of my computer’s home directory and media drive, a few terabytes of total data), which fluctuated as low as five or six months and as high as longer than a year. Finally I removed my media drive from the backup to let Crashplan focus on just uploading my computer’s home directory, and even then it took a few weeks to complete. But I was left without my media drive backed up, which was critical because that data lived in only one place.

So I tried Backblaze and, to my surprise, it went a lot faster than Crashplan: backing up just my computer was done in less than a week. As a result, I decided to stick with Backblaze. But I didn’t use it for a couple of weeks while I was setting up my home NAS…

Enter Drobo.

After Drobo5N came into my life and I spent some time getting it set up and configured, and re-set up and reconfigured, I signed into Backblaze to add my Drobo share that housed all the media I’d copied over from the external hard drive that I had been using to store it. But Backblaze wouldn’t let me locate my Drobo, no matter how many times I hit the refresh button or restarted the Drobo or the computer. I just couldn’t get it to work. Eventually I realized why it wouldn’t work: Backblaze doesn’t support NAS.

Unfortunately, that made me unable to use Backblaze, leaving me back where I started.

Price

Crashplan costs $60 per year and Backblaze costs $50 per year (with a coupon code). It doesn’t seem like an accident that the prices are so close. I’d gladly pay a bit more than that for a superior product, but not an order of magnitude higher.

Speed

Backblaze goes about as fast as I’d expect, but Crashplan was exceedingly slow.

Software

Do you love the look and feel and general UX of Windows, and wish that more desktop apps were written in Java so you could replicate that Windows experience from work on your Mac at home? If so, you’ll get a kick out of using Crashplan. But if you like the Mac way of doing things, you’ll love the very Mac-like Backblaze, which actually operates as a preference pane in System Preferences – exactly how it should be.

Restore

I fortunately haven’t yet had to restore from either Crashplan or Backblaze. Since I also back up by Time Machine to a share on my Drobo, I’d only need to use a Crashplan/Backblaze restore in the event of a catastrophe. They both have an option to ship me a drive with my backed up data for less than $200, so that’s what I’d probably do.

The fine print

Here’s why Backblaze says they won’t back up my Drobo:

Backblaze can technically backup a network drive, but for business reasons do not allow it. Backing up mounted or network drives can easily be abused. A user could mount the 10 or 20 computers in their home or small business and back them all up to one account for $5/month.
At this time, we do not have any available service to back up network storage devices.

I agree that this has the potential for abuse, but I’m not sure what they gain by banning it outright (what they lose, however, is clear: my business). I think they should let users back up a home NAS and just isolate the few who choose to abuse the service, and ask those people to pay more. Alternatively, I think they should continue charging $5 per month and let people pay an additional $1 or $2 per month to back up a NAS. A small additional charge strikes me as perfectly fair.

Crashplan offers a seeded backup service in which, for an extra fee, they’ll ship you an external hard drive and you can copy your data to it and ship it back to them, and they’ll use that as the basis for your backup to their server. It’s an interesting idea, but at $125 it’s also priced exorbitantly. It also still takes plenty of time – from placing the order to getting your data in their hands and included in your online backup could be two weeks. They also only let you do it once with a single 1 TB drive. And, let’s face it, this is actually just an admission that their real service is really, really slow. So no.

Backblaze seems a lot more cavalier than Crashplan about getting rid of your old data. It’s hard to quantify this, but just from browsing around their sites a lot, I keep getting the impression that Crashplan is committing to keeping all the data my entire backup stored for as long as I’m a Crashplan customer, while Backblaze doesn’t want to keep any versions of any files longer than absolutely necessary, and if for any reason I stop backup up with Backblaze for a few weeks, I risk losing everything.

Other options

Besides Amazon Glacier via Arq, I did look into a few other options:

  • Carbonite: charges $60 per year for their basic plan, which allows unlimited data but doesn’t even support an external hard drive (!!). Their other plans are only available on Windows, not on Mac! They strike me as not a serious company and I can’t imagine giving them my business.
  • Mozy: at $6 per month for 50 GB or $10 per month for 125 GB, they are not even in the league of the kind of data that I’m thinking about backing up.
  • Dropbox: could work, and I love using it for other things, but again, pricing. 500 GB costs $500 per year. Their business pricing is a little more appealing: $15 per month for unlimited storage – but only for a minimum of five users – so actually $75 per month.
  • Cycling my own physical drives and storing them in a secure location: it’s true that I could get a safe deposit box at a bank in a convenient location and two external hard drives, keeping one in the safe deposit box, being safe, and one at home, being up to date, switching them every week. But logistically this plan sucks and the biggest easily available hard drives are around 4 TB, but I need to think bigger than that.

The solution

There isn’t one, I’m afraid.

I am back to using Crashplan, for the time being, and I’ve followed their advice in Speeding Up Your Backup, though I can’t say that it’s been very helpful.

I also found this interesting blog post claiming that your specific Crashplan server location might be slowing you down. In his case, switching to their Atlanta server was a great benefit, since he’s in Atlanta. In my case, I happened already to be on their Atlanta server, so I’ve trashed and restarted everything, only to end up back on their Atlanta server, so there’s no benefit to me.

A bunch of people on the web have argued that tinkering with Crashplan’s de-duping settings will get Crashplan to go much faster (1, 2, 3). I’m trying this out as well, but these posts are old so I doubt what they say is still valid.

Update your Asus router

The Wirecutter:

If you own an Asus router, like the RT-N56U, N66U, or AC68U, you should update your firmware immediately to prevent attackers from accessing USB storage devices attached to your router.

But if you own an Asus router, why would you be running the stock firmware in the first place? It’s 2014 and your router should be working for you, not being crappy like a router from the 1990s. Here are some useful resources: