Categories
tech

Search Innovations

Read/Write Web posts:

There are an abundance of new search engines (100+ at last count ) – each pioneering some innovation in search technology. Here is a list of the top 17 innovations that, in our opinion, will prove disruptive in the future. These innovations are classified into four types: Query Pre-processing; Information Sources; Algorithm Improvement; Results Visualization and Post-processing.

Categories
tech

House Bill to Protect Bloggers as Journalists

Ars Technica highlights a new amendment to the Free Flow of Information Act of 2007 extending source-protection rights to bloggers. Rep. Rick Boucher (D-VA) has a good reputation among the freedom of information/open access crowd for siding with users. He also sponsored the Fair Use Act of 2007 which would protect libraries and other users of copyrighted materials.

And speaking of open access, a couple quick searches at OpenCongress show that both bills are still in committee. Lobbying time…

Categories
tech

Harvard Law Prof: "Protect Harvard from the RIAA"

Harvard Law School Professor Charles Nesson writes:

The RIAA has already requested that universities serve as conduits for more than 1,200 “pre-litigation letters.” Seeking to outsource its enforcement costs, the RIAA asks universities to point fingers at their students, to filter their Internet access, and to pass along notices of claimed copyright infringement.

But these responses distort the University’s educational mission. They impose financial and non-monetary costs, including compromised student privacy, limited access to genuine educational resources, and restricted opportunities for new creative expression.

With colleges and universities under increasing pressure from the record labels’ lobby, now is the time to push back. The educational mission is a more vital interest to our schools than collaboration with the entertainment industry to prop up their obsolete revenue model.

[via Slashdot]

Categories
tech

HD-DVD Encryption Authority Vows to Fight Key Leak

Ars Technica has the latest in the HD-DVD encryption key leak story (see my previous post). The encryption method in question is called AACS and it’s managed by the AACS Licensing Authority.

“If the local neighborhood gang is throwing rocks at your house, some people might tell you not to call the police because they will just throw bigger rocks,” [AACS LA chairman Michael] Ayers said.

But the bigger point is what happens when you “call the police,” to continue with his metaphor. Yes, the cops can stop people from throwing rocks at your house, so you’ve got to take that risk knowing that those same kids might retaliate next week. But AACS isn’t a house, and encryption keys aren’t rocks. Can “the cops” stop a 16-byte number from existing online? We can peer into the future and see the answer because history is, in fact, repeating itself.

The article goes on to draw the natural parallel between the HD-DVD encryption hack and DeCSS, the 1999 DVD encryption hack. The two situations will end in the same result: the code will continue to be available online. Unfortunately, the AACS LA seems determined to harass a bunch of people with lawsuits before bowing to the inevitable.

Categories
tech

Project Honey Pot: New Tools

Project Honey Pot captures the addresses of spam bots it attracts, and allows site administrators to block those addresses. Recently, they have introduced some cool new tools.

I just installed the WordPress plugin for Http:BL, which makes it really easy to tap into Project Honey Pot’s ban list to keep the bad bots off this site. You can also easily create your own honey pot and turn in offenders.

I’ve been using Akismet to filter spam comments (thanks for the recommendation, Emily S.!), which is why you don’t see hundreds of ads for drugs and porn in the comments on this site. It catches everything and hasn’t falsely caught any legitimate comments yet. But blocking them here is only defensive — I’m glad to be able to help crack down on them overall.

Categories
tech

Refining Google

Via Digg, I found an interesting article on Google’s attempts to prevent people from “gaming” its search results. Google’s PageRank algorithm, while secret, is known to consider the number and quality of incoming links to a site in its rankings. Therefore PageRank has working models of reputation, trust, etc.

In the article, Carsten Cumbrowski talks a lot of jargon and the writing becomes elliptical and dense at times, but the information he presents, and links to, comprises a very good background on issues with PageRank. He analyzes the NOFOLLOW attribute, an attempt to reduce the credence given to paid or otherwise less meaningful links. He also covers improvements to PageRank’s trust model:

It is like with people. You do not trust anybody you just have met. How quickly you trust somebody is less a time factor, but has to do with what the person is doing or is not doing and how much it does match what the person says about himself, his intentions and his plans.

Therefore the age of a site is a poor proxy for trustworthiness, and PageRank’s naive reliance on it was faulty. As I’ve posted before, an extreme amount of time and effort goes into reverse-engineering search algorithms, along a whole spectrum from benign “search engine optimization” to malicious exploitation of flaws. It’s an arms race in which the complexity of the system is determined as much by competitive pressure from its exploiters as by the desire for more useful search results.

Remember that the next time you rely on a search algorithm — or build a web service that relies on one.

Categories
tech

Digg Gets Caught in HD-DVD Encryption Fight

As I posted a while back, a method was found for extracting the encryption codes for HD-DVDs and Blu-Ray discs, allowing unauthorized decryption. Basically the problem is that in order to let legitimate users play their movies, you have to give them both the locked version and the key. It’s just a matter of time before someone takes that key and unlocks something else. In the case of HD-DVDs, it’s even worse because the encryption scheme depends on a single master key.

Next came some twists that speak volumes about the current state of “intellectual property” and its radical opposition to free speech. Yesterday the story went around social bookmarking and discussion site Digg that the key had been found. Because the master key was so short, the original poster included it in the title of their post.

Digg received a DMCA takedown notice and decided to comply. Users went nuts. They flooded the site with posts containing the key code and lobbying Digg’s management to fight back. Finally, at the end of the day, founder Kevin Rose posted his decision.

First, this case highlights the fact that even if you have a good method of securing information, there is no reasonable level of lockdown that will prevent this type of “leak”. This is a tangent, but I think an enlightening one: the same thing applies to security from terrorism.

There is no such thing as a tradeoff of freedom for security, because security is an illusion. Inmates in maximum-security prisons still manage to murder each other. Unless you’re willing to impose restrictions on the public greater than those on prisoners, you can’t make violence impossible. If violence is possible, you can’t be secure — only more or less likely to be attacked, more or less likely to live through it.

Any amount of freedom in a society, which I hope we can agree is a good thing, brings with it “security holes”. The fact that we are not constantly at war with each other is a social construct based on alternative effective methods of conflict resolution. We are free to be violent, but mostly choose not to.

Back to freedom of information. Freedom of speech does not exist in a vacuum. Any amount of freedom of expression in a society depends upon the public’s ability to use and repurpose the expressions of others. No reasonable amount of restriction of free expression will achieve full control over access to information. And the crucial point is that with the internet, one lapse in access control leads to full publicity. Therefore the idea of trading freedom of expression for information security, in the public context, is an illusion.

Second, Digg’s reaction is a miniature version of the process that society as a whole is going through to readjust its beliefs and policies to internet technology. Because of the internet, that encryption key is now a piece of public information. And the public is getting tired of corporate interests manipulating the legal system, trying to put the cat back in the bag.

Digg recognizes that its value is dependent upon its social nature. Placing restrictions on users will dry up that well of participation and cause Digg to fail. It’s precisely the many-to-many nature of the internet that makes explicit the radical dependence of content and service providers on the good will of their users.

Without waiting for the market to sort out the problem, Digg listened directly to its users and made the right decision. In a sense, actually, the users made the decision. That’s a fundamental shift in the way things work, which is making its way through every institution — though mostly at a slower pace than Digg’s one-day turnaround.

Categories
tech

Joost Readies Launch

Joost is a p2p streaming video system. Basically you watch TV on your computer; I’ve been using the beta and it works really well. The limitations are 1. you can’t save the videos, only stream them, and 2. your favorite show is probably not on Joost (yet).

Ars Technica reports that Joost will come out of beta later this month with content from CNN and other Turner networks, in addition to existing deals with CBS and Viacom.

While other net video services like YouTube and Google Video have been hugely successful with short clips, TV networks want more control and more DRM. I’m sure they’re also happy to save on high-bandwidth streaming servers and pass that task to users’ network connections. Joost is a few steps ahead and seems well-positioned to give the networks what they want.

I do think it’s cool that the old media are trying out new distribution methods. But they seem to be sacrificing some key features of web services — searching, repurposing, linking, and layering content via standard, open protocols and APIs. They’ve rebuilt the one-to-many TV network model on top of the many-to-many internet. This may be necessary to secure participation of the old media players but let’s not stop demanding full functionality — which means open interoperability — of these new services.

Categories
tech

Supreme Court Reins in Obvious Patents

Ars Technica reports on today’s Supreme Court ruling rejecting the standard for obviousness the Circuit Court has been using to determine patentability.

[T]he Federal Circuit had adopted a higher standard, ruling that those challenging a patent had to show that there was a “teaching, suggestion, or motivation” tying the earlier inventions together.

[The Supreme Court found] that the Federal Circuit had failed to apply the obviousness test. “The results of ordinary innovation are not the subject of exclusive rights under the patent laws,” Justice Anthony Kennedy wrote for the Court. “Were it otherwise, patents might stifle rather than promote the progress of useful arts.”

I’m not quite ready to forgive Kennedy, but it’s nice to see the Court upholding the obviousness test. As my previous posts illustrate, lately a lot of stupidly obvious inventions have been given patents. Especially in software, the result has been much more stifling than promotion of good ideas.

Categories
tech

Deep Notes

Free software plug: I really like Deep Notes, a simple outlining app. When working on longer writing, I need to see the structure in outline form while being able to quickly dig down into that structure and look at the exact wording.

Deep Notes represents hierarchical structure with expand/condense buttons, like list view in the OS X Finder, giving complete control over the level of detail you see for every item. It’s easy to move items around in the hierarchy; it’s a dynamic model, which is exactly what I need when sorting out conceptual structures.

If anyone has seen this functionality in a web app, please let me know!