The Information Silk Road

In this note I criticize recursive auction markets as a good model for the global market in public information. Instead I suggest an agent based model where data merchants compete to be suppliers of information to clients based almost entirely on value they above and beyond the value of the information itself. I ignore issues of data ownership and data privacy.

The original version of this note was sent to e$@thumper.vmeng.com on Friday, 29 Aug 1997 at 13:18:43 EDT. I realize that e$ is perhaps not the best forum for this message and welcome redirection to a more appropriate venue.

Mike Duvos suggested on 29Aug1997 in "A Distributed Network Cache Service" an idea for using micropayments to pay servers to store data. There was a follow-up on 31Aug1997 called "Adding Memory to the Net" which amplified further on this motif. I've saved some of these messages (thanks to RAH's e$pam) here.

A follow-up message was sent to e$@thumper.vmeng.com on Tuesday, 7 Oct 1997 at 08:39:27 EDT on maintaining data persistently. Its content has been incorporated into this document.

The Information Silk Road

I would like to draw a distinction between trading in digital services and trading information. There are interesting proposals to use market-based mechanisms to manage computer resources such as memory and CPU cycles[1] or network bandwidth[2]. Attempts have been made to extend this idea to trading information, such as the recursive auction market[3]. While the similarities between computer resources and the material world are close enough to make market-based ideas quite successful, I don't think the same is true for markets in information. Differences stem from the fact that information can be copied with perfect fidelity and essentially zero cost while both physical commodities and computing resources are strictly limited in quantity.

From another point of view, the problem is that the responsibilities of ownership are different. Market-based mechanisms work by clearly identifying an owner for each resource. The owner (in computational contexts this is usually a software agent) then tries earn enough income from its asset to pay expenses. The job of the owner is then to avoid overuse (the tragedy of the commons problem) and underuse. However, with information, overuse is impossible (maximizing the use of information is actually beneficial to general good) and only underuse is a risk. Underuse of information can even lead to a problem not shared with other limited commodities, it may be lost altogether.

Practically speaking there is another difference between computer resource commodities and information products. For example, any memory page can be sold to a software agent and numerous agents will compete for a fixed pool of identical pages. However, while I can endlessly replicate a piece of information, each customer won't need more than one of each datum. While different agents may occasionally request the same piece of information, this is often rare. Partly this is because a even small cache near the customer will mean that the server only gets a single request from each cache miss, not from each customer request. Therefore, an information seller must stock a huge selection of data and only expect a very tiny fraction of them to be to be in high demand.

Still, I would like to see both the digital silk road (markets for computer services) and the information silk road (markets for data) built. I am particularly interested in the design of the data merchants that will be the traders on the information silk road.

Recursive Auction Market

A useful contribution of the recursive auction market idea is that it denigrates to concept of information ownership. While ownership of information may still be useful to prevent valuable information from disappearing altogether (by being replaced in every supplier's inventory by more profitable data) it has nowhere near the standing of ownership of finite commodities.

The recursive auction market idea assumes that data in high demand would propagate from suppliers or intellectual property producers, through intermediate distributors, to consumers as fast as the network could carry it. At each stage the data is auctioned to the highest bidder who must factor into his bid price the network transfer charges[2] and his ability to further resell it. As the data is distributed, the rarity premium available to early sellers disappears, leaving only the basic network delivery costs.

These auctions require multiple simultaneous buyers. However it would seem that this would be a very unusual case. Except for a few extreme cases, the market for data will consist of multiple buyers spread out in time. This means that a significant expense of the seller is for the storage of the data between buyers. This storage rental must be factored into the data's price. This is a hard cost that depends on the interval between requests of each piece of data. Rarity, and the premium it adds to the price, is usually an ephemeral, even self-defeating, property of data. This suggests that an auction is not a good general paradigm to use for trading information.

Data Merchants

A better model of the information silk road has a large population of data merchants spread throughout the network. Their mission is to maintain an inventory of data in anticipation of being able to resell it profitably at a later time. Viewed individually these data merchants provide a data caching service. While viewed collectively, the information silk road is a replicated data storage facility[4].

So what services can a data merchant offer which customers might be willing to pay for. In other words, how can they add value.

Locality. Assuming some distance (as the packets fly) sensitivity to network transport charges, data will be cheaper from a local supplier than from a distant one. This means that a merchant can pay the transport charges to obtain the distant data once and attract local buyers by charging less than it would cost them to fetch it from a remote source. As long as there are multiple local buyers and the data storage costs incurred between requests is small compared to the differential between remote and local network transport charges then the local vendor can turn a profit.

Authoritativity. For small data, where the cost of transferring the data is comparable to the cost of requesting it, it will be useful for a client to pass its requests to a single merchant and expect a high probability of obtaining the data. Without consulting an authoritative source the client will potentially have to contact many merchants before finding the data it seeks. Unless the data is large this cost may dwarf the actual transfer cost. Being an authoritative source does not have to be a global property, but might apply to narrowly defined types of data: hostname to IP address mappings, perl scripts, or Cypherpunks messages. Being authoritative depends upon the data merchant maintaining a persistent reputation. Interestingly an authoritative source doesn't actually have to store any data, it may operate purely as a locator service.

Speed. Some data merchants may strive to provide data rapidly but spending more on rotating media, doing more aggressive prefetch and caching, and having bigger servers and network connections. For some customers and some data, paying a premium for speed could be quite desirable.

Anonymity. Guaranteeing privacy for a client's requests could be an attractive option. This is an other feature that would depend upon the merchant maintaining a good reputation.

Persistence. Offering to store data for a customer in exchange for a fee would be valuable service. In this case the data merchant is renting disk space instead of acting as a cache.

Data Persistence

Mike Duvos suggested on 29Aug1997 in "A Distributed Network Cache Service" an idea for using micropayments to pay servers to store data. There was a follow-up on 31Aug1997 called "Adding Memory to the Net" which amplified further on this motif. I've saved some of these messages (thanks to RAH's e$pam) in [5].

I have two basic problems with Mike's idea. The first is the payment model, and I think I have a good solution for this. The second is that I don't seen any financial incentive for data merchants to maintain data that no one is requesting. This seems to me to throw a bit of a wrench into the whole eternity file service idea.

I originally saw two basic approaches for a customer to pay a data merchant to hold his data. Either pay in advance; the customer must trust the merchant not to take the money and drop the data. Or pay to get the data back; the customer must trust the merchant to retain the data and the merchant must trust the customer to still be around when the storage period is over. Clearly there are also half-now-half-later variants. None of these seemed very sound.

Taking a queue from Nick Szabo's Smart Contracts[6] and Robert Hettinga's Digital Bearer Bonds[7] it occurred to me to construct a sort of bond to attach value to the persistence of the data. To do this, deposit some money with an escrow agent in exchange for a digital instrument which encodes a hash value and a date. The instrument entitles the holder to present the data matching the hash value after the date and receive the money in exchange. The escrow agent has instructions for what to do when the data is returned; perhaps it keeps it for a short while pending pick-up or e-mails it to a specified address.

Anyone wanting to preserve some data can create such a bond and sell it, with the data, to a data merchant at a discount from the face value. To first order, this discount represents the cost of storing the data until the bond matures. Normally the market value of the bond would increases as time passes until the date of maturity arrives when the bond, with the data, can be redeemed at face value.

The data merchant can, of course, sell the data alone to customers requesting it. If he perceives a large resale market he may offer to buy the bond at a reduced discount, since retailing the data will defray some of his storage costs. The data merchange may also decide to resell the bond itself, using an e$-like protocol, to some other data merchant. If there are few requests for the data it may be more profitable to sell the bond to data archiving service that doesn't keep data on rotating media but instead keeps it on tertiary storage, sorted by maturity date, along with other bonds. These tertiary media represent future cash and can be reloaded after the collective maturity date and redeemed with the appropriate escrow agents. Presumably the cost of such off-line storage would be considerably lower than for online storage, so the discount to face value would be smaller and the selling data merchant could reap a small profit on the sale.

This mechanism assures, with fair confidence, that data can be cast onto the network and be reliably recovered in the future. Of course, it doesn't guarantee that the data will not be lost or suppressed, merely establishes the cost of such an action. If I dearly desire that the data be retained I can create a bond with a large face value. Since the cost of storing the data is pretty much unaffected by the face value, the discount is similarly unaffected. The main difference is that the bond holder has more at risk if the data is lost in a disk crash or other calamity. However, nothing stops a censorious attacker from buying the bond and discarding it, except lack of funds.

There is presumably some art in selecting the face value of a bond. Clearly the value needs to reflect the storage costs of the data, because, after discounting, the value still needs to be positive. Further, if discounted the value is too small the data merchant has little incentive, at least at first, to avoid losing or even discarding the data. For example, if the cost to store a megabyte of data for a year is $1, then a $1.10 bond would sell initially for a dime. This may or may not be enough to convince the data merchant to hold on to it for a year. Though, after 11 months, it would be worth about $1.01 ($1*11/12 + $0.10). Perhaps a $2 face value would make more sense, the initial and final values would differ by no more than a factor of two. On the other hand, a $20 face value would sell initially for about $19, but the value to data merchants is large and approximately constant for the life of the bond. The limit on face value is that the premium the data merchant has to pay for insurance against disk crashes is proportional to the face value. Data assurance costs will represent an additional discount from the bond's face value, but after all improving the safety of the data is why the large face value was selected in the first place.

If the original owner of the data has lots of money to spend it may make more sense to mint additional bonds (probably with different escrow agents) with smaller face values and sell them to different data merchants. There is still no guarantee that the data will be preserved, but at least the cost to suppress the data is quantifiable.

References

[1] Mark S. Miller and K. Eric Drexler, "The Agorics Papers", http://www.webcom.com/~agorics/agorpapers.html.
[2] Norman Hardy and Eric Dean Tribble, "The Digital Silk Road", http://www.webcom.com/~agorics/dsr.html.
[3] "Email collected from e$ & e$pam", RecursiveAuctionMarket.txt
[4] "Eternity Service", http://www.dcs.ex.ac.uk/~aba/eternity/.
[5] Mike Duvos, "Distributed Netowrk Cache Service and Adding Memory to the Net", DistributedNetworkCacheService.txt
[6] Nick Szabo, "Formalizing and Securing Relationships on Public Networks", http://www.firstmonday.dk/issues/issue2_9/szabo/index.html. (I haven't yet read this reference, but I've read earlier material on the same subject).
[7] Robert Hettinga, "e$: What's a Digital Bearer Bond?", http://www.shipwright.com. I know Bob has a rant on this in there somewhere. Or try http://www.tiac.net/users/rah/dbb.html.