<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>JoeCascio.net &#187; distributed, microblogging, twitter</title>
	<atom:link href="http://joecascio.net/joecblog/tag/distributed-microblogging-twitter/feed/" rel="self" type="application/rss+xml" />
	<link>http://joecascio.net/joecblog</link>
	<description>Everyone is entitled to my opinion</description>
	<lastBuildDate>Fri, 06 Apr 2012 13:19:29 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Distributed Twitter &#8211; the hard bits</title>
		<link>http://joecascio.net/joecblog/2008/05/06/distributed-twitter-the-hard-bits/</link>
		<comments>http://joecascio.net/joecblog/2008/05/06/distributed-twitter-the-hard-bits/#comments</comments>
		<pubDate>Tue, 06 May 2008 18:32:23 +0000</pubDate>
		<dc:creator>JoeC</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[distributed]]></category>
		<category><![CDATA[distributed, microblogging, twitter]]></category>
		<category><![CDATA[microblogging]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://peeps.3greeneggs.com/joecblog/?p=11</guid>
		<description><![CDATA[If you don&#8217;t read all of this admittedly long post, please do skip to the end and check out the BarCampBoston info. I&#8217;ll be holding a session there on the topic of Distributed Microblogging. Ok, so let&#8217;s talk about the hard bits of doing Distributed Microblogging. It&#8217;s easy to envision a multitude of servers exchanging [...]]]></description>
			<content:encoded><![CDATA[<p><strong>If you don&#8217;t read all of this admittedly long post, </strong>please do skip to the end and check out the <a title="BarCampBoston home page" href="http://www.barcampboston.org/" target="_blank">BarCampBoston </a>info. I&#8217;ll be holding a session there on the topic of Distributed Microblogging.</p>
<p>Ok, so let&#8217;s talk about the hard bits of doing Distributed Microblogging. It&#8217;s easy to envision a multitude of servers exchanging microblog posts, and a UI that simply arranges the posts in chronological order. By the way, chronological order is easy to do if all the servers are synched with a time standard like <a title="National Institute of Standards and Technology" href="http://tf.nist.gov/timefreq/service/its.htm" target="_blank">nist.gov, </a>and most are. But the hard bit is making it work in a way that performs well and scales to large populations of both microblog posters and readers. I&#8217;ve been thinking of some different alternatives for this, which I&#8217;ll lay out here. As always, your thoughts are welcome.</p>
<h3>Performance</h3>
<p>So, what do I mean by &#8220;performs well&#8221;? Well, microblog (ie, Twitter) updates happen much more frequently than what we&#8217;d consider traditional blog posts, but not quite as fast as instant messenger or chat updates. And microblogs don&#8217;t have the notion of presence attached to them. You don&#8217;t worry about whether a Twitter poster is &#8220;on line&#8221; at any particular time, although you may deduce that from the rapidity of their responses to you.</p>
<p>To put a number on it, I&#8217;d say that a microblog post notification should be transmitted in less than a minute, ideally in a few seconds, although a delay of up to 5 minutes is not that objectionable. I frequently will ignore my Twitter feed for many minutes, sometimes hours. I have no expectation that I will see someone&#8217;s posts immediately (i.e., less than a second) nor do I care to. Microblogging is a river of updates. You don&#8217;t expect to see every single one.</p>
<h3>Size</h3>
<p>Now let&#8217;s talk numbers of followers and following. A typical Twitter user has 100 or fewer people that they follow, and a similar number that are following them. But the edge cases are far bigger. Someone like Robert Scoble (@scobleizer) or Chris Brogan (@ChrisBrogan) follow literally thousands of other users, and have even more following them.</p>
<p>Ok, so we need to be able to update thousands of users in nominally less than a minute, but at the exreme less than five minutes. If you don&#8217;t buy that, please leave a comment explaining why these limits are not realistic.</p>
<h3>Alternatives</h3>
<p><strong>RSS feed polling</strong></p>
<p>The simplest solution would be polling, which is reader initiated. Followers would poll the feeds of the microblogs they were interested in. It seems to me that polling has to be discarded as a solution except for occasional or retrospective uses. I think a solution should include RSS feeds but it seems obvious to me that for someone with 100 friends to poll those RSS feeds every minute, or ideally faster because they <strong>might</strong> post something is a stupendous waste of bandwidth.</p>
<p>On the sending side, if that same person had 100 followers, each of those followers would be polling the person&#8217;s RSS feed every few seconds causing a very high server load. Now multiply that by however many people&#8217;s accounts are hosted on that server and the problem quickly blows up.</p>
<p><strong>Notification</strong></p>
<p>A much more efficient solution would be for senders (microblog posters) to notify followers when a new post has been made, and perhaps to proactively send the new content in the notification message. Then, resources are used only when there is actual traffic to send. Both senders and receivers are quiescent when no one is posting. So what are the possibilities for implementing notification? I see the following.</p>
<ol>
<li>RSS cloud API &#8211; a little-known part of the RSS specification, the cloud element allows a feed to publish a web-service address that readers of the feed can register with to be notified of changes using a SOAP or xml-rpc call.</li>
<li>Jabber (XMPP) channels between DMB servers to carry notifications and content.</li>
<li>UDP notification with http callback. UDP is lightweight for both senders and receivers.  No open connections are required between senders and receivers. It&#8217;s sort of like RSS cloud, but narrowly and specifically designed for DMB, as opposed to generalized RSS.</li>
</ol>
<p><a title="RSS cloud spec" href="http://cyber.law.harvard.edu/rss/soapMeetsRss.html" target="_blank"><strong>RSS cloud API</strong></a></p>
<p>The cloud API was specifically designed with this purpose of notifying readers of content updates. Its original intent, judging from the RSS 2.0 spec was to allow actual client feed readers to register with the cloud. In the case of DMB, it would be cooperating servers that would register for notification with each other.</p>
<p>The problem with RSS cloud is overhead. Microblog entries are tiny and frequent compared with blog entries or traditional site updates. To require a follower server to read an entire RSS document to get 140 characters of content, and have this happen every few minutes when the poster updates would be inefficient to say the least. In addition, there is <a title="Forum article on the RSS cloud API" href="http://lists.apple.com/archives/syndication-dev/2006/Jan/msg00032.html" target="_blank">experience with the cloud api</a> that indicates just the HTTP session overhead for notifying many users becomes intolerable, although this was from the perspective of actual clients being notified as opposed to clients&#8217; servers being notified.</p>
<p><strong>Jabber (XMPP)</strong></p>
<p>Jabber is a very tempting candidate for this application, and has been getting quite a bit of discussion in the development community of late. Here&#8217;s <a title="Twitter application of Jabber pubsub" href="http://www.process-one.net/en/blogs/article/introducing_the_xmpp_application_server/" target="_blank">an example</a>. The advantage to Jabber is that it maintains open sessions between servers, which eliminates the session setup/teardown overhead, and allows for almost instantaneous notification of all &#8220;following&#8221; parties.</p>
<p>But this may also be a disadvantage in situations where there are hundreds or thousands of &#8220;followers&#8221; for a single sender. IM or chat is typically one-to-one, or one to a few, but microblogging is frequently one to hundreds or one to thousands. I am not familiar enough with Jabber servers in actual practice to know what their performance or connection limitations are. <strong>Anyone with Jabber implementation or operational experience is strongly encouraged to comment.</strong></p>
<p>Jabber&#8217;s concept of presence could be used to keep the number of messages by only requesting messages updates when a user is actually logged into his/her microblog system. What&#8217;s interesting about this notion is what &#8220;logged in&#8221; means. Microblogging, at least the way Twitter works, does not really require the concept of presence. For instance, you can be &#8220;logged in&#8221; to Twitter, but dormant for hours not getting any updates until you request a page refresh.</p>
<p>In fact, one key aspect of microblogs that differentiates them from IM or chat is that they don&#8217;t typically &#8220;auto-update&#8221;. And rather than this being a disadvantage, I-and I think many others-find this on-demand update to be much more useful than a streaming IM or chat window. It&#8217;s really more like reading small blog posts. I read when <strong>I </strong>want to, not when someone else decides to say something. So, using the presence capability without auto-updating will require a little clever UI design to, in a sense, auto-logoff the user when their web page hasn&#8217;t been refreshed in a certain period of time. This then, will also require the ability of followers to query the senders they follow for &#8220;back posts&#8221;, so they can see what happened in the past without keeping a client logged in all the time to save all the posts.</p>
<p><strong>UDP for notification</strong></p>
<p>In the past year, I did some work on a revamped email protocol I called IMTP that uses small UDP messages to notify receiving servers that the sender has some traffic for them. The receiver then calls back to the sender using TCP to get the message body. This was based on Prof. Daniel Bernstein&#8217;s <a title="Internet Mail 2000 page" href="http://cr.yp.to/im2000.html" target="_blank">Internet Mail 2000 </a>proposal several years ago. He proposed, quite rightly in my opinion, that mail senders should bear the burden of storing the message contents, not the receivers, and that mail content should only be sent when a receiver actually wants to read it.</p>
<p>The advantage to UDP is that it is very light weight for both the sender and the receiver, not requiring any session overhead or setup/teardown. If used for microblogging, UDP notification messages could take the place of the continuously open TCP sessions that Jabber employs, thus reducing the session resource allocation load on both ends.</p>
<p>Now, of course, the issue with UDP is that it is not guaranteed to be delivered. The IMTP service we built would retry the UDP message at some frequency until the receiver called back to either fetch or reject the message.</p>
<p>Conceptually, using UDP messages with a 140 character payload and a message number would fit a microblogging application very well.  Because microblogging has no expectation or requirement of presence or real-time delivery like chat and IM do, a dropped UDP message is not a tragedy. If the sender retried even once or twice per minute, that would be plenty of timeliness for microblogging. Plus, if the UDP messages carry a monotonically increasing message number, the receiver can know if they&#8217;re missed a message and simply call back to the sender to get it. The microblogging UI can reconstruct the sequence easily when the missing pieces, if any, finally come through.</p>
<p>A notification system using UDP would seem to minimize the resource requirements on both senders and receivers, and can perform the same kind of message fan-in/fan-out optimization that Jabber could. In other words, a given microblog update body needs to be sent only once to any server, regardless of how many followers there may be on that destination server.</p>
<p><strong>Comments, please!</strong></p>
<p>I am very interested in what others think and have to say about these issues. This problem of efficient and <em>timely-enough</em> notification, it seems to me, is the tough nut to crack for a good solution to microblogging.</p>
<p>By the way, I am going to do a session on Distributed Twitter (or Microblogging) at <a title="BarCampBoston home page" href="http://www.barcampboston.org/" target="_blank">BarCampBoston3 on May 17, 18. </a>We are going to try to bring in some mobile technology folks to discuss the other really interesting issue with Distributed Twitter, the SMS connection.</p>
<p>Also, at BarCampBoston, I am going to try to organize some kind of group to try implementing a Distributed Microblogging application going forward.</p>
]]></content:encoded>
			<wfw:commentRss>http://joecascio.net/joecblog/2008/05/06/distributed-twitter-the-hard-bits/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Distributed Twitter &#8211; Overview</title>
		<link>http://joecascio.net/joecblog/2008/05/03/distributed-twitter-overview/</link>
		<comments>http://joecascio.net/joecblog/2008/05/03/distributed-twitter-overview/#comments</comments>
		<pubDate>Sun, 04 May 2008 01:49:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[distributed, microblogging, twitter]]></category>

		<guid isPermaLink="false">http://peeps.3greeneggs.com/joecblog/?p=10</guid>
		<description><![CDATA[I want to quickly set down a high-altitude view of how I see a Distributed Twitter working. This should give you the basic concept, which I&#8217;ll then elaborate in more detail in subsequent posts. First of all, let&#8217;s call it something more generic. I like Distributed MicroBlogging or DMB. The &#8220;Distributed&#8221; part is really the [...]]]></description>
			<content:encoded><![CDATA[<p>I want to quickly set down a high-altitude view of how I see a Distributed Twitter working. This should give you the basic concept, which I&#8217;ll then elaborate in more detail in subsequent posts.</p>
<p>First of all, let&#8217;s call it something more generic. I like Distributed MicroBlogging or DMB. The &#8220;Distributed&#8221; part is really the key. Unlike a centralized, proprietary walled garden system, DMB would be spread out over hundreds or thousands of different servers over the internet.</p>
<p>Just like email or Jabber, anyone could run a DMB server. People would register on a particular server with their <a title="OpenID Foundation" href="http://openid.net" target="_blank">OpenID </a>and create or contribute to <strong>microblogs </strong>that other people could follow, ala Twitter. Note that <strong>people</strong> are different than <strong>microblogs</strong> as entities in the architecture. This is somewhat different than Twitter, in which there are only user accounts. This allows a form of the long-sought groups feature, implemented as microblogs that many people can contribute to.</p>
<p>So, <strong>people </strong>contribute to <strong>microblogs </strong>that are <strong>followed</strong> by other people. When someone updates a microblog, anyone on any server that is following (ie subscribing to) that microblog will get the update in whatever client they have running <em>when the client fetches it</em>. A Twitter-like client will display the posts from many different users interleaved in chronological order. Some clients could maintain a &#8220;live&#8221; real-time update, where other clients could display only on demand from the user, like the Twitter.com home page.</p>
<p>That&#8217;s another important point. The DMB architecture, like Jabber and SMTP, does not specify any particular user interface. How the microblogs are presented is up to a UI designer. The architecture only specifies what data are interchanged, not how it is presented.</p>
<p>So, that&#8217;s a very simple explanation of how I see a Distributed MicroBlogging working. There could be large public servers like Google or small private servers for individual companies or groups of people. A server could host hundreds or thousands of microblogs and users, or just one microblog with a single user.<br />
I can envision a given individual&#8217;s domain delegating its microblogging functions to a larger server, much as an individual&#8217;s home site can delegate its OpenID functions to a large identity service company.</p>
<p>Next, I&#8217;ll talk about the single most challenging implementation problem for DMB &#8211; notification. How does a DMB server notify other following servers that a change has taken place on one of its microblogs?</p>
]]></content:encoded>
			<wfw:commentRss>http://joecascio.net/joecblog/2008/05/03/distributed-twitter-overview/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Distributed Twitter</title>
		<link>http://joecascio.net/joecblog/2008/04/20/distributed-twitter/</link>
		<comments>http://joecascio.net/joecblog/2008/04/20/distributed-twitter/#comments</comments>
		<pubDate>Sun, 20 Apr 2008 21:15:09 +0000</pubDate>
		<dc:creator>JoeC</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[distributed, microblogging, twitter]]></category>

		<guid isPermaLink="false">http://peeps.3greeneggs.com/joecblog/?p=6</guid>
		<description><![CDATA[One of the things that drives everyone nuts about Twitter is its unreliability. In fact, it&#8217;s having a little case of the no-updates this morning (2008.04.20). This is a less frequently discussed but ultimately, I think, more important disadvantage to walled gardens. If you rely on a service and it goes down for maintenance or [...]]]></description>
			<content:encoded><![CDATA[<p>One of the things that drives everyone nuts about <a href="http://twitter.com">Twitter </a>is its unreliability. In fact, it&#8217;s having a little case of the no-updates this morning (2008.04.20). This is a less frequently discussed but ultimately, I think, more important disadvantage to walled gardens. If you rely on a service and it goes down for maintenance or failure, or it is subjected to any one of several denial-of-service attacks by hackers, you&#8217;re out of luck. You&#8217;re subject to how much, or more properly, how little time and money the service&#8217;s administrators or financial backers have put into site dependability.</p>
<p>So, what could be done about this problem? The answer would be to design a service that is <strong>distributed</strong> instead of centralized. A distributed Twitter service would operate like email. There&#8217;s no single point of failure for  email because there&#8217;s no single super email server or service through which everything flows. Email is based on a protocol, not a single service, or even group of services. Anyone can run an email server and exchange mail with anyone else running a server. I think sometimes people would <em>like</em> email to go down for a few days because they feel overwhelmed by it, but that&#8217;s a different blog post entirely. <img src='http://joecascio.net/joecblog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Of course, you might say it all flows through the Internet, but the net itself is distributed. Yes, there are backbone subnets that carry huge amounts of traffic and would degrade your service if they went down (and they sometimes do), but there are well-known ways to re-route traffic around failed links.</p>
<p>Twitter could most definitely be distributed. Over the past few months there have been some real developments toward this end, and conversation in the development community about ways to do it. The single most complete development done so far is <a title="Prologue project page" href="http://prologuetheme.org/" target="_blank">Prologue, </a>a WordPress theme that allows several people to &#8220;co-author&#8221; blog posts and share their &#8220;update stream&#8221; via RSS. It&#8217;s been most talked about as a Twitter for corporate co-workers that in a certain sense gives the long-desired groups capability to Twitter.</p>
<p>Prologue is a step in the right direction and hints at what is possible, but is not a general solution that would scale to accommodate thousands of users. The guys at WordPress that developed Prologue have said they&#8217;re not particularly interested in developing a fully distributed Twitter-clone, but in the hopes that someone else might pick up the ball and advance it, they&#8217;ve made the code open and available under an open-source license.</p>
<p>I&#8217;ve been thinking a lot about this problem myself and in this introductory post I&#8217;d like to outline what I think are the major requirements and architectural issues to be solved in creating a fully distributed microblogging platform. Do you have any thoughts to offer on this? Please comment!</p>
<ol>
<li><strong>Open<br />
</strong>Although it&#8217;s certainly possible to design a closed microblogging architecture (ie, distributed but a proprietary implementation), it&#8217;s certainly in the general user&#8217;s interest to make it open, which would allow for competing implementations. Why break down a walled garden only to create another one?</li>
<li><strong>Protocol -defined<br />
</strong>In order to be open and distributed, the architecture must defined by a protocol, not by a server program or database schema, or other implementation artifact.</li>
<li><strong>Minimal</strong><br />
Occam&#8217;s Razor should be the guideline. A new microblogging architecture should specify as little as possible in terms of UI, security techniques, etc. OpenID is an excellent example of a standard that leaves many important features (such as authentication technique) to the various implementations.</li>
<li><strong>Privacy-capable<br />
</strong>I think there is a rising concern and general wariness of the completely open and public nature of many social media and social networking applications. Users want to be sure they know who sees their posts and control who can see what &#8220;friends&#8221; they have. So, there should be a general capability to control access to your microblog and its meta-data.</li>
<li><strong>Extensible, Forward-Compatible</strong><br />
It should be possible for an implementation to add new capabilities without rendering previous versions inoperative or non-interoperable.</li>
<li><strong>Scalable</strong><br />
The architecture should be scalable to the entire internet. A given user should be able to subscribe to or service subscriptions from tens of thousands of other users.</li>
<li><strong>Efficient</strong><br />
In order to fulfill #6, the architecture must be efficient of network and computing resources. In particular, this would argue against polling RSS feeds as a way to collect input. Polling a blog once a day works, but a microblog stream like Twitter needs to react almost immediately.</li>
<li><strong>Standards-based</strong><br />
Whenever possible, new open architectures should utilize existing standardized protocols and data formats whenever possible. This does not mean that a new protocol or format is out of the question, but there had better be a very good reason why an existing standard cannot be used.</li>
<li><strong>Open-source</strong><br />
While proprietary implementations of a standard cannot be prevented, open source implementations should be favored, especially with respect to security issues. Only by inspecting the source can one be assured there are no easter-egg back doors or other weaknesses.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://joecascio.net/joecblog/2008/04/20/distributed-twitter/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>
<!-- This Quick Cache file was built for (  joecascio.net/joecblog/tag/distributed-microblogging-twitter/feed/ ) in 0.30002 seconds, on May 18th, 2012 at 9:56 am UTC. -->
<!-- This Quick Cache file will automatically expire ( and be re-built automatically ) on May 18th, 2012 at 10:56 am UTC -->
