Haystack powers allyoucanupload and will soon power all of Webshots (Webshots is a photo sharing community with 19,000,000 members who have uploaded over 375,000,000 photos). Allyoucanupload is an image hosting service that we built to run alongside Webshots.
Haystack is designed to provide a very scalable, reliable and cost effective platform for object storage and delivery to the Internet. It just went live 2 weeks ago - we are currently using it for the allyoucanupload Webshots image hosting service (gif, jpeg and png). In the very near future, it will serve all Webshots photos, and soon, video.
I'm doing a long post because I'm very proud of what our technical team has accomplished, but also to give some insight into how finance, user and technical strategy intersect in our group. Building a large and sustainable (aka profitable) business in social media requires a balancing act between the delivering on a great user promise, a revenue model and keeping your costs under control. Haystack will allow us to deliver very robust storage solutions for users at a very low marginal cost.
Disclaimer/Credit: Almost all of the technical content in this blog post was written by Paul O, who runs CNET Networks' data center services, including database architecture, network systems and operations and the actual data center. Paul, Jim, Rodolphe, Marcus, and Matthew built Haystack - they don't (yet) have blogs so I'm doing this post. Please do not attribute any technical props to me because of this post. My only contribution to Haystack was to approve its development and cheerlead along the way.
Haystacks' content (social media files) has several interesting characteristics: it grows without bound; it tends toward write-once, read-many; the most recent content tends to be the most frequently accessed. Haystack's design leverage these characteristics.
The challenge: A big financial and therefore technical issue is the relationship of storage to delivery. It's relatively easy to deliver a small number of files to a lot of people. And a large number of files to a small number of people. A large number of files to a large number of people gets more complicated and can get very expensive very quickly.
Haystack gives us the ability to finely match the raw storage capacity of the system with it's overall IO capacity. Haystack grows very naturally through incremental addition of capacity. Haystack is designed to handle failures automatically and to keep reliability constant as the system grows. Haystack uses commodity hardware and software.
The promise of Haystack is that we can handle reliability at
scale at a very low (perhaps the lowest) marginal operating cost.
Reliability means that we never have to say we're sorry - we lost your
photos. Scale is scale. Low marginal cost directly goes to our
ability to give users as much storage as we can, and run the least
intrusive ads, while running Webshots with a sustainable profit margin
- and keeping the data center team focused on talent vs hands and
well paid:) You'll see the effects of Haystack on Webshots in our
upcoming redesign and soon to be revised storage limits.
BSU Haystack consists of many Basic Storage Units (BSU), which are just servers with a lot of disks. Content is scattered more or less randomly across all BSU and spindles to maximize the IO throughput of the system. Multiple copies of the content are maintained on disparate equipment so that no single failure can loose all copies of an object.
Failure: In the event of a component failure, Haystack immediately begins a process to copy the "missing" content from one of the redundant sources to the available components. Because the content is scattered across all available components, recovery time is on the order of 1/N. Of course, the failure rate is on the order of N, so the overall availability is on the order of N * 1/N or a constant. Recovery is also has very little impact on the overall performance of the system.
As new capacity is added, existing content is migrated to the new capacity to rebalance the storage and IO across all available units. The rebalancing time is approximately constant.
Separation of Church and State
To minimize the overhead in tracking the location of any given object, Haystack puts objects into buckets and needs to track only the location of the buckets. The applications using Haystack must independently track metadata about each object including it's bucket. At some level, a bucket is really just a directory and each BSU knows what buckets it contains. Haystack maintains a proper cache of the bucket locations and each BSU checks in periodically to report the state of its buckets. The proper cache can easily be rebuilt and the overall system is very tolerant of data inconsistencies between clients, the cache and the BSUs.
Processes
Various processes monitor the overall condition of Haystack and initiate actions as needed to maintain the health. For example, when new capacity is added, these monitoring processes detect the availability of "under-utilized" capacity and begin a bucket migration process to bring the new capacity up to the same levels as the old capacity.
Some of the jobs have names...they are:
leon's job is to identify and remove extra instances
jeopardy makes more copies of data that has too few copies
scalpel removes outdated copies of data
optimist moves data to fill up less-full nodes
pessimist moves data from busy nodes to less busy nodes
Changing the drives over time
Currently we use 400 GB sata drives. As time goes by, the average age of the content will grow and the average number of access per object will decrease. This will allow us to introduce larger capacity disk drives as the system grows, helping to keep the overall hosting and depreciations costs low. The number of sata drives attached to a given BSU is determined by the overall network throughput of the BSU and the ability of the BSU to effectively use the file system cache.
Caching
Haystack uses various content caching and redirection mechanisms to both hide the complexity of the system from clients and to leverage the raw IO capacity of the spindles. The system is very loosely coupled and designed to be quite tolerant of failures. This means that failures are very localized. In addition, the types of failures that can have the largest negative impact are with very proven technologies, such as file systems and disk arrays. Thus the overall reliability is high and the operational costs are low.
Rant - lots of private social media properties are venture funded and lose tons of money storing/serving content to get big to either get bought or change the model.
Finale The challenge at public companies like CNET Networks is to build a social media business model as well as a user & technical model from the "get go". That operating plan must be good for user and cheap enough to operate so we can return a healthly return on invested capital to our shareholders. Haystack is a significant innovation that should help us to build a better product for users, scale for marketing partners, and generate operating cash flow for shareholders.
Whew - if you're still reading then you are likely one of the Webshots or CNET Networks engineers - nice work folks :)
wow time card
wow time card
Posted by: wow time card | December 30, 2009 at 10:34 PM
wow cd key
wow cd key
Posted by: wow cd key | December 30, 2009 at 10:44 PM
Thank you for your sharing! I like i very much!
Posted by: cheap coach handbags | January 25, 2010 at 07:15 PM
Great comments! You are so nice, man! You never know how much i like'em!
Posted by: cheap coach bags | January 26, 2010 at 04:51 PM
Yes, that's cool. The device is amazing! Waiting for your next one!
Posted by: cheap coach purses | January 26, 2010 at 10:30 PM
I like i very much!
Posted by: babyface | February 03, 2010 at 12:46 PM
Great articles and it's so helpful. I want to add your blog into my rrs reader but i can't find the rrs address. Would you please send your address to my email? Thanks a lot!
Posted by: Coach Handbags | February 23, 2010 at 12:07 AM
what's your name? bmx
Posted by: aion time card | March 02, 2010 at 11:40 PM
what's your name? bmx
Posted by: silkroad.com game | March 02, 2010 at 11:42 PM
Thank you very good blog system, I got very good information thanks to your site. I am constantly followed.
Posted by: Konya chat | April 12, 2010 at 05:13 AM
Thank you very good blog system, I got very good information thanks to your site. I am constantly followed.
Posted by: Konya chat | April 12, 2010 at 05:13 AM
Thank you very good blog system, I got very good information thanks to your site. I am constantly followed.
Posted by: Konya chat | April 12, 2010 at 05:14 AM
Thank you very good blog system, I got very good information thanks to your site. I am constantly followed.
Posted by: Konya chat | April 12, 2010 at 05:14 AM
Thank you very good blog system, I got very good information thanks to your site. I am constantly followed.
Posted by: Konya chat | April 12, 2010 at 05:15 AM
I love your blog! You will be in our prayers and thoughts! Nice and informative post on this topic thanks for sharing with us.Thank you!
Posted by: Air Jordans | April 23, 2010 at 11:36 PM
It's great to hear from you and see what you've been up to. All of the projects look great! Thanks!
Posted by: Cheap Jordans | April 24, 2010 at 06:39 PM
Just a quick note to say great to see people blogging about climate change and if you want more information on the topic you might like to check out my blog. Links are always welcome, if you like the site but more importantly it may provide you with a few interesting stories for your readers.
Posted by: Air Jordan 6 | May 06, 2010 at 12:22 AM
How about some free psp games?
Posted by: christian louboutin | May 18, 2010 at 12:22 AM
Those numbers are quite staggering. Its amazing how one little concept can help so many people out.
Posted by: reverse lookup | July 19, 2010 at 01:21 PM
thanks you bro
Posted by: lez | September 05, 2010 at 07:14 PM
ariston beyaz eşya servisi, ariston servisi.
thanks man
Posted by: ariston servisi | October 02, 2010 at 02:11 AM
This website is very nice but a little slow. I wanted to report. Good work ...
Posted by: Gerçek Oyun Sitesi | November 16, 2010 at 01:57 PM
shares have been one of the nice people thank you for your blog
Posted by: Chat | November 25, 2010 at 07:50 AM
shares have been one of the nice people thank you for your blog..(http://www.memleketchat.com)
Posted by: chat siteleri | November 25, 2010 at 07:51 AM
ou want more information on the topic you might like to check out my blog. Links are always welcome, if you like the site but more importantly it may provide you with a few interesting stories for your reader.
( http://konychat.memleketchat.com)
Posted by: konyachat | November 25, 2010 at 07:54 AM