« May 2006 | Main | July 2006 »

Chowhound Re-opens

We'd purchased chowhound a little while ago and began rebuilding the software (and updating the UI).  It just re-launched over the weekend.  Although it's built with the latest ajax and ruby on rails, the site is designed to appeal to people passionate about food, not code.

The members of the site are *very*  passionate  treasure hunters -  who share their finds and opinions on restaurants and food around the world - for example, one of my favorite members is melanie wong .  To give you a flavor of the site, Melanie just posted that proscuitto de parma is on sale for $9.99 per half pound.

      Prosciutto di Parma ($9.99/half-pound) at Traverso's    

     Friday afternoon I stopped at Traverso's in downtown Santa Rosa to buy something cold to drink and found a prosciutto di parma sale in progress. Genuine prosciutto di parma is reduced from $32/lb. to $9.99/half-lb. until Wednesday.

The counter people know how to slice here, giving me a sample to make sure it was as thin as I wanted. It was almost gossamer, thin enough to see through but still holding together. I got six slices for a pre-dinner snack for two bucks. The prosciutto is quite delicate with a sweet and faintly salty flavor. I'm pretty sure that it was Fratelli Beretta but not entirely positive."

I love that we can build global neighborhoods around stuff like this.  Stay tuned for the upcoming launch of the new companion food site Chow.

under the radar

Just came back from the under the radar conference, where I had the pleasure of "judging" about 10 start ups from podcasting to video remixing.  A few of my fellow CNET'ers were there, Rafe moderated one of panels and Dan Farber was there with digital camera in hand, and each had insights galore.  I met Debbie Landa, who runs the conference, and have to say how impressed I was - a real human being, smart, down to earth, opinionated - like to see her again.  I will be back, and hope to bring more people from the CNET Networks business side.  We all don't get out as much as we should.

Webshots redesign

We just posted the first public comp of our new design for Webshots in the Webshots blog to get user feedback, in advance of the upcoming beta in late june/early july.  I'd love to get your thoughts too. 

and yes, that's a video player where the photo should be.  more on that in a minute.

For the last several months we've been working with an advisory group of Webshots users to get their perspective on the needed changes to the front end design and UI.

Additionally, we have been adding talented designers and producers/developers and I now feel we have the necessary user centered design skills to do a complete overhaul of a site that serves 19 million people per month. 

Everyone knows that the current Webshots front end is functional, but it needed major major changes.  our members knew.  you knew.  we knew.  it's all going to change.  for the better (we hope).

The new changes will be risky - because they touch most areas of the site, and they are pretty radical.

- all major user flows and pages are getting changed.  signing up, uploading, sharing, browsing, searching.  signing up to webshots currently takes something like 9 pages/steps and we ask people to download the software to "get started".  ugh.  changing. 
- category pages will be changed and will pave the way for tags (finally) - but we'll be combining tags with cagtegories.
- our members' pages will be changed.  we are adding better ways for them to publish and share.
- most of the ads are being changed.  no more banners.  no more skyscrapers.  no one likes them.  out out out.  In with sponsorships, paid links, and what we call MPUs (square ads).
- the site will better highlight the great content we have in the community, and the breadth of passions from our users, from citizen news to travel, their hobbies, art, and good times.
- the site will show more of our members (that just doesn't read well).  anyway - we'll highlight people as well as photos.

- finally - the site redesign will pave the way for video.  The cameras we all have take both short videos as well as photos, so after you've captured both video and photos of a recent trip, why shouldn't the site allow you to share both, together? 

The photo page design layout here will be the same for photos and videos, and our albums will allow members to go chronicle something with both photos, text and video, and store it all in one container, so someone can see everything to do with your trip, or the next hurricane or whatever, all in one place.  you won't have to upload your photos to one service and your videos to another.

The Webshots logo in the upper left is a placeholder  - we're not changing the name - but the logo/font will be changing.  More later.

Consumating relaunch

We relaunched consumating yesterday.  Ben and Josh have put some wonderful ideas into what is now an online version of a bar.  Consumating started as a dating site, which is what it was when it was just a twinkle in Ben's (and co-founder Adam's) eye(s).  Adam is now at google.  Josh joined us in January to help Ben develop the site.

Like a bar, you may end up going home with someone, but part of the fun is going to a place that speaks to your personal brand, and hanging out with people who are into what you are into, meeting new people and then going home to your slightly more normal existence.  that's consumating.

The new features are detailed here.

Ben, Josh and consumating have been great additions to the CNET Community Group.    Mike Arrington covered the launch here.

More on Haystack

Haystack powers allyoucanupload and will soon power all of Webshots (Webshots is a photo sharing community with 19,000,000 members who have uploaded over 375,000,000 photos).  Allyoucanupload is an image hosting service that we built to run alongside Webshots.

Haystack is designed to provide a very scalable, reliable and cost effective platform for object storage and delivery to the Internet.  It just went live 2 weeks ago - we are currently using it for the allyoucanupload Webshots image hosting service (gif, jpeg and png).   In the very near future, it will serve all Webshots photos, and soon, video. 

I'm doing a long post because I'm very proud of what our technical team has accomplished, but also to give some insight into how finance, user and technical strategy intersect in our group.  Building a large and sustainable (aka profitable) business in social media requires a balancing act between the delivering on a great user promise, a revenue model and keeping your costs under control.  Haystack will allow us to deliver very robust storage solutions for users at a very low marginal cost.

Disclaimer/Credit: Almost all of the technical content in this blog post was written by Paul O, who runs CNET Networks' data center services, including database architecture, network systems and operations and the actual data center.  Paul, Jim, Rodolphe, Marcus, and Matthew built Haystack - they don't (yet) have blogs so I'm doing this post.  Please do not attribute any technical props to me because of this post.  My only contribution to Haystack was to approve its development and cheerlead along the way.

Haystacks' content (social media files) has several interesting characteristics: it grows without bound; it tends toward write-once, read-many; the most recent content tends to be the most frequently accessed.  Haystack's design leverage these characteristics. 

The challenge:
A big financial and therefore technical issue is the relationship of storage to delivery.  It's relatively easy to deliver a small number of files to a lot of people.  And a large number of files to a small number of people.  A large number of files to a large number of people gets more complicated and can get very expensive very quickly.

Haystack gives us the ability to finely match the raw storage capacity of the system with it's overall IO capacity.  Haystack grows very naturally through incremental addition of capacity.  Haystack is designed to handle failures automatically and to keep reliability constant as the system grows.  Haystack uses commodity hardware and software.

The promise of Haystack is that we can handle reliability at scale at a very low (perhaps the lowest) marginal operating cost. Reliability means that we never have to say we're sorry - we lost your photos.  Scale is scale.  Low marginal cost directly goes to our ability to give users as much storage as we can, and run the least intrusive ads, while running Webshots with a sustainable profit margin - and keeping the data center team focused on talent vs hands and well paid:)  You'll see the effects of Haystack on Webshots in our upcoming redesign and soon to be revised storage limits. 

BSU
  Haystack consists of many Basic Storage Units (BSU), which are just servers with a lot of disks.  Content is scattered more or less randomly across all BSU and spindles to maximize the IO throughput of the system.  Multiple copies of the content are maintained on disparate equipment so that no single failure can loose all copies of an object.

Failure: In the event of a component failure, Haystack immediately begins a process to copy the "missing" content from one of the redundant sources to the available components.  Because the content is scattered across all available components, recovery time is on the order of 1/N.  Of course, the failure rate is on the order of N, so the overall availability is on the order of N * 1/N or a constant.  Recovery is also has very little impact on the overall performance of the system.

As new capacity is added, existing content is migrated to the new capacity to rebalance the storage and IO across all available units.  The rebalancing time is approximately constant.

Separation of Church and State
To minimize the overhead in tracking the location of any given object, Haystack puts objects into buckets and needs to track only the location of the buckets.  The applications using Haystack must independently track metadata about each object including it's bucket.  At some level, a bucket is really just a directory and each BSU knows what buckets it contains.  Haystack maintains a proper cache of the bucket locations and each BSU checks in periodically to report the state of its buckets.  The proper cache can easily be rebuilt and the overall system is very tolerant of data inconsistencies between clients, the cache and the BSUs.

Processes
Various processes monitor the overall condition of Haystack and initiate actions as needed to maintain the health.  For example, when new capacity is added, these monitoring processes detect the availability of "under-utilized" capacity and begin a bucket migration process to bring the new capacity up to the same levels as the old capacity.

Some of the jobs have names...they are:
leon's job is to identify and remove extra instances
jeopardy makes more copies of data that has too few copies
scalpel removes outdated copies of data
optimist moves data to fill up less-full nodes
pessimist moves data from busy nodes to less busy nodes

Changing the drives over time
Currently we use 400 GB sata drives.  As time goes by, the average age of the content will grow and the average number of access per object will decrease.  This will allow us to introduce larger capacity disk drives as the system grows, helping to keep the overall hosting and depreciations costs low.  The number of sata drives attached to a given BSU is determined by the overall network throughput of the BSU and the ability of the BSU to effectively use the file system cache.

Caching
Haystack uses various content caching and redirection mechanisms to both hide the complexity of the system from clients and to leverage the raw IO capacity of the spindles.  The system is very loosely coupled and designed to be quite tolerant of failures.  This means that failures are very localized.  In addition, the types of failures that can have the largest negative impact are with very proven technologies, such as file systems and disk arrays.  Thus the overall reliability is high and the operational costs are low.

Rant - lots of private social media properties are venture funded and lose tons of money storing/serving content to get big to either get bought or change the model. 

Finale
The challenge at public companies like CNET Networks is to build a social media business model as well as a user & technical model from the "get go".  That operating plan must be good for user and cheap enough to operate so we can return a healthly return on invested capital to our shareholders.  Haystack is a significant innovation that should help us to build a better product for users, scale for marketing partners, and generate operating cash flow for shareholders.

Whew - if you're still reading then you are likely one of the Webshots or CNET Networks engineers - nice work folks :)

String Theory

I wish I could post about string theory :)

We are scoping the suggestions received on allyoucanupload.com.  I'll post the results soon.  But I just learned something about integers and strings.   As someone on the business side of tech I have to say that it's awe-inspiring to see the technical complexity that is required to build simple and scalable things for users.  This is a neat example, because it's not even that complicated.

As I took the snip url idea back to James Park - who runs the product development team, he told me that the way to create unique but short URLs is to use numbers *and* letters.  That way you get lots of permutations within a small url.

Makes sense I said - that's what tinyurl and snipurl do.   How hard is it to do letters and numbers?

Easy said James.

"Perfect", I said...but James was looking up at the ceiling then he turned to me and said, "but a letter is a string.  And a string is a lot slower than an integer at scale - and we need things that work very fast at scale."

Not so easy.  not so perfect.  nothing ever is.  and that's what's so damn fun about this business.

we're looking further into it.  If anyone knows anything relevant - let me know and that might help us get to the answer faster.  cheers. 

My Photo

Recent Comments

Recent Posts

Blog powered by TypePad