Highly Scalable Systems

This site (HighlyScalableSystems.com) is set up to talk about highly scalable systems. Highly Scalable Systems publishes articles, tutorials and news on systems technologies especially for cloud computing. It is welcomed if you share and publish technical articles. Please check here for contribution information.

hss-keywords.png

Recent posts

Benchmarks are important to understand the performance and quantitative and qualitative comparison of different systems. Many analytic frameworks, such as Hive, Impala and Shark, are designed and implemented these years and become fundamental software for processing big data. How to benchmark these big data analytic systems is an interesting problem. [...]
Mon, Mar 17, 2014, Continue reading at the source
The public cloud storage services like Amazon S3, Google Cloud Storage and Windows Azure Storage replicate the data to ensure high availability. On the other hand, with data being replicated, the storage services exhibits certain data consistency models. Different cloud service providers employ different data consistency models nowadays. In this [...]
Tue, Feb 04, 2014, Continue reading at the source
John Ousterhout is a professor of Deparment of Computer Science from Stanford University. One recent project he is working on is the RAMCloud, a “new class of storage, based entirely in DRAM, that is 2-3 orders of magnitude faster than existing storage systems”. He posts his “Favorite Sayings” on his [...]
Wed, Sep 04, 2013, Continue reading at the source
When dealing with environments where memory is a constraint it is important to intelligently design memory usage. Be it embedded systems or supercomputers memory is always expensive. And with each boolean value using a byte it actually wastes a lot of memory. If not for addressability in languages like C [...]
Fri, Aug 30, 2013, Continue reading at the source
Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean. Slides download: Software Engineering Advice from Building Large-Scale Distributed Systems by Jeff Dean Numbers Everyone Should Know L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 [...]
Thu, Jul 18, 2013, Continue reading at the source
Here is a list of tutorials for learning how to write MapReduce programs on Hadoop, the opensource MapReduce implementation with HDFS. MapReduce Tutorials The official tutorial on Hadoop MapReduce framework: http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html. Yahoo! Hadoop Tutorial A comprehensive tutorial on Hadoop from Yahoo! Developer Network: http://developer.yahoo.com/hadoop/tutorial/. More about MapReduce To better understand [...]
Wed, Jul 17, 2013, Continue reading at the source
I compiled a list of good systems conferences and deadlines for my own reference. Here I share the list and hope it can help others who also need such a list. This list is kept updated. A PDF version: Systems Conference and Deadlines. Links to conference websites: Systems Conferences. Related posts:Conferences [...]
Tue, Apr 09, 2013, Continue reading at the source
Storage Architecture and Challenges in Faculty Summit, July 29, 2010, by Andrew Fikes, Principal Engineer. Download PDF. This slides introduces some of Google's storage systems with insights and discussion of problems. Related posts:Colossus: Successor to the Google File System (GFS) Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google [...]
Tue, Jan 22, 2013, Continue reading at the source
Designs, Lessons and Advice from Building Large Distributed Systems by Jeaf Dean. Everyone who is interested in large distributed systems should read: PDF for Designs, Lessons and Advice from Building Large Distributed Systems by Jeaf Dean. Related posts:Large-scale Data Storage and Processing System in Datacenters Storage Architecture and Challenges by Andrew [...]
Tue, Jan 22, 2013, Continue reading at the source
Update on Mar. 7, 2014: it seems that Faraz has graduated from Purdue and the webpage are not available anymore. Please check the comment for the latest links. MapReduce is a well-known programming model designed for generating and processing large data. There are various MapReduce implementations. One widely known and [...]
Thu, Dec 20, 2012, Continue reading at the source

highly-scalable-sysems-pic

Most viewed posts

Latest updated posts

Hosted on Dreamhost