How To: Check Couchbase Speed

How To: Check Couchbase Speed
Memcached is at the heart of Couchbase and its the worlds fastest and most reliable cache solution. Like any process or application on a server there are many varibles that go into determining speed. PRO TIP – When testing or evaluating Couchbase remember to install it on a vanilla OEM operating system image. I have seen many times Couchbase performing sub-par because a “standard corporate” image have other applications or setting that interfere with Couchbase operations. How do we measure the performance of Couchbase? The tool to use for performance information is CBSTATS. CBSTATS is a CLI tool ussually located in Couchbase server at /opt/couchbase/bin/cbstats This tools only relates to performance stats to only a particular node and bucket. The stats are for only performance on the couchbase server level side and does not have any data of client side stats. #./cbstats <IP>:11210 <command> -b <bucket_name> -p <bucket_password> <command> all allocator checkpoint [vbid] dcp dcpagg dispatcher [logs] failovers hash [detail] items kvstore kvtimings raw argument reset slabs tap [username password] tapagg timings vkey keyname vbid CBSTATS is on a per node per bucket bases. NOTE – This only measures speed at the Couchbase Server Application it self and not to the SDK. The specific command you want is timings #./cbstats 127.0.0.1:11210 timings -b <bucket_name> -p <bucket_password> This will give you a histogram of all the specific events. You will not be able to track down a single and specific GET() or SET() but you can see a history.Many times you will want to test and clear the history and test again. The best method for that is to use the reset command to clear the histogram. #./cbstats 127.0.0.1:11210 reset -b <bucket_name> -p <bucket_password> For more details about CBSTATS click here on the official documentation: http://docs.couchbase.com/admin/admin/CLI/cbstats-intro.html The source code for CBSTATS is here on GITHUB: https://github.com/couchbase/couchbase-cliArray Array

Fast Machine Learning with PB-BLAS & Michael Payne in HPCC Systems

Fast Machine Learning with PB-BLAS & Michael Payne in HPCC Systems
Watch Michael Payne , PHD student at Clemson University, at the 2014 HPCC Systems Summit talk about PB-BLAS (Parallel Block Basic Linear Algebra Subprogram). With PB-BLAS machine learning algorithms can be more efficient and faster by 6X or more. HPCC Systems Machine Learning Library has just been updated. https://github.com/hpcc-systems/ecl-ml     Wikipedia: https://en.wikipedia.org/wiki/PBLAS Detailed Paper: http://www.netlib.org/utk/people/JackDongarra/PAPERS/079_1996_pb-blas-a-set-of-parallel-block-basic-linear-algebra-subroutines.pdf Array Array

Machine Learning

Machine Learning
ECL is the main method to query HPCC Systems. To do machine learning in HPCC Systems just import the machine learning ECL library from github. https://github.com/hpcc-systems/ecl-ml The library has lots of tools and pre-built functions to get the data scientist up and running. Check out the image below to see all the options. To get started click here: http://hpccsystems.com/ml/ml-getting-started  Array Array

What is ECL (Enterprise Control Language)?

What is ECL (Enterprise Control Language)?
ECL (Enterprise Control Language) is a C++ based query language for use with HPCC Systems Big Data platform. ECLs syntax and format is very simple and easy to learn. // Schema to use on data Layout_Person := RECORD UNSIGNED1 PersonID; STRING15 FirstName; STRING25 LastName; END; //Calling the data to be used. Similar to USE DATABASE; (SQL) //Example inline data allPeople := DATASET([ {1,'Fred','Smith'},{2,'Joe','Blow'},{3,'Jane','Smith'}],Layout_Person); //FILTER by Lastname somePeople := allPeople(LastName = 'Smith'); // Outputs --- OUTPUT(somePeople);   Hadoop Pig Note – ECL is very similar to Hadoop’s pig ,but more expressive and feature rich.Array Array

How To: Installing HPCC Systems on a single machine – Getting Started

How To: Installing HPCC Systems on a single machine – Getting Started
HPCC Systems is easy to install and takes about 5 minutes. Its so easy to install even a 8 year old can install it :-). Watch the video below to see how easy it is. (5:08 minutes) After installing go to http://<ip_address_of_install>:8010 to get to the administrator screen and query playground. Download Here: http://hpccsystems.com/download/free-community-edition/server-platform or Get Source code at Github: https://github.com/hpcc-systemsArray Array

What is Couchbase?

What is Couchbase?
No Schema , No Problem Couchbase merges both Memcached and CouchDB into one. Have the speed & reliability of Memcached and the easy clustering & JSON database in CouchDB all in one. Couchbase is the worlds fastest NoSQL database. Do you have a large Memcached cluster and tired of down nodes meaning down time, Couchbase is a great drop-in replacement that persist data to disk with replica copies so that you will always be 100% available. Click here to download Couchbase http://www.couchbase.com/nosql-databases/downloadsorGo to the AWS Marketplace to create a Couchbase cluster instantly.https://aws.amazon.com/marketplace/search/results/?page=1&searchTerms=couchbaseArray Array

Couchbase 3.0 – NoSQL with Power

Couchbase 3.0 – NoSQL with Power
  Streaming Change Replication   Couchbase Server 3.0 introduces Database Change Protocol (DCP), an innovative protocol for replicating changes to components, nodes, and data centers via streams. DCP is a fundamental extension of the memory-centric architecture of Couchbase Server, removing IO bottlenecks from indexing, rebalancing, recovery, and replication that lead to improvements ranging from 2x to more than 100x. By leveraging in-memory streams of changes, DCP enables real time processing and administration.   DCP is really why this is not just a better version of Couchbase 2.5.1. With this feature Couchbase is taking its first step to becoming a ACID compliant database. DCP coupled with ForestDB ,Couchbase’s homegrown data store https://github.com/couchbaselabs/forestdb, we should be seeing transaction between two or more documents in the near future.   Optimization for Massive Data Sets   Couchbase Server 3.0 introduces optimization for massive data sets. By default, Couchbase Server caches all metadata in memory. However, while caching all metadata decreases latency for the entire data set, it may be impractical for massive data sets. With dynamically tunable memory, Couchbase Server can now be configured on the fly to cache a partial or full working set of data in memory, to optimize memory utilization for massive data sets. The big plus and weakness of Couchbase 2.5.1 and lower was that you always had to have your data sets KEY/META in memory. Now with the ability to Evict KEY/META/VALUE from memory and still store it on disk your only bound by the size of your hard drive. Faster Cross Data Center Replication   By leveraging DCP, Couchbase Server 3.0 enables faster cross data center replication (XDCR) with memory-to-memory replication. The result is up to 4x lower latency ensuring consistency between multiple data centers. In addition, in the event communication is interrupted, DCP enables data centers to resume replication from where they left off rather than a checkpoint for increased efficiency. Once again DCP is a big game changer. Before in 2.5.1 before items would be XDCR to another cluster it had to be written to disk first. Now changes are send in real time without having to write to disk first. So the only latancy will be the the WAN or LAN trip to the other cluster. Faster View Updates   By leveraging DCP, Couchbase Server 3.0 enables faster view updates by applying in-memory changes rather than waiting for all changes to be persisted first. The result is up to 50x lower latency for consistent views. Views can now be leveraged to power real-time dashboards summarizing continuous streams of data. Yup DCP again. Before the view had to be written to disk first then placed in indexing que. Now items go directly from DCP to indexing que. PLUS a partial rewrite of the map/reduce engine in C also mean a 2X-13X is view indexing performance. If your are using CouchDB now its time to switch just because of that alone Automatic, Optimized Resource Utilization Couchbase Server 3.0 introduces a shared thread pool for increased throughput and decreased latency. Couchbase Server automatically configures the shared thread pool with processor detection. In addition, the shared thread pool increases resource utilization with point-in-time workload optimization by allocating threads for reads and writes CPU are getting cheaper now and customers are now having 8 to 48 cores in a single node. Before those extra CPU would lye idea, but now you can use them all.   Source: http://www.couchbase.com/whats-new-in-3-0Array Array